Extract RBF logic into policy/rbf (refactoring, tx fees and policy, validation)

https://github.com/bitcoin/bitcoin/pull/22675

Host: glozow  -  PR author: glozow

Notes

  • Replace by Fee (RBF) is a method of fee-bumping a transaction by creating a higher-feerate replacement transaction that spends the same inputs as their original transaction. BIP125 specifies how users can signal replaceability in their transactions and a set of conditions that replacement transactions must meet in order to be accepted.

  • RBF is an example of mempool policy, a set of validation rules applied to unconfirmed transactions.

  • PR #22675 extracts the RBF logic into a policy/rbf module and adds documentation. This allows each function to be unit tested and reused for future projects such as RBF in package validation and Witness Replacement.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?

  2. What are some benefits of extracting the RBF logic into its own module?

  3. Why is the error returned here a TxValidationResult::TX_CONSENSUS instead of TxValidationResult::TX_POLICY, given that this is RBF-related logic?

  4. In BIP125 Rule #2, why is it important for the replacement transaction to not introduce any new unconfirmed inputs?

  5. The code in PaysMoreThanConflicts() doesn’t seem to directly correspond to a rule in BIP125. Why do we have it, and why don’t we merge it with the other fee-related checks in PaysForRBF()?

  6. In BIP125 Rule #4, what does it mean for a transaction to “pay for its own bandwidth?” Why don’t we just allow any replacement as long as it has a higher feerate?

  7. In BIP125 Rule #5, how is it possible for a transaction to conflict with more than 100 mempool transactions, given that mempool ancestor and descendant limits don’t allow a chain of more than 25 transactions?

  8. In PaysForRBF(), why is hash passed by reference but original_fees by value? Should the fee parameters be const? Why or why not?

  9. When considering a transaction for submission to the mempool, we check to see if it would cause any mempool entries to exceed their descendant limits. How do we make sure we don’t overestimate by counting both the new and to-be-replaced transactions, since they would share ancestors?

  10. This PR introduces a circular dependency: policy/rbf -> txmempool -> validation -> policy/rbf. This can be avoided by not making policy/rbf depend on txmempool or by using the approach in PR #22677. Which approach do you prefer, and why?

Meeting Log

  118:00 <glozow> #startmeeting
  218:00 <larryruane> hi
  318:00 <glozow> Welcome to PR Review Club!
  418:01 <svav> Hi
  518:01 <glozow> We're continuing the RBF theme this week, with #22675 extract RBF logic into policy/rbf
  618:01 <schmidty> hi
  718:01 <michaelfolkson> hi
  818:01 <glozow> notes: https://bitcoincore.reviews/22675
  918:01 <glozow> PR: https://github.com/bitcoin/bitcoin/pull/22675
 1018:01 <merkle_noob[m]> Hello everyone.
 1118:01 <dunxen> hi!
 1218:01 <dariusparvin> hi!
 1318:01 <t-bast> hi everyone!
 1418:01 <glozow> Did anybody get a chance to review the PR? y/n
 1518:02 <glozow> or look at the notes?
 1618:02 <glozow> does everybody have BIP125 memorized?
 1718:02 <dunxen> y-ish
 1818:02 <Azorcode> Hello Everyone
 1918:02 <dunxen> BIP125 not memorized sorry :(
 2018:02 <ccdle12> hihi
 2118:02 <michaelfolkson> yes to all questions, I can recite BIP 125 word for word by memory
 2218:02 <dopedsilicon> Hello
 2318:03 <glozow> We're basically going over BIP125 today. And you'll get extra credit for knowing some of the finer details :)
 2418:03 <glozow> First question, since this is moveonly PR: What are some benefits of extracting the RBF logic into its own module?
 2518:04 <dunxen> we can reduce bloat in validation.cpp and test RBF stuff in isolation
 2618:04 <svav> de-bloating validation.cpp
 2718:04 <dopedsilicon> will prepare for future PRs
 2818:05 <svav> more modular code
 2918:05 <dariusparvin> Package RBF can use the same logic
 3018:05 <glozow> dunxen: svav: dopedsilicon: dariusparvin: yes! :)
 3118:05 <larryruane> this is allow unit tests (tests in isolation), but we don't have such tests now, do we? (just want to confirm)
 3218:05 <glozow> so many good things
 3318:05 <glozow> larryruane: correct, I believe the best way to test rbf logic is to run test/functional/feature_rbf.py right now
 3418:06 <glozow> we also have a fuzzer src/test/fuzz/rbf.cpp which tests the `SignalsOptInRBF` function
 3518:06 <dunxen> i also learnt about witness replacement today!
 3618:06 <glozow> dunxen: yay! wanna tell us what it is? :)
 3718:06 <glozow> (or larryruane?)
 3818:07 <dunxen> replacing same-txid-different-wtxid when the witness is smaller because higher fee rate? can do that in a BIP125 like way right?
 3918:07 <glozow> Okie anyway, we can also continue with the questions
 4018:07 <glozow> Why is the error returned here a TxValidationResult::TX_CONSENSUS instead of TxValidationResult::TX_POLICY, given that this is RBF-related logic?https://github.com/bitcoin/bitcoin/blob/0ed5ad1023d9ced8cb0930747539c78edd523dc8/src/validation.cpp#L774
 4118:08 <glozow> dunxen: yep!
 4218:08 <larryruane> yes I think it's that currently, if we get a tx that is the same as one already in our mempool except that its witness is different, we currently reject it, but we should accept it if its witness is smaller (higher fee rate), at least by a certain margin
 4318:09 <glozow> larryruane: right exactly
 4418:09 <larryruane> and what is "nice" about that type of replacement is that we don't have to evict any ancestors or decendents, since the replacement tx must specify exactly the same of those
 4518:10 <larryruane> so in a way it's a simpler replacement
 4618:10 <michaelfolkson> Cos we are talking about a potentially invalid transaction (consensus not policy)
 4718:10 <dunxen> for it being a consensus error, i'm not sure about it in depth, but isn't it basically checking for a tx that spends outputs that would be replaced by it. seems like consensus issue
 4818:11 <larryruane> glozow: ".. CONSENSUS .." because if the replacement were to happen (let's say B replaces A), then B couldn't possibly be valid, since one of its inputs is one of B's outputs!
 4918:11 <larryruane> sorry meant one of A's outputs
 5018:11 <glozow> dunxen: larryruane: yes exactly. this is a consensus error because this transaction must be inconsistent if it's dependencies and conflicts intersect
 5118:11 <glozow> its*
 5218:12 <glozow> next question: why is it important for the replacement transaction to not introduce any new unconfirmed inputs?
 5318:12 <larryruane> i couldn't figure this one out!
 5418:12 <glozow> This is BIP125 Rule #2, described here: https://github.com/bitcoin/bips/blob/master/bip-0125.mediawiki#implementation-details
 5518:13 <larryruane> maybe introduces some kind of DoS vector?
 5618:13 <dunxen> i was actually trying to think of an attack for this one
 5718:13 <glozow> larryruane: not DoS vector, no
 5818:13 <glozow> hint: https://github.com/bitcoin/bitcoin/blob/7e75400bb568fe8a653246c4e76f6baab2455a61/src/validation.cpp#L842-L851
 5918:14 <dunxen> Might we be able to invalidate another unconfirmed transaction that didn't opt into RBF?
 6018:14 <glozow> this restriction is more to make our RBF logic easier rather than prevent attacks, I think
 6118:15 <dunxen> oh cool checking the link
 6218:15 <t-bast> If I'm side-tracking, you can answer later: will we have a way to get rid of this rule for package-RBF? It's an annoying pinning vector and needed the special carve-out rule to be added, which is somewhat hackish...
 6318:15 <larryruane> ok sounds like in the future, we may allow this (which we can do because it's policy not consensus)
 6418:16 <glozow> t-bast: ah interesting, yes we can get rid of it, though I wasn't planning to.
 6518:17 <glozow> I agree it would improve RBF
 6618:17 <glozow> t-bast: to clarify, do you mean the CPFP carve out rule, https://github.com/bitcoin/bitcoin/pull/15681 ?
 6718:17 <t-bast> glozow: we should double-check with BlueMatt, but I think it would be nice to get rid of it, it made anchor outputs useless for lightning until Matt added the carve-out exception...
 6818:18 <t-bast> yes exactly, CPFP carve-out rule
 6918:18 <dunxen> does it simplify things since we don't need to check fee rate of the new tx corresponding to the new input which might be low fee rate?
 7018:19 <larryruane> going back to question 3 for a second, this really helped me understand why it's a consensus failure https://github.com/glozow/bitcoin/blob/2021-08-rbf/test/functional/feature_rbf.py#L324
 7118:20 <glozow> dunxen: yes! :) you can imagine that, if the new replacement transaction adds higher fees, but has a low ancestor fee, it's not actually going to be a better candidate for miners
 7218:20 <dunxen> larryruane: thanks for the test link!
 7318:21 <glozow> so let's say we're replacing mempool tx, A, with a new tx, B
 7418:21 <glozow> A is 3 sat/vb and B is 5 sat/vb
 7518:21 <glozow> but B has an unconfirmed input, another transaction in the mempool with 1 sat/vb
 7618:22 <glozow> this isn't an "attack" but it's a case we want to consider when we're deciding whether or not a new transaction is a better candidate for mining
 7718:22 <glozow> hopefully this makes sense?
 7818:22 <dunxen> clears it up a lot, thanks!
 7918:22 <glozow> great!
 8018:23 <glozow> next question: The code in `PaysMoreThanConflicts()` doesn’t seem to directly correspond to a rule in BIP125. Why do we have it, and why don’t we merge it with the other fee-related checks in `PaysForRBF()`?
 8118:23 <glozow> `PaysMoreThanConflicts()`: https://github.com/bitcoin/bitcoin/blob/a33fdd0b981965982754b39586eedb7ae456ba57/src/policy/rbf.h#L76
 8218:23 <glozow> `PaysForRBF()`: https://github.com/bitcoin/bitcoin/blob/a33fdd0b981965982754b39586eedb7ae456ba57/src/policy/rbf.h#L88
 8318:24 <larryruane> as a higher-level point then, we're trying to run our mempool management as if we're a miner (like, what would a miner think is good), really (i think?) so that we can *forward* transactions appropriately ... but miners could act differently than we expect! E.g. a miner *could* replace that tx (that has a new unconfirmed input)
 8418:25 <dunxen> i guess in this case anti-DoS can be argued somewhat? also want to make sure there is incentive for a miner to use the new one?
 8518:25 <larryruane> (i hope that's right, a question, not an assertion!)
 8618:26 <glozow> larryruane: yes, I consider this to be one of the main goals of mempool - to keep the most incentive-compatible transaction candidates for mining, even if we're not a miner
 8718:26 <glozow> dunxen: why anti-DoS?
 8818:27 <sipa> there is a fundamental conflict between DoS protection (e.g. not allowing people to use the P2P network to broadcast transactions that will never confirm), and miner incentives (maximizing joint fee of the top 1 block worth of the mempool)... our goal so far is minimizing / specifying where that conflict occurs, but to an extent, it's inevitble
 8918:28 <glozow> sipa: right. if we had infinite resources, we would only need to apply consensus rules
 9018:28 <dunxen> i'd think it's expensive to replace transactions, so the cost should be higher fee rate each time you want to do that right?
 9118:29 <sipa> i guess there is an additional one: our mempool is also _our_ prediction of what will confirm, and in theory, miners could keep mempools with e.g. conflicting transactions around, and pick the best combination at the time of mining
 9218:29 <sipa> but that's incompatible with us trying to guess what will happen
 9318:30 <michaelfolkson> sipa: And BlueMatt said the other day Lightning wants the most relaxed mempool policies possible (partly in jest perhaps). So that is another factor in the conflict
 9418:30 <sipa> michaelfolkson: well, everything wants them to be as relaxed as possible
 9518:30 <sipa> the world would be a lot simpler :p
 9618:30 <michaelfolkson> Ha
 9718:30 <sipa> but that's not DoS friendly
 9818:31 <glozow> dunxen: yes! the best way to avoid expensive operations is to disallow replacing transactions, but we also want to allow replacements so that we can get better-fee transactions. so this is a happy middle
 9918:31 <sipa> it's particularly worrisome for Lightning and perhaps other 2nd layer protocols, as their correctness relies on being able to predict what will confirm, which is... eh
10018:31 <larryruane> there's been that dust limit discussion lately on the mailing list, maybe that's another instance of this, sort of
10118:31 <sipa> larryruane: yes, though hopefully one that's moot in scenarios where fees are high enough
10218:32 <sipa> (as fees will make everything that's remotely classifiable as dust also actually uneconomical, so nobody will create it)
10318:33 <glozow> i feel like dust doesn't fall into the buckets of anti-DoS and useful mempool, but into a 3rd category of "(non-consensus) rules we think are good for the network to follow"
10418:34 <larryruane> glozow: interesting! makes sense
10518:35 <glozow> anyway, I'll answer the last question: `PaysMoreThanConflicts()` is an early-exit thing. We haven't gone digging in the mempool for descendants-of-conflicts yet, but we can compare fees with the direct conflicts because we'd be able to exit immediately if we're not paying more than them.
10618:35 <michaelfolkson> A big mempool and the prediction element isn't important. A small mempool and then we're potentially talking retrieving transactions we don't have in our mempool to validate a block (not ideal)
10718:36 <glozow> in `PaysMoreThanConflicts()`, we're making sure the replacement transaction fees >= direct conflicts fees. if it's not better, we can quit immediately. anybody have an idea why?
10818:36 <glozow> if it is better, we'll also make sure that the replacement tx fees >= conflicts + descendants fees.
10918:37 <glozow> ^why do we need to do that?
11018:37 <sipa> glozow: well i'd consider dust accumulation in the mempool and p2p bandwidth/validation cost both DoS concerns, though very different ones admittedly
11118:37 <larryruane> glozow: so it's just performance .. would be great to add a comment to `PaysMoreThanConflicts()` to explain that
11218:38 <glozow> sipa: in mempool, if the tx ends up confirmed, then I'd consider it a good use of resources right?
11318:39 <glozow> larryruane: comment is here https://github.com/bitcoin/bitcoin/pull/22855/files#diff-fa5cb2d84034ff72cdb9d479b17cf8c744a9bf3fc932b3a77c1a017edd767dfaR132-R141
11418:39 <glozow> wait sorry no, this is the comment: https://github.com/bitcoin/bitcoin/pull/22855/files#diff-97c3a52bc5fad452d82670a7fd291800bae20c7bc35bb82686c2c0a4ea7b5b98R785-R790
11518:40 <glozow> (btw, if you think RBF is confusing and want more docs, #22855 would be good to review :P)
11618:41 <glozow> I'm going to continue with the notes, but feel free to ask questions about past stuff
11718:41 <glozow> In BIP125 Rule #4, what does it mean for a transaction to “pay for its own bandwidth?” Why don’t we just allow any replacement as long as it has a higher feerate?
11818:42 <michaelfolkson> I think BIP 125 needs some footnotes explaining the rationale for the rules too
11918:42 <michaelfolkson> We hounded ariard to explain their rationale at the Sydney Socratic. Perhaps they were explained in an old mailing list post.... https://btctranscripts.com/sydney-bitcoin-meetup/2021-07-06-socratic-seminar/#bip-125-rules-and-their-rationale
12018:42 <michaelfolkson> I don't think they are in the code comments
12118:43 <dariusparvin> paying for its own bandwidth means paying a fee which includes an additional amount that would cover its minimum relay fee?
12218:44 <dariusparvin> *the node's minimum relay fee
12318:44 <larryruane> dariusparvin: I think you're right -- we already relayed the original tx(es) and if we accept the replacement, we have to replay it too (?)
12418:46 <glozow> dariusparvin: larryruane: yep. we'll be relaying the transaction. and we already relayed the original one(s). so it needs to pay for that second relay
12518:46 <dariusparvin> i guess having rule #4 rather than just #3 is also a kind of DoS protection? So that the sender doesn't keep incrementing the replacement tx by 1 sat
12618:46 <larryruane> so in effect the new tx has to pay for the old tx's relay (which was in a sense an unnecessary relay (of the old one))
12718:46 <glozow> what would happen if we allowed any replacement as long as it pays more fees?
12818:46 <glozow> dariusparvin: indeed, this is definitely a DoS protection
12918:46 <larryruane> glozow: bump by one sat each replacement?
13018:46 <dunxen> we could just change the fees by 1 sat if it didn't pay for its own bandwidth right? that would mean we could replace a lot pretty cheaply and DoS?
13118:47 <glozow> larryruane: dunxen: yes! and we'll be spamming the network with transactions
13218:47 <dopedsilicon> One can keep increasing the fees by insignifacnt amounts
13318:47 <dunxen> yes that too haha
13418:47 <glozow> and if everyone else is also accepting them, they'll also be flooding the transactions
13518:48 <dunxen> oh that would be kinda bad
13618:48 <glozow> pretty gross abuse of mempool resources and network bandwidth
13718:49 <glozow> so here's a nice tradeoff between protecting mempool resources + wanting higher fee transactions :)
13818:49 <glozow> next question: In BIP125 Rule #5, how is it possible for a transaction to conflict with more than 100 mempool transactions, given that mempool ancestor and descendant limits don’t allow a chain of more than 25 transactions?
13918:50 <dunxen> could it be like a wide tree of descendants?
14018:50 <larryruane> is it because the new tx could have (say) 110 inputs, all of which are the outputs of separate mempool transactions?
14118:50 <glozow> larryruane: ding ding ding
14218:50 <glozow> it can conflict with a bunch of independent transactions in the mempool, yep
14318:51 <glozow> dunxen: wide tree of descendants would still be counted together
14418:51 <dunxen> oh got it, so like independent ancestors would not?
14518:52 <glozow> dunxen: yeah, they're not connected, so they won't hit the ancestor/descendant limits
14618:52 <glozow> next question. here's the signature for `PaysForRBF()`: https://github.com/bitcoin/bitcoin/blob/a33fdd0b981965982754b39586eedb7ae456ba57/src/policy/rbf.h#L88
14718:52 <glozow> why is `hash` passed by reference but `original_fees` by value? Should the fee parameters be const? Why or why not?
14818:54 <larryruane> i think reference versus non-reference is often (not always!) a performance consideration.. we don't want to pass a 32-byte thing on the stack (copy), but a `CAmount` is just 8 bytes, usually the same size as a pointer (which is what a reference really is)
14918:54 <larryruane> so passing a CAmount by reference is passing the same amount of data (8 bytes), but is slower to access (memory lookup)
15018:55 <glozow> larryruane: yes! we often want to pass by reference because (1) we want to mutate it in the function or (2) the size of the reference is smaller than the object itself, e.g. `CAmount`
15118:56 <larryruane> it's kinda too bad which of those two reasons is operative isn't obvious from reading the function prototype!
15218:56 <glozow> and what about the other part of the question - should it be const? why or why not?
15318:57 <glozow> `original_fees`, `replacement_fees`, and `replacement_vsize`
15418:57 <larryruane> specifying a pass-by-value (as in this case with the CAmount) as a `const` in the prototype actually is always wrong, because the caller doesn't care whether the called functions changes the variable *internally*
15518:57 <larryruane> (or shouldn't)
15618:57 <glozow> right, what's the scope of `original_fees`?
15718:58 <larryruane> but in the definition of the function, `const` can make sense
15818:58 <larryruane> glozow: do you mean in the calling function or the called function?
15918:59 <glozow> larryruane: the callee. or both, if you want to elaborate :)
16018:59 <larryruane> in pass-by-value, the called function gets a copy, and the scope of that copy is only within that called function (out of scope when the function returns)
16119:00 <glozow> larryruane: yep!
16219:00 <larryruane> in the calling function, the scope is before and after the call
16319:00 <glozow> and even if the callee mutates it, it's a copy, so it doesn't matter to the caller
16419:00 <glozow> looks like we have run out of time. thanks for diving into RBF with me today, everyone! hope it was fun
16519:00 <glozow> #endmeeting