Extract RBF logic into policy/rbf (refactoring, tx fees and policy, validation)

Sep 1, 2021

https://github.com/bitcoin/bitcoin/pull/22675

Host: glozow - PR author: glozow

Notes

Replace by Fee (RBF) is a method of fee-bumping a transaction by creating a higher-feerate replacement transaction that spends the same inputs as their original transaction. BIP125 specifies how users can signal replaceability in their transactions and a set of conditions that replacement transactions must meet in order to be accepted.
RBF is an example of mempool policy, a set of validation rules applied to unconfirmed transactions.
PR #22675 extracts the RBF logic into a policy/rbf module and adds documentation. This allows each function to be unit tested and reused for future projects such as RBF in package validation and Witness Replacement.

Questions

Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?
What are some benefits of extracting the RBF logic into its own module?
Why is the error returned here a TxValidationResult::TX_CONSENSUS instead of TxValidationResult::TX_POLICY, given that this is RBF-related logic?
In BIP125 Rule #2, why is it important for the replacement transaction to not introduce any new unconfirmed inputs?
The code in PaysMoreThanConflicts() doesn’t seem to directly correspond to a rule in BIP125. Why do we have it, and why don’t we merge it with the other fee-related checks in PaysForRBF()?
In BIP125 Rule #4, what does it mean for a transaction to “pay for its own bandwidth?” Why don’t we just allow any replacement as long as it has a higher feerate?
In BIP125 Rule #5, how is it possible for a transaction to conflict with more than 100 mempool transactions, given that mempool ancestor and descendant limits don’t allow a chain of more than 25 transactions?
In PaysForRBF(), why is hash passed by reference but original_fees by value? Should the fee parameters be const? Why or why not?
When considering a transaction for submission to the mempool, we check to see if it would cause any mempool entries to exceed their descendant limits. How do we make sure we don’t overestimate by counting both the new and to-be-replaced transactions, since they would share ancestors?
This PR introduces a circular dependency: policy/rbf -> txmempool -> validation -> policy/rbf. This can be avoided by not making policy/rbf depend on txmempool or by using the approach in PR #22677. Which approach do you prefer, and why?

Meeting Log

18:00

<glozow> #startmeeting

18:00

<larryruane> hi

18:00

<glozow> Welcome to PR Review Club!

18:01

<svav> Hi

18:01

<glozow> We're continuing the RBF theme this week, with #22675 extract RBF logic into policy/rbf

18:01

<schmidty> hi

18:01

<michaelfolkson> hi

18:01

<glozow> notes: https://bitcoincore.reviews/22675

18:01

<glozow> PR: https://github.com/bitcoin/bitcoin/pull/22675

18:01

<merkle_noob[m]> Hello everyone.

18:01

<dunxen> hi!

18:01

<dariusparvin> hi!

18:01

<t-bast> hi everyone!

18:01

<glozow> Did anybody get a chance to review the PR? y/n

18:02

<glozow> or look at the notes?

18:02

<glozow> does everybody have BIP125 memorized?

18:02

<dunxen> y-ish

18:02

<Azorcode> Hello Everyone

18:02

<dunxen> BIP125 not memorized sorry :(

18:02

<ccdle12> hihi

18:02

<michaelfolkson> yes to all questions, I can recite BIP 125 word for word by memory

18:02

<dopedsilicon> Hello

18:03

<glozow> We're basically going over BIP125 today. And you'll get extra credit for knowing some of the finer details :)

18:03

<glozow> First question, since this is moveonly PR: What are some benefits of extracting the RBF logic into its own module?

18:04

<dunxen> we can reduce bloat in validation.cpp and test RBF stuff in isolation

18:04

<svav> de-bloating validation.cpp

18:04

<dopedsilicon> will prepare for future PRs

18:05

<svav> more modular code

18:05

<dariusparvin> Package RBF can use the same logic

18:05

<glozow> dunxen: svav: dopedsilicon: dariusparvin: yes! :)

18:05

<larryruane> this is allow unit tests (tests in isolation), but we don't have such tests now, do we? (just want to confirm)

18:05

<glozow> so many good things

18:05

<glozow> larryruane: correct, I believe the best way to test rbf logic is to run test/functional/feature_rbf.py right now

18:06

<glozow> we also have a fuzzer src/test/fuzz/rbf.cpp which tests the `SignalsOptInRBF` function

18:06

<dunxen> i also learnt about witness replacement today!

18:06

<glozow> dunxen: yay! wanna tell us what it is? :)

18:06

<glozow> (or larryruane?)

18:07

<dunxen> replacing same-txid-different-wtxid when the witness is smaller because higher fee rate? can do that in a BIP125 like way right?

18:07

<glozow> Okie anyway, we can also continue with the questions

18:07

<glozow> Why is the error returned here a TxValidationResult::TX_CONSENSUS instead of TxValidationResult::TX_POLICY, given that this is RBF-related logic?https://github.com/bitcoin/bitcoin/blob/0ed5ad1023d9ced8cb0930747539c78edd523dc8/src/validation.cpp#L774

18:08

<glozow> dunxen: yep!

18:08

<larryruane> yes I think it's that currently, if we get a tx that is the same as one already in our mempool except that its witness is different, we currently reject it, but we should accept it if its witness is smaller (higher fee rate), at least by a certain margin

18:09

<glozow> larryruane: right exactly

18:09

<larryruane> and what is "nice" about that type of replacement is that we don't have to evict any ancestors or decendents, since the replacement tx must specify exactly the same of those

18:10

<larryruane> so in a way it's a simpler replacement

18:10

<michaelfolkson> Cos we are talking about a potentially invalid transaction (consensus not policy)

18:10

<dunxen> for it being a consensus error, i'm not sure about it in depth, but isn't it basically checking for a tx that spends outputs that would be replaced by it. seems like consensus issue

18:11

<larryruane> glozow: ".. CONSENSUS .." because if the replacement were to happen (let's say B replaces A), then B couldn't possibly be valid, since one of its inputs is one of B's outputs!

18:11

<larryruane> sorry meant one of A's outputs

18:11

<glozow> dunxen: larryruane: yes exactly. this is a consensus error because this transaction must be inconsistent if it's dependencies and conflicts intersect

18:11

<glozow> its*

18:12

<glozow> next question: why is it important for the replacement transaction to not introduce any new unconfirmed inputs?

18:12

<larryruane> i couldn't figure this one out!

18:12

<glozow> This is BIP125 Rule #2, described here: https://github.com/bitcoin/bips/blob/master/bip-0125.mediawiki#implementation-details

18:13

<larryruane> maybe introduces some kind of DoS vector?

18:13

<dunxen> i was actually trying to think of an attack for this one

18:13

<glozow> larryruane: not DoS vector, no

18:13

<glozow> hint: https://github.com/bitcoin/bitcoin/blob/7e75400bb568fe8a653246c4e76f6baab2455a61/src/validation.cpp#L842-L851

18:14

<dunxen> Might we be able to invalidate another unconfirmed transaction that didn't opt into RBF?

18:14

<glozow> this restriction is more to make our RBF logic easier rather than prevent attacks, I think

18:15

<dunxen> oh cool checking the link

18:15

<t-bast> If I'm side-tracking, you can answer later: will we have a way to get rid of this rule for package-RBF? It's an annoying pinning vector and needed the special carve-out rule to be added, which is somewhat hackish...

18:15

<larryruane> ok sounds like in the future, we may allow this (which we can do because it's policy not consensus)

18:16

<glozow> t-bast: ah interesting, yes we can get rid of it, though I wasn't planning to.

18:17

<glozow> I agree it would improve RBF

18:17

<glozow> t-bast: to clarify, do you mean the CPFP carve out rule, https://github.com/bitcoin/bitcoin/pull/15681 ?

18:17

<t-bast> glozow: we should double-check with BlueMatt, but I think it would be nice to get rid of it, it made anchor outputs useless for lightning until Matt added the carve-out exception...

18:18

<t-bast> yes exactly, CPFP carve-out rule

18:18

<dunxen> does it simplify things since we don't need to check fee rate of the new tx corresponding to the new input which might be low fee rate?

18:19

<larryruane> going back to question 3 for a second, this really helped me understand why it's a consensus failure https://github.com/glozow/bitcoin/blob/2021-08-rbf/test/functional/feature_rbf.py#L324

18:20

<glozow> dunxen: yes! :) you can imagine that, if the new replacement transaction adds higher fees, but has a low ancestor fee, it's not actually going to be a better candidate for miners

18:20

<dunxen> larryruane: thanks for the test link!

18:21

<glozow> so let's say we're replacing mempool tx, A, with a new tx, B

18:21

<glozow> A is 3 sat/vb and B is 5 sat/vb

18:21

<glozow> but B has an unconfirmed input, another transaction in the mempool with 1 sat/vb

18:22

<glozow> this isn't an "attack" but it's a case we want to consider when we're deciding whether or not a new transaction is a better candidate for mining

18:22

<glozow> hopefully this makes sense?

18:22

<dunxen> clears it up a lot, thanks!

18:22

<glozow> great!

18:23

<glozow> next question: The code in `PaysMoreThanConflicts()` doesn’t seem to directly correspond to a rule in BIP125. Why do we have it, and why don’t we merge it with the other fee-related checks in `PaysForRBF()`?

18:23

<glozow> `PaysMoreThanConflicts()`: https://github.com/bitcoin/bitcoin/blob/a33fdd0b981965982754b39586eedb7ae456ba57/src/policy/rbf.h#L76

18:23

<glozow> `PaysForRBF()`: https://github.com/bitcoin/bitcoin/blob/a33fdd0b981965982754b39586eedb7ae456ba57/src/policy/rbf.h#L88

18:24

<larryruane> as a higher-level point then, we're trying to run our mempool management as if we're a miner (like, what would a miner think is good), really (i think?) so that we can *forward* transactions appropriately ... but miners could act differently than we expect! E.g. a miner *could* replace that tx (that has a new unconfirmed input)

18:25

<dunxen> i guess in this case anti-DoS can be argued somewhat? also want to make sure there is incentive for a miner to use the new one?

18:25

<larryruane> (i hope that's right, a question, not an assertion!)

18:26

<glozow> larryruane: yes, I consider this to be one of the main goals of mempool - to keep the most incentive-compatible transaction candidates for mining, even if we're not a miner

18:26

<glozow> dunxen: why anti-DoS?

18:27

<sipa> there is a fundamental conflict between DoS protection (e.g. not allowing people to use the P2P network to broadcast transactions that will never confirm), and miner incentives (maximizing joint fee of the top 1 block worth of the mempool)... our goal so far is minimizing / specifying where that conflict occurs, but to an extent, it's inevitble

18:28

<glozow> sipa: right. if we had infinite resources, we would only need to apply consensus rules

18:28

<dunxen> i'd think it's expensive to replace transactions, so the cost should be higher fee rate each time you want to do that right?

18:29

<sipa> i guess there is an additional one: our mempool is also _our_ prediction of what will confirm, and in theory, miners could keep mempools with e.g. conflicting transactions around, and pick the best combination at the time of mining

18:29

<sipa> but that's incompatible with us trying to guess what will happen

18:30

<michaelfolkson> sipa: And BlueMatt said the other day Lightning wants the most relaxed mempool policies possible (partly in jest perhaps). So that is another factor in the conflict

18:30

<sipa> michaelfolkson: well, everything wants them to be as relaxed as possible

18:30

<sipa> the world would be a lot simpler :p

18:30

<michaelfolkson> Ha

18:30

<sipa> but that's not DoS friendly

18:31

<glozow> dunxen: yes! the best way to avoid expensive operations is to disallow replacing transactions, but we also want to allow replacements so that we can get better-fee transactions. so this is a happy middle

18:31

<sipa> it's particularly worrisome for Lightning and perhaps other 2nd layer protocols, as their correctness relies on being able to predict what will confirm, which is... eh

100

18:31

<larryruane> there's been that dust limit discussion lately on the mailing list, maybe that's another instance of this, sort of

101

18:31

<sipa> larryruane: yes, though hopefully one that's moot in scenarios where fees are high enough

102

18:32

<sipa> (as fees will make everything that's remotely classifiable as dust also actually uneconomical, so nobody will create it)

103

18:33

<glozow> i feel like dust doesn't fall into the buckets of anti-DoS and useful mempool, but into a 3rd category of "(non-consensus) rules we think are good for the network to follow"

104

18:34

<larryruane> glozow: interesting! makes sense

105

18:35

<glozow> anyway, I'll answer the last question: `PaysMoreThanConflicts()` is an early-exit thing. We haven't gone digging in the mempool for descendants-of-conflicts yet, but we can compare fees with the direct conflicts because we'd be able to exit immediately if we're not paying more than them.

106

18:35

<michaelfolkson> A big mempool and the prediction element isn't important. A small mempool and then we're potentially talking retrieving transactions we don't have in our mempool to validate a block (not ideal)

107

18:36

<glozow> in `PaysMoreThanConflicts()`, we're making sure the replacement transaction fees >= direct conflicts fees. if it's not better, we can quit immediately. anybody have an idea why?

108

18:36

<glozow> if it is better, we'll also make sure that the replacement tx fees >= conflicts + descendants fees.

109

18:37

<glozow> ^why do we need to do that?

110

18:37

<sipa> glozow: well i'd consider dust accumulation in the mempool and p2p bandwidth/validation cost both DoS concerns, though very different ones admittedly

111

18:37

<larryruane> glozow: so it's just performance .. would be great to add a comment to `PaysMoreThanConflicts()` to explain that

112

18:38

<glozow> sipa: in mempool, if the tx ends up confirmed, then I'd consider it a good use of resources right?

113

18:39

<glozow> larryruane: comment is here https://github.com/bitcoin/bitcoin/pull/22855/files#diff-fa5cb2d84034ff72cdb9d479b17cf8c744a9bf3fc932b3a77c1a017edd767dfaR132-R141

114

18:39

<glozow> wait sorry no, this is the comment: https://github.com/bitcoin/bitcoin/pull/22855/files#diff-97c3a52bc5fad452d82670a7fd291800bae20c7bc35bb82686c2c0a4ea7b5b98R785-R790

115

18:40

<glozow> (btw, if you think RBF is confusing and want more docs, #22855 would be good to review :P)

116

18:41

<glozow> I'm going to continue with the notes, but feel free to ask questions about past stuff

117

18:41

<glozow> In BIP125 Rule #4, what does it mean for a transaction to “pay for its own bandwidth?” Why don’t we just allow any replacement as long as it has a higher feerate?

118

18:42

<michaelfolkson> I think BIP 125 needs some footnotes explaining the rationale for the rules too

119

18:42

<michaelfolkson> We hounded ariard to explain their rationale at the Sydney Socratic. Perhaps they were explained in an old mailing list post.... https://btctranscripts.com/sydney-bitcoin-meetup/2021-07-06-socratic-seminar/#bip-125-rules-and-their-rationale

120

18:42

<michaelfolkson> I don't think they are in the code comments

121

18:43

<dariusparvin> paying for its own bandwidth means paying a fee which includes an additional amount that would cover its minimum relay fee?

122

18:44

<dariusparvin> *the node's minimum relay fee

123

18:44

<larryruane> dariusparvin: I think you're right -- we already relayed the original tx(es) and if we accept the replacement, we have to replay it too (?)

124

18:46

<glozow> dariusparvin: larryruane: yep. we'll be relaying the transaction. and we already relayed the original one(s). so it needs to pay for that second relay

125

18:46

<dariusparvin> i guess having rule #4 rather than just #3 is also a kind of DoS protection? So that the sender doesn't keep incrementing the replacement tx by 1 sat

126

18:46

<larryruane> so in effect the new tx has to pay for the old tx's relay (which was in a sense an unnecessary relay (of the old one))

127

18:46

<glozow> what would happen if we allowed any replacement as long as it pays more fees?

128

18:46

<glozow> dariusparvin: indeed, this is definitely a DoS protection

129

18:46

<larryruane> glozow: bump by one sat each replacement?

130

18:46

<dunxen> we could just change the fees by 1 sat if it didn't pay for its own bandwidth right? that would mean we could replace a lot pretty cheaply and DoS?

131

18:47

<glozow> larryruane: dunxen: yes! and we'll be spamming the network with transactions

132

18:47

<dopedsilicon> One can keep increasing the fees by insignifacnt amounts

133

18:47

<dunxen> yes that too haha

134

18:47

<glozow> and if everyone else is also accepting them, they'll also be flooding the transactions

135

18:48

<dunxen> oh that would be kinda bad

136

18:48

<glozow> pretty gross abuse of mempool resources and network bandwidth

137

18:49

<glozow> so here's a nice tradeoff between protecting mempool resources + wanting higher fee transactions :)

138

18:49

<glozow> next question: In BIP125 Rule #5, how is it possible for a transaction to conflict with more than 100 mempool transactions, given that mempool ancestor and descendant limits don’t allow a chain of more than 25 transactions?

139

18:50

<dunxen> could it be like a wide tree of descendants?

140

18:50

<larryruane> is it because the new tx could have (say) 110 inputs, all of which are the outputs of separate mempool transactions?

141

18:50

<glozow> larryruane: ding ding ding

142

18:50

<glozow> it can conflict with a bunch of independent transactions in the mempool, yep

143

18:51

<glozow> dunxen: wide tree of descendants would still be counted together

144

18:51

<dunxen> oh got it, so like independent ancestors would not?

145

18:52

<glozow> dunxen: yeah, they're not connected, so they won't hit the ancestor/descendant limits

146

18:52

<glozow> next question. here's the signature for `PaysForRBF()`: https://github.com/bitcoin/bitcoin/blob/a33fdd0b981965982754b39586eedb7ae456ba57/src/policy/rbf.h#L88

147

18:52

<glozow> why is `hash` passed by reference but `original_fees` by value? Should the fee parameters be const? Why or why not?

148

18:54

<larryruane> i think reference versus non-reference is often (not always!) a performance consideration.. we don't want to pass a 32-byte thing on the stack (copy), but a `CAmount` is just 8 bytes, usually the same size as a pointer (which is what a reference really is)

149

18:54

<larryruane> so passing a CAmount by reference is passing the same amount of data (8 bytes), but is slower to access (memory lookup)

150

18:55

<glozow> larryruane: yes! we often want to pass by reference because (1) we want to mutate it in the function or (2) the size of the reference is smaller than the object itself, e.g. `CAmount`

151

18:56

<larryruane> it's kinda too bad which of those two reasons is operative isn't obvious from reading the function prototype!

152

18:56

<glozow> and what about the other part of the question - should it be const? why or why not?

153

18:57

<glozow> `original_fees`, `replacement_fees`, and `replacement_vsize`

154

18:57

<larryruane> specifying a pass-by-value (as in this case with the CAmount) as a `const` in the prototype actually is always wrong, because the caller doesn't care whether the called functions changes the variable *internally*

155

18:57

<larryruane> (or shouldn't)

156

18:57

<glozow> right, what's the scope of `original_fees`?

157

18:58

<larryruane> but in the definition of the function, `const` can make sense

158

18:58

<larryruane> glozow: do you mean in the calling function or the called function?

159

18:59

<glozow> larryruane: the callee. or both, if you want to elaborate :)

160

18:59

<larryruane> in pass-by-value, the called function gets a copy, and the scope of that copy is only within that called function (out of scope when the function returns)

161

19:00

<glozow> larryruane: yep!

162

19:00

<larryruane> in the calling function, the scope is before and after the call

163

19:00

<glozow> and even if the callee mutates it, it's a copy, so it doesn't matter to the caller

164

19:00

<glozow> looks like we have run out of time. thanks for diving into RBF with me today, everyone! hope it was fun

165

19:00

<glozow> #endmeeting