zmq, mempool)

Jun 22, 2022

https://github.com/bitcoin/bitcoin/pull/23418

Host: MarcoFalke - PR author: MarcoFalke

Notes

In C++, unsigned integer overflow is well defined and often used by design. Signed integer overflow, on the other hand, is undefined behavior.
In practice, signed integer overflow will usually manifest by “wrapping around”. For example, adding two postive values int{0x7ffffffd} and int{10} will result in a negative value of -2147483641.
This doesn’t mean that signed integers should be avoided. In fact, signed integers should normally be preferred for arithmetic calculations such as addition or subtraction. Care should be taken to pick an integer width that is large enough to fit all possible values at runtime. Commonly, this is int, int32_t, or int64_t. User provided values should also be checked to be in range as early as possible.
Compilers such as gcc and clang can instrument the binary to detect signed integer overflow with the flag -fsanitize=signed-integer-overflow. In Bitcoin Core it can be set via ./configure --with-sanitizers=signed-integer-overflow.

Questions

Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?
What does the prioritisetransaction RPC do?
Were you able to compile Bitcoin Core with the signed-integer-overflow sanitizer? Were you able to reproduce the bug manually by calling the RPC or by running the fuzz test?
Why is it impossible to fix this issue by limiting the range of possible values a user can enter?
Enumerate the different approaches that were discussed in the pull request and summarize each in one sentence. See abort-approach, return-early-approach, saturating-approach.
Which approach was picked in the pull request? Do you agree that the discarded approaches are inferior?

Meeting Log

17:00

<MacroFake> #startmeeting

17:00

<michaelfolkson> hi

17:00

<BlueMoon> Hello!!

17:00

<dunxen> hi!

17:00

<MacroFake> Anyone here for the first time?

17:01

<afmencken> Hi, I'm a first timer

17:01

<Amirreza> MacroFake, Me

17:01

<Amirreza> Hello

17:01

<larryruane> hi

17:01

<MacroFake> afmencken: Amirreza: Nice, and welcome

17:01

<Amirreza> MacroFake: Thanks

17:02

<MacroFake> to get started, you may also share whether you reviewed the pr (y/n). https://bitcoincore.reviews/23418

17:02

<effexzi> Hello every1

17:03

<Amirreza> MacroFake: no

17:03

<svav> n but I read the notes

17:03

<larryruane> y

17:03

<dunxen> y

17:03

<effexzi> Y

17:03

<paul_c> Hey everyone

17:04

<dpr54> hi

17:04

<MacroFake> ok, if there are any questions later, just go ahead and ask

17:04

<Bayer[m]> hello!

17:04

<MacroFake> Let's jump in: What does the prioritisetransaction RPC do?

17:04

<svav> afmencken: where did you hear about this meeting please?

17:05

<svav> prioritisetransaction Gives a fee value in satoshis to add or subtract to the existing fee value. It’s a value to modify the absolute fee of the TX. It is not paid initially itself, but only considered by the algorithm that selects transactions into a block as it means the transaction will pay a higher or lower absolute fee.

17:05

<MacroFake> svav: Correct

17:05

<michaelfolkson> Just for miners right when constructing the block to mine?

17:05

<larryruane> I think it's a way to change a mempool transaction's effective fee (thus feerate) ... hence the name, you can increase it's probability of being included in a block (if you're a miner)

17:06

<larryruane> *effective fee meaning not its actual fee, but how it's treated in the mempool

17:06

<MacroFake> michaelfolkson: Good q. I think it is also used for relay. But let me double check...

17:07

<michaelfolkson> I'd guess it wouldn't be particularly helpful for relay. Sure you propagate it but your peers still won't include it in their mempool or propagate it further?

17:07

<svav> prioritisetransaction gives the value of the fee adjustment to reprioritise the transaction, right? It is considered by the transaction selection algorithm, because it means an adjusted absolute fee will be taken, if the transaction is actually selected.

17:08

<MacroFake> michaelfolkson: Actually for relay the normal fee is used

17:08

<MacroFake> https://github.com/bitcoin/bitcoin/blob/b9122e95f0f4ff5d2b2e21a5caf6c69d488c0347/src/txmempool.h#L241-L260

17:09

<Amirreza> It may seem very basic question, so sorry to ask. Why we should use an RPC to modify the TRX fee? Why not sending a second TRX with higher fee? If we are lucky, the first TRX would accept (which has less fee) and if no, the second would be included in the block.

17:10

<michaelfolkson> Amirreza: This isn't actually modifying the fee. It is pretending the fee is higher so that it is treated differently

17:10

<dpr54> interesting

17:10

<larryruane> Amirreza: It's the miner who is using this RPC, on a transaction that they likely didn't create

17:10

<michaelfolkson> So I think you are asking why use RBF (replace-by-fee) which isn't related to this PR? :)

17:10

<MacroFake> Amirreza: Good question. This is primarily supposed to be used by miners to prioritise a transaction.

17:11

<larryruane> If we have time, I would like to understand better the motivation for the use of this RPC

17:11

<sanya> what reasons they might have to prioritise certain transaction?

17:12

<MacroFake> They may have been paid out-of-band

17:12

<afmencken> I'm trying to imagine a use case for this - the first one that comes to mind is if a miner has some out of band incentive to (de)prioritize the transaction.

17:12

<MacroFake> Or it is one of their own txs, for example a pool payout tx

17:12

<larryruane> If someone submits a tx and it's not getting mined, I suppose the tx sender could pay a miner on the side to mine it, and the miner would do that using this RPC?

17:13

<larryruane> (oh marco just said that :) )

17:13

<Amirreza> Sorry I don't get it. What I interpret by the name, it should sort TRXs by fee rat in descending order. Because miners want TRXs with higher fees. Is this correct?

17:13

<Amirreza> fee rate*

17:14

<larryruane> Amirreza: That is correct -- this gives the miner a way to modify that selection decision (for just themself)

17:14

<michaelfolkson> Amirreza: They generally do want transactions with the highest fee rates yes. But occasionally like in examples described above they'll want to force a transaction into a block even if it has lower fee rate

17:15

<Amirreza> larryruane: michaelfolkson: thanks I get it

17:15

<svav> Who gets to use prioritisetransaction, is it just a miner?

17:16

<MacroFake> svav: Anyone can use it. It will also prevent the tx to "fall off" when the prioritization puts it above the minimum mempool fee

17:17

<MacroFake> However, your peers may reject the transaction if it doesn't meet their minimum mempool fee

17:17

<MacroFake> Let's go on: Were you able to compile Bitcoin Core with the signed-integer-overflow sanitizer? Were you able to reproduce the bug manually by calling the RPC or by running the fuzz test?

17:18

<larryruane> yes! and even better, the bug doesn't reproduce with the PR!

17:18

<svav> So is it just a way to make it more or less likely a transaction gets included in a block, after you have submitted the transaction for processing, i.e. its in the mempool already?

17:18

<MacroFake> larryruane: Thanks for testing

17:18

<MacroFake> svav: Only if your node creates blocks (or block templates that get mined on)

17:18

<larryruane> tiny question, when building for fuzzing, is there no bitcoind?

17:20

<MacroFake> larryruane: There won't be a bitcoind if you pass --enable-fuzz (to link a fuzz engine), however the fuzz tests are also built (by default) along with all other stuff without a fuzz engine

17:20

<afmencken> svav: I think at better way to say it is that it is a way to increase the likelihood that the transaction stays in _your_ mempool (but not necessarily into a block).

17:21

<svav> OK thanks

17:21

<MacroFake> Why is it impossible to fix this issue by limiting the range of possible values a user can enter?

17:22

<larryruane> MarcoFake: because the user can execute this RPC an unlimited number of times?

17:23

<michaelfolkson> Or is it the descendant transactions issue?

17:23

<svav> Because you don't know how many descendant transactions may be added to the mempool??

17:23

<MacroFake> Right, but why would that be an issue? (Hint: Explain what the call does internally)

17:24

<svav> MacroFake can you explain this in a bit more detail? It is impossible to predict when and if an overflow occurs, since

17:24

<svav> the overflow caused by a prioritisetransaction RPC might only be

17:24

<svav> later hit when descendant txs are added to the mempool.

17:25

<MacroFake> svav: Correct

17:25

<MacroFake> Though, there is still a second answer to the question

17:26

<MacroFake> Hint: Think about calling the rpc on the same txid

17:26

<larryruane> svav: that's really interesting (can't simply fail the RPC)

17:27

<michaelfolkson> MarcroFake: You mean calling it twice on same txid?

17:27

<larryruane> that's what your RPC repro steps do

17:27

<larryruane> (calls twice on same txid)

17:27

<larryruane> first time doesn't overflow, second time does

17:28

<larryruane> (this is what I was thinking with my earlier answer about calling the RPC an unlimited number of times)

17:28

<michaelfolkson> Change the fee by a different amount the second time? Or same fee adjustment?

17:29

<MacroFake> (For referenc we are looking at https://github.com/bitcoin/bitcoin/blob/b9122e95f0f4ff5d2b2e21a5caf6c69d488c0347/src/txmempool.cpp#L923-L939 )

17:30

<MacroFake> If there are no ancestors and no descendants, it will simply call UpdateFeeDelta on the tx itself: https://github.com/bitcoin/bitcoin/blob/b9122e95f0f4ff5d2b2e21a5caf6c69d488c0347/src/txmempool.cpp#L92-L97

17:32

<MacroFake> nModFeesWithDescendants would then be equal to nModFeesWithAncestors and both may overflow if the applied fee delta is too high

17:32

<svav> Can someone tell me what the overflow is occurring in? What variable or whatever?

17:32

<MacroFake> michaelfolkson: Any fee delta that would result in an overflow should wok

17:32

<MacroFake> *work

17:34

<MacroFake> svav: The overflow may happen for any variable that tracks the modified fee.

17:34

<MacroFake> For example, it may happen for the CAmount in mapDeltas[hash] , see https://github.com/bitcoin/bitcoin/blob/b9122e95f0f4ff5d2b2e21a5caf6c69d488c0347/src/txmempool.cpp#L920

17:34

<svav> Is the problem essentially that you don't know how many times the fee delta will be applied?? If it's applied too many times it can cause an overflow??

17:35

<MacroFake> svav: You know that the fee delta will be applied to all ancestors and all descendants (even if they will be added to the mempool later)

17:36

<michaelfolkson> svav: It's not too many times. It is a value not fitting within the integer width (notes of the PR review club)

17:36

<larryruane> ah so one of those ancestors or descendants could have a large actual fee ... so for that tx, the modified fee could overflow

17:36

<svav> MarcoFalke why are you called MacroFake today? :)

17:37

<MacroFake> svav: I changed my IRC nick

17:38

<MacroFake> larryruane: Right

17:39

<michaelfolkson> When you say "leave it up to the user to use the RPC endpoint responsibly" you essentially mean don't introduce the fee by something ludicrous?

17:39

<MacroFake> michaelfolkson: Yes

100

17:40

<MacroFake> However, this may also be hit when reading the mempool.dat from disk

101

17:40

<michaelfolkson> *increase the fee (correction)

102

17:40

<michaelfolkson> Ok thanks

103

17:40

<MacroFake> Next q: Enumerate the different approaches that were discussed in the pull request and summarize each in one sentence. See abort-approach, return-early-approach, saturating-approach.

104

17:40

<MacroFake> abort-approach: https://github.com/bitcoin/bitcoin/pull/23418#discussion_r742736167

105

17:41

<MacroFake> return-early-approach: https://github.com/bitcoin/bitcoin/commit/e8522d082e5b7ab421e62c8e76fabbea24531a8d#diff-c065d4cd2398ad0dbcef393c5dfc53f465bf44723348892395fffd2fb3bac522R37

106

17:41

<MacroFake> saturating-approach: https://github.com/bitcoin/bitcoin/pull/23418#discussion_r744953272

107

17:42

<larryruane> easy part, I'm pretty sure we're taking the last one (saturating)

108

17:43

<michaelfolkson> Yeah not the abort approach

109

17:43

<larryruane> abort approach is just to violate an assert and crash the node ... but (ideally) any user input shouldn't be able to crash the node

110

17:43

<MacroFake> Can someone summarize each one, so that everyone is on the same page, please :)

111

17:43

<larryruane> return-early is where we just don't do the increment at all if it would overflow (so the result could be significantly off)

112

17:44

<larryruane> saturating approach means make the result as close as we can

113

17:44

<michaelfolkson> And abort is abort the program if the overflow occurs

114

17:44

<larryruane> i agree with the saturating approach

115

17:44

<MacroFake> larryruane: thx

116

17:44

<MacroFake> larryruane: Yes, user inputs shouldn't crash the program. That is a good rule of thumb.

117

17:45

<larryruane> (even if it is RPC, which is sort of the user shooting himself in the foot ... input from P2P would be much worse!)

118

17:46

<michaelfolkson> And this saturating approach still doesn't put any restrictions on what the fuzzers can do. They can go wild

119

17:46

<MacroFake> Well, it still requires a call to the RPC with a large/small value. But the crash could (in theory) be triggered by a P2P tx.

120

17:46

<afmencken> One additional approach that I didn't see mentioned would be to have the RPC return an error code. Is that a non-starter for some obvious reason that I'm not seeing?

121

17:46

<larryruane> MacroFake: +1 good point

122

17:47

<svav> Did we do Q4? Why is it impossible to fix this issue by limiting the range of possible values a user can enter?

123

17:47

<MacroFake> afmencken: Good question. The thing is that it is not possible to determine whether an overflow will (eventually) occur, as the overflow may happen after the RPC finished sucessfully.

124

17:47

<larryruane> afmencken: It doesn't solve the problem, because a tx could be added later (via P2P) that causes the overflow

125

17:48

<afmencken> MacroFake: larryruane: thanks, that makes sense.

126

17:49

<Amirreza> yeah, can someone explain why it can't be solved by limiting the range of possible inputs?

127

17:49

<svav> Q4?

128

17:49

<MacroFake> To recap Q4: Limiting the range is not sufficient. For example, you can prioritise by +1 in a loop on the same txid. This will eventually overflow.

129

17:50

<larryruane> by the way, it's interesting how the prioritisation map can contain txids that don't exist yet in the mempool! maybe they just haven't arrived yet!

130

17:50

<svav> MacroFake could you try and give a few sentences on how this bug appears, because I still don't understand it fully.

131

17:50

<Amirreza> MacroFake: Can't we limit the number of attempt to prioritise a tx? I know it's not a good solution, but in theory, does this solve?

132

17:51

<michaelfolkson> Amirreza: The problem isn't the number of attempts. A single "attempt" can cause the integer overflow

133

17:52

<michaelfolkson> So no limiting the number of times you can change the priorisation of the tx doesn't solve it

134

17:52

<svav> I get that someone could set a massive fee delta and that would overflow ... but what else causes it?

135

17:53

<MacroFake> Amirreza: Good question. I think it would be another possible solution. However, you'd need to limit both the allowed range (based on maximum allowed ancestors/descendants), and the number of allowed calls.

136

17:53

<Amirreza> michaelfolkson: MacroFake: I got it, thanks.

137

17:53

<MacroFake> Amirreza: While this is a theoretical solution, it would be harder to implement, as it requires tracking state

138

17:54

<Amirreza> MacroFake: Yeah I agree

139

17:54

<larryruane> Also just to be clear, it's not like we're worried that this could actually happen in practice, right? It's more that we have the fuzzers and sanitizers, and we want them to run cleanly as much as possible, without having to declare exceptions (the ubsan file that this PR removes a line from)

140

17:55

<MacroFake> It might have happened in practise (no one knows), but I don't expect it to happen regurlarly.

141

17:55

<MacroFake> last q: Which approach was picked in the pull request? Do you agree that the discarded approaches are inferior?

142

17:57

<larryruane> MacroFake: (saturation, I agree!) If it wouldn't take too long to explain, why is signed integer overflow considered undefinied behavior? Especially while unsigned is not?

143

17:59

<MacroFake> unsigned integer overflow is used commonly to implement low level cryptographic primitives. I think this wouldn't be possible if the language didn't have unsigned integer overflow

144

17:59

<MacroFake> not sure why signed integer overflow is UB, but it usually seems unwanted if the program can run into it.

145

18:00

<larryruane> thanks makes sense

146

18:00

<MacroFake> #endmeeting