Fix signed integer overflow in prioritisetransaction RPC (rpc/rest/zmq, mempool)

https://github.com/bitcoin/bitcoin/pull/23418

Host: MarcoFalke  -  PR author: MarcoFalke

Notes

  • In C++, unsigned integer overflow is well defined and often used by design. Signed integer overflow, on the other hand, is undefined behavior.

  • In practice, signed integer overflow will usually manifest by “wrapping around”. For example, adding two postive values int{0x7ffffffd} and int{10} will result in a negative value of -2147483641.

  • This doesn’t mean that signed integers should be avoided. In fact, signed integers should normally be preferred for arithmetic calculations such as addition or subtraction. Care should be taken to pick an integer width that is large enough to fit all possible values at runtime. Commonly, this is int, int32_t, or int64_t. User provided values should also be checked to be in range as early as possible.

  • Compilers such as gcc and clang can instrument the binary to detect signed integer overflow with the flag -fsanitize=signed-integer-overflow. In Bitcoin Core it can be set via ./configure --with-sanitizers=signed-integer-overflow.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?

  2. What does the prioritisetransaction RPC do?

  3. Were you able to compile Bitcoin Core with the signed-integer-overflow sanitizer? Were you able to reproduce the bug manually by calling the RPC or by running the fuzz test?

  4. Why is it impossible to fix this issue by limiting the range of possible values a user can enter?

  5. Enumerate the different approaches that were discussed in the pull request and summarize each in one sentence. See abort-approach, return-early-approach, saturating-approach.

  6. Which approach was picked in the pull request? Do you agree that the discarded approaches are inferior?

Meeting Log

  117:00 <MacroFake> #startmeeting
  217:00 <michaelfolkson> hi
  317:00 <BlueMoon> Hello!!
  417:00 <dunxen> hi!
  517:00 <MacroFake> Anyone here for the first time?
  617:01 <afmencken> Hi, I'm a first timer
  717:01 <Amirreza> MacroFake, Me
  817:01 <Amirreza> Hello
  917:01 <larryruane> hi
 1017:01 <MacroFake> afmencken: Amirreza: Nice, and welcome
 1117:01 <Amirreza> MacroFake: Thanks
 1217:02 <MacroFake> to get started, you may also share whether you reviewed the pr (y/n). https://bitcoincore.reviews/23418
 1317:02 <effexzi> Hello every1
 1417:03 <Amirreza> MacroFake: no
 1517:03 <svav> n but I read the notes
 1617:03 <larryruane> y
 1717:03 <dunxen> y
 1817:03 <effexzi> Y
 1917:03 <paul_c> Hey everyone
 2017:04 <dpr54> hi
 2117:04 <MacroFake> ok, if there are any questions later, just go ahead and ask
 2217:04 <Bayer[m]> hello!
 2317:04 <MacroFake> Let's jump in: What does the prioritisetransaction RPC do?
 2417:04 <svav> afmencken: where did you hear about this meeting please?
 2517:05 <svav> prioritisetransaction Gives a fee value in satoshis to add or subtract to the existing fee value. It’s a value to modify the absolute fee of the TX. It is not paid initially itself, but only considered by the algorithm that selects transactions into a block as it means the transaction will pay a higher or lower absolute fee.
 2617:05 <MacroFake> svav: Correct
 2717:05 <michaelfolkson> Just for miners right when constructing the block to mine?
 2817:05 <larryruane> I think it's a way to change a mempool transaction's effective fee (thus feerate) ... hence the name, you can increase it's probability of being included in a block (if you're a miner)
 2917:06 <larryruane> *effective fee meaning not its actual fee, but how it's treated in the mempool
 3017:06 <MacroFake> michaelfolkson: Good q. I think it is also used for relay. But let me double check...
 3117:07 <michaelfolkson> I'd guess it wouldn't be particularly helpful for relay. Sure you propagate it but your peers still won't include it in their mempool or propagate it further?
 3217:07 <svav> prioritisetransaction gives the value of the fee adjustment to reprioritise the transaction, right? It is considered by the transaction selection algorithm, because it means an adjusted absolute fee will be taken, if the transaction is actually selected.
 3317:08 <MacroFake> michaelfolkson: Actually for relay the normal fee is used
 3417:08 <MacroFake> https://github.com/bitcoin/bitcoin/blob/b9122e95f0f4ff5d2b2e21a5caf6c69d488c0347/src/txmempool.h#L241-L260
 3517:09 <Amirreza> It may seem very basic question, so sorry to ask. Why we should use an RPC to modify the TRX fee? Why not sending a second TRX with higher fee? If we are lucky, the first TRX would accept (which has less fee) and if no, the second would be included in the block.
 3617:10 <michaelfolkson> Amirreza: This isn't actually modifying the fee. It is pretending the fee is higher so that it is treated differently
 3717:10 <dpr54> interesting
 3817:10 <larryruane> Amirreza: It's the miner who is using this RPC, on a transaction that they likely didn't create
 3917:10 <michaelfolkson> So I think you are asking why use RBF (replace-by-fee) which isn't related to this PR? :)
 4017:10 <MacroFake> Amirreza: Good question. This is primarily supposed to be used by miners to prioritise a transaction.
 4117:11 <larryruane> If we have time, I would like to understand better the motivation for the use of this RPC
 4217:11 <sanya> what reasons they might have to prioritise certain transaction?
 4317:12 <MacroFake> They may have been paid out-of-band
 4417:12 <afmencken> I'm trying to imagine a use case for this - the first one that comes to mind is if a miner has some out of band incentive to (de)prioritize the transaction.
 4517:12 <MacroFake> Or it is one of their own txs, for example a pool payout tx
 4617:12 <larryruane> If someone submits a tx and it's not getting mined, I suppose the tx sender could pay a miner on the side to mine it, and the miner would do that using this RPC?
 4717:13 <larryruane> (oh marco just said that :) )
 4817:13 <Amirreza> Sorry I don't get it. What I interpret by the name, it should sort TRXs by fee rat in descending order. Because miners want TRXs with higher fees. Is this correct?
 4917:13 <Amirreza> fee rate*
 5017:14 <larryruane> Amirreza: That is correct -- this gives the miner a way to modify that selection decision (for just themself)
 5117:14 <michaelfolkson> Amirreza: They generally do want transactions with the highest fee rates yes. But occasionally like in examples described above they'll want to force a transaction into a block even if it has lower fee rate
 5217:15 <Amirreza> larryruane: michaelfolkson: thanks I get it
 5317:15 <svav> Who gets to use prioritisetransaction, is it just a miner?
 5417:16 <MacroFake> svav: Anyone can use it. It will also prevent the tx to "fall off" when the prioritization puts it above the minimum mempool fee
 5517:17 <MacroFake> However, your peers may reject the transaction if it doesn't meet their minimum mempool fee
 5617:17 <MacroFake> Let's go on: Were you able to compile Bitcoin Core with the signed-integer-overflow sanitizer? Were you able to reproduce the bug manually by calling the RPC or by running the fuzz test?
 5717:18 <larryruane> yes! and even better, the bug doesn't reproduce with the PR!
 5817:18 <svav> So is it just a way to make it more or less likely a transaction gets included in a block, after you have submitted the transaction for processing, i.e. its in the mempool already?
 5917:18 <MacroFake> larryruane: Thanks for testing
 6017:18 <MacroFake> svav: Only if your node creates blocks (or block templates that get mined on)
 6117:18 <larryruane> tiny question, when building for fuzzing, is there no bitcoind?
 6217:20 <MacroFake> larryruane: There won't be a bitcoind if you pass --enable-fuzz (to link a fuzz engine), however the fuzz tests are also built (by default) along with all other stuff without a fuzz engine
 6317:20 <afmencken> svav: I think at better way to say it is that it is a way to increase the likelihood that the transaction stays in _your_ mempool (but not necessarily into a block).
 6417:21 <svav> OK thanks
 6517:21 <MacroFake> Why is it impossible to fix this issue by limiting the range of possible values a user can enter?
 6617:22 <larryruane> MarcoFake: because the user can execute this RPC an unlimited number of times?
 6717:23 <michaelfolkson> Or is it the descendant transactions issue?
 6817:23 <svav> Because you don't know how many descendant transactions may be added to the mempool??
 6917:23 <MacroFake> Right, but why would that be an issue? (Hint: Explain what the call does internally)
 7017:24 <svav> MacroFake can you explain this in a bit more detail? It is impossible to predict when and if an overflow occurs, since
 7117:24 <svav> the overflow caused by a prioritisetransaction RPC might only be
 7217:24 <svav> later hit when descendant txs are added to the mempool.
 7317:25 <MacroFake> svav: Correct
 7417:25 <MacroFake> Though, there is still a second answer to the question
 7517:26 <MacroFake> Hint: Think about calling the rpc on the same txid
 7617:26 <larryruane> svav: that's really interesting (can't simply fail the RPC)
 7717:27 <michaelfolkson> MarcroFake: You mean calling it twice on same txid?
 7817:27 <larryruane> that's what your RPC repro steps do
 7917:27 <larryruane> (calls twice on same txid)
 8017:27 <larryruane> first time doesn't overflow, second time does
 8117:28 <larryruane> (this is what I was thinking with my earlier answer about calling the RPC an unlimited number of times)
 8217:28 <michaelfolkson> Change the fee by a different amount the second time? Or same fee adjustment?
 8317:29 <MacroFake> (For referenc we are looking at https://github.com/bitcoin/bitcoin/blob/b9122e95f0f4ff5d2b2e21a5caf6c69d488c0347/src/txmempool.cpp#L923-L939 )
 8417:30 <MacroFake> If there are no ancestors and no descendants, it will simply call UpdateFeeDelta on the tx itself: https://github.com/bitcoin/bitcoin/blob/b9122e95f0f4ff5d2b2e21a5caf6c69d488c0347/src/txmempool.cpp#L92-L97
 8517:32 <MacroFake> nModFeesWithDescendants would then be equal to nModFeesWithAncestors and both may overflow if the applied fee delta is too high
 8617:32 <svav> Can someone tell me what the overflow is occurring in? What variable or whatever?
 8717:32 <MacroFake> michaelfolkson: Any fee delta that would result in an overflow should wok
 8817:32 <MacroFake> *work
 8917:34 <MacroFake> svav: The overflow may happen for any variable that tracks the modified fee.
 9017:34 <MacroFake> For example, it may happen for the CAmount in mapDeltas[hash] , see https://github.com/bitcoin/bitcoin/blob/b9122e95f0f4ff5d2b2e21a5caf6c69d488c0347/src/txmempool.cpp#L920
 9117:34 <svav> Is the problem essentially that you don't know how many times the fee delta will be applied?? If it's applied too many times it can cause an overflow??
 9217:35 <MacroFake> svav: You know that the fee delta will be applied to all ancestors and all descendants (even if they will be added to the mempool later)
 9317:36 <michaelfolkson> svav: It's not too many times. It is a value not fitting within the integer width (notes of the PR review club)
 9417:36 <larryruane> ah so one of those ancestors or descendants could have a large actual fee ... so for that tx, the modified fee could overflow
 9517:36 <svav> MarcoFalke why are you called MacroFake today? :)
 9617:37 <MacroFake> svav: I changed my IRC nick
 9717:38 <MacroFake> larryruane: Right
 9817:39 <michaelfolkson> When you say "leave it up to the user to use the RPC endpoint responsibly" you essentially mean don't introduce the fee by something ludicrous?
 9917:39 <MacroFake> michaelfolkson: Yes
10017:40 <MacroFake> However, this may also be hit when reading the mempool.dat from disk
10117:40 <michaelfolkson> *increase the fee (correction)
10217:40 <michaelfolkson> Ok thanks
10317:40 <MacroFake> Next q: Enumerate the different approaches that were discussed in the pull request and summarize each in one sentence. See abort-approach, return-early-approach, saturating-approach.
10417:40 <MacroFake> abort-approach: https://github.com/bitcoin/bitcoin/pull/23418#discussion_r742736167
10517:41 <MacroFake> return-early-approach: https://github.com/bitcoin/bitcoin/commit/e8522d082e5b7ab421e62c8e76fabbea24531a8d#diff-c065d4cd2398ad0dbcef393c5dfc53f465bf44723348892395fffd2fb3bac522R37
10617:41 <MacroFake> saturating-approach: https://github.com/bitcoin/bitcoin/pull/23418#discussion_r744953272
10717:42 <larryruane> easy part, I'm pretty sure we're taking the last one (saturating)
10817:43 <michaelfolkson> Yeah not the abort approach
10917:43 <larryruane> abort approach is just to violate an assert and crash the node ... but (ideally) any user input shouldn't be able to crash the node
11017:43 <MacroFake> Can someone summarize each one, so that everyone is on the same page, please :)
11117:43 <larryruane> return-early is where we just don't do the increment at all if it would overflow (so the result could be significantly off)
11217:44 <larryruane> saturating approach means make the result as close as we can
11317:44 <michaelfolkson> And abort is abort the program if the overflow occurs
11417:44 <larryruane> i agree with the saturating approach
11517:44 <MacroFake> larryruane: thx
11617:44 <MacroFake> larryruane: Yes, user inputs shouldn't crash the program. That is a good rule of thumb.
11717:45 <larryruane> (even if it is RPC, which is sort of the user shooting himself in the foot ... input from P2P would be much worse!)
11817:46 <michaelfolkson> And this saturating approach still doesn't put any restrictions on what the fuzzers can do. They can go wild
11917:46 <MacroFake> Well, it still requires a call to the RPC with a large/small value. But the crash could (in theory) be triggered by a P2P tx.
12017:46 <afmencken> One additional approach that I didn't see mentioned would be to have the RPC return an error code. Is that a non-starter for some obvious reason that I'm not seeing?
12117:46 <larryruane> MacroFake: +1 good point
12217:47 <svav> Did we do Q4? Why is it impossible to fix this issue by limiting the range of possible values a user can enter?
12317:47 <MacroFake> afmencken: Good question. The thing is that it is not possible to determine whether an overflow will (eventually) occur, as the overflow may happen after the RPC finished sucessfully.
12417:47 <larryruane> afmencken: It doesn't solve the problem, because a tx could be added later (via P2P) that causes the overflow
12517:48 <afmencken> MacroFake: larryruane: thanks, that makes sense.
12617:49 <Amirreza> yeah, can someone explain why it can't be solved by limiting the range of possible inputs?
12717:49 <svav> Q4?
12817:49 <MacroFake> To recap Q4: Limiting the range is not sufficient. For example, you can prioritise by +1 in a loop on the same txid. This will eventually overflow.
12917:50 <larryruane> by the way, it's interesting how the prioritisation map can contain txids that don't exist yet in the mempool! maybe they just haven't arrived yet!
13017:50 <svav> MacroFake could you try and give a few sentences on how this bug appears, because I still don't understand it fully.
13117:50 <Amirreza> MacroFake: Can't we limit the number of attempt to prioritise a tx? I know it's not a good solution, but in theory, does this solve?
13217:51 <michaelfolkson> Amirreza: The problem isn't the number of attempts. A single "attempt" can cause the integer overflow
13317:52 <michaelfolkson> So no limiting the number of times you can change the priorisation of the tx doesn't solve it
13417:52 <svav> I get that someone could set a massive fee delta and that would overflow ... but what else causes it?
13517:53 <MacroFake> Amirreza: Good question. I think it would be another possible solution. However, you'd need to limit both the allowed range (based on maximum allowed ancestors/descendants), and the number of allowed calls.
13617:53 <Amirreza> michaelfolkson: MacroFake: I got it, thanks.
13717:53 <MacroFake> Amirreza: While this is a theoretical solution, it would be harder to implement, as it requires tracking state
13817:54 <Amirreza> MacroFake: Yeah I agree
13917:54 <larryruane> Also just to be clear, it's not like we're worried that this could actually happen in practice, right? It's more that we have the fuzzers and sanitizers, and we want them to run cleanly as much as possible, without having to declare exceptions (the ubsan file that this PR removes a line from)
14017:55 <MacroFake> It might have happened in practise (no one knows), but I don't expect it to happen regurlarly.
14117:55 <MacroFake> last q: Which approach was picked in the pull request? Do you agree that the discarded approaches are inferior?
14217:57 <larryruane> MacroFake: (saturation, I agree!) If it wouldn't take too long to explain, why is signed integer overflow considered undefinied behavior? Especially while unsigned is not?
14317:59 <MacroFake> unsigned integer overflow is used commonly to implement low level cryptographic primitives. I think this wouldn't be possible if the language didn't have unsigned integer overflow
14417:59 <MacroFake> not sure why signed integer overflow is UB, but it usually seems unwanted if the program can run into it.
14518:00 <larryruane> thanks makes sense
14618:00 <MacroFake> #endmeeting