bug fix: update for ancestor inclusion using modified fees, not base (mining)

https://github.com/bitcoin/bitcoin/pull/24538

Host: glozow  -  PR author: glozow

Notes

  • Miners can retrieve a block template, a consensus-valid block excluding the proof of work (usually computed on separate, dedicated hardware) using the getblocktemplate RPC. The ā€œminerā€ (BlockAssembler) generates this template using transactions from the mempool, attempting to maximize the fees in the block while staying within the block weight and sigop limits.

  • Miners can also use the prioritisetransaction RPC to artificially raise or lower the fees of specific transactions in their own mempools. The prioritisation is achieved through a fee delta. The ā€œmodified feeā€ of a transaction is the sum of its base fee (total output value subtracted from total input value) and the fee delta.

  • PR #7594 added ancestor package tracking to the mempool. The mempool caches every transactionā€™s ancestor feerate (total modified fees divided by total virtual size of the transaction and all of the transactions it depends upon to be mined).

  • PR #7600 changed the mining algorithm to use ancestor packages rather than individual transactions, which improves assessments of the incentive compatibility of transactions and enables Child Pays for Parent (CPFP) fee-bumping.

    • The algorithm adds transactions from the mempool in ancestor feerate order; every time it adds a transaction to the block template, it also adds each of its ancestors and updates the remaining transactions in the mempool accordingly.

    • Rather than edit the mempool transactions itself, the miner creates a copy of the updated entries in mapModifiedTx.

    • Review on #24364 unearthed some unexpected behavior in the way these entries are edited.

    • PR #24538 fixes this unexpected behavior and adds unit tests for mining prioritised transactions.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? How did you review the PR - did you try reproducing the bug?

  2. What does ancestor feerate include?

  3. In your own words, how does the mining algorithm work (Hint: the main logic can be found in addPackageTxs())?

    3a. In what scenario does a transaction get added to mapModifiedTx?

    3b. In what scenario does an entry in mapModifiedTx get further modified?

  4. What is the bug fixed by this PR? Can you construct a specific case in which the bug leads to a lower-fee transaction being included in the mempool? (Hint: the PR adds a test).

(Bonus Mining Questions)

  1. Why is MAX_CONSECUTIVE_FAILURES necessary?

  2. Could the prioritisetransaction RPC (and fee deltas) be replaced with parameters to getblocktemplate to force-include or force-exclude transactions?

  3. What two indexes can the indexed_modified_transaction_set be sorted by?

Meeting Log

  117:00 <glozow> #startmeeting
  217:00 <glozow> hi there!
  317:00 <stickies-v> hi
  417:00 <svav> Hi
  517:00 <glozow> Welcome to PR Review Club!
  617:00 <lightlike> hi
  717:00 <emzy> Hi
  817:00 <theStack> hi
  917:00 <glozow> We're reviewing a miner bug fix today, "update for ancestor inclusion using modified fees, not base"
 1017:01 <glozow> Notes: https://bitcoincore.reviews/24538
 1117:01 <larryruane> hi
 1217:01 <glozow> Any first-timers?
 1317:01 <Dweezahr> yeah first time for me
 1417:01 <glozow> welcome Dweezahr!
 1517:02 <Dweezahr> Thank you
 1617:02 <glozow> This is our first time looking at the mining code in pr review club (afaik), so hopefully there's something new to learn for everyone
 1717:02 <svav> Dweezahr where did you hear about this meeting, if you don't mind sharing?
 1817:02 <glozow> Did y'all get a chance to review the PR or look at the notes? y/n
 1917:02 <Dweezahr> svav, I found it through the CONTRIBUTING file in the root of bitcoin/bitcoin on github
 2017:02 <effexzi> Hi every1
 2117:03 <lightlike> i read the PR title as "minor bug fix" and thought "how modest!"
 2217:03 <svav> Ok thanks Dweezahr, and welcome!
 2317:03 <Dweezahr> I merged the PR into a local git repo and compiled, ran the tests fine, but needed a special flag in ./configure
 2417:03 <glozow> lightlike: xD
 2517:03 <theStack> lightlike: heh
 2617:03 <stickies-v> n, couldn't properly review so I'm here to lurk and learn
 2717:03 <larryruane> looked at the actual fix (easy!) but trying to puzzle out the test changes
 2817:03 <Dweezahr> ./configure --enable-experimental
 2917:03 <emzy> n, just read the notes.
 3017:04 <glozow> larryruane: awesome! did you try reproducing the bug and whatnot?
 3117:04 <theStack> n
 3217:04 <svav> I read the notes, but it seems like quite a difficult issue
 3317:04 <larryruane> i was just going to ask that ... isn't it a good review practice to run any new or modified test without the production code change, and make sure the test fails?
 3417:04 <larryruane> (sadly i didn't have time to do that)
 3517:04 <ccdle12> hi - semi reviewed
 3617:04 <glozow> larryruane: yeah! that's what i'd recommend.
 3717:05 <effexzi> N
 3817:05 <glozow> Is anybody able to summarize how the mining algorithm works?
 3917:05 <glozow> (and by mining, i mean block template building)
 4017:05 <glozow> Hint: we're looking at the code here https://github.com/bitcoin/bitcoin/blob/f3e0ace8ecd84009a23da6b0de47f01d79c45772/src/node/miner.cpp#L303
 4117:06 <svav> Well, firstly, it works to maximise profit for miner, right?
 4217:06 <glozow> svav: yes exactly. we want to maximize the total fees of the block.
 4317:06 <larryruane> because if we don't, miners are encouraged to write their own algorithms, and that disadvatages newcomer miners
 4417:06 <effexzi> Picks up a bunch of transactions, adds previous header, a nonce and hashes until difficulty is met.
 4517:07 <glozow> and it needs to be consensus-valid. Here, the most relevant constraints are = maximum block weight and sigops.
 4617:07 <stickies-v> at a VERY high level: sorting tx packages by their ancestor feerate and picking the highest fee rate ones until the block is full?
 4717:07 <glozow> stickies-v: yes, great start!
 4817:07 <glozow> What is ancestor feerate?
 4917:07 <stickies-v> the combined fee rate of a tx and all of its unconfirmed parents
 5017:07 <larryruane> i always forget about sigops ... is it common that a block is less than max weight because it's at the max sigops? or is that more of a sanity check?
 5117:08 <stickies-v> so, all the fees of tx + parents, divided by the weight of tx + parents
 5217:08 <glozow> effexzi: yeah that's the idea for mining in general. Right now we're specifically talking about the process of picking the transactions.
 5317:08 <glozow> larryruane: AFAIK, that's very uncommon
 5417:08 <larryruane> stickies-v: without double-(multiple-) counting, right? so if a tx has 2 parents, and each of those shares a parent, we count that "grandparent" only once?
 5517:09 <Kaizen_Kintsugi_> hm
 5617:09 <stickies-v> larryruane good point, yes it should be the unique set of ancestors
 5717:09 <glozow> stickies-v: not just parents :) parents' parents, parents' parents' parents, etc.
 5817:09 <glozow> in other words, a tx's ancestor set is the set of all transactions that it depends upon
 5917:09 <glozow> larryruane: correct, we don't double count.
 6017:10 <glozow> why ancestor feerate in particular?
 6117:10 <larryruane> so when a new block is mined, it's possible for a tx's ancestor fee (and ancestor size) to decrease since some of its ancestors may be included in the new block
 6217:10 <glozow> Relevant PR: https://github.com/bitcoin/bitcoin/pull/7600
 6317:10 <svav> Could someoneĀ  give a definition for mapTx?
 6417:11 <glozow> larryruane: yes exactly. a subset of your ancestors may be included without you.
 6517:11 <larryruane> is that the crazy multimap thing that is essentially the mempool??
 6617:11 <ccdle12> svav: the main datastructure in the mempool that tracks txs according 5 indexes
 6717:11 <glozow> larryruane: yes xD it's the multi-index container that stores all mempool entries
 6817:12 <glozow> here is the definition: https://github.com/bitcoin/bitcoin/blob/094d9fda5ccee7d78a2e3d8b1eec17b8b6a33466/src/txmempool.h#L458-L488
 6917:13 <svav> and a definition for mapModifiedTx for clarity? Thanks
 7017:13 <glozow> this transitions nicely into our next question - what is `mapModifiedTx` ?
 7117:13 <glozow> svav: haha jinx
 7217:13 <theStack> that is quite some lines of code for a single typedef ^^
 7317:13 <glozow> here is the typedef for `mapModifiedTx`: https://github.com/bitcoin/bitcoin/blob/094d9fda5ccee7d78a2e3d8b1eec17b8b6a33466/src/node/miner.h#L93-L108
 7417:14 <glozow> What is `mapModifiedTx` used for?
 7517:15 <lightlike> so it's kind of a poor man's mempool with just one index?
 7617:15 <Dweezahr> like std multimap?
 7717:16 <Dweezahr> with modified transactions
 7817:16 <ccdle12> `mapModifiedTx` stores copies of txs in the mempool but only sorted by ancestor fee rate?
 7917:16 <svav> I am guessing now but ... is mapModifiedTx some sort of snapshot for a "potential" mempool, which has added a given transaction into the mempool to then evaluate total fee rates, and see how this compared to previous mapTx?
 8017:16 <stickies-v> we want to have a copy of the mempool where we can remove ancestors that have already been selected as part of a package, without actually affecting the mempool, I think?
 8117:16 <glozow> yes, it's not storing the same information as mapTx. How are they modified? When do we add a transaction to it?
 8217:17 <glozow> stickies-v: bingo
 8317:18 <glozow> lightlike: there are 2, you can index by iter and by ancestor feerate
 8417:18 <lightlike> oh, right
 8517:18 <lightlike> why do we call UpdatePackagesForAdded() right at the beginning of addPackageTxs() ? what could have been already added at this points so that we might need to change mapModifiedTx?
 8617:19 <glozow> lightlike: I'm not sure, I also had the same question
 8717:20 <glozow> AFAIK you can't pre-populate the template with transactions, but that would have been my guess
 8817:20 <lightlike> there is a comment talking about "previously added" transactions, but I didn't find any code that does that
 8917:21 <glozow> maybe it was removed and this wasn't cleaned up? idk
 9017:21 <glozow> Does everybody understand what `mapModifiedTx` is used for?
 9117:22 <glozow> To summarize, it contains transactions that have not been selected yet, but some subset of their ancestors have. So we can't just use the ancestor feerate cached in their mempool entries.
 9217:22 <glozow> (We don't modify the actual mempool while selecting transactions)
 9317:23 <svav> So basically it's a mechanism to ensure that fees available from packages are not erroneously counted multiple times?
 9417:23 <Kaizen_Kintsugi_> that is my understanding
 9517:23 <glozow> svav: yes, that's another way to look at it
 9617:23 <larryruane> and when you say some ancestors have been selected, you mean for inclusion in a block that we're creating?
 9717:24 <glozow> let's give a concrete example and we can use it for the next few questions
 9817:24 <glozow> Let's say you have tx C. It has parent B, and grandparent A. A <- B <- C
 9917:24 <glozow> Let's say A is 10sat/vB, B is 5sat/vB, and C is 1sat/vB
10017:25 <glozow> mapTx says A's ancestor feerate is 10sat/vB, B's ancestor feerate is 7.5sat/vB, and C's is 5.3sat/vB
10117:26 <glozow> A gets selected first. We store B and C in mapModifiedTx. B's new ancestor feerate is 5sat/vB. C's new ancestor feerate is 3sat/vB.
10217:26 <glozow> This makes sense yes?
10317:26 <glozow> larryruane: yes, selected = included in the block template we're building
10417:26 <larryruane> (so we're assuming all tx are the same size)
10517:26 <glozow> larryruane: correct. thanks
10617:27 <stickies-v> makes sense!
10717:27 <theStack> yup, that sounds alright
10817:27 <larryruane> so this way, if we don't end up mining the next block, it's very easy to "undo" this
10917:27 <glozow> larryruane: yep!
11017:28 <larryruane> (we just toss out those entries in mapModifiedTx)
11117:28 <glozow> Great. So in this example, what happens next? Which transaction gets selected for inclusion, and how do we update mapModifiedTx?
11217:29 <theStack> B gets selected, and C is stored in mapModifiedTx with an ancestor feerate of 1sat/vB?
11317:29 <glozow> theStack: exactly!
11417:29 <larryruane> oh so C appears in mapModifiedTx twice?
11517:30 <theStack> it's just updated i guess?
11617:30 <larryruane> yes you're probably right
11717:30 <glozow> yes, it's updated. there is only 1 entry.
11817:30 <glozow> sorry for the confusion
11917:30 <glozow> we update using `update_for_parent_inclusion`: https://github.com/bitcoin/bitcoin/blob/094d9fda5ccee7d78a2e3d8b1eec17b8b6a33466/src/node/miner.h#L115
12017:31 <glozow> called here in `UpdatePackagesForAdded`: https://github.com/bitcoin/bitcoin/blob/f3e0ace8ecd84009a23da6b0de47f01d79c45772/src/node/miner.cpp#L258
12117:32 <glozow> https://github.com/bitcoin/bitcoin/blob/f3e0ace8ecd84009a23da6b0de47f01d79c45772/src/node/miner.cpp#L251-L259
12217:32 <glozow> here shows that it's updated. we have 2 branches: for creating a new entry and for updating an existing one.
12317:32 <glozow> This brings us to the next question - notice anything fishy? What's the bug?
12417:33 <theStack> so generally it's only ever the size and the fees which are updated separately, and the resulting feerate is calculated later when needed?
12517:33 <theStack> (not referring to the bug, just a general question)
12617:33 <glozow> theStack: yes
12717:34 <glozow> CFeeRate doesn't remember what the size and amount were, so it's not possible to deduct a transaction from a package feerate that way.
12817:34 <glozow> we have to just remember the total fees and total size
12917:34 <theStack> ok that makes sense
13017:35 <glozow> `CFeeRate` definition: https://github.com/bitcoin/bitcoin/blob/094d9fda5ccee7d78a2e3d8b1eec17b8b6a33466/src/policy/feerate.h#L29
13117:36 <glozow> Anyone find the bug?
13217:37 <larryruane> https://github.com/bitcoin/bitcoin/blob/f3e0ace8ecd84009a23da6b0de47f01d79c45772/src/node/miner.cpp#L258 passes a closure (function pointer), which is why `update_for_parent_inclusion` has to be written in that operator() style?
13317:37 <glozow> larryruane: good question :) was going to be my bonus question
13417:39 <stickies-v> I suppose the bug is that update_for_parent_inclusion uses GetFee instead of GetModifiedFee?
13517:39 <svav> Finding this bug is way beyond my capabilities I'm afraidĀ ;(
13617:39 <lightlike> when adjusting the modified entry, the actual feerate was used, not the modified one. so things would be wrong if miners had prioritised the transaction.
13717:39 <glozow> larryruane: I think the answer is simply = this code was written before we used C++11, so you couldn't use lambdas
13817:39 <glozow> stickies-v: winner winner
13917:40 <stickies-v> it's the only changed line of code that's not in a test file ĀÆ\_(惄)_/ĀÆ
14017:40 <glozow> lightlike: exactly
14117:40 <glozow> stickies-v: very smart :P
14217:40 <theStack> xD
14317:40 <larryruane> so _normally_ the two are the same, but if the tx had its feerate modified (using the prioritisetransaction RPC, then it will be wrong without this fix
14417:41 <glozow> larryruane: yeah. I'm not sure how common it is to use prioritisetransaction
14517:41 <glozow> we didn't really have test coverage for it
14617:41 <lightlike> yes, I was wondering wherther there is evidence/statistics of miners using prioritisetransaction much?
14717:42 <larryruane> ok now i have a question, who is the world found this bug?? Oh the PR description (first comment) explains it, the result of an earlier review! that's great
14817:42 <glozow> yeah, technically Marco found it
14917:42 <stickies-v> oh okay so the "modified" in "GetModifiedFee" has nothing to do with the "modified" in "mapModifiedTx"?
15017:43 <theStack> what is the real use case for prioritisetransaction? miners accepting bribes? :)
15117:43 <glozow> stickies-v: correct, haha. a bit confusing
15217:43 <sipa> or mining their own transactions
15317:43 <theStack> (OTOH the mining fee itself is kind of a bribe already)
15417:43 <theStack> sipa: makes sense yes
15517:43 <glozow> theStack: lightlike: sipa: I think we should just get rid of it. And replace it with an option to force-include transactions in the template
15617:43 <glozow> Would save 64b per mempool entry
15717:44 <lightlike> or miners censoring transactions, the modification can also be negative
15817:44 <randomcrow> marathon would be pleased
15917:45 <theStack> lightlike: interesting point!
16017:45 <larryruane> prioritysettransaction seems like one of those features that if core didn't implement it, someone else would (so may as well standardize it)
16117:45 <glozow> lightlike: indeed. you can censor by prioritising with -MAX_MONEY
16217:45 <larryruane> would it be easier to not let the tx into the mempool in the first place?
16317:46 <glozow> larryruane: I mostly disagree. If it's a feature that a small fraction of miners (also small fraction of users) use, seems unnecessary.
16417:46 <larryruane> really basic question: the mempool gets persisted to disk, right? so if the node goes down, then when we come back up again, we'll have the mempool from before, with all the modifications?
16517:47 <glozow> larryruane: modified fees are used in mempool acceptance logic, too. If you prioritise with a negative amount, it'll also not make it into your mempool
16617:47 <glozow> code here: https://github.com/bitcoin/bitcoin/blob/094d9fda5ccee7d78a2e3d8b1eec17b8b6a33466/src/validation.cpp#L822
16717:48 <larryruane> glozow: :+1
16817:48 <theStack> playing devils advocate: maybe prioritisetransaction will be used more once blocks get full regularly in the future (right now they aren't)
16917:48 <theStack> not saying that this a strong or good argument to keep it though
17017:48 <glozow> larryruane: yes, fee deltas are persisted to disk. code here: https://github.com/bitcoin/bitcoin/blob/094d9fda5ccee7d78a2e3d8b1eec17b8b6a33466/src/validation.cpp#L4729-L4731
17117:49 <glozow> theStack: it would be nice if people could fee-bump the normal way :) if people need to pay miners out-of-band, there's something wrong with our fee bumping
17217:49 <glozow> it is a valid argument though ofc
17317:50 <theStack> glozow: true! i assume with "normal" you mean both RBF and CPFP?
17417:50 <glozow> yep!
17517:51 <glozow> we have one more question that we haven't covered from the notes: Why is MAX_CONSECUTIVE_FAILURES necessary? code here: https://github.com/bitcoin/bitcoin/blob/f3e0ace8ecd84009a23da6b0de47f01d79c45772/src/node/miner.cpp#L323
17617:51 <randomcrow> spam
17717:53 <Dweezahr> as the first items do no longer fit, it is unlikely that future items will fit as they are decremental
17817:53 <theStack> seems to be used to avoid taking too much time building a block which is almost full anyway
17917:53 <lightlike> to save time - aborting early instead of trying out the entire mempool when the block is almost full so most transaction won't fit anymore.
18017:53 <glozow> yep exactly
18117:54 <glozow> like if we only have 5 weight units left, which no transaction will fit
18217:54 <glozow> there's no need to try every transaction in the mempool
18317:54 <Kaizen_Kintsugi_> so its a probability thing, if we start failing a lot, the liklihood of finding a transaction that does fit drops
18417:54 <theStack> i wonder where the magic number 4000 comes from btw... is this derived from a consensus limit on how large the coinbase is allowed to be? (if there is such a limit)
18517:54 <larryruane> would you say it's an anti-DOS measure too?
18617:55 <glozow> larryruane: not really. nobody can force you to build a block template
18717:56 <stickies-v> and you also have your mempool size limit, in case someone wanted to spam you with a trillion transactions
18817:57 <glozow> theStack: oh that's a good question. I'm not sure, maybe sipa knows? code added here https://github.com/bitcoin/bitcoin/pull/9868/
18917:57 <larryruane> glozow: the code you linked to most recently, `addPackageTxs` ... git blame seems to show it was added 6 years ago, is that accurate? I thought packages were a recent addition (that you mostly implemented)
19017:57 <Dweezahr> why was int64_t chosen over uint64_t?
19117:57 <glozow> larryruane: nope. I'm adding packages to mempool validation logic. We've had packages in mempool and miner for years!
19217:58 <larryruane> ok, TIL ... even though they haven't been used (because not supported by P2P)? Or do I have that wrong?
19317:58 <svav> It will be something to do with that it's 4 x 1000
19417:58 <svav> The 4 is a conversion factor
19517:58 <glozow> Dweezahr: which item are you referring to?
19617:59 <Kaizen_Kintsugi_> I think int64_t is parsed by this object that outputs JSON
19717:59 <Dweezahr> nConsecutiveFailed
19817:59 <theStack> svav: yes, 4000 WU = 1000 vbytes... but then, where do the 1000 come from? :p
19917:59 <sipa> glozow theStack My (vague) recollection is that these min/max weight limits on blocks were there before.
20017:59 <sipa> Having a max size is useful, in case the exact size of the coinbase isn't known yet.
20118:00 <glozow> I guess a max 1000vB coinbase sounds reasonable
20218:00 <theStack> indeed
20318:00 <glozow> Ah we're out of time. Thanks for coming everyone!
20418:01 <lightlike> larryruane: I think the package logic has been used, child-pays-for-parent works after all. It's just that the parent currently needs a high enough feerate to make it into the mempool (even if it's not enough to get mined)
20518:01 <glozow> I'm looking for somebody to host next week, so if you're interested please lmk!
20618:01 <glozow> #endmeeting
20718:01 <theStack> thanks for hosting glozow! that was fun
20818:01 <emzy> Thank you glozow and all!
20918:01 <lightlike> thanks glozow !
21018:01 <larryruane> glozow: thanks!
21118:02 <glozow> Yeah larryruane: to answer your question about the packages, the nice thing is we've had CPFP for 6 years, but the problem is it only works for transactions already in the mempool.
21218:02 <stickies-v> ty glozow and everyone for the discussion!
21318:02 <Kaizen_Kintsugi_> thank you! I learned a lot
21418:02 <larryruane> makes sense glozow thanks
21518:02 <svav> Thanks glozow and all!
21618:03 <glozow> and Dweezahr: not sure why it's a signed integer. really it could just be a uint16