The PR branch HEAD was 082c5bf099 at the time of this review club meeting.
We recently looked at PR #22677, which cuts the circular dependency
between validation and txmempool. This week’s PR was split off from that PR,
taking a slightly different approach to simplify the CTxMemPool::check()
function.
Notes
The CTxMemPool class is a data structure that stores the set of unconfirmed
transactions, along with various methods to update or query that set. As well
as the transactions themselves, the mempool stores metadata about the
transactions (such as ancestor/descendant dependencies) that makes it highly
optimized for various operations, such as constructing blocks.
The mempool must only contain non-conflicting, valid transactions which
could theoretically be included in the next block. There are therefore
many invariants such as:
a transaction may only appear in the mempool if all of its inputs are
either in the UTXO set or are the outputs of other mempool transactions
no two transactions in the mempool may spend the same output
the {descendant|ancestor} {fees|count|size} cached for the transaction must
be the correct value
There are several public methods that can alter the contents of the
mempool. Those public methods can be called in many different
circumstances, such as:
removing transactions when a block is connected
expiring old transactions or evicting transactions by feerate when limiting
the mempool size
replacing transactions for RBF
re-inserting transactions from disconnected blocks during a re-org
Maintaining the mempool’s invariants during all of those operations is
very delicate, and failure to do so can lead to subtle bugs
such as those fixed in PR #2876 or
PR #5267.
For that reason, PR #2876
introduced a CTxMemPool::check() method, which asserts many of the
mempool’s invariants. Later, PR
#5267 extended those checks.
The checks are computationally expensive, so the -checkmempool command
line option can control how frequently they’re run. By default, mempool
checks are always disabled for mainnet, signet and testnet nodes, and always
enabled for regtest nodes.
Did you run the bench tests before and after the changes to
CTxMemPool::check()? Were your results the same as reported in the PR?
Is it important for these checks to be fast? Why/why not?
What does the UpdateCoins() function do? What is it replaced with in
this PR?
What was waitingOnDependants used for before this PR? Explain what
this while
loop was doing. Why can it now be removed?
How does the GetSortedDepthAndScore() function work? Why does the
returned vector not contain any descendants before their ancestors?
GetSortedDepthAndScore() uses the CTxMemPoolEntry’s cached ancestor
count to sort the transactions. What would happen if those cached values
were incorrect?
<larryruane> If this check is enabled via `-checkmempool`, does that mean the check code runs on each and every modification to the mempool? I would think that's too much for mainnet (even with these improvements)
<jnewbery> larryruane: I agree that running on mainnet is probably not what we want to do. Even with the performance gains, I think it'd add too much load (and we maybe don't want to be adding asserts to nodes running mainnet)
<jnewbery> It's actually only called in two places from net_processing, and once in validation. If you grep for ".check(" or ">check(" you'll find them
<jnewbery> theStack: correct. And the other one in net_processing is after processing an orphan transaction. The call from validation is at the end of ActivateBestChainStep()
<jnewbery> Did anyone try running the functional tests? I wonder if these changes make any difference to runtime (possibly not, since the mempools in the functional tests are mostly empty)
<jnewbery> check() is iterating over the entries in mapTx, which may be in any order. Say my mempool contains 5 transactions A -> B -> C -> D -> E. What's the worst order they could be in?
<jnewbery> right, so in the first iteration, we'll go through all the transactions and add (E,D,C,B) to the waitingOnDependants list, and then verify A
<jnewbery> then in the while loop, we'll iterate over E,D,C and process B, then iterate over E,D and process C, and so on, then iterate over E and process D, and finally process E.
<KaizenKintsugi> I will guess the amount of dependents, I read in the git comments that we want to sort by "topological order" but I dont know what that means
<jnewbery> raj: sorting by ascending ancestor count gives a partial ordering. The output of the topo sort is a total ordering that doesn't violate any of the partially ordered pairs
<jnewbery> We've mostly done the next question, but perhaps someone could add some more detail. How does the GetSortedDepthAndScore() function work? Why does the returned vector not contain any descendants before their ancestors?
<theStack> hope to not being to off-topic, but any specific reason why the hash value is included as third criteria for sorting? i can't think of a scenario where this really matters
<jnewbery> ok, final question. GetSortedDepthAndScore() uses the CTxMemPoolEntry’s cached ancestor count to sort the transactions. What would happen if those cached values were incorrect?
<jnewbery> part of check() is checking that those cached values are correct. If any of those are wrong, then the mempool will be indexed incorrectly, which could lead to all kinds of problems
<raj> as we are not doing the check() by default, so in regular cases we are just hoping that these kind of bad stuffs of mempool inconsistency doesn't happen?
<jnewbery> we run check() by default in the functional tests. It's pretty usual to have these kinds of sanity checks enabled in test and disabled in production
<jnewbery> nodes are free to implement whatever mempool they want, and whatever policy rules for accepting transactions into their mempool that they want.
<larryruane> jnewbery: yes, I thought I'd heard that miners likely have made proprietary changes to the mempool code (even if running core), but we don't know for sure