Cluster mempool: add txgraph diagrams/mining/eviction (mempool)

https://github.com/bitcoin/bitcoin/pull/31444

Host: glozow  -  PR author: sipa

Notes

Motivations and Background

  • Block building (done by the BlockAssembler) is the process of selecting ~4MWU worth of mempool transactions while trying to maximize the total fees.
  • Non-mining nodes might never use the block builder, but are still interested in comparing the incentive compatibility of transactions as a metric for whether it is worth keeping: if the mempool exceeds its maximum memory allowance, it should evict the transactions that are least likely to be mined soon.
  • The block building algorithm on master uses ancestor set-based sort to dynamically linearize mempool transactions by including ancestor sets in order of highest ancestor score (minimum between ancestor feerate and individual feerate). As transactions are included, their descendants’ ancestor scores can change; the BlockAssembler keeps a copy of mempool entries with their updated ancestor information in a separate map, which avoids modifying the mempool itself. We covered this algorithm in more detail in a previous meeting.
  • Ancestor set-based sorting can be used to linearize the entire mempool to find the “worst” transaction for eviction, but this algorithm would be too slow to use in practice.
  • Instead, eviction runs a similar algorithm with ascending descendant score (maximum between descendant feerate and individual feerate). This also linearizes transactions in an approximate order of least to most likely to be mined, but the linearization is not exactly opposite to the result of ancestor set-based sorting. This asymmetry is problematic:
    • What if the first transaction selected for eviction is also the transaction that would be selected first for mining?
    • Similarly, what if we have transactions that are “junk” (e.g. fall below the node’s configured -blockmintxfee and would thus never be selected for mining) but can’t be kicked out because they have a high descendant score? This mempool limitation necessitated the requirement that package transactions be above the minimum relay feerate (see PR #26933).
  • Due to similar limitations, Replace-by-Fee code cannot properly determine whether the proposed replacement(s) are more incentive compatible, so it uses approximations like individual feerate comparisons. These imperfect heuristics are a major cause of pinning problems and fee-bumping inefficiencies.
  • The main motivations for cluster mempool are to address these problems.

TxGraph

  • We have reviewed TxGraph basic functionality in a previous meeting. If TxGraph is new to you, the host recommends reading some of those notes and getting a feel for TxGraph by reviewing SimTxGraph in the fuzzer, which has similar functionality but a far simpler design:
    • It uses a single DepGraph to represent all transactions across all clusters, which means it doesn’t need to implement merges or track much information about the set of clusters.
    • It implements the staging level by creating a copy of main and deleting either the main or the staging to commit or abort staged changes. This approach is much more memory and CPU-intensive, but doesn’t need to track the differences between the levels.
  • A cluster mempool essentially keeps the entire mempool linearized at all times (lazily using TxGraph), which makes it easy to quickly determine a transaction’s linearization position (including what the highest and lowest transactions are) and compare the current mempool with a potential one.
  • PR #31444 adds TxGraph functionality for these purposes:
    • TxGraph adds a chunk index, a total ordering of all chunks in the graph across all clusters. A ChunkOrder comparator defines the order in which chunks would be mined.
    • BlockBuilder can just iterate through the chunks in order. Also, since a total orderering of all graph entries exists, we can get the “least likely to be mined” transaction without building a full mempool worth of block templates: simply take the last chunk in the index.
    • TxGraph exposes a BlockBuilder to iterate over the chunks from best to worst. BlockBuilder can Skip chunks (i.e. if they are too large to fit in the remaining block space), and those chunks’ descendants will be subsequently skipped as well.
    • TxGraph also exposes a GetWorstMainChunk method to identify the transactions that would be included in a block last.
  • The PR also adds a GetMainStagingDiagrams method to make it possible to compare the feerate diagram of the TxGraph with and without its staged changes.
    • This Delving post discusses the theory behind evaluating replacements using feerate diagrams.
    • Feerate diagram comparison is already used in package RBF. Package RBF is limited to clusters of size 2, which are easy to linearize even without cluster mempool.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?
  2. Why are block building and eviction relevant to each other? Wouldn’t it be easier to evict transactions by the order they entered the mempool?
  3. True / false: if all clusters are singletons (have 1 transaction each), m_main_chunkindex would just be sorting the transactions by feerate.
  4. In English, using the approach in this PR, what is the algorithm for selecting transactions in order for block building? And for eviction?
  5. How would a client of BlockBuilder use it to build a block? When would GetCurrentChunk, Include, and Skip be called?
  6. What is the expected lifetime of BlockBuilder (is it similar to CTxMemPool’s or very different)?
  7. Can you create a BlockBuilder when staging exists? Can you build a block using the TxGraph’s state with its staged changes?
  8. Does BlockBuilder modify TxGraph (a “yes and no”-style answer is ok)?
  9. Why does BlockBuilder need to remember the set of skipped transactions? Why can it be represented as a set of Clusters?
  10. This commit adds new fields in data structures that need to point to each other: Entry now contains an iterator to the transaction’s ChunkData in m_main_chunkindex, and ChunkData refrence Entrys by their position in m_entries. In your review, how did you check that these pointers are always kept up-to-date?
  11. Conceptually, what are all the ways that an entry’s chunk index can change?
  12. In the ChunkOrder comparator, when cmp_feerate != 0, why can it be returned directly without comparing position within the cluster?
  13. m_main_chunkindex_observers indicates the existence of aBlockBuilder. Why is it an integer instead of a boolean?
  14. This call to GetClusterRefs gets the vector of Refs corresponding to the chunk. Why are the arguments ret.first and start_pos = chunk_end_entry.m_main_lin_index + 1 - chunk_data.m_chunk_count correct?
  15. Continuing from the last question, why is std::reverse called on the result?
  16. What is GetMainStagingDiagrams useful for? Why might we exclude the clusters that are identical in main and staging?