Cluster mempool: introduce TxGraph (mempool)

Feb 5, 2025

https://github.com/bitcoin/bitcoin/pull/31363

Host: glozow - PR author: sipa

The PR branch HEAD was 3c7732bbbe735d1883cdd1737ec200cab8caa049 at the time of this review club meeting.

Notes

Cluster Mempool Motivation and Background

Skip this section if you are just interested in reviewing the code and don’t need to hear the pitch for why.

The primary goal of Cluster Mempool (tracked here) is to enable the mempool to more intelligently assess which transactions would be most incentive compatible to keep. In simple terms: “which transactions would earn a miner the most in fees?” The complexity of incentive compatibility assessment comes from dependencies between transactions, i.e. where a transaction spends the output of another.

If the motivations for this project are unclear to you, try to spend up to 60min on these reading materials:

“But aren’t fees a trivially small portion of block rewards? Why do we care so much?”
- One of the biggest problems we want to solve is the asymmetry between how we decide what to evict and what goes into blocks: it’s possible that the first transaction we would evict is also the first transaction we would mine.
- An original motivation for trying to come up with a metric for “incentive compatibility” was within the context of RBF pinning. Our current feerate-based approach causes us to reject transactions that would be more incentive compatible, creating censorship problems.
“Why is cluster-tracking the solution for incentive compatibility?”
- This Delving post explains why incentive compatibility is complicated and why tracking and limiting clusters is necessary for a complete solution.
- We talked about clusters in previous review club meetings about auto-bumping unconfirmed UTXOs in coin selection (which added CTxMmemPool::GatherClusters) and Topologically Restricted Until Confirmation transactions. As we discussed, one of the benefits of TRUC is to have an effective cluster limit within the existing framework of ancestor/descendant limits (limit of cluster count 2 is the equivalent of limiting ancestor and descendant counts to 2). However, a limit of 2 is quite prohibitive.

Mempool transactions and their dependency relationships can naturally be represented as a directed graph where nodes represent transactions and edges represent UTXO spending relationships. This graph may contain multiple connected components, each called a cluster. A transaction’s cluster includes all of the transactions it is connected to, which can be defined recursively: all parents and children (nodes with an edge to or from this one), each of their parents and children, etc.

Cluster Mempool includes some large work components, including (1) adding cluster linearization algorithms which are used to order (and thus compare) transactions by incentive compatibility, (2) changing the mempool’s internal structure to enable cluster tracking and always maintain a linearization order, and (3) upgrading RBF, eviction, and mining to use the improved incentive compatibility assessments. This PR is pretty much only relevant to the second component; you can review it without knowing anything about the cluster linearization algorithm or specific RBF rules. The code for the first component has been merged (see PR #30126 and its followups), and work-in-progress code for the third component can be found at PR #28676).

Here is everything you need to know about linearization for this PR:

A linearization is topologically valid if it ensures that a parent transaction appears somewhere before its child. There are often multiple valid linearizations for the same cluster, and linearization L1 is better than L2 if taking the first N bytes of L1 would yield higher fees than taking the first N bytes of L2. The optimal linearization of a cluster is better or as good as all the others.
Computation needed for linearization scales with the size of the cluster, which is why their size needs to be limited.
A cluster requires re-linearization each time a transaction is added or removed.

If you have more time to spend on background concepts or need more convincing, here is additional optional reading material.

BIP 431 provides an overview of why RBF pinning is problematic and some of the other pieces that come together to address its issues.
This Delving post provides an overview of the algorithms used in cluster linearization.
This Delving post discusses how to evaluate replacements using feerate diagrams, which are enabled with cluster mempool.

Background on Mempool Architecture and Interfaces

Skip this section if you are already familiar with current mempool code.

“The Mempool” usually refers to the class CTxMemPool. This class holds a map of CTxMemPoolEntrys and other supporting data structures. It is responsible for keeping itself internally consistent. Much of the code is dedicated to dealing with dependency relationships.

Outside of the CTxMemPool class, here are the major pieces of mempool functionality to think about while reviewing cluster mempool:

Mempool validation decides what unconfirmed transactions to keep in memory. This code also enforces transaction relay policies.
Block connection removes mempool transactions that would be inconsistent with chainstate.
Block disconnection (aka reorganization) resubmits the transactions from the previous chain that are no longer in the new chain.
Block assembly (aka mining) fills a block template using mempool transactions, optimizing for fee revenue.
Mempool eviction is an internal process that decides what transactions to delete when the memory usage exceeds configured limits. It is triggered in multiple places, including Mempool validation and Block disconnection.

On master, Bitcoin Core’s mempool does not track clusters, but it does track ancestor and descendant sets. CTxMemPoolEntry stores references to its parents and children and tracks the {count, total vsize, total fees} of its ancestor and descendant sets. Each time a transaction is added or removed, CTxMemPool uses the references to traverse ancestors and/or descendants recursively, updating the cached ancestor and descendant information.

Design and Usage with Existing Mempool

While the current (on master) CTxMemPool class does not contain a data structure dedicated to storing the graph representation itself, “the graph already exists” in the form of CTxMemPoolEntry’s parent/child references and cached ancestor/descendant information.

The responsibility of keeping this graph up-to-date and within its specified limits largely belongs to CTxMemPool, but occasionally bleeds out to other areas. For example, in mempool validation, MemPoolAccept queries CTxMemPool for the ancestors of a transaction in order to:

Calculate whether this transaction would exceed the ancestor/descendant limits. However, sometimes, when evaluating a replacement, we apply an RBF Carve Out, literally increasing the limit by 1, to avoid double-counting the replacement and to-be-replaced transactions. Yuck!
Check whether a TRUC transaction meets its additional topology restrictions. However we first need to aggregate the in-package and in-mempool ancestors.
When evaluating a package, the precise ancestor/descendant counts of each package transaction and connected mempool transactions are not calculated before submission. Instead, the aggregate package count and size are used. This approximation avoids complexity, but can result in overestimation for certain topologies.

These examples also illustrate current accuracy problems with enforcing topology limits.

Design and Usage with TxGraph

PR #31363 adds a TxGraph class, used by CTxMemPool to encapsulate the transaction graph. The PR description provides an overview of its interface and implementation. TxGraph supports “main” and “staged” changes: transaction additions and removals can be staged and later discarded or applied. Additionally, removals and additions are applied lazily, allowing them to be batched.

You can view the full cluster mempool implementation PR to see the usage of TxGraph and how these new functions are used in the surrounding logic and contrast them with the existing code:

This commit makes CTxMemPoolentry derive from TxGraph::Ref.
Mempool validation stages the addition of the new transaction(s) to a changeset, which stages changes to the mempool TxGraph. To check cluster limits, it just queries whether it is oversized. It no longer needs to know what the ancestor/descendant limits are or apply carveouts in a replacement.
RBF incentive compatibility simply entails asking the TxGraph if the staged subgraph’s feerate diagram improves upon the original feerate diagram.
PrioritiseTransaction now just calls TxGraph::SetTransactionFee() and expects it to handle everything from there. Before, it needed to iterate through all ancestors and descendants to update their cached fees.
Additionally, prior to being queried for oversize or feerate diagrams, no cluster calculation and linearization is done.

Taking a look at PR #28676 can help illustrate some of the benefits of having TxGraph, including:

The mechanical details of updating the dependency graph and its impact on incentive compatibility are abstracted away from higher level mempool logic.
- CTxMemPool’s code is less cluttered with graph-wrangling logic. Lots of code is removed, and the much of the remaining ancestor/descendant tracking can be cleaned up in the future.
- TxGraph doesn’t know that it represents a mempool. It doesn’t know what a transaction is, and doesn’t care whether elements are being added and removed for replacements or block (dis)connection. It has a smaller scope and well-specified behavior.
We now have the ability to reason accurately about the addition of multiple transactions and removal of their collective replacements.
Batched updates and linearization improve the overall performance of mempool operations.
Deferred linearization also can allow for slower linearization improvements to be applied later, e.g. in the background, giving us the benefits of optimal linearizations without sacrificing performance.

Implementation

The rest of the notes are about reviewing the PR itself. Advice on review approach:

The author has written recommendations for reviewing the code in this comment.
The PR is largely split into feature and optimization commits. The features are introduced iteratively, so it may be easiest to review the PR in chunks of commits for each feature if the overall changes seem intimidating.
Another approach is to first read through the overall changes to TxGraph and TxGraphImpl’s respective interfaces, and then review implementation. You could also review the interface, then the fuzzer implementation, then the real implementation.

These notes are split by feature to make it easier to review the PR iteratively.

Basic TxGraph, Lazy and Oversized

This section includes code up to “txgraph: (optimization) avoid per-group vectors for clusters & dependencies”.

A TxGraph has a vector of Entrys (where each entry is a single transaction), and a vector of Clusters (which represent clusters as defined earlier). Clients can AddTransaction, RemoveTransaction, AddDependency, and SetTransactionFee. When a transaction is added, it starts in a cluster of its own. When AddDependency is called for two transactions that are not already in the same cluster, their clusters are merged.

It also has queues of “work” to do which may or may not need to be completed in order to perform certain tasks or answer external queries. Some tasks have other tasks as prerequisites.

This is the order in which things have to be resolved:

ApplyRemovals: Apply all of the queued removals.
SplitAll: Split up clusters if they contain components that are no longer connected to each other.
GroupClusters: Calculate what clusters would need to be merged if dependencies are applied. However, these changes are not yet applied.
ApplyDependencies: Apply new dependencies and merge clusters as computed in the previous step.
Relinearize: Relinearize clusters.

Much of the work is done lazily. For example, calling RemoveTransaction does not actually mean its etry is deleted, just that the index has been added to m_to_remove. However, TxGraph cannot answer a call to Exists without first calling ApplyRemovals.

An Oversized TxGraph is one that has at least one cluster that violates the cluster count or size limits.

A TxGraph only knows if it is oversized after it has completed steps 1-3.
When a TxGraph is oversized, mutation functions (like RemoveTransaction and AddDependency) and GetIndividualFeerate can still be called, but inspection functions like GetChunkFeerate are unavailable because we won’t merge or linearize clusters that are oversized. (GetChunkFeerate becomes GetMainChunkFeerate in a later commit).
Additionally, step 4 will not proceed if the TxGraph is oversized; step 5 is thus also impossible.

The GroupClusters function implements union-find, a favorite amongst software engineering interviewers.

Staging Support

This section includes code up to “txgraph: (optimization) cache oversizedness of graphs”.

This feature allows users of TxGraph to stage changes and then apply or discard them, much like git.

Much of the TxGraph members are grouped together as a ClusterSet. This refactor is done so that a TxGraph can have multiple subgraphs or levels represented as ClusterSets: “main” (level 0) and “staged” (level 1).

By default, mutations like AddTransaction, RemoveTransaction, AddDependency are applied directly to main, but clients of TxGraph can create a staging session:

To StartStaging, a new subgraph is added containing copies of information from the main graph. It starts with no clusters and no transactions.
Additions, removals, and new dependencies are automatically applied on the highest level: staged if it exists, main otherwise.
To CommitStaging, everything is copied from staged to main after finding conflicting clusters and deleting them from main. Conflicting clusters include those belonging to all staged removals and any transactions the subgraphs have in common. Then, clusters, work queues, and statistics are copied from staged to main.
AbortStaging largely consists of simply deleting the staged subgraph.

Specialized Functionality

This section includes the rest of PR #31363, but also PR #31444 and PR #31553.

There are various functions added to TxGraph to obtain specialized information. It may be useful to look at the code surrounding mempool to understand why: