The PR branch HEAD was b80cd25 at the time of this review club meeting.
Apart from coinbase transactions, every Bitcoin transaction has at least one
parent, another transaction whose outputs are spent as its inputs. We call
the spending transaction the child. Ancestors include parents, their
parents, and so forth. Descendants include children, their children, and so
Each mempool entry has information on its ancestors and descendants in the
mempool. This speeds up some operations (i.e. selecting which transactions to
include in a block and which to evict for low feerates) and slows down others
(adding and removing transactions).
We have ancestor/descendant limits as part of our mempool policy with the goal
of constraining the maximum computation needed for such operations. For
example, if seeing a conflicting transaction in a block could cause us to
remove a transaction with 10,000 descendants in the mempool, our mempool could
become a sink of computational resources (which could also be triggered
intentionally by an attacker) rather than a useful cache of unconfirmed
The current ancestor/descendant default limits are: no transaction can have
more than 25 or 101KvB of ancestors or descendants, including itself. The
mempool member function
calculates and enforces limits for a transaction that is in or outside the
PR#21800 defines and
implements an ancestor/descendant limit for packages (groups of transactions
being validated together for submission to the mempool). The following figure
illustrates some examples of why we cannot simply check
CalculateMemPoolAncestors() on the transactions individually.
In Scenario [A], calling CalculateMemPoolAncestors() on transaction A
will show that it is within limits (24 in-mempool ancestors). Calling
CalculateMemPoolAncestors() on transaction B will also show that it is
within limits (0 in-mempool ancestors).
In Scenario [B], transaction A and B both have 13 in-mempool ancestors,
but they’re the same transactions. If we naively added their ancestor
counts together, we would be grossly overestimating.
In Scenario [H], transaction C’s ancestor limits are only exceeded when
both transaction A and B are considered together; it isn’t sufficient to
only try pair combinations or simply subtract total package sizes from the
Why do we have ancestor/descendant limits in the mempool? Why do we use 25
transactions and 101K virtual bytes as our limit?
2a. Is it possible for a block to have a chain of 100 transactions (assuming
all other consensus rules are met)?
2b. Imagine a “family” to be any group of transactions connected by
parent-child relationships, including descendants-of-ancestors
(e.g. “siblings,” two transactions sharing a parent in the group) and
ancestors-of-descendants (e.g. “co-parents,” two transactions sharing a
child in the group). What is the largest possible family that can exist
in the mempool, assuming the default limits are applied? (Hint: it’s not
In our package validation
we run CalculateMemPoolAncestors() on each transaction individually during
PreChecks(). Why is this insufficient?
What is the package ancestor/descendant policy proposed by this PR? What are
your thoughts? Do you have any alternative proposals?
How might we test this algorithm? What are some edge cases we might want to consider?
This PR turns the original CalculateMemPoolAncestors() operating on a
CTxMemPoolEntry into a wrapper function, constructing a vector from the
singular entry. What could happen if the wrapper function used a copy of
the entry instead of the entry itself? (Hint: what happens
if the precondition is not met?).
What is a
Why don’t we just operate on a std::vector<CTxMemPoolEntry> instead?
<FelixWeis> if you know you have to create these 25 outputs, you can do it in 1 batched payment. electrum-wallet even supports "amend by fee" so if you have already an unconfirmed tx in the mempool, you can replace that one with the additional output
<harding> For the vsize limit, there's also a fundamental limit at the size of a block's transaction space. If the last descendant of a package is further away from its oldest unconfirmed ancestor, it can't contribute fee to that ancestor because they can't be included in the same block.
<glozow> Imagine a "family" to be any group of transactions connected by parent-child relationships, including descendants-of-ancestors and ancestors-of-descendants. What is the largest possible family that can exist in the mempool, assuming the default limits are applied?
<harding> lightlike: the miner of block n doesn't know that they'll mine block n+1, so by including less profitable ancestor transactions in block n, their competitior mining block n+1 profits more. I don't think that's economically rational.
<jnewbery> lightlike: the miner is not incentivized to mine the first batch of transactions, since it doesn't expect that it'll mine the next block (unless it has a majority of hash rate on the network)
<dergoegge> CalculateMemPoolAncestors() only takes txs into account that are already in the mempool, a tx in a package that has all its parents within the package might exceed the limits but CalculateMemPoolAncestors() would be fine with it because it has 0 ancestors in the mempool (example A)
<harding> lightlike: beyond having a mempool whose size is greater by default than the size of a block, we don't included transactions in the mempool that can't be included in the next block (e.g. timelocked transactions). This makes the transaction selection algorithm simple. A descendent transaction that can't be included in the next block because it had >1e6 vbytes of ancestors would violate that rule.
<harding> It's same policy enforced by Bitcoin Core for standard relay and mempool inclusion, but applied using a simple one-line heuristic that each transaction in the package is counted as having as many more ancestors and descendants as there are transactions in the package being evaluated.
<glozow> raj_: yes! implementation-wise, we're putting everyone's ancestors in the same pot while calculating, and using package_count instead of 1 and total_package_size instead of 1 transaction's size
<harding> One alternative would be to do the full computation on the package just like we would do if each tx was submitted individually, but for the sake of limiting the CPU/memory needed for an eventual P2P package relay, severely restrict the number of transactions allowed in a package from 25 currently to, say, 5.
<harding> Another alternative, also aimed at allowing P2P package relay, would be allowing the submitter to include a proof of the longest transaction chain in the package. I *think* it would be possible to take the txids and input outpoints of all the transactions in the package, put them into a tree-like structure, and then to create compact proofs about the shape of that tree. That's a ridiculously heavyweight solution if we expect most
<murch> I think it would be hard to set a meaningful limit that doesn't restrict naturally occuring clusters. In turn a relatively low limit might be easy to leverage to do things akin from transaction pinning or restricting other parties from bumping their txs
<glozow> the only reason you would put transactions in a package instead of submitting them individually is if some package-specific policy will get you something that individual transactions won't. imo the only policy that you'd want is for a tx to be considered for its descendant feerate (or more generally, i guess it's feerate with sponsors in the package) rather than base feerate
<harding> Testing is a bit tricky because it is more restrictive than the actual relay policy, so we can't be lazy and just spin up a synced node with no mempool and just feed it random packages from another node with a current mempool.
<glozow> michaelfolkson: so the hope is to be able to create random packages, submit them to mempool, and make sure that we aren't exceeding limits (which would indicate that our algorithm underestimated). but i've had a hard time getting it to create valid random packages
<glozow> Ok next question: This PR turns the original `CalculateMemPoolAncestors()` operating on a `CTxMemPoolEntry` into a wrapper function, constructing a vector from the singular entry. What could happen if the wrapper function used a copy of the entry instead of the entry itself?
<glozow> dergoegge: yes, bonus points to you for being well prepared ^_^ the container's `iterator_to` function requires it to be inside the container, so we'd get a bad value for the copy since it's not there