The PR branch HEAD was f87e07c6f at the time of this review club meeting.
represents the node’s knowledge of a block chain, with an (optional)
mempool that is synchronized with that chain. It was introduced in
PR 10279 to eventually be an
exposed “libconsensus”. Initially it was intended to clarify the internal
interfaces within bitcoind.
is the class in validation that manages one or more CChainState objects. It
was introduced in PR 17737
in order to provide unified interface to the multiple chainstates required by
the AssumeUTXO project. Now
that a validation instance can contain multiple chainstates, it is eventually
intended to be the main interface into validation from the rest of the code.
The ChainstateManager::ProcessNewBlock() and
ChainstateManager::ProcessNewBlockHeaders() public functions were added in
PR 18698 as high-level
interfaces methods to validation. The callers (most notably net_processing) do
not need to be aware of validation’s chainstates (or even whether there are
multiple chainstates), and simply pass the block or block headers to be
(often called ATMP) is a free function in the global namespace. Prior to this PR, it was the
interface through which the rest of the code could call into validation to
submit a transaction to the mempool. The calling code would need to provide the
chainstate and mempool as arguments to the function, which breaks the abstraction
that calling code doesn’t need to know about validation’s internals.
This PR adds a new
interface function, similar to ProcessNewBlock() and
ProcessNewBlockHeaders(), which provides a high-level interface method for
submitting transactions to be validated. The caller no longer needs to know
about the chainstate or mempool inside validation.
The PR also removes the responsibility of calling the CTxMemPool::check()
consistency check function from the caller (net_processing) and
moves it to validation.
Which components currently call AcceptToMemoryPool()? List all the places that we’ll call ATMP.
What does CTxMemPool::check() do? Whose responsibility is it to call that function?
What does the bypass_limits argument in ATMP do? In which circumstances is ATMP call with bypass_limits set to true?
One of the commit logs reads:
“This logic is not necessary for memory pool transactions, as
AcceptToMemoryPool already refuses previously-known transaction ids
entirely.” refers to the logic at
which was later removed in commit 450cbb0944cd20a06ce806e6679a1f4c83c50db2.
What was the change that removed that logic?
The last commit is to Always call mempool.check() after processing a new transaction. Does that add any overhead to processing/relaying transactions? Why/why not?
<stickies-v> code outside of validation.cpp should not need to care about chainstate management, so through ChainstateManager::ProcessTransaction those callers can now add transactions without needing a reference to mempool or chaintip
<jnewbery> Kaizen_Kintsugi: AssumeUTXO is a huge project that touches lots of parts of validation and init. This PR wasn't motivated specifically by AssumeUTXO, but I'd argue that clarifying interfaces should help add new features if done right
<stickies-v> a mutex - a bit fuzzy on the details but i think it tries to ensure validation consistency by only allowing a single thread at the same time to modify crucial components like chainstate etc?
<jnewbery> so to summarize, cs_main is the big lock in Bitcoin Core. It _should_ be predominantly to protect chainstate and validation, but there's also a lot of state in net_processing and other places that are guarded by cs_main
<jnewbery> follow up question. Out of (1), (2), (3), (4) and (4b), which are calls from inside validation, and which are from external client code (where by client I mean other components within bitcoin core, not external programs)
<jnewbery> stickies-v: yes, the same logic is involved in package validation, although that doesn't actually use the ATMP function. Package validation would be from the same call sites as (2) and (4) when it gets merged
<stickies-v> CTxMemPool::check() verifies that all the transactions in the mempool are consistent, e.g. they don't spend the same inputs twice. Since it's by default not used on mainnet, I think this is mostly a dev/debug feature?
<jnewbery> stickies-v's point about it being off by default is important. If that wasn't the case, then this PR might be a performance pessimization. But since check() is generally only used for testing, it's not a problem.
<jnewbery> lsilva_: "connecting" a block is applying its transactions to the UTXO set (removing outputs that are spent by txs in the block and adding new outputs that are created) and updating the chainstate
<stickies-v> hmm so during reorg we ignore the default mempool size of 300MB but instead use a (default) MAX_DISCONNECTED_TX_POOL_SIZE of 20MB of disconnected transactions? Then doesn't it make more sense to just keep the 300MB mempool limit?
<jnewbery> "disconnecting" a block is the reverse: removing the UTXOs that were created in the block and re-adding those that were spent in the block (which were stored in the 'undo data' for that block on disk)
<jnewbery> I think it may be due to ancestor/descendant chains. Since the transactions can only be added sequentially, we can't look at the ancestor/descendant feerates when trying to add the transactions back, so it's perhaps better to add everything, and then limit the mempool size if necessary
<glozow> well if you were at 290mb and disconnectpool is 20mb, your only choices are (1) cut down to 280mb and then accept or (2) allow going to 310mb and then cut right? the former means you might end up with lower fee transactions
<michaelfolkson> sipa: What is the UTXO set implied by the mempool? There's no predictive element of what will be in the next mined block is there? So the mempool is treated by the codebase as irrelevant to any future changes to the UTXO set?