The BlockManager::m_block_index is a map which is used to keep track of which block headers exist, how they interconnect, and where on disk the block data (if any) is stored 1. Generally speaking, it is updated whenever a new valid header with sufficient Proof-of-Work (PoW) is received. It contains entries for blocks in the current most-PoW chain, alternative chains, and even invalid blocks.
The CBlockIndex objects in this map can be considered the nodes in a tree shaped structure with the genesis block at its root. By definition, each block can only point to a single predecessor, but multiple blocks can point to the same predecessor. Of course, in a single chain, this tree is pruned so that each block will never have more than one block pointing to it. The entire tree structure is kept to enable efficiently handling chain reorgs.
Block validation is a process that consists of multiple steps, gradually bumping up the CBlockIndex’s nStatus from its default-initialized BLOCK_VALID_UNKNOWN to BLOCK_VALID_SCRIPTS. The validation flow which is relevant to this PR can be summarised as:
When the header is successfully validated with AcceptBlockHeader(), it is added to m_block_index, and generally never removed from it anymore, even if the block fails a subsequent validation step.
When the partial block validation in AcceptBlock() succeeds, the block is persisted to disk. This validation is partial, because full validation requires the predecessor block’s data, which the node may not have received yet.
As soon as a block header becomes eligible to be part of the fully validated current chain (i.e. it is part of the most-PoW chain, and all data for the block’s predecessors is available), ActivateBestChain() will attempt to connect the block. If this final validation step succeeds, the block is connected and its nStatus is raised to the ultimate BLOCK_VALID_SCRIPTS.
CBlockIndex holds a pprev pointer to its predecessor which makes it trivial to iterate backwards (towards the genesis block) over the block tree. Iterating forwards (towards the chain tip) is not possible, because a block can have multiple successors pointing to it. The lack of forward iteration makes operations such as finding a new best block header expensive, because we have to iterate over the entire m_block_index block tree again.
ChainstateManager::m_best_header and ActiveChain.Tip() are both CBlockIndex pointers. During normal operation, they’ll often converge to the same block with the highest PoW. However, it is crucial to distinguish their meaning and use. ActiveChain.Tip() returns the CBlockIndex* of the fully validated and connected block that is the tip of the currently active chain. m_best_header points to the header with the most PoW that the node knows about. It represents a node’s view of what the most-PoW chain might look like, but it is possible that validation won’t allow it to progress there. As such, m_best_header is just a hint. The distinction is especially visible during IBD, when ActiveChain().Tip() progresses towards m_best_header, but can only be equal to it when IBD is finished.
Which purpose(s) does ChainstateManager::m_best_header serve?
Prior to this PR, which of these statements are true, if any?
A) a CBlockIndex with an INVALID predecessor will ALWAYS have a BLOCK_FAILED_CHILDnStatus
B) a CBlockIndex with only VALID predecessors will NEVER have a BLOCK_FAILED_CHILDnStatus
One of the goals of this PR is to ensure m_best_header, and the nStatus of successors of an invalid block are always correctly set. Which functions are directly responsible for updating these values?
Would we still need the cand_invalid_descendants cache if we were able to iterate forwards (i.e. away from the genesis block) over the block tree? What would be the pros and cons of such an approach, compared to the one taken in this PR?
<stringintech> I looked at the changes but found out about the notes not soon enough. (Not sure I missed them or they were not uploaded when I looked at the PR)
<stickies-v> strat___ already left a nice ACK on the PR, but for everyone else able to review: would you give it a Concept ACK, Approach ACK, Tested ACK, or NACK? what was your review approach?
<marcofleon> Concept ACK I'd say. I think makes sense to have these values be set correctly instead of estimated. The PR descriptions makes a good case for it
<stickies-v> i think we can definitely talk more about rationale and general understanding in this PR, i too have quite a bit of outstanding questions so hopefully we can all learn together
<stringintech> Concept ACK. Almost everything was new to me though so I had to spend some time to grasp the context a bit. I also wrote some small integration test (python test framework) to understand the behaviour in some cases.
<stickies-v> oh that's a great approach stringintech, perhaps they're worth sharing on the PR too for other people's understanding and/or adding them to the codebase?
<catnip> @xgclass introduced to manage one or more CChainState objects, which track the state of the blockchain (e.g., the UTXO set and the active chain of blocks
<stringintech> :stickies-v I doubt they could be added to the codebase (not clean/comprehensive enough) but I will definitely check if I can share them as learning material on the PR page.
<lightlike> I think of it as the header for the block we would like our chain to move to (by downloading missing block data, doing validation etc.) - which may work out or not.
<stickies-v> lightlike: yeah, kinda like a north start (that might turn out to have died millions of years ago when the light/block data finally reaches us)
<stickies-v> for example, it also seems like it's used as a proxy for current time, e.g. to determine if a block is old enough to qualify as historic (which in turn affects how much we'll serve it to peers etc)
<lightlike> monlovesmango: You could imagine that (and I think it could've been implemented that way), but I if I remember correctly I think It's actually not being used during IBD to determine which blocks to download next - but it is for related things such as whether to apply -assumevalid (i.e. skip script validation) or not.
<stickies-v> as lightlike pointed out on a previous PR, bluematt also listed out most use cases on a ~6yo PR: https://github.com/bitcoin/bitcoin/pull/16974 - that's pretty interesting to read!
<stickies-v> lightlike: it is used when we receive unconnecting headers (in `HandleUnconnectingHeaders()`), to determine which headers to request from our peer, but I'm not sure if that function is actually reached in IBD? didn't look into that
<stickies-v> I'm going to launch the next question already, but as always, feel free to continue the discussion on previous topics if there's anything else you find
<marcofleon> I wanna say false for A but not entirely sure. I think in AcceptBlockHeader we do walk back along the block tree and mark blocks with `BLOCK_FAILED_CHILD`?
<strat___> A is false - before this PR - only in the active chain we were supposed to mark children as BLOCK_FAILED_CHILD (there is an incorrect traversal
<strat___> which #31835 fixes). other blocks in the block index which descend from BLOCK_FAILED_VALID were not marked as invalid. and this PR fixes that.
<stringintech> :stickies-v Oh. I have not seen "reconsiderblock". But perhaps if we reconsider a parent block (and it becomes valid) its descendant could become valid too.
<lightlike> "invalidateblock" and "reconsiderblock" make these kind of changes more complicated, because they are kind of artificial: Normally, a block that was once completely validated will be valid forever, "invalidateblock" changes that assumption it's basically the user overruling: "this block is invalid because I say so".
<stickies-v> monlovesmango: what about this scenario: B is a child of A, and B is in the active chain when A gets invalidated. Then, the node receives a B' (with higher cumulative work than B), also building on A. Then we make A valid again with `reconsiderblock()`. Does it still hold?
<stickies-v> strat___: lightlike: well, actually, I think B holds true even with `reconsiderblock` (and the scenario I outlined), because `reconsiderblock` does nicely walk all the descendants of the reconsider block
<lightlike> A used to be true in even more cases, my previous PR from last summer https://github.com/bitcoin/bitcoin/pull/30666 already fixed some of them (but not all) - so this PR fixes the remaining ones so that we can add the assumptions to CheckBlockIndex().
<monlovesmango> stickies-v: Interesting scenario. if B is immediately invalidated then maybe (but i think this PR is what would guarantee B is immediately invalidated). B' should be reconsidered when reconsidering A..?
<stringintech> :stickies-v Regarding your the last scenario, you mean that invalidated blocks through invalidate RPC may not remain invalid forever? (makes sense to me just wanted to make sure)
<stickies-v> monlovesmango: yes indeed, and B` would be reconsidered! but the "gotcha" here could be that B is now no longer in the most-pow chain, so it wouldn't get reconnected, so we need to check that there's some other process to update its validity flags
<stickies-v> yes sorry you're right we only consider ancestors and descendants, i meant that we traverse the entire block index to find them but that's not super relevant
<monlovesmango> yes I suppose so! just to check my understanding, prior to this PR would B get stuck with a valid status (since child blocks aren't invalidated immediately)?
<stickies-v> 3. One of the goals of this PR is to ensure `m_best_header`, and the `nStatus` of successors of an invalid block are always correctly set. Which functions are directly responsible for updating these values?
<stringintech> SetBlockFailureFlags (for nStatus), RecalculateBestHeader (for best header), and InvalidateBlock (both)? (if we are considering those who change the values directly)
<lightlike> stringintech: these are the non-normal cases for invalid block edge cases. I'd say the most important spot where m_best_header is set is AddToBlockIndex() in blockmanager when a new block index is received.
<stickies-v> catnip: there are no guarantees that m_best_header is a valid block, it's a best effort attempt, but yes you can traverse back to genesis from there
<stickies-v> marcofleon: stringintech: yeah , my understanding for Q5 is that it's an optimization to avoid unnecessarily traversing `m_block_index` - lightlike do you have anything to add there? we talked about this offline a bit
<lightlike> stickies-v: yes, exactly. The idea is to traverse m_block_index only once, not multiple times (imagine someone invalidateblock for a lot of blocks, which is already quite slow)
<stickies-v> 8. Would we still need the `cand_invalid_descendants` cache if we were able to iterate forwards (i.e. away from the genesis block) over the block tree? What would be the pros and cons of such an approach, compared to the one taken in this PR?
<stickies-v> strat___: yes!! thank you for making a diagram, that's so much more helpful. So my concern (which turned out to be unfounded) was that because B was no longer in active chain, it wouldn't get connected again, and so its status might not get updated (but `reconnectblock` does indeed handle it well)
<monlovesmango> I don't think so. the cons would be there would be a whole new data structure to maintain on CBlockIndex (and it would be more complex than the pprev pointer as you can multiple decendants)
<stickies-v> monlovesmango: it would just be a simpler container of pointers, like a `std::vector<CBlockIndex*>`. and since we never remove items from `m_block_index`, we'd only have to add a pointer to a predecessor once, and then never look at it again
<stickies-v> catnip: yes! iterating backwards is trivial, because each block has 1 parent. But iterating forward, even if we explicitly link the block indexes, would still be non-trivial because we have to cover all branches
<stickies-v> alright, went a bit over time already, thank you everyone for your participation on today's review club, and lightlike for making variables less wrong, yay!