Improve Indices on pruned nodes via prune blockers (utxo db and indexes, rpc/rest/zmq)

https://github.com/bitcoin/bitcoin/pull/21726

Host: mzumsande  -  PR author: fjahr

The PR branch HEAD was 2faa5b5179 at the time of this review club meeting.

Notes

  • Indexes are optional modules that scan each block of the chain and store index-specific additional data in a separate LevelDB database. There are currently three indexes in bitcoin core with very different use cases that are all using a common framework (index/base).

  • All indexes have a best block which is the block height up to which they are synced.

  • In principle, indexes can also work with a node that is running in pruned mode - as long as the data of each block was indexed at a point in time when the block was still available. After a block has been processed by the index, this block could be pruned (the indexed data itself is never pruned).

  • However, when pruning with an index enabled, extra care must be taken:

    • We must ensure no blocks are pruned that still need to be indexed.

    • We must take into account the possibility of a reorg. In particular, coinstatsindex has a running hash for the UTXO set, so when a block is disconnected from the tip, this block’s data needs to be available in order to be able to adjust the running hash.

    • Being optional, indexes can be switched off (and switched back on again) at any point. If it is impossible to fully sync an index because of missing block data, the user must be notified.

  • PR #15946 enabled pruning for blockfilterindex - however, a negative side-effect of this was that it introduced a circular dependency between validation and the index code.

  • This PR introduces a new method for pruning, prune locks. It removes the circular dependency and enables pruning for coinstatsindex.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?

  2. Can you summarize what indexes currently exist in bitcoin core, and what each of them does?

  3. What is a circular dependency, and why do we want to avoid these when possible?

  4. Please explain in your own words how the prune blockers introduced in commit 527ef44 work.

  5. What is the difference to the old approach (which is removed here)?
    • Is there a cost related to the new approach?
  6. Why do you think a buffer of 10 blocks (PRUNE_LOCK_BUFFER) was introduced?

Meeting Log

  110:00 <lightlike> #startmeeting
  210:00 <larryruane> hi
  310:00 <danielabrozzoni> hi
  410:00 <lightlike> Hi, welcome to the PR Review Club!
  510:01 <theStack> hi
  610:01 <ls55> hi
  710:01 <lightlike> Feel free to say hi if you're here - any first-timerst today?
  810:01 <RobinAdams> First time
  910:01 <a_GucciPoet> Hey everyone
 1010:01 <b10c> hi (lurking-only today)
 1110:02 <lightlike> Welcome RobinAdams!
 1210:02 <lightlike> Today's meeting is on #21726 (Improve Indices on pruned nodes via prune blockers)
 1310:02 <lightlike> notes are at https://bitcoincore.reviews/21726
 1410:02 <svav> Hi
 1510:02 <lightlike> Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?
 1610:03 <RobinAdams> I have it open now, looking at it
 1710:03 <ls55> Approach ACK. What is the best way to test this PR? Because indexing takes a lot of time.
 1810:03 <danielabrozzoni> Approach ACK, I still need to test 🙂
 1910:04 <theStack> light concept ack -- seems reasonable to allow more indices to be run on pruned nodes and clean up some circular dependencies by the way
 2010:04 <lightlike> ls55: If you test on other networks than mainnet, it doesn't take that much time.
 2110:05 <larryruane> theStack: +1 same for me
 2210:05 <lightlike> ls55: one way is to test with the pyhton framework on regtests (there is a functional test added in this PR)
 2310:05 <lightlike> another possibility would be to test on signet.
 2410:06 <a_GucciPoet> can someone explain circular dependency
 2510:06 <lightlike> a_GucciPoet: We'll come to this soon, it's the second question :)
 2610:07 <a_GucciPoet> ok thanks
 2710:07 <lightlike> Let's start with a general one:
 2810:07 <lightlike> Can you summarize what indexes currently exist in bitcoin core, and what each of them does?
 2910:08 <larryruane> I think they're all nicely organized into the `src/index` directory
 3010:08 <danielabrozzoni> coin stats index - statistics on the utxo set; block filter index - to retrieve BIP157 block filters, hashes and headers; tx index - to retrieve transactions by hash
 3110:08 <lightlike> danielabrozzoni: correct!
 3210:09 <larryruane> what's listed there are base (common stuff), and then blockfilterindex, coinstatsindex, disktxpos, txindex (4 of them)
 3310:09 <lightlike> larryruane: yes, I think it started with the txindex, and was then organized into a common framework when blockfilterindex was added
 3410:10 <lightlike> yes, disktxpos is not an index by itself, it's just an auxiliary file I think
 3510:11 <lightlike> ok, so all of these indexes use a common framework (index/base) and on top of that have their own specific code relating to what data they index and how to handle this
 3610:12 <theStack> the block filter index is the most recent one of the indices i think? (remember some weekly review club sessions in summer 2020)
 3710:12 <larryruane> and i think these are all leveldb indices (?)
 3810:13 <ls55> -coinstatsindex: coinstats index used by the gettxoutsetinfo RPC
 3910:13 <ls55> -txindex: a full transaction index, used by the getrawtransaction rpc call
 4010:13 <ls55> -blockfilterindex: an index of compact filters by block
 4110:13 <lightlike> theStack: I think coinstatsindex is newer, https://github.com/bitcoin/bitcoin/pull/19521
 4210:14 <lightlike> larryruane: yes! although blockfilterindex has the special property that the filters themselves are not saved in the levelDB, but in a flatfile - the levelDB has the file positions where one finds the filter.
 4310:14 <theStack> lightlike: oh good to know; i thought coinstatsindex was there for a longer time and only muhash was added more recently
 4410:16 <lightlike> Moving on: What is a circular dependency? Why do we want to avoid these when possible?
 4510:16 <ls55> If A uses B and conversely then there is a circular dependency. However, the circular dependency maybe subtler. For instance, it may be A that uses B that uses C that uses A.
 4610:17 <larryruane> (sorry I'm late with this:) I always wonder how stuff is laid out on disk, it looks like the `$HOME/.bitcoin/index` directory has subdirectories for each kind of index
 4710:17 <larryruane> so they can easily be removed if disabled
 4810:18 <larryruane> I think circular dependencies make testing harder, because it's hard to link in only the code you want to test? (i'm not exactly sure)
 4910:18 <lightlike> larryruane: yes, that's where the data is saved. one could also just delete the respective folder and reindex again. I find myself doing that a lot doing testing.
 5010:18 <ls55> I think all these indexes are saved in levelDB
 5110:19 <larryruane> if you remove an index from the config, does the node automatically delete its corresponding directory?
 5210:19 <lightlike> ls55: yes, good explanaion on a circular dependency. there is actuall a script (part of th linter?) which detects new ones.
 5310:19 <sipa> note that circular dependency can mean two different things
 5410:20 <sipa> there are "code dependencies", as in: file X includes file Y - if these form a cycle, you code will often just not compile
 5510:20 <lightlike> ls55: yes, and all indexes hava a separate levelDB database, each in a different directory.
 5610:21 <sipa> code dependencies can be broken by just using forward declarations or so
 5710:21 <lightlike> larryruane: no, it won't. You can enable it back on later on, and the index will catch up with the chain, starting from the point it was synced before (and not from genesis)
 5810:21 <a_GucciPoet> is a circular dependency a security issue?
 5910:21 <sipa> there are also "conceptual circular dependencies" or so... as in: "module X cannot really be used without module Y"; this is much broader, and not necessarily a problem in the code - just in your design
 6010:22 <sipa> ideally, modules in your code, if well-organized, are layers - they build on top of each other. If two modules both cannot be used without the other, that's a sign that they should be either just one module, or the interface between them is bad
 6110:23 <sipa> a_GucciPoet: No, they're just "code smell"
 6210:23 <sipa> The script can detect some forms of circular dependencies but not all (e.g. it won't catch ones you work around using forward declarations)
 6310:24 ← BlueMoon left (~BlueMoon@189.233.142.104):
 6410:24 <larryruane> I think an easy code circular dependency to get your mind around, that I've seen often in other projects I've worked on, involves logging. If logging takes a mutex, and if the mutex code can possible write log messages... !
 6510:25 <lightlike> for example, I think that the existing circular dependency between validation and index caused the indexes to be part of the initial draft of Carl's libbitcoinkernel library for consensus code - as an optional module they really shouldn't be there, but it takes refactoring to make that possible.
 6610:25 <sipa> Right. Ideally it's always possible for any two modules to use at least one of them without the other one. Circular dependencies break that ability.
 6710:28 <lightlike> ok, next question:
 6810:28 <lightlike> Please explain in your own words how the prune blockers introduced in this PR work.
 6910:30 <ls55> Are you referring to `m_prune_locks` ?
 7010:30 <lightlike> ls55: yes!
 7110:31 <ls55> It is a map representing the index name ("txindex", "coinstatsindex" or "basic blockfilterindex") and the height of the earliest block that should be kept and not pruned for each index.
 7210:31 <larryruane> prune blockers are a beautiful thing, it's a list of heights, one per kind of index, which set a lower bound on what can be pruned away (removed)
 7310:31 <lightlike> right! and when are these updated (and by whom?)
 7410:32 <ls55> It is read in `CChainState::FlushStateToDisk`
 7510:32 <otech> Every time any of the 3 indices (since they all inherit the Base Index class) updates the best block index
 7610:32 <larryruane> if we maintained just a single lowest-height (across all index types), then when that lowest index can advance to a higher height, we wouldn't know what would be the new "lowest" ... so they must be kept separately per-index
 7710:33 <lightlike> otech: exactly! The index code tells validation to update the prune locks.
 7810:33 <ls55> In `BaseIndex::SetBestBlockIndex()`
 7910:34 <larryruane> if not too much of a side point, I did have a question on this loop https://github.com/bitcoin-core-review-club/bitcoin/commit/527ef4463b23ab8c80b8502cd833d64245c5cfc4#diff-97c3a52bc5fad452d82670a7fd291800bae20c7bc35bb82686c2c0a4ea7b5b98R2282 Could "destructering" be used here so we wouldn't have to reference `.second` which is not very symbolic?
 8010:34 <lightlike> yes. and ls55 is right too, they are read in CChainState::FlushStateToDisk , which is the point where the node decides whether it should prune away blocks or not.
 8110:36 <lightlike> larryruane: what do you mean by "destructuring"?
 8210:37 <larryruane> (i'm finding an example)
 8310:38 <lightlike> ok, I'll move to the next q then: Now that we have talked about the new approach, which is the difference to the old one?
 8410:38 <lightlike> (which is removed in https://github.com/bitcoin-core-review-club/bitcoin/commit/3b8b898d96f570489238a4aa21cf4fe27a4a7e73#diff-97c3a52bc5fad452d82670a7fd291800bae20c7bc35bb82686c2c0a4ea7b5b98L2278-L2281 )
 8510:40 <ls55> The old code handled only the "blockfilterindex" and this new code iterates over all available indexes to find last height the node can prune. This data is got from `CBlockIndex BaseIndex::m_best_block_index`.
 8610:40 <ls55> The new code also subtracts `PRUNE_LOCK_BUFFER - 1` from the height of earliest block that should be kept and not pruned.
 8710:40 <ls55> Regarding the cost, the number of iterations has increased, but it doesn't seem to add much more processing than the previous code.
 8810:40 <larryruane> lightlike: like this https://github.com/bitcoin/bitcoin/blob/master/src/wallet/scriptpubkeyman.cpp#L1272
 8910:41 <larryruane> it just makes `first` and `second` into local variables with nice names
 9010:41 <lightlike> ls55: Yes! So previously, validation would reach into the indexes, and query for their best height.
 9110:42 <lightlike> Now, the indexes tell validation their best height by themselves. So validation doesn't need to know about the indexes anymore, and there is no longer a circular dependency.
 9210:43 <lightlike> ls55: what do you mean with "number of iterations"?
 9310:43 <lightlike> (this refers to the next question): Is there a cost related to the new approach?
 9410:44 <ls55> Yes. I thought they were the same question.
 9510:45 <lightlike> larryruane: looks sensible to me, without knowing too much about it.
 9610:47 <lightlike> I also think that there is no meaningful cost related to it. I think one difference is that the prune locks might be updated more often (even when our node doesn't want to prune), but that should be completely negligible.
 9710:48 <lightlike> I meant "when our node doesnt call FlushStateToDisk"
 9810:49 <lightlike> Ok, moving on to the last q: Why do you think a buffer of 10 blocks (PRUNE_LOCK_BUFFER) was introduced?
 9910:49 <lightlike> or maybe first: what does this buffer do?
10010:49 <ls55> Because it's higher than expected in regular mainnet reorgs.
10110:50 <theStack> the point of the buffer seems for taking potential reorgs into account
10210:50 <ls55> But how many blocks do mainnet reorgs usually involve and how often does this happen?
10310:51 <danielabrozzoni> I think 1-2 block reorgs happen quite frequently, but I might be wrong
10410:52 <lightlike> I think that if we are in prune mode, we'll still keep at least 550MB of block data and don't prune it (irrespective of any index code).
10510:53 <otech> I think more than a few blocks reorg would be rare, but I can see it being prudent to set the buffer a bit higher in case of an targeted eclipse attack especially since the use case is for pruned nodes... but not sure if that intuition is wrong...
10610:54 <lightlike> so even without the buffer, we wouldn't prune anywhere near the range of blocks that could be affected by a reorg.
10710:54 <RobinAdams> I would think putting this value in a config file would make sense, hardcoded seems bad
10810:55 <sipa> Rule for making something configurable: you need to be able to express when someone would want to change it, and give advice about it.
10910:55 <RobinAdams> Agreed, but then would need rationale for a default as well
11010:55 <lightlike> RobinAdams: the prune threshold is configurable "-prune=XYZ" you just have to pick a number >550, otherwise there will be an error.
11110:56 <sipa> that's just a guideline of course, but my point is more: "being unable to decide" or "feeling bad about hardcoding" should not be reasons to make something configurable. It's just you as designer not doing your job and trying to shove it off to the user.
11210:56 <larryruane> I'm curious to know what happens if we DO have a very large reorg (such that some indices "break" because height decreases too much) ... does the node shut down? No, couldn't be, that would be bad!
11310:57 <theStack> didn't look deeper into the code, but is there the theoretical possibility of an integer underflow? "const int lock_height{prune_lock.second.height_first - PRUNE_LOCK_BUFFER - 1};" (if height_first is <= 10)
11410:57 <larryruane> sipa: very interesting ... would you ever be in favor of a temporary hidden config option so that if there's some unexpected problem, nodes can be "patched" without needing a new binary?
11510:58 <larryruane> theStack: since it's signed and not unsigned, should be ok?
11610:58 <sipa> larryruane: Sure, like the "invalidateblock" RPC? I can easily express when I'd advise someone to use it.
11710:58 <larryruane> sipa: +1
11810:59 <lightlike> theStack: I think the following line makes sure that last_prune is > 1
11910:59 <theStack> larryruane: true! even if it's not a real underflow, i wonder if it has any bad consequences if that number is negative
12010:59 <theStack> lightlike: oh, right
12111:00 <lightlike> larryruane: The functional test looks at such a case.
12211:00 <lightlike> oh, it's time already
12311:00 <lightlike> #endmeeting