Don't wipe indexes again when continuing a prior reindex (utxo db and indexes)

https://github.com/bitcoin/bitcoin/pull/30132

Host: stickies-v  -  PR author: TheCharlatan

The PR branch HEAD was eeea0818c1a20adc5225b98b185953d386c033e0 at the time of this review club meeting.

Notes

  • Recommended reading from earlier review clubs:
  • Bitcoin Core currently implements five indexes. Two are required: the UTXO set (CCoinsViewDB, also often referred to as “coins db”) and the block index (BlockTreeDB). Three are optional: a transaction index (enabled with -txindex), a compact block filter index (enabled with -blockfilterindex=<type>), and a coinstats index (enabled with -coinstatsindex).

  • When running with -reindex, all indexes are wiped and rebuilt (generally from the block files on disk). This process can take quite a while, and it can be aborted by the user before it is finished.

  • Because the node needs to have an up to date UTXO set and block index, the reindexing state is persisted on disk. When a reindex is started, a flag is set, and it will only be unset when the reindex is finished. This way, when the node starts, it can detect that it should continue reindexing, even if the user didn’t provide the flag as a startup option.

  • This PR can make node startup more efficient by avoiding the wiping of the optional indexes when it is not necessary.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?

  2. What is the behaviour change introduced by this PR. Can you reproduce it, and if so - how?

  3. What are the two ways an index can process new blocks? How does this PR affect that logic?

  4. What are the potential risks of not wiping the optional indexes when a reindex is continued? How are these addressed, if at all?

  5. What is the difference between ChainstateLoadOptions::reindex and BlockManager::Options::reindex? Why do we need both?

  6. Does 9de8b26 introduce any behaviour change? What is the relation between chainman.m_blockman.m_reindexing and blockman_opts.reindex? When are they the same, when are they different?

  7. This PR fixes a bug introduced in b47bd95. What is the bug, and under which circumstances could it manifest?

  8. The 0d04433 commit message states that “Log messages indicating when indexes and chainstate databases are loaded exist in other places.”. What are these places?

  9. Are there any circumstances under which 9de8b26 will cause an optional index to be wiped, where prior to this commit it wouldn’t be wiped?

Meeting Log

  117:00 <stickies-v> #startmeeting
  217:01 <stickies-v> anyone here for the review club?
  317:02 <TheCharlatan> hi :)
  417:02 <monlovesmango> hey i'm here
  517:02 <emc99> hi
  617:02 <kevkevin> hi
  717:02 <stickies-v> ah, yay. welcome everyone! today we're looking at #30132, authored by TheCharlatan. the notes and questions are available on https://bitcoincore.reviews/30132
  817:02 <kevkevin> havent had a chance to review so will be lurking
  917:03 <stickies-v> anyone joining us for the first time today? even if you're just lurking, feel free to say hi!
 1017:04 <stickies-v> who got the chance to review the PR or read the notes? (y/n)
 1117:04 <TheCharlatan> y :P
 1217:04 <monlovesmango> y
 1317:04 <stickies-v> that's a okay kevkevin , great that you're joining anyway!
 1417:04 <stickies-v> mmm wondering about your Concept (N)ACK TheCharlatan but we'll find out soon enough
 1517:05 <stickies-v> Did you review the PR? [Concept ACK, approach ACK, tested ACK, or NACK](https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#peer-review)? What was your review approach?
 1617:06 <monlovesmango> a bit. concept ack but don't know enough to comment on approach
 1717:07 <stickies-v> we'll mostly be talking about the concept here anyway so that works :-D
 1817:08 <stickies-v> let's dig into those
 1917:08 <stickies-v> 2. What is the behaviour change introduced by this PR. Can you reproduce it, and if so - how?
 2017:09 <monlovesmango> start bitcoind with reindex flag with additional optional indexes, restart bitcoind, check that optional index's progress isn't wiped out
 2117:10 <kevkevin> looks like we want to stop the reindex from being deleted if the user restarts their bitcoind without the reindex flag after one has been started
 2217:10 <kevkevin> we should resume the previous reindex process
 2317:11 <stickies-v> monlovesmango: exactly, on master you should observe that the leveldbs are being wiped, whereas that shouldn't be the case with 30132
 2417:11 <stickies-v> kevkevin: i think you're close but i don't fully comprehend what you're saying (reindex is a process, so it can't be deleted)
 2517:12 <stickies-v> but yes we want to resume the reindex process without deleting the *optional* indices
 2617:12 <stickies-v> is that the only behaviour change introduced in this PR?
 2717:13 <kevkevin> yea sorry I was referring to the indexes we have stored in the datadir
 2817:13 <monlovesmango> there is also a bug fix but I wasn't able to get to understanding the bug itself
 2917:13 <kevkevin> ya looks like a bug fix in https://github.com/bitcoin/bitcoin/pull/30132/commits/f27290c39d63df36a1e1baa7f9c1609ebb65ca97
 3017:14 <stickies-v> monlovesmango: which part don't you understand? how it manifests, or why it manifests?
 3117:15 <kevkevin> looks like we want to use the reindex param provided by the user, but currently it is not clear if it will be true or false in options.reindex
 3217:17 <monlovesmango> I guess why? I don't understand what the bug is (but I also didn't spend time looking at that)
 3317:17 <stickies-v> so we have this convenience feature for GUI users where if loading the chainstate fails, we ask them (in a popup box, so GUI only) if they'd like to try again with reindex
 3417:18 <stickies-v> but in b47bd959207e82555f07e028cc2246943d32d4c3, that behaviour was accidentally changed a bit - if the user responded yes to that, we would try again but without reindexing, and then we'd keep asking the user (ad infinitum) if they want to try again with reindex
 3517:19 <stickies-v> both of those are fixed in this bugfix commit
 3617:19 <monlovesmango> ahh ok. and the user's choice on whether to reindex was stored in chainman.m_blockman.m_reindexing?
 3717:20 <stickies-v> indeed it was - and why is that not a safe thing to do?
 3817:20 <TheCharlatan> yes, exactly monlovesmango
 3917:20 <monlovesmango> bc that gets destroyed in initialization (AppInitMain)
 4017:20 <monlovesmango> (taking this from ryanofsky's comment https://github.com/bitcoin/bitcoin/pull/30132#issuecomment-2120741121)
 4117:21 <monlovesmango> thanks for diving into the bug stickies-v! its very clear now
 4217:22 <kevkevin> ahh I didnt know that either, thats helpful!
 4317:22 <TheCharlatan> yes, being more concrete the problem is that it (the chainman) gets destroyed and recreated on each for loop iteration. So the users choice is immediately discarded.
 4417:23 <stickies-v> it gets destroyed because we have this rather unintuitive for-loop that, at the beginning constructs a new chainman and blockman: https://github.com/bitcoin/bitcoin/blob/ff7d2054c4f1d7ff98078b9695e7c36e79a476c6/src/init.cpp#L1534
 4517:23 <monlovesmango> got to step away for a few minutes!
 4617:23 <stickies-v> blessed be them who improve that code to make it less footgunny
 4717:24 <stickies-v> okay - any other behaviour change introduced in this PR or is that it?
 4817:25 <TheCharlatan> yes, improvements to that retry logic would be very welcome. It is constantly tripping up people if they have to deal with it.
 4917:26 <TheCharlatan> yes, there is a tiny log line being removed :P
 5017:26 <stickies-v> YAAAAA
 5117:27 <stickies-v> alright, moving on
 5217:27 <stickies-v> 3. What are the two ways an index can process new blocks? How does this PR affect that logic?
 5317:31 <stickies-v> one way is to process them through the validation interface (after registering itself with `RegisterValidationInterface()`) on a per-block basis, or another is to do it in batch through a background sync (through `BaseIndex::StartBackgroundSync()`)
 5417:33 <stickies-v> so, any guesses as how this PR affects how blocks are processed?
 5517:34 <kevkevin> not sure does it have to do with the background sync?
 5617:37 <stickies-v> do you know when background sync is used vs when the validation interface is used?
 5717:40 <monlovesmango> does it have to do with whether it is processed via start up verses blocks arriving in real time?
 5817:41 <kevkevin> looks like in BaseIndex::Init we use RegisterValidationInterface and it seems like background one (StartIndexBackgroundSync) is used in AppInitMain
 5917:41 <stickies-v> monlovesmango: it does indeed! although arriving not just in a "received from peer" sense, but importantly in this case also when processed while reindexing
 6017:42 <stickies-v> so basically in `BaseIndex::Init()` we check if we're currently caught up with the chaintip: https://github.com/bitcoin/bitcoin/blob/ff7d2054c4f1d7ff98078b9695e7c36e79a476c6/src/index/base.cpp#L123
 6117:44 <stickies-v> if we're synced up, we will process all new blocks through the validation interface. if we're not synced up, we'll wait for the background sync to finish (started here: https://github.com/bitcoin/bitcoin/blob/ff7d2054c4f1d7ff98078b9695e7c36e79a476c6/src/init.cpp#L1755), and then process new blocks through the validation interface
 6217:45 <stickies-v> kevkevin: indeed! we register the index early on, to make sure we don't miss out on any events
 6317:45 <stickies-v> so, does this PR mean we start processing more through validation interface, more through background sync, or both/can't say?
 6417:46 <emc99> both
 6517:46 <monlovesmango> it depends whether people use reindex more often when they are completely synced up
 6617:46 <monlovesmango> ?
 6717:47 <monlovesmango> which I think would be the case (if you aren't synced I don't think there is much point having reindex flag right?)
 6817:47 <kevkevin> I would think through background sync if users tend to not use reindex, but I am not sure
 6917:47 <monlovesmango> so I guess my vote is validatoni interface
 7017:48 <stickies-v> so this is the scenario: a user has started bitcoind with -reindex, aborted after half an hour (well before reindex has finished), and then started bitcoind again *without* the -reindex flag
 7117:49 <stickies-v> prior to this PR, when their node is started for the second time, the optional indices will be wiped
 7217:49 <stickies-v> in that case, will they start syncing through validation interface right away, or will they go through background sync first?
 7317:51 <kevkevin> well if they were wiped they would go through the validation interface is my guess
 7417:52 <stickies-v> kevkevin: alas! as per https://github.com/bitcoin/bitcoin/blob/ff7d2054c4f1d7ff98078b9695e7c36e79a476c6/src/index/base.cpp#L123, the index will be considered out-of-sync, and when we're out of sync we first catch up through background validation
 7517:53 <stickies-v> so, how does that behaviour change with this PR?
 7617:54 <monlovesmango> ahh ok were considering whether index is out of sync, not whether block db is out of sync. got mixed up
 7717:56 <stickies-v> ah yes, very good point!
 7817:56 <monlovesmango> i don't think the behavior between whether validation interface or background sync is used has changed..? i must be missing something. whether indexes are wiped or not on restart, the index is still out of sync
 7917:56 <monlovesmango> whether or not*
 8017:57 <stickies-v> monlovesmango: well, what this PR does in https://github.com/bitcoin/bitcoin/pull/30132/commits/9de8b263dabd6dd2f86f1f0253c6ee3fac7a4407 is to not actually wipe the optional indixes when we don't have to
 8117:58 <stickies-v> so in that case, the optional indices should be still be in sync with the chain tip - and we don't need to do a background sync
 8217:58 <monlovesmango> omg you are referring to local chain tip right? not peer's chain tip? if so that is what I am missing
 8317:59 <stickies-v> ah, i see
 8417:59 <stickies-v> reindex is a local process, i.e. we rebuild all of our indices again *from disk*, so generally there's no network activity involved (although we may have to request blocks from peers if they are missing on disk)
 8518:00 <monlovesmango> yep that makes sense
 8618:00 <stickies-v> a peer's chain tip is not relevant in any of the code we're looking at here (but the confusion is very understandable), that's handled in the p2p code
 8718:00 <kevkevin> looks like because m_synced = true it will skip the background sync?
 8818:00 <stickies-v> kevkevin: indeed
 8918:01 <kevkevin> and we will only need to use RegisterValidationInterface
 9018:01 <stickies-v> alright folks, we're unfortunately at time already so we'll have to wrap up here
 9118:01 <kevkevin> ok thank you! this was very informational!
 9218:01 <monlovesmango> thank you for hosting stickies-v! learned a lot
 9318:02 <stickies-v> thanks for the discussion, and thanks a lot TheCharlatan for your work on improving this bit of code!
 9418:02 <TheCharlatan> thanks for digging into this last point stickies-v
 9518:02 <stickies-v> #endmeeting