Don't wipe indexes again when continuing a prior reindex (utxo db and indexes)

Jun 5, 2024

https://github.com/bitcoin/bitcoin/pull/30132

Host: stickies-v - PR author: TheCharlatan

The PR branch HEAD was eeea0818c1a20adc5225b98b185953d386c033e0 at the time of this review club meeting.

Notes

Recommended reading from earlier review clubs:
- #21726: Improve Indices on pruned nodes via prune blockers for more background on indexes.
- #16981: Improve runtime performance of –reindex for more background on the reindexing process.
Bitcoin Core currently implements five indexes. Two are required: the UTXO set (CCoinsViewDB, also often referred to as “coins db”) and the block index (BlockTreeDB). Three are optional: a transaction index (enabled with -txindex), a compact block filter index (enabled with -blockfilterindex=<type>), and a coinstats index (enabled with -coinstatsindex).
When running with -reindex, all indexes are wiped and rebuilt (generally from the block files on disk). This process can take quite a while, and it can be aborted by the user before it is finished.
Because the node needs to have an up to date UTXO set and block index, the reindexing state is persisted on disk. When a reindex is started, a flag is set, and it will only be unset when the reindex is finished. This way, when the node starts, it can detect that it should continue reindexing, even if the user didn’t provide the flag as a startup option.
This PR can make node startup more efficient by avoiding the wiping of the optional indexes when it is not necessary.

Questions

Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?
What is the behaviour change introduced by this PR. Can you reproduce it, and if so - how?
What are the two ways an index can process new blocks? How does this PR affect that logic?
What are the potential risks of not wiping the optional indexes when a reindex is continued? How are these addressed, if at all?
What is the difference between ChainstateLoadOptions::reindex and BlockManager::Options::reindex? Why do we need both?
Does 9de8b26 introduce any behaviour change? What is the relation between chainman.m_blockman.m_reindexing and blockman_opts.reindex? When are they the same, when are they different?
This PR fixes a bug introduced in b47bd95. What is the bug, and under which circumstances could it manifest?
The 0d04433 commit message states that “Log messages indicating when indexes and chainstate databases are loaded exist in other places.”. What are these places?
Are there any circumstances under which 9de8b26 will cause an optional index to be wiped, where prior to this commit it wouldn’t be wiped?

Meeting Log

17:00

<stickies-v> #startmeeting

17:01

<stickies-v> anyone here for the review club?

17:02

<TheCharlatan> hi :)

17:02

<monlovesmango> hey i'm here

17:02

<emc99> hi

17:02

<kevkevin> hi

17:02

<stickies-v> ah, yay. welcome everyone! today we're looking at #30132, authored by TheCharlatan. the notes and questions are available on https://bitcoincore.reviews/30132

17:02

<kevkevin> havent had a chance to review so will be lurking

17:03

<stickies-v> anyone joining us for the first time today? even if you're just lurking, feel free to say hi!

17:04

<stickies-v> who got the chance to review the PR or read the notes? (y/n)

17:04

<TheCharlatan> y :P

17:04

<monlovesmango> y

17:04

<stickies-v> that's a okay kevkevin , great that you're joining anyway!

17:04

<stickies-v> mmm wondering about your Concept (N)ACK TheCharlatan but we'll find out soon enough

17:05

<stickies-v> Did you review the PR? [Concept ACK, approach ACK, tested ACK, or NACK](https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#peer-review)? What was your review approach?

17:06

<monlovesmango> a bit. concept ack but don't know enough to comment on approach

17:07

<stickies-v> we'll mostly be talking about the concept here anyway so that works :-D

17:08

<stickies-v> let's dig into those

17:08

<stickies-v> 2. What is the behaviour change introduced by this PR. Can you reproduce it, and if so - how?

17:09

<monlovesmango> start bitcoind with reindex flag with additional optional indexes, restart bitcoind, check that optional index's progress isn't wiped out

17:10

<kevkevin> looks like we want to stop the reindex from being deleted if the user restarts their bitcoind without the reindex flag after one has been started

17:10

<kevkevin> we should resume the previous reindex process

17:11

<stickies-v> monlovesmango: exactly, on master you should observe that the leveldbs are being wiped, whereas that shouldn't be the case with 30132

17:11

<stickies-v> kevkevin: i think you're close but i don't fully comprehend what you're saying (reindex is a process, so it can't be deleted)

17:12

<stickies-v> but yes we want to resume the reindex process without deleting the *optional* indices

17:12

<stickies-v> is that the only behaviour change introduced in this PR?

17:13

<kevkevin> yea sorry I was referring to the indexes we have stored in the datadir

17:13

<monlovesmango> there is also a bug fix but I wasn't able to get to understanding the bug itself

17:13

<kevkevin> ya looks like a bug fix in https://github.com/bitcoin/bitcoin/pull/30132/commits/f27290c39d63df36a1e1baa7f9c1609ebb65ca97

17:14

<stickies-v> monlovesmango: which part don't you understand? how it manifests, or why it manifests?

17:15

<kevkevin> looks like we want to use the reindex param provided by the user, but currently it is not clear if it will be true or false in options.reindex

17:17

<monlovesmango> I guess why? I don't understand what the bug is (but I also didn't spend time looking at that)

17:17

<stickies-v> so we have this convenience feature for GUI users where if loading the chainstate fails, we ask them (in a popup box, so GUI only) if they'd like to try again with reindex

17:18

<stickies-v> but in b47bd959207e82555f07e028cc2246943d32d4c3, that behaviour was accidentally changed a bit - if the user responded yes to that, we would try again but without reindexing, and then we'd keep asking the user (ad infinitum) if they want to try again with reindex

17:19

<stickies-v> both of those are fixed in this bugfix commit

17:19

<monlovesmango> ahh ok. and the user's choice on whether to reindex was stored in chainman.m_blockman.m_reindexing?

17:20

<stickies-v> indeed it was - and why is that not a safe thing to do?

17:20

<TheCharlatan> yes, exactly monlovesmango

17:20

<monlovesmango> bc that gets destroyed in initialization (AppInitMain)

17:20

<monlovesmango> (taking this from ryanofsky's comment https://github.com/bitcoin/bitcoin/pull/30132#issuecomment-2120741121)

17:21

<monlovesmango> thanks for diving into the bug stickies-v! its very clear now

17:22

<kevkevin> ahh I didnt know that either, thats helpful!

17:22

<TheCharlatan> yes, being more concrete the problem is that it (the chainman) gets destroyed and recreated on each for loop iteration. So the users choice is immediately discarded.

17:23

<stickies-v> it gets destroyed because we have this rather unintuitive for-loop that, at the beginning constructs a new chainman and blockman: https://github.com/bitcoin/bitcoin/blob/ff7d2054c4f1d7ff98078b9695e7c36e79a476c6/src/init.cpp#L1534

17:23

<monlovesmango> got to step away for a few minutes!

17:23

<stickies-v> blessed be them who improve that code to make it less footgunny

17:24

<stickies-v> okay - any other behaviour change introduced in this PR or is that it?

17:25

<TheCharlatan> yes, improvements to that retry logic would be very welcome. It is constantly tripping up people if they have to deal with it.

17:26

<TheCharlatan> yes, there is a tiny log line being removed :P

17:26

<stickies-v> YAAAAA

17:27

<stickies-v> alright, moving on

17:27

<stickies-v> 3. What are the two ways an index can process new blocks? How does this PR affect that logic?

17:31

<stickies-v> one way is to process them through the validation interface (after registering itself with `RegisterValidationInterface()`) on a per-block basis, or another is to do it in batch through a background sync (through `BaseIndex::StartBackgroundSync()`)

17:33

<stickies-v> so, any guesses as how this PR affects how blocks are processed?

17:34

<kevkevin> not sure does it have to do with the background sync?

17:37

<stickies-v> do you know when background sync is used vs when the validation interface is used?

17:40

<monlovesmango> does it have to do with whether it is processed via start up verses blocks arriving in real time?

17:41

<kevkevin> looks like in BaseIndex::Init we use RegisterValidationInterface and it seems like background one (StartIndexBackgroundSync) is used in AppInitMain

17:41

<stickies-v> monlovesmango: it does indeed! although arriving not just in a "received from peer" sense, but importantly in this case also when processed while reindexing

17:42

<stickies-v> so basically in `BaseIndex::Init()` we check if we're currently caught up with the chaintip: https://github.com/bitcoin/bitcoin/blob/ff7d2054c4f1d7ff98078b9695e7c36e79a476c6/src/index/base.cpp#L123

17:44

<stickies-v> if we're synced up, we will process all new blocks through the validation interface. if we're not synced up, we'll wait for the background sync to finish (started here: https://github.com/bitcoin/bitcoin/blob/ff7d2054c4f1d7ff98078b9695e7c36e79a476c6/src/init.cpp#L1755), and then process new blocks through the validation interface

17:45

<stickies-v> kevkevin: indeed! we register the index early on, to make sure we don't miss out on any events

17:45

<stickies-v> so, does this PR mean we start processing more through validation interface, more through background sync, or both/can't say?

17:46

<emc99> both

17:46

<monlovesmango> it depends whether people use reindex more often when they are completely synced up

17:46

<monlovesmango> ?

17:47

<monlovesmango> which I think would be the case (if you aren't synced I don't think there is much point having reindex flag right?)

17:47

<kevkevin> I would think through background sync if users tend to not use reindex, but I am not sure

17:47

<monlovesmango> so I guess my vote is validatoni interface

17:48

<stickies-v> so this is the scenario: a user has started bitcoind with -reindex, aborted after half an hour (well before reindex has finished), and then started bitcoind again *without* the -reindex flag

17:49

<stickies-v> prior to this PR, when their node is started for the second time, the optional indices will be wiped

17:49

<stickies-v> in that case, will they start syncing through validation interface right away, or will they go through background sync first?

17:51

<kevkevin> well if they were wiped they would go through the validation interface is my guess

17:52

<stickies-v> kevkevin: alas! as per https://github.com/bitcoin/bitcoin/blob/ff7d2054c4f1d7ff98078b9695e7c36e79a476c6/src/index/base.cpp#L123, the index will be considered out-of-sync, and when we're out of sync we first catch up through background validation

17:53

<stickies-v> so, how does that behaviour change with this PR?

17:54

<monlovesmango> ahh ok were considering whether index is out of sync, not whether block db is out of sync. got mixed up

17:56

<stickies-v> ah yes, very good point!

17:56

<monlovesmango> i don't think the behavior between whether validation interface or background sync is used has changed..? i must be missing something. whether indexes are wiped or not on restart, the index is still out of sync

17:56

<monlovesmango> whether or not*

17:57

<stickies-v> monlovesmango: well, what this PR does in https://github.com/bitcoin/bitcoin/pull/30132/commits/9de8b263dabd6dd2f86f1f0253c6ee3fac7a4407 is to not actually wipe the optional indixes when we don't have to

17:58

<stickies-v> so in that case, the optional indices should be still be in sync with the chain tip - and we don't need to do a background sync

17:58

<monlovesmango> omg you are referring to local chain tip right? not peer's chain tip? if so that is what I am missing

17:59

<stickies-v> ah, i see

17:59

<stickies-v> reindex is a local process, i.e. we rebuild all of our indices again *from disk*, so generally there's no network activity involved (although we may have to request blocks from peers if they are missing on disk)

18:00

<monlovesmango> yep that makes sense

18:00

<stickies-v> a peer's chain tip is not relevant in any of the code we're looking at here (but the confusion is very understandable), that's handled in the p2p code

18:00

<kevkevin> looks like because m_synced = true it will skip the background sync?

18:00

<stickies-v> kevkevin: indeed

18:01

<kevkevin> and we will only need to use RegisterValidationInterface

18:01

<stickies-v> alright folks, we're unfortunately at time already so we'll have to wrap up here

18:01

<kevkevin> ok thank you! this was very informational!

18:01

<monlovesmango> thank you for hosting stickies-v! learned a lot

18:02

<stickies-v> thanks for the discussion, and thanks a lot TheCharlatan for your work on improving this bit of code!

18:02

<TheCharlatan> thanks for digging into this last point stickies-v

18:02

<stickies-v> #endmeeting