Bitcoin Core PR Review Club

Improve runtime performance of --reindex (resource usage, validation)

Mar 4, 2020

https://github.com/bitcoin/bitcoin/pull/16981

Host: mzumsande - PR author: LarryRuane

The PR branch HEAD was a50f956e at the time of this review club meeting.

NOTE! Earlier versions of this PR had a different scope from the current branch. The discussion prior to this comment by the author is therefore no longer relevant for the PR in its current form.

Notes

The -reindex argument wipes and rebuilds the two leveldb databases for the block index and the chainstate, using the block files saved on disk. This procedure is completely local and requires no interactions with the p2p network. For help on the argument see bitcoind -help | grep -A1 reindex.
Reindexing is a lengthy procedure taking several hours on mainnet and consists of two steps:
1. Rebuilding the index of blocks based on the blk*.dat files saved in .bitcoin/blocks.
2. Rebuilding the chainstate (UTXO set) by fully validating each block starting from genesis using the block index created in step 1.
There is a second command, reindex-chainstate that will only perform step 2. This PR improves the runtime of step 1 and does not affect step 2 (which is already highly optimized since it uses the same validation functions that are used for connecting new blocks received from the network).
Reindex uses the CBufferedFile stream, introduced in #1962, which has a buffer to allow “rewinding” the stream position to an earlier position without additional disk access. In the merged PR #16577, the author of this week’s PR fixed a bug in CBufferedFile and added comprehensive unit tests.
The reindexing happens in LoadExternalBlockFile() (in validation.cpp), which is called for each block file. The block index is rebuilt by calling AcceptBlock() for each block in the chain in correct order.
However, blocks are usually not saved to disk in the correct order during IBD (listen to Pieter Wuille’s recent Chaincode podcast for more background, starting at 3:30). Therefore, the parent hash of blocks that are encountered without a known parent is saved in the map mapBlocksUnknownParent. After accepting a block, the reindex algorithm recursively finds all blocks from mapBlocksUnknownParent which are ready for processing now, tries to accept them and removes them from the map.
Before this PR, we would always read entire blocks from disk into our buffer. If the predecessor of a block was not available, the blocks would have to be read again at a later point. This PR changes behavior such that initially only the 80 byte block header is read into the buffer, and if we can’t accept the block yet, the rest of the block is skipped for now.

Questions

Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? (Don’t forget to put your PR review on GitHub.)
In which situations is reindex useful? When is it better to use reindex-chainstate?
What does the block index database look like? What information is stored there?
Can you explain why this PR reduces the runtime of reindexing considerably?
How good is the test coverage of the reindexing (functional and unit tests)? Do you think it can be improved?
Can you think of further optimizations of reindexing beyond this PR that are possible/worthwhile?

Meeting Log

13:00

<jnewbery> #startmeeting

13:00

<jnewbery> Hi folks! Welcome to Bitcoin Core PR Review Club. Feel free to say hi to let everyone know you're at keyboard.

13:00

<emilengler> hi

13:00

<kanzure> hi

13:00

<jonatack> hi

13:00

<lightlike> hi

13:00

<fjahr> hi

13:00

<LarryRuane> hi

13:00

<okkkk> Hello!

13:00

<jnewbery> As always, we'll have a host to guide discussion, but feel free to jump in at any time to ask questions.

13:00

<amiti> hi

13:00

<michaelfolkson> hi

13:00

<ajonas> hi

13:00

<emzy> hi

13:00

<jnewbery> Today's PR is #16981. Notes and questions are here: https://bitcoincore.reviews/16981.html

13:01

<willcl_ark> hi

13:01

<jnewbery> lightlike is hosting for us today. Thanks lightlike! We also have the PR author, LarryRuane here. Thanks for joining us, Larry

13:01

<jnewbery> ok, I'll pass over to lightlike

13:01

<lightlike> Ok so today's PR is about improving the performance of reindexing.

13:01

<jkczyz> hi

13:01

<lightlike> Did everyone get a chance to review the PR? ( y/n )

13:01

<emilengler> y

13:01

<jkczyz> n

13:01

<emzy> n

13:01

<Guest83> y

13:01

<jnewbery> 0.3y

13:02

<amiti> ~.2 y

13:02

<willcl_ark> brief code review, but not tested

13:02

<fjahr> y

13:02

<michaelfolkson> ~.1 y

13:02

<ajonas> read comments and code but not tested

13:02

<lightlike> So for those who did get a chance to look at it, what is your opinion on the PR? Concept ACK, approach ACK, tested ACK, or NACK?

13:02

<emilengler> To summarize it up: It does a more low level job when reading the reindexing plus it only reads the block header instead fo the entire block

13:02

<emilengler> Correct me if I missed something

13:03

<emilengler> lightlike: I did a Concept ACK. Was going to test once my chain is fully synced but I reviewed the code

13:03

<willcl_ark> if implemented correctly, seems like a win with no downsides

13:03

<jnewbery> Concept ACK. Better is better

13:04

<fjahr> Concept ACK but i only skimmed the test so far and did not have time to test

13:04

<lightlike> emilengler: I wouldn't really say that the level really changed ( we still use CBufferedFile, which is pretty low-level).

13:04

<lightlike> But first to some more general questions:

13:04

<lightlike> In which situations is reindex useful? When is it better to use reindex-chainstate?

13:05

<instagibbs> reindex is when block index is fubar, right?

13:05

<instagibbs> block data is ok, but block index broken

13:05

<fjahr> both rebuild the utxo set, full reindex rebuilds the block index as well

13:05

<elichai2> Hi

13:06

<LarryRuane> yes or for example, isn't it needed if you enable `txindex`?

13:06

<willcl_ark> I think if you enable txindex (e.g. ElectrumX) then you _require_ a reindex also

13:06

<fjahr> so it depends on what you need

13:06

<elichai2> about note 2.2, is reindex really fully validating? why? and does it have the same logic as IBD?(ie assumevalid)

13:06

<lightlike> instagibbs, fjahr: yes.

13:06

<jnewbery> instagibbs: that's how I understand it. reindex-chainstate only rebuilds the UTXO set. reindex also rebuilds the block index

13:06

<LarryRuane> reindex is in a way similar to initial block sync except it can be done completely offline

13:06

<michaelfolkson> In which situations is reindex useful? Data corruption yes. Anything else?

13:06

<lightlike> elichai2: yes, I think it does, didn't think of that until later today.

13:07

<jnewbery> willcl_ark: that used to be true. Since jimpo refactored the indexing code, I think that building a txindex doesn't require a -reindex

13:07

<instagibbs> right, incremental indexing is a thing now AFAIU :)

13:08

<willcl_ark> jnewbery: that's good news! I recall doing it myself a while ago for that

13:08

<emilengler> michaelfolkson: I'm not sure but maybe for verification

13:08

<emilengler> For example if you transferred the data from a disk to another

13:08

<michaelfolkson> As always there's good sipa StackExchange answer: https://bitcoin.stackexchange.com/questions/60709/when-should-i-use-reindex-chainstate-and-when-reindex

13:08

<lightlike> I also think reindexing might possibly be needed if the format of the index db would ever be changed in the future between releases

13:08

<willcl_ark> although, as IBD is usually saturated at CPU, perhaps --reindex is not actually that useful (unless you are bandwidth/data constrained in some way for a db error)

13:08

<nehan_> hi

13:09

<michaelfolkson> Yeah makes sense emilengler

13:09

<jnewbery> willcl_ark: PR 13033 I think

13:09

<fjahr> michaelfolkson: definitely when you were running pruned node

13:09

<fjahr> and now you switch to fully validating

13:09

<raj_> hi

13:10

<jnewbery> fjahr: not sure I understand. A pruned node _is_ fully validating

13:11

<michaelfolkson> Yeah that's good one fjahr. You switch to unpruned I assume you mean

13:11

<luke-jr> indeed

13:11

<fjahr> yeah, sorry, i meant non pruned node %)

13:11

<lightlike> just noting that in some situations (only utxo db is broken), reindex-chainstate is sufficient.

13:11

<lightlike> ok, next q:

13:11

<lightlike> What does the block index database look like? What information is stored there?

13:12

<lightlike> it's quite a general question, just name some of the most important infos...

13:13

<Guest83> Indexed on header?

13:13

<willcl_ark> block hash, tx hashes, block file names

13:13

<michaelfolkson> Where block is stored on disk

13:14

<LarryRuane> i was just reading sipa's reply on the stackexchange (linked above).. "You should use -reindex only when you were running in pruning mode..." should that be non-pruning mode? to reindex requires that you have all 300+ gb of blocks, right?

13:14

<lightlike> I think a great overview is on https://bitcoin.stackexchange.com/questions/28168/what-are-the-keys-used-in-the-blockchain-leveldb-ie-what-are-the-keyvalue-pair (again by Pieter Wuille :-))

13:14

<emilengler> Yes reindexing is not possible if pruning

13:15

<emilengler> I think it throws a runtime error then

13:15

<willcl_ark> LarryRuane: does --reindex init a full IBD if it's missing the data?

13:15

<LarryRuane> good question, i don't know

13:16

<fjahr> LarryRuane: if you were running it pruned and you switch to non pruning I think you have to start the node with -reindex if i remember correctly

13:16

<lightlike> so willcl_ark: I think that it would sync up the last block we can connect from disk, and then switch to IBD for the rest of the chain

13:17

<willcl_ark> that would seem logical

13:17

<jnewbery> I'm looking in init.cpp, and there appears to be logic for when using -reindex and -prune, so I guess it'll just redownload blocks from the network

13:17

<jnewbery> (from here: https://github.com/jnewbery/bitcoin/blob/a50f956e304adc7c75428ec391f1a76a99c1ddde/src/init.cpp#L628)

13:18

<molly> once i tested reindex on a laptop a few years ago, it took a week to fully sync, so i wouldn't use reindex if i have a corrupt database, i would resync the node from scratch, it's faster

13:19

<willcl_ark> maybe not after this PR :)

13:19

<jnewbery> molly: I'd be surprised if that were true in general, even before this PR

13:19

<lightlike> willcl_ark: I think that is what happened to me once when I had a corrupt block file (and slow internet connection): I would reindex up until the corrupt block, and then download all blocks beyond that from peers

13:19

<molly> jnewbery, i haven't looked at this PR

13:20

<LarryRuane> for me it took slightly less time to reindex than IBD last time i checked, several months ago ... but also reindexing puts less traffic on the network on load on peers

13:21

<lightlike> ok, next question: Can you explain why this PR reduces the runtime of reindexing considerably?

13:21

<emilengler> LarryRuane: If you use the same hardware it jsut takes longer becasue of the growing size

13:21

<LarryRuane> (do you want me to answer? :) )

13:21

<nehan_> it avoids deserializing blocks that will need to be deserialized later

13:22

<emilengler> AFAIK because it reads the header instead of the entire block

13:22

<lightlike> LarryRuane: maybe give others a chance first, but feel free to add your view on anything :-)

13:22

<theme> reduced read operations?

13:23

<jnewbery> it allows us to seek through the block file for headers rather than reading everything

13:23

<LarryRuane> nehan_ that's correct, a lot of CPU is spent deserializing blocks unnecessarily (throwing that work away)

13:23

<cprkrn> Yup. Doesn't require reading all of the data

13:23

<lightlike> emilengler: yes, but just at the first encounter, because based on the header we can decide if wa want to deserialize the block already now (or need it only later)

13:23

<raj_> by checking into memory for a chilld block instead of reading it from the disk when the parent is found.

13:24

<LarryRuane> so... i made a comment on the PR yesterday that explains in more detail but it doesn't actually reduce reads from disk (either num of reads or length of reads), but it only saves CPU time spent deserializing

13:25

<cprkrn> Gotcha

13:25

<LarryRuane> lightlike yes that is correct

13:25

<jnewbery> https://bitcoin.stackexchange.com/questions/28168/what-are-the-keys-used-in-the-blockchain-leveldb-ie-what-are-the-keyvalue-pair is excellent, but a little out of date now. I believe the transaction index record is now in its own database structure, and the chain state databse is per-txout rather than per-tx

13:26

<lightlike> yes, I was not precise there in my notes: The reading from disk, in my understanding, is done in the Fill() method, which happens also if we skip the block

13:26

<willcl_ark> jnewbery: so thats how #13033 did it

13:27

<lightlike> jnewbery: do you know if a good documentation exists somewhere that is up to date?

13:27

<MarcoFalke> Ideally the documentation about Bitcoin Core should be in the source code :)

13:28

<jnewbery> lightlike: I do not. Someone should suggest edits to sipa's SE answer

13:28

<MarcoFalke> So someone should copy-past the useful parts into our code base

13:29

<jnewbery> MarcoFalke: for design documentation, is https://github.com/bitcoin-core/bitcoin-devwiki/wiki better?

13:30

<lightlike> I think it is also helpful to mention the observation by Larry in the PR, that this change is only an improvement because IBD typically saves block out of order. If everything was in order, this change would make things (slightly) slower.

13:32

<nehan_> LarryRuane: I added a comment. You are now holding cs_main while doing a disk read (i think) in Skip(). might that affect performance?

13:32

<raj_> just one tangent question. While reindexing, is transaction validation process again repeated?

13:32

<LarryRuane> yes because when we encounter a block and its parent has already been seen, it backs up (in the memory buffer) by 80 bytes (header) and deserializes the entire block (incuding header, again) .. so header is deserialized twice .. but since it's only 80 bytes probably not much impact

13:33

<michaelfolkson> That's just used for release notes jnewbery? achow101 said that got vandalized recently as no merge restrictions

13:33

<LarryRuane> raj_ i believe yes, similar to IBD, all checks are performed, only difference is blocks come from disk instead of peers

13:34

<lightlike> raj_: although as elichai2 noted earlier, with the same restrictions as in IBD (afaik not every signature of very old blocks is validated, but I don't know the details there)

13:36

<lightlike> ok, next q: How about test coverage of the reindexing (functional and unit tests)? Do you think it can be improved?

13:36

<LarryRuane> nehan_ thank you, i'll reply there, but today, where of course there is no Skip(), instead the deserialization (`blkdat >> block;`) can trigger the same disk read, so there is no difference

13:37

<nehan_> LarryRuane: I think that's outside the cs_main lock though

13:37

<jnewbery> raj_: yes, exactly the same as for IBD. You can imagine we're treating the blk files as untrusted and revalidating everything again. assumevalid means that we don't check scripts/signatures before a certain height

13:39

<LarryRuane> nehan_ you're right! good catch, i'll investigate how to improve that (or try to see if it may be acceptable)

13:40

<willcl_ark> jnewbery: which also might give us another --reindex use-case: recieving an offline copy of the block files from a friend (or torrent download?) which you might want to reindex before trusting

13:40

<nehan_> LarryRuane: I'm having a lot of trouble figuring out how nRewind is updated given that it moved around. I'm still trying to get my head around CBufferedFile semantics...

13:41

<LarryRuane> _nehan_ this lock can be dropped before calling Skip() .. Is there a way to drop a LOCK() besides it going out of scope? i don't think i've ever seen that

13:41

<jonatack> michaelfolkson: there are several different things in https://github.com/bitcoin-core/bitcoin-devwiki/wiki... mempool, p2p design, wallet structure

13:41

<jnewbery> nehan_: good catch! In practice, I don't think it matters too much. If we're doing reindex then the critical path is single-threaded in this thread (the ThreadImport thread)

13:41

<raj_> jnewbery: Thanks. So how much is actuallly the difference between a fresh IBD and reindexing in terms of time?

13:41

<jnewbery> definitely worth benching performance though

13:43

<LarryRuane> nehan_ yes that rewind stuff is somewhat obtuse... basic idea of rewind is it marks a point in the stream that we can reposition to if deserialization throws

13:43

<raj_> nehan_: I cant seem to find where cs_main is held in read(). :(

13:43

<jnewbery> raj_: depends on bandwidth/quality of peers/disk access speed/CPU

13:43

<lightlike> raj_: that would depend a lot on the speed of your internet connection (and that of your peers).

13:44

<nehan_> raj_: it's not. It's held while this is called, which calls read: https://github.com/bitcoin-core-review-club/bitcoin/commit/a50f956

13:44

<michaelfolkson> Re tests. Any additional tests would be separate PR not related to this specific performance improvement?

13:45

<raj_> ok makes sense. So in % term what kind of reduction we are talking about here?

13:45

<lightlike> michaelfolkson: yes, plus this PR adds unit tests for the new Skip() functionality.

13:45

<raj_> Oh thanks nehan_

13:45

<jnewbery> in general, you'd expect reading files from disk to be faster than downloading that data from peers, but exact quantitative difference depends on those factors

13:45

<nehan_> LarryRuane: oh so that's another thing, read() could throw while holding the lock. does that matter?

13:45

<lightlike> but in general I think that while the unit tests of CBufferedFile are really great, the validation code is tested quite poorly in the functional tests:

13:46

<LarryRuane> just for those who may not be that familiar with this code ... (this wasn't at all clear to me initially) ... deserialization (`blkdat >> ....`) can throw an exception, and then we magically end up the catch and continue that top-level loop

13:46

<LarryRuane> nehan_ nope, the compiler (semantics of LOCK) deal with that correctly, one aspect of its magic

13:46

<nehan_> LarryRuane: cool thanks

13:47

<lightlike> feature_reindex.py exists but seems quite rudimentary.

13:48

<lightlike> Last question: Can you think of further optimizations of reindexing beyond this PR that are possible/worthwhile?

13:48

<jnewbery> lightlike: I agree that feature_reindex.py is rudimentary. What kind of additional tests would you like to see?

13:49

<LarryRuane> lightlike yes it's tiny, tbh i didn't even really look at it while doing this PR .. yes, my question too, should it do more?

13:49

<LarryRuane> i could look into that

13:49

<lightlike> LarryRuane: That was in no way meant a criticism - it was like that for years, and you added lots of units tests :-)

13:50

<lightlike> jnewbery: I think having multiple block files, and also full blocks with blocks being serialized out of order, would be nice.

13:50

<jnewbery> LarryRuane: also, you don't change functionality, so there shouldn't need to be changes to functional tests!

13:50

<lightlike> *full block files (MB)

13:50

<LarryRuane> the testing for this PR isn't really great, it does have a nice test for the new Skip method, but there really aren't any (new) tests for the changes to validation.cpp .... oh no, not taken that way!

13:50

<jnewbery> but improving functional test coverage in a separate PR seems like it'd be worthwhile

13:51

<LarryRuane> yes i agree, i could take that on if people like that idea ... since i have learned a little about this area already

13:52

<jonatack> +1 on that LarryRuane

13:52

<jnewbery> lightlike: to answer your question about further optimizations, it seems like a lot of the performance degredation is due to blocks being out of order in the blk files. Maybe this is too wacky, but could bitcoind sort those blocks in the background, so that on reindex we're less likely to have to skip forwards and backwards in those files?

13:53

<LarryRuane> ok i would love to ... what's the convention, make a ticket? or just a PR okay?

13:54

<docallag> How often would you expect to reindex?

13:54

<lightlike> jnewbery: do you mean on the fly during IBD, or some kind of reorganization script that can be run on demand?

13:54

<jnewbery> docallag: basically never

13:55

<LarryRuane> that is an interesting idea jnewbery.... i like it .. i suggested a different way to improve this at the end of the last comment i made on the PR (yesterday), maybe take a look at that too .. but my suggestion would create more files in `blocks/`

13:55

<jonatack> LarryRuane: maybe ask for input on #bitcoin-core-dev, could add it as a weekly meeting topic

13:55

<jnewbery> lightlike: I was thinking in the background after IBD, but could also be on demand

13:55

<michaelfolkson> Why do the blocks go out of order in the first place?

13:56

<LarryRuane> headers-first IBD

13:56

<jnewbery> I don't necessarily think it's worth it, since it would be changing mainline behavior to optimize for something we expect never to do

13:56

<LarryRuane> @jnew

13:56

<jnewbery> michaelfolkson: pieter talks about it here: https://podcast.chaincode.com/2020/01/27/pieter-wuille-1.html

13:56

<LarryRuane> jnewbery good point

13:57

<fjahr> jnewbery: I was just going to ask, such on the fly optimization might be hard to get in if it slows down iBD even slightly?

13:57

<nehan_> also crazy on the fly optimizations often increase bug surface area :)

13:57

<jnewbery> fjahr: right, if such an idea was considered, it should be in the background after IBD

13:57

<lightlike> yes, it seems a bit pessimistic to think too much about reindexing during IBD (plus, It's just an hour)

13:57

<fjahr> ah, I see, you wrote after ibd earlier

13:57

<jnewbery> nehan_: +1

13:58

<lightlike> ok, we are almost at the end of the hour. Any questions left?

13:59

<jnewbery> I don't know the serving-blocks-to-other-peers-doing-IBD logic well enough to know whether having blocks serialized on disk in order would be an improvement there too

13:59

<docallag> How would you know you needed to reindex?

13:59

<jnewbery> docallag: https://bitcoin.stackexchange.com/questions/60709/when-should-i-use-reindex-chainstate-and-when-reindex

14:00

<LarryRuane> jnewbery i think *probably* not, because i think the indices record the exact start (and length) of each block (to serve to peers during IBD)

14:00

<docallag> Sorry I meant would Core fall over and then you'd know to reindex?

14:00

<lightlike> let's wrap it up, thanks all for participating!

14:00

<michaelfolkson> Use cases covered earlier in the meeting docollag if that's your question. Basically something goes wrong or you want to do something you can't do

14:00

<LarryRuane> thank you everyone! been great

14:00

<emilengler> thanks for hosting :)

14:01

<nehan_> thank you!

14:01

<emilengler> and thanks to LarryRuane for the PR

14:01

<jnewbery> Great meeting. Thanks for hosting lightlike, and thanks for joining us LarryRuane!

14:01

<lightlike> and thanks LarryRuane for answering questions

14:01

<willcl_ark> thanks lightlike + LarryRuane

14:01

<emilengler> And also thanks to anyone else for his/her feedback

14:01

<emzy> thank you!

14:01

<docallag> tks

14:01

<michaelfolkson> Thanks all. And congrats jonatack :)

14:02

<jnewbery> LarryRuane: I think you're right, but it seems possible that we could be more efficient in serving blocks if they were serialized in order

14:02

<LarryRuane> ah i see, perhaps less seeking

14:02

<jonatack> Thanks lightlike, LarryRuane, jnewbery and everyone! (Thank you, michaelfolkson)

14:02

<jnewbery> like we could have a new P2P message to serve a range of blocks(?)

14:03

<jnewbery> sorry - probably way off topic!

14:03

<LarryRuane> makes sense

14:14

<emilengler> Is this meeting over now? Maybe it is time to do hashtag end meeting

14:16

<lightlike> right

14:16

<lightlike> #endmeeting

14:16

<lightlike> (though I'm not sure if there are any bots here that would listen to it)

Meeting Log – Asia time zone

Host: kallewoof

05:01

<jnewbery> hi!

05:01

<kallewoof> #startmeeting

05:01

<aj> hey

05:01

<meshcollider> hi :)

05:01

<akionak> Hi!

05:01

<kallewoof> If you're here, say hi so we know

05:01

<coinsureNZ> yep thats the one meshcollider , albiet expat now

05:01

<jnewbery> it's 11pm here so I'm not going to stick around for too long. I just wanted to be here at the start of the first one :)

05:01

<anditto> hi! ^_^

05:01

<kallewoof> or to quote our illustrious leader, < jnewbery> Hi folks! Welcome to Bitcoin Core PR Review Club. Feel free to say hi to let everyone know you're at keyboard.

05:02

<kallewoof> jnewbery: thanks for sticking around :)

05:02

<fanquake> hi

05:02

<meshcollider> Night John

05:02

<kallewoof> I'm going to guide myself through the other log so forgive some amount of copy-pasting.

05:03

<kallewoof> Today's PR is #16981. Notes and questions are here: https://bitcoincore.reviews/16981.html

05:03

<kallewoof> Did everyone get a chance to review the PR? ( y/n )

05:03

<kallewoof> y

05:03

<coinsureNZ> y

05:03

<jnewbery> kallewoof: I'll post meeting logs sooner in future. I had to run out straight after the meeting today

05:03

<aj> ish

05:03

<coinsureNZ> not the code- just the gist of the nodes

05:03

<fanquake> 1/2 y

05:03

<coinsureNZ> *notes

05:04

<kallewoof> jnewbery: No worries. I am still wondering if it's better to NOT look at logs to reduce bias, but we'll use it this time as I've never done one of these before

05:04

<kallewoof> What are people's general opinion on the PR?

05:05

<kallewoof> Concept/code/approach/etc.

05:06

<fanquake> Concept ACK improving performance. +1 new test. I can't actually bench it atm though.

05:06

<fanquake> *test code

05:07

<fanquake> I don't think I've actually run a --reindex in quite a while.

05:08

<jnewbery> goodnight all. Have fun!

05:08

<kallewoof> fanquake: why can't you bench it?

05:08

<kallewoof> jnewbery: night night

05:08

<RubenSomsen> hey guys, and goodnight John :)

05:09

<kallewoof> hi Ruben :)

05:09

<fanquake> I nuked my datadir, and need to redownload block data

05:09

<aj> i wonder how the performance changes depending on the layout of the blocks. i think you get different layouts depending on the network speed of the peers you do IBD from -- if you've got 1 peer that gives you a block per second (~10Mbps), and another that gives you a block per minute (~133kbps), then I think you'll tend to have sets of 60 out of order blocks (from the fast peer) when you're

05:09

<aj> expecting a block from the slow peer

05:10

<kallewoof> aj: does this PR impact that?

05:11

<aj> i think the PR's performance improvement should rely on the "60 out of order blocks" size matching the choice of memory size constant?

05:13

<kallewoof> aj: I don't think I follow. Can you point out file/line no?

05:13

<aj> no not really, maybe i'm way off

05:13

<kallewoof> i don't know :)

05:14

<kallewoof> to summarize what the PR does, IIUC, is to instead of reading every block one at a time, it reads only the block header, then skips over the block if it's not the desired one

05:15

<kcalvinalvin> Does it keep that skipped block in memory?

05:15

<kallewoof> No, I don't think so: https://github.com/bitcoin/bitcoin/blob/a50f956e304adc7c75428ec391f1a76a99c1ddde/src/streams.h#L797-L800

05:17

<aj> ah, the bits i'm thinking of are in the second change which was removed from this PR https://github.com/bitcoin/bitcoin/pull/16981#issuecomment-589791783

05:17

<meshcollider> aj: that's an interesting thought anyway

05:17

<kallewoof> aj: ahh, okay

05:18

<aj> https://github.com/bitcoin/bitcoin/pull/16981#issuecomment-542020115 they were getting an additional ~10% so almost noise by comparison

05:18

<kcalvinalvin> Little bit of a basic question but what data does the txindex=1 save vs when you do txindex=0?

05:19

<kallewoof> aj: I didn't realize there was a 'keep in memory' component. I only looked at this PR yesterday.

05:19

<aj> kallewoof: well there's not any more, so you're right!

05:19

<kallewoof> kcalvinalvin: I think it stores the block hash for each transaction only. Would have to check

05:21

<kallewoof> Any other general opinions on the concept / idea of this code change? Or if someone reviewed it and have questions about the code or such, we can go through that too.

05:23

<aj> kcalvinalvin: txindex is run as a separate process, that triggers off new blocks being accepted, and stores the index data in a separate ldb in .bitcoin/indexes/txindex

05:23

<aj> separate thread, not process i suppose

05:24

<kcalvinalvin> Is reindex also a separate thread?

05:25

<aj> reindex is updating the chain state which is the most important thing, so it's kind of the main thread?

05:26

<fanquake> and it's also using the loadblk thread

05:29

<kallewoof> I'm gonna steal a question from the IRC log, as I was actually unsure about this one myself: In which situations is reindex useful? When is it better to use reindex-chainstate?

05:31

<aj> https://bitcoin.stackexchange.com/questions/60709/when-should-i-use-reindex-chainstate-and-when-reindex -- only when you were pruning or you think your disk might be corrupted?

05:37

<kallewoof> so reindex recreates the index pointing out where in the blk files the blocks are located

05:37

<kallewoof> and also does what reindex-chainstate does.

05:39

<kallewoof> aj: seems you also do it when you enable txindex

05:42

<kallewoof> The notes say "Reindex uses the CBufferedFile stream, introduced in #1962, which has a buffer to allow “rewinding” the stream position to an earlier position without additional disk access." -- does that mean it keeps all the read data from the blk file in memory until it's destroyed?

05:45

<aj> i think it's a fixed size buffer, size declared as second arg to constructor, so 8MB currently

05:47

<aj> kallewoof: i'm pretty sure the new txindex code lets you enable txindex *without* doing a reindex? PR#13033

05:48

<kallewoof> ohh, didn't know that. nice.

05:51

<kallewoof> Does anyone have any other questions about the PR? I think these meetings are partially meant to mentor people through the review process, so don't hesitate to speak up even if you feel unsure of yourself.

05:51

<kallewoof> Me and AJ can keep going for days unless somebody stops us. :P

05:53

<fanquake> I'd be interested if someone on Linux could benchmark this with and without the thread prioritization from SCHED_BATCH, to see how much difference it actually makes

05:54

<fanquake> (unrelated to the changeset though)

05:54

<fanquake> macOS doesn't support SCHED_BATCH so I can't check easily.

05:54

<kallewoof> fanquake: I have a linux full node synced up. How do I toggle SCHED_BATCH, though?

05:54

<fanquake> Easiest way is probably just to comment out the call to ScheduleBatchPriority()

05:55

<fanquake> https://github.com/bitcoin/bitcoin/blob/master/src/init.cpp#L672

05:55

<kallewoof> Ahh, gotcha

05:56

<kallewoof> So if I get this right, the entire reindex part is in init.cpp's ThreadImport function. That while loop runs until it gets to the end and then it continues on with the rest.

05:59

<kallewoof> Yeah that seems to be the case. The author said he put an assert(0) after the "Reindexing finished" and used time to benchmark it. Guess I'll do that, with and without the ScheduleBatchPriority part.

06:02

<kallewoof> Anyway, unless people have more questions or opinions on the PR, I think we can end the meeting.

06:02

<meshcollider> Yeah it all seems sane to me and I don't have any specific questions, I'd like to test it tomorrow when I get back to uni though

06:03

<meshcollider> That seems the most useful thing to do

06:03

<fanquake> SGTM

06:03

<anditto> I don't remember last time I used -reindex & never thought about it, so for this particular PR I'm trying to learn as much as I can. It's also nice to have this meeting in an accessible timezone. Thanks everyone.

06:03

<kallewoof> meshcollider: cool, yeah i'm going to test it on my linux machine with/without the sched batch tweak

06:04

<kallewoof> I'm up for talking about more general bitcoin core related things, but maybe we need to switch to another channel (#bitcoin-core-dev?)

06:04

<meshcollider> Thanks for hosting btw kallewoof :)

06:05

<aj> +1

06:05

<fanquake> Thanks kallewoof

06:05

<kallewoof> my pleasure! I think I'm deviating a bit from the original concept, but I'll read up and do better in the future, if we do these again.

06:05

<anditto> Thanks kallewoof.

06:05

<akionak> Thanks kallewoof

06:06

<kallewoof> I know people were talkign about doing this & a more general bitcoin core related meeting (like the one that is happening tonight in #bitcoin-core-dev). Since we're a smaller crowd it may make sense to combine the two into one.

06:10

<coinsureNZ> thanks guys, didnt have much to add just eager to sit in on these discussions for a while and understand how PRs are evaluated. Its been informative and I appreciate being able to added in a workable time slot

06:10

<RubenSomsen> Thanks kalle :)

06:10

<kallewoof> coinsureNZ: Thanks for joining!

06:11

<kallewoof> #endmeeting