During IBD, prune as much as possible until we get close to where we will eventually keep blocks (
validation) Dec 29, 2021
The PR branch HEAD was 24f3936 at the time of this review club meeting.
When pruning is enabled in Bitcoin Core, data about old blocks is deleted to limit disk space usage.
Users can configure a pruning target with the
-prune=<target in MB> argument defining how much disk space to use for block and undo data.
The minimum target is 550MB.
Bitcoin Core keeps a write buffer of UTXOs (aka dbcache).
If the buffer didn’t exist, creating a UTXO and deleting a UTXO would both cause a write operation to disk.
As UTXOs are often short lived, modifying the buffer is a lot faster than writes to disk.
Reading from the buffer is also cheaper than looking UTXOs up on disk.
The buffer is flushed to disk, for example, when it grows too large.
Depending on the buffer size, flushes can take a while.
Node operators can control the buffer size with the
-dbcache=<size in MB> argument.
A larger buffer takes up more system memory but takes longer to fill and thus requires fewer flushes.
This speeds up the initial block download (IBD).
Pruning is a reason for us to flush the dbcache regardless of its memory usage.
The maximum configured dbcache size is often not reached.
This PR changes the pruning behavior.
Previously, we’d prune just enough files for us to be able to continue the IBD.
We now aggressively prune all prunable files enabling us to continue with IBD without having to prune again too soon.
Fewer prunes also mean fewer dbcache flushes, potentially speeding IBD for pruned nodes up.
Higher dbcache sizes can be reached before the dbcache is flushed.
#12404 attempted aggressive pruning too, but was closed in favor of PR #11658.
PR #11658 added 10% of the prune target to the
This is being overwritten by PR #20827.
What does this PR do? What is the goal of this PR?
Where in the code do we check if we need to prune old block data? (hint: look for usages of the
What is removed during pruning and under which conditions? What is not pruned?
What is the variable
being used for? How large is the buffer (in MB)?
The PR assumes 1MB for
average_block_size. How accurate does this assumption have to be?
The PR description mentions IBD speed improvements for pruned nodes. What can we measure to benchmark the improvement? With which prune targets and dbcache sizes should we test?
Edge case: Is agressively pruning during IBD a problem if there are longer forks in the chain?
1 18:00 <b10c> #startmeeting
2 18:00 <b10c> Welcome to the last Bitcoin Core review club meeting of 2021!
3 18:00 <b10c> Feel free to say hi!
4 18:00 <shapleigh1842> hi!
8 18:01 <michaelfolkson> hi
9 18:01 <b10c> anyone got a chance to have a look at this over the holidays?
10 18:02 <scavr> yes read the notes and had a look at the changes
12 18:03 <michaelfolkson> Yup read the notes too
13 18:04 <b10c> cool! the diff is only a few lines, this one is more about understanding how pruning in Bitcoin Core works. Let's dive right in with the questions, but feel free to ask questions any time!
14 18:04 <b10c> What does this PR do? What is the goal of this PR?
15 18:05 <svav> I read the notes too ...
16 18:05 <scavr> The goal is to optimize the pruning strategy during initial block download
17 18:06 <b10c> svav: was everything clear? any questions?
18 18:06 <michaelfolkson> More aggressive pruning to speed up IBD for a pruned node
19 18:06 <svav> Why was this PR felt necessary?
20 18:07 <b10c> scavr michaelfolkson: correct! What do we prune now that we previously didn't prune?
21 18:07 <michaelfolkson> IBD speedups are always good. I was more unsure why there wasn't already aggressive pruning in the original code
22 18:07 <svav> I would say if the IBD acronym is used, it should be defined once as Initial Block Download for clarity for newbies
23 18:08 <shapleigh1842> ^^yes I just figured this acronym out
24 18:08 <michaelfolkson> ^
25 18:09 <b10c> svav: performance improvements are always welcome. This helps people running Bitcoin Core on lower end hardware. e.g. Raspberry Pi's
26 18:09 <michaelfolkson> b10c: Just more blocks right? blk.dat and rev.dat files
27 18:09 <svav> b1c: and do we know how significant a performance increase this gives?
28 18:09 <shapleigh1842> context Q: during an IBD on a "pruned" node, does the node still download the entire blockchain, albeit verifying and pruning as it goes?
29 18:10 <sipa> IBD is no different from normal synchronization really. It just changes some heuristics/policies.
30 18:10 <sipa> All blocks are still downloaded and verified the same, whether IBD or not.
31 18:11 <michaelfolkson> Random q: Are rev.dat files what undo.dat files used to be called?
32 18:11 <shapleigh1842> sipa: thank you.
33 18:12 <b10c> michaelfolkson: yes! we just free up more space once we decide to prune
34 18:12 <sipa> @michaelfolkson I can't remember Bitcoin Core ever having had an undo.dat file.
35 18:13 <b10c> I think it's called undo data in the code and the files are called rev*.dat (?)
36 18:13 <sipa> Yeah, rev*.dat files contain undo data.
38 18:14 <b10c> hm this should probably be rev*.dat files, not sure if this got changed at some point
39 18:15 <b10c> Next question: Where in the code do we check if we need to prune old block data?
40 18:15 <sipa> That's a typo I think.
41 18:15 <sipa> In what branch do you see that?
42 18:15 <michaelfolkson> Master
43 18:16 <svav> validation.cpp
44 18:16 <scavr> we check it in CChainState::FlushStateToDisk before we check if we need to flush the dbcache
45 18:17 <shapleigh1842> so just browsing this diff [and I'm sure I could look this up in a readme] it looks like the bitcoin codebase standard is to only provide parameter comments for [out] parameters? (i.e. no comments required for normal params or return?)
46 18:18 <sipa> michaelfolkson: It was introduced in commit f9ec3f0fadb11ee9889af977e16915f5d6e01944 in 2015, which introduced pruning in the first place. Even then the files were called rev*.dat.
47 18:19 <b10c> svav scavr: correct! I somehow assumed it would it's done when connecting a new block, but I guess we call FlushStateToDisk often enough (but don't actually flush the cache)
48 18:20 <b10c> sipa: maybe they were called undo*.dat in a first interation of the pruning feature, but got renamed during development
49 18:20 <michaelfolkson> sipa: So a typo then. I'll open a PR to correct (or someone new can)
50 18:21 <sipa> No, because the rev*.dat files predate pruning. I introduced the concept of rev*.dat files ;)
51 18:21 <b10c> sipa: oh, I see :D
52 18:21 <b10c> next question: Under which conditions do we prune and what is removed? What is not pruned?
53 18:22 <scavr> we prune once disk_usage + buffer >= prune target
54 18:23 <svav> Pruning will never delete a block within a defined distance (currently 288) from the active chain's tip.
55 18:23 <scavr> and we stop once that's no longer the case
56 18:23 <b10c> scavr svav: correct!
57 18:24 <b10c> svav: to be clear, we don't delete a block file with a block 288 blocks from tip
58 18:25 <b10c> rev*.dat files are also pruned
60 18:25 <michaelfolkson> The block index isn't deleted?
61 18:26 <sipa> No, only the blocks.
62 18:26 <sipa> We don't want to forget about old blocks, just their contents is forgotten.
63 18:27 <b10c> There are flags in the index that indicate if we HAVE_BLOCK_DATA and HAVE_UNDO_DATA
64 18:27 <b10c> I guess they are set to false when we prune?
65 18:28 <sipa> Yeah, I believe so.
66 18:28 <b10c> What is the variable nBuffer in BlockManager:FindFilesToPrune() being used for? How large is the buffer (in MB)?
67 18:29 <sipa> In the very first commit that added undo files they were called "<HEIGHT>.und", actually: 8adf48dc9b45816793c7b98e2f4fa625c2e09f2c.
68 18:29 <michaelfolkson> I think of blocks as transactions (rather than UTXO diffs) and deleting transactions but this is deleting from a UTXO database right? Effectively spent txo
69 18:29 <sipa> No, pruning is unrelated to the UTXO set.
70 18:30 <scavr> the nBuffer and the current disk usage are summed up when checking if the prune target has been reached
71 18:30 <sipa> The UTXO set is already UTXO: it only contains spent outputs already.
72 18:30 <sipa> unspent, sorry
73 18:31 <b10c> another acronym worth mentioning: UTXO = unspend transaction output
74 18:31 <sipa> Pruning is literally deleting the block files (blk*.dat) and undo files (rev*.dat) from disk, nothing more. It does not touch the UTXO set, and doesn't delete anything from any database.
75 18:31 <michaelfolkson> Ok thanks
76 18:31 <sipa> (apart from marking the pruned blocks as pruned in the database).
77 18:31 <b10c> scavr: do you know how big the buffer is?
78 18:32 <scavr> it starts as 17 MB (16MB block chunk size + 1 MB undo chunk size)
79 18:34 <b10c> correct! we pre-allocate the files in chunks so we want to keep this as a buffer
80 18:35 <scavr> when in IBD we add 10% of the prune target to the buffer
81 18:35 <scavr> so 17MB + 55MB = 72MB?
82 18:35 <scavr> with a prune target of 550MB
83 18:36 <michaelfolkson> 550 being the minimum
84 18:36 <scavr> this is similar to PR 20827 also an optimization, right?
85 18:36 <b10c> yes! that's my understanding too. why?
86 18:37 <b10c> yep, we leave a bit bigger buffer to have to flush too soon again (causing another dbcache flush)
87 18:38 <b10c> With 20827 the buffer will be 17 MB + `number of blocks left to sync` * 1 MB
88 18:39 <michaelfolkson> The downside of having a big dbcache is that when it fills it takes longer to flush so time trade-offs I'm guessing. Saves time overall as infrequent flushing
89 18:39 <sipa> That point is that any pruning involves flushing, whether it's deleting a small or large amount of blocks.
90 18:40 <b10c> michaelfolkson: yes, with a large dbcache you could do IBD without ever flushing
91 18:40 <sipa> And frequent flushing is what kills IBD performance.
92 18:40 <sipa> So by deleting more when we prune, the pruning (and thus flushing) operation needs to be done less frequently.
93 18:40 <michaelfolkson> Frequent flushing of small dbcaches is a lot worse than infrequent flushing of big dbcaches, right
94 18:41 <sipa> The size isn't relevant. If you flush frequently, the dbcache just won't grow big.
95 18:41 <b10c> not really about dbcache size in this PR, more about number of flushes we do
96 18:41 <sipa> Speed is gained by having lots of things cached. When you flush, the cahce is empty.
97 18:42 <sipa> These two are related: the longer you go without flushing, the bigger the database cache grows.
98 18:42 <sipa> The dbcache parameter sets when you have to forcefully flush because otherwise the cache size would exceed the dbcache parameter.
99 18:42 <b10c> The PR assumes 1MB for average_block_size. How accurate does this assumption have to be?
100 18:44 <michaelfolkson> b10c: Presumably not very accurate :)
101 18:44 <scavr> i don't know but guess not too accurate as may blocks are >1MB
102 18:44 <michaelfolkson> The average for the first few years must have been much lower than 1MB
103 18:44 <michaelfolkson> The start of IBD
104 18:45 <michaelfolkson> Today's blocks are consistently much bigger than 1MB, 1.6MB average?
105 18:46 <Kaizen_Kintsugi_> I'm surprised at this as-well, intuitively with my limited knowledge, I come to the conclusion that the average block size could be computed.
106 18:46 <svav> b10c: Were people having performance problems that made this PR necessary? When is it hoped that this PR will be implemented?
107 18:47 <b10c> agree with both of you, yes. It doen't matter for the early blocks and it's an OK assumption for the later blocks. Might leave us with one more or one fewer set of blk/rev dat files when IBD is done
108 18:47 <sipa> I mean, what counts as "performance problem"? Faster is always nicer, no?
109 18:47 <michaelfolkson> I think if the estimate was totally wrong e.g. 0.1 MB or 6 MB it would be a problem
110 18:47 <sipa> And certainly on some hardware IBD is painfully slow... hours to days to weeks sometimes
111 18:47 <b10c> It is totally wrong for regtest, signet and testnet
112 18:48 <michaelfolkson> Hmm but on the high side right
113 18:48 <michaelfolkson> So just a massive underestimate would be a problem?
114 18:49 <b10c> Kaizen: computing would be possible for sure, but this would probably be to complex here
115 18:49 <michaelfolkson> So 0.1 MB would be a problem but 6MB wouldn't be a problem?
116 18:50 <michaelfolkson> Just less efficient
117 18:50 <b10c> e.g. on testnet you could finish IBD with only one pair of blk/rev files left when we prune just before we catch up with the tip
118 18:51 <b10c> as we assume the next blocks will all be 1MB each and make space for them
119 18:52 <michaelfolkson> So the pruning is too aggressive for testnet
120 18:52 <b10c> could maybe even be a problem if there is a big reorg on testnet as we can't reverse to a previous UTXO set state?
121 18:52 <b10c> michaelfolkson: yes, maybe. I need to think about this a bit more
122 18:53 <sipa> During IBD there shouldn't be reorgs in the first place.
123 18:53 <sipa> Which is presumably the justification why more aggressive pruning is acceptable there.
124 18:53 <b10c> I mean after
125 18:53 <sipa> Oh, oops.
126 18:54 <b10c> assume we just finished IBD with only a few blocks+undo data left on disk
127 18:54 <b10c> (you anwered question 7 :) )
128 18:55 <b10c> we do a headers first sync, so we don't download and blocks from stale chains
129 18:56 <b10c> (that's the answer to questions 7, let's do question 6 now):
130 18:56 <b10c> The PR description mentions IBD speed improvements for pruned nodes. What can we measure to benchmark the improvement? With which prune targets and dbcache sizes should we test?
131 18:56 <michaelfolkson> A non-zero very low probability of having to deal with re-orgs during IBD. Even lower probability with headers first sync
132 18:57 <michaelfolkson> (if someone provides wrong headers)
133 18:57 <scavr> ohh I didnt know about headers first sync. you mean once we have the longest chain we ignore forks in IBD?
135 18:58 <sipa> even with headers-first sync we download blocks simultaneously with headers
136 18:58 <sipa> but we only fetch blocks along the path of what we currently know to be the best headers chain
137 18:59 <sipa> and i'm not sure "probability" is the issue to discuss here; you can't *randomly* end up with an invalid headers chain if your peers are honest
138 18:59 <sipa> if you peers are dishonest however, it is possible, but that's a question of attack cost, not probability
139 18:59 <svav> To benchmark the improvement, can we measure IBD download time?
140 19:00 <michaelfolkson> The header has the PoW in it.. so massive attack cost :)
141 19:00 <b10c> svav: yes!
142 19:00 <shapleigh1842> We should measure / benchmark IBD sync time with low cost hardware and default dbcache and other settings
143 19:01 <b10c> yup, preferably with different prune targets (more important) and dbcache sizes
144 19:01 <shapleigh1842> yeah, well def with the minimum 550
145 19:02 <b10c> I'd assume this has a bigger effect for people pruning with larger prune targets
146 19:03 <b10c> since you still flush quite often with the 550 prune target, but if you can download 10GB and only need to flush (for pruning) once, that's a lot better than before
147 19:03 <sipa> It should mostly matter for people with large prune target and large dbcache, I think.
148 19:03 <b10c> Ok, time to wrap up! Thanks for joining, I wish everyone a happy new year.
149 19:04 <Kaizen_Kintsugi_> Thanks for hosting!
150 19:04 <b10c> #endmeeting