Return the AcceptBlock CValidationState directly in ProcessNewBlock (
validation) Oct 16, 2019
is net processing’s entry point into validation, called whenever net
processing receives a new block. It can be called in
BLOCKTXN message is received and net processing
is ready to hand a block to the validation layer.
CheckBlock() to carry out non-contextual
validation tests on the block. If those checks pass, then
AcceptBlock(), which does more checks on the block
and then stores it to disk.
AcceptBlock() passes and the block is valid, then
is called to update the node’s view of the blockchain. If the new block has
more work than the node’s current tip, then
ActivateBestChain will try to
connect it to the chain and update the UTXO set. This can fail, for
example if a transaction in the block spends non-existent coins.
If a peer is sending us invalid blocks, then depending on the invalidity
reason, we may wish to disconnect that peer.
AcceptBlock() fails, then the failure reason
will be saved in a
CValidationState object and returned to
Prior to this PR,
ProcessNewBlock() would then call the
CValidationInterface. net processing would receive the
BlockChecked callback and then potentially disconnect the peer.
This PR changes
ProcessNewBlock so that if
fail, then it will return the
CValidationState directly to the caller, which
can then decide whether to ban the peer.
This PR is a step towards
16175, which makes
ProcessNewBlock() asynchronous, and would allow parallelization between net
processing and validation.
Did you review the PR?
Concept ACK, approach ACK, tested ACK, or
(Don’t forget to put your PR review on GitHub.)
What steps did you take, beyond reading the code?
Did anything about the way new blocks are checked and accepted surprise you?
Do you have any questions about the series of steps to add a block to the
The author describes this PR as
“a first step towards a series of cleanups
leading to a (more-complete) #16175. It’s just two rather-nice (IMO)
cleanups.” Do you agree? Is this just a cleanup PR or are there behaviour
Are there any other
CValidationInterface clients that use the
The other place that the
BlockChecked method is called is in
ActivateBestChain(). Why does that need to be a
callback? Why does net processing need to keep a
mapBlockSource map from the
block to the peer that provided it?
6 13:01 <jnewbery> what did everyone think of this week's PR?
7 13:01 <sebastianvstaa> hi
9 13:01 <zenogais> Was interesting, small change that interacts with a lot of crucial systems.
10 13:01 <jnewbery> zenogais: yes! I agree
11 13:02 <jnewbery> very little functional change, but there's lots to dig into
13 13:02 <zenogais> Definitely caused me to explore parts of the codebase I'd never seen before.
14 13:02 <amiti> yeah agreed, not a ton changed but I learned/explored a lot to get the context
15 13:02 <jnewbery> we've touched on the CValidationInterface in a few previous reviews, but I think this is the first time we've looked at changes in validation.cpp
17 13:03 <amiti> very context heavy, as many of the conversations revealed
19 13:03 <jnewbery> that's where the useful comments are
20 13:04 <jnewbery> you can see that there are 8 methods in the interface. Most of them are asynchronous, but a couple are synchronous. Can you see which ones?
21 13:05 <pinheadmz> well, like RegisterBackgroundSignalScheduler takes a callback as an arg?
22 13:06 <pinheadmz> so id guess that is async
23 13:06 <jnewbery> RegisterBackgroundSignalScheduler is adding the scheduler thread. I think it's only called at startup/initialization
24 13:06 <jnewbery> The methods are in the CValidationInterface() class
25 13:07 <jnewbery> UpdatedBlockTip, TransactionAddedToMempool, etc
27 13:07 <ariard> ChainStateFlushed and BlockChecked
28 13:08 <ariard> and NewPowValidBlock?
29 13:08 <fox2p> and NewPoWValidBlock?
31 13:08 <zenogais> So anything that doesn't call `AddToProcessQueue`?
32 13:08 <jnewbery> zenogais: correct
33 13:08 <ajonas> so not ChainStateFlushed then
34 13:09 <jnewbery> BlockChecked and NewPOWValidBlock are synchronous
35 13:09 <ariard> No ChainStateFlushed sorry
36 13:09 <jnewbery> you can see those two methods are making direct function calls. The other methods are all adding a lambda function to the queue
37 13:10 <jnewbery> the scheduler thread will come and service those functions in the background at some point.
38 13:10 <jnewbery> ok, so with that background, let's move onto the PR
39 13:10 <jnewbery> describe briefly what the PR is doing
40 13:11 <jnewbery> all of you... GO!
42 13:12 <pinheadmz> adds a new arg to ProcessNewBlock, a CValidationState
43 13:12 <pinheadmz> so every time you call PNB, you have to init a state first, pass it in, thenafter PNB returns, you can examine it for isValid()
44 13:12 <jnewbery> pinheadmz: right, and what's that CValidationState argument used for
45 13:12 <jnewbery> pinheadmz: yes
46 13:12 <pinheadmz> seems to just return validation errors and state
47 13:12 <pinheadmz> i was hoping the util tests would have a bad block in there and i could see a "bad" response
48 13:13 <pinheadmz> but I think all the tests return boring valid states
49 13:13 <zenogais> Validation state is used for DoS checks
50 13:13 <jnewbery> zenogais: correct. And where do we do those DoS checks?
51 13:13 <jnewbery> Which function handles invalid blocks?
52 13:14 <pinheadmz> is that checkblock and contextualcheckblock ?
53 13:14 <zenogais> InvalidBlockFound
54 13:14 <zenogais> and some others, let me check
55 13:15 <jnewbery> pinheadmz: I'm thinking more about on the net processing layer. Which function sees that we've received an invalid block and then handles whether to punish peers?
56 13:15 <zenogais> Ah and BlockChecked
57 13:15 <pinheadmz> BlockChecked()
58 13:15 <jnewbery> exactly
59 13:15 <pinheadmz> funny to name a function past-tense but ok i get it
60 13:16 <jnewbery> pinheadmz: the function is an implementation of that CValidationInterface method
61 13:16 <zenogais> which calls MaybePunishNode with the state given
62 13:16 <jnewbery> so that method is telling subscribers that 'a block has been checked'
65 13:18 <jnewbery> It's a member function of the PeerLogicValidation class
67 13:19 <jnewbery> so prior to this PR, we had net processing calling into validation (using ProcessNewBlock()), which then called directly into net processing (using the BlockChecked() validation interface method, which is a direct function call)
68 13:20 <jnewbery> so net processing -> validation -> net processing
69 13:20 <jnewbery> does that make sense? any questions about that?
70 13:20 <zenogais> Yep makes sense
71 13:20 <pinheadmz> cool.
73 13:21 <jnewbery> zenogais: exactly. Any client that wants to register with the validation interface needs to call that RegisterValidationInterface() function
74 13:22 <jnewbery> and how does this PR change that?
75 13:22 <jnewbery> (the net processing -> validation -> net processing stack)
76 13:23 <zenogais> Adds CConnman* as an argument
77 13:23 <pinheadmz> oh yeah i was a bit confused - what would be a 'client' in this sense?
78 13:23 <pinheadmz> are we actually talking about a process outside bitcoind? like ZMQ notificaitons ?
79 13:24 <jnewbery> pinheadmz: no, the clients are the components or classes that register with the validation interface. They're all internal to bitcoind
80 13:24 <ariard> like the index or wallet
81 13:24 <jnewbery> ariard: exactly
82 13:25 <jnewbery> other clients are the ZMQ component, which receives notifcations over the validation interface and then sends ZMQ notifications out
83 13:25 <ariard> and mining stack IIRC
84 13:25 <jnewbery> and PeerLogicValidation, which is net processing
85 13:26 <jnewbery> ariard: right. That uses the validation interface to check that a submitted block was valid
86 13:26 <zenogais> Adds an argument to ProcessNewBlock as well to bubble up the CValidationState
87 13:26 <jnewbery> I think that's all of them. Just look for classes which inherit CValidationInterface
88 13:26 <jnewbery> and then call RegisterValidationInterface()
89 13:27 <jnewbery> zenogais: right, so ProcessNewBlock is no longer calling BlockChecked directly
90 13:27 <jnewbery> which means validation is not making a direct functional call into net processing
91 13:28 <jnewbery> why is that important/interesting here?
92 13:28 <zenogais> Breaks the dependency there
93 13:29 <pinheadmz> The PR emphasizes pushing validation into a backgroun thread there as well
94 13:29 <pinheadmz> does it allow the process to be async then?
95 13:29 <amiti> easier to make async if its just a message being passed from net processing -> validation ?
96 13:29 <jnewbery> great! Yes, that's the motivation for this PR
98 13:31 <jnewbery> really, I think we want net processing to just be passing blocks/transactions up to validation, and being informed asynchronously of any validation events
99 13:31 <jnewbery> validation shouldn't be calling functions directly in net processing
101 13:32 <amiti> can you hash that out a bit further?
102 13:32 <amiti> why not?
103 13:32 <amiti> it mostly makes sense to me, but not entirely
104 13:32 <zenogais> My guess would be cleaner separation of concerns, and ability to run more work in parallel.
105 13:33 <jnewbery> zenogais: yeah, that's it
106 13:34 <jnewbery> if we want net processing and validation to run in parallel, we want net processing to be simply handing messages to validation and not have net processing and validation calling into each other
107 13:35 <ariard> we would be able to check validity of blocks in parallel
108 13:35 <amiti> ok ya. thanks
109 13:35 <zenogais> Also just cleaner to understand if its strict message passing rather than nested levels of function calls between components.
110 13:36 <jonatack> Yes, while avoiding reaching into internals across layers.
111 13:36 <jnewbery> Perhaps we can look at PR 16175 in a future review club meeting if people are interested
112 13:36 <jnewbery> to see where these changes are heading
114 13:37 <ariard> about #12934, the OG refactoring PR
115 13:37 <jnewbery> ariard: thanks! Nice find
116 13:37 <jnewbery> ok, onto questions
117 13:37 <jnewbery> What steps did you take, beyond reading the code?
118 13:38 <jnewbery> Did you test this? If so, how?
119 13:38 <pinheadmz> ran the tests, tried adding extra logging with printf to see what happens to dos_state
120 13:38 <zenogais> I ran all unit and functional tests and reviewed the existing DoS tests to look for blind spots.
121 13:38 <jnewbery> pinheadmz: zenogais: that's great!
122 13:38 <zenogais> It seems like most of what is easy to test is tested already
123 13:38 <jnewbery> did you find anything interesting?
124 13:39 <pinheadmz> not really, the validation_block_tests dont seem to test bad blocks
125 13:40 <pinheadmz> probably could have been better for me to log in CValidationState and then run the python tests
126 13:40 <jnewbery> interesting. I don't think I've looked at that file. A lot of the block failure modes are tested in the feature_blocks.py functional test
127 13:40 <pinheadmz> yeah, but the PR didn't touch those :-)
129 13:41 <zenogais> One interesting thing I found was `CheckForkWarningConditions`. Seems untested, but interesting system to have in place.
130 13:41 <jnewbery> ok, next question: Did anything about the way new blocks are checked and accepted surprise you? Do you have any questions about the series of steps to add a block to the block chain?
131 13:42 <pinheadmz> (heh - the link on that 16688.html page goes to the wrong PR...)
132 13:43 <jnewbery> pinheadmz: oops. Thanks - I'll fix after this meeting
133 13:43 <ariard> have a look on BlockStatus in src/chain.h
134 13:44 <ariard> we have different state according the validation work which has been done on a given Block
135 13:44 <zenogais> ah yeah, the BlockStatus stuff was interesting
136 13:44 <zenogais> still don't fully understandit
138 13:45 <jnewbery> (which Matt moved into its own commit today)
139 13:47 <jnewbery> BlockStatus is kind of weird. It seems to be partly treated as an enum and partly as a bitfield
140 13:48 <ariard> jnewbery: where it's treated as a bitfield?
141 13:48 <jnewbery> everything above the bottom three bits are treated as a bitfield, no?
142 13:48 <jnewbery> BLOCK_HAVE_DATA = 8, //!< full block available in blk*.dat
143 13:48 <jnewbery> BLOCK_HAVE_UNDO = 16, //!< undo data available in rev*.dat
145 13:49 <zenogais> The bottom three bits seem to be treated as a bitfield
146 13:49 <zenogais> BLOCK_VALID_MASK ors some of those bits together AFAICT
147 13:49 <ariard> Oh I see what you mean
148 13:49 <zenogais> it's odd to or BLOCK_VALID_TRANSACTIONS w/ BLOCK_VALID_RESERVED and BLOCK_VALID_TREE since some of them flip the same bits on
149 13:50 <jnewbery> the bottom three bits are used to store the validity
150 13:50 <jnewbery> the fourth bit is whether the block is available in a blk*.dat file
151 13:51 <ariard> well a OR of enum is still an enum
152 13:51 <jnewbery> BLOCK_VALID_MASK could just be defined as 7 (or 0b11100000...)
153 13:51 <jnewbery> ok, we've got 10 minutes left. Let's keep moving through the questions
154 13:51 <jnewbery> The author describes this PR as “a first step towards a series of cleanups leading to a (more-complete) #16175. It’s just two rather-nice (IMO) cleanups.” Do you agree? Is this just a cleanup PR or are there behaviour changes?
156 13:52 <jnewbery> ajonas: :)
157 13:52 <ariard> I agree there is a confusion between validity of block and its availability as data in BlockStatus
158 13:52 <zenogais> Hard to draw the line between cleanup and behavior change. I think some of the stuff pointed out with BlockStatus by jnewbery leads me to believe there's a behavior change.
159 13:53 <jnewbery> yeah, I found that change very difficult to review.
160 13:54 <jnewbery> before I put my ACK on a PR I want to understand exactly how behaviour might change, especially in net processing and validation where very tiny changes can have surprising and dangerous effects
161 13:55 <jnewbery> changing from `if (pindex->IsValid(BLOCK_VALID_TRANSACTIONS))` to `if (dos_state.IsValid())` seems reasonable, but it sent me off in lots of different directions trying to work out when BlockStatus is changed, what MarkBlockAsReceived() is doing exactly, etc
162 13:56 <zenogais> Yeah, I found I went down some pretty interesting rabbit holes there
163 13:56 <jnewbery> ok, a couple of minutes left. I had two other questions:
164 13:56 <jnewbery> Are there any other CValidationInterface clients that use the BlockChecked callback?
165 13:56 <jnewbery> The other place that the BlockChecked method is called is in ConnectTip() in ActivateBestChain(). Why does that need to be a callback? Why does net processing need to keep a mapBlockSource map from the block to the peer that provided it?
166 13:57 <jnewbery> I'm happy to extend this for a few extra minutes if everyone else is
167 13:57 <zenogais> submitblock_StateCatcher
168 13:57 <jnewbery> zenogais: nice. What's that used for?
169 13:58 <zenogais> I'm not exactly sure lol
170 13:58 <zenogais> It's in mining.cpp so clearly related to that
171 13:58 <zenogais> But hadn't had a chance to explore it
172 13:58 <jnewbery> right. It temporarily registers with the validation interface
173 13:58 <jnewbery> and then the block is submitted
174 13:59 <jnewbery> and it catches whether a BlockChecked notification was sent
175 13:59 <ariard> You need to keep a mapBlockSource to punish peer in case of invalidity ?
176 13:59 <jnewbery> ariard: yes!
177 13:59 <jnewbery> because we don't necessarily know whether a block was invalid until we try to connect it to the chain
178 13:59 <ariard> exactly
179 14:00 <jnewbery> and so at that point, if it is invalid (eg it contains a double-spend or invalid tx), we want to disconnect from the peer that sent it to us
180 14:00 <jnewbery> so mapBlockSource keeps a map from the block hash to the peer that sent it to us
181 14:01 <jnewbery> ok, we've overrun a bit, but does anyone have any final questions before we wrap it up?
183 14:02 <amiti> if fForceProcessing is set, then you force going through ProcessNewBlock again right
184 14:03 <amiti> cause it passes that value through to AcceptBlock
185 14:03 <jnewbery> yes, and I think it forces you to save to disk
186 14:04 <amiti> so, reasons you would go through ProcessNewBlock are 1. fNewBlock is true 2. you force it with fForceProcessing 3. pruning enabled & trying to re-download a block that was pruned
187 14:05 <jnewbery> fNewBlock I think is a return argument
188 14:05 <amiti> oh right
189 14:05 <amiti> ok so if you force it or pruned the block, you could repeat
190 14:06 <amiti> but otherwise you wouldnt ever repeat with the same hash
191 14:06 <zenogais> Have to head out. Thanks for hosting jnewberry!
192 14:07 <jnewbery> I think the comment is trying to say - we don't need to call ProcessNewBlock again if we receive a block with the same hash, as long as the state returned was valid and (it was fForceProcessing or a new block)
193 14:08 <jnewbery> the state.IsValid() means that the block hasn't been mutated - ie that the PoW is valid
195 14:09 <amiti> I think I get it
196 14:09 <jnewbery> it's a confusing comment!
197 14:09 <amiti> yeah and I was trying to fix it, but didn't actually understand what was trying to be said
198 14:09 <pinheadmz> ive been down this road before - bc a legacy node could send a block w no witness data, but itd have the same hash
199 14:10 <jnewbery> we're about 10 minutes over. Let's close it there.
200 14:10 <amiti> thank you :)
201 14:10 <jnewbery> Thanks everyone. Great discussion this week!
202 14:10 <pinheadmz> yeah thanks JN!!!!
204 14:11 <sebastianvstaa> thanks
205 14:11 <ajonas> thanks John