#15921 validation: Tidy up ValidationState interface (validation)

https://github.com/bitcoin/bitcoin/15921

Host: jnewbery

Happy half-birthday to PR review club 🎉

Notes

  • This is a follow-up to PR 15141, which rewrote how DoS protection was handled. Previously, validation would assign DoS scores if a transaction or block was rejected, and pass the score back to net processing inside a CValidationState object. Net processing could then disconnect or ban the peer that provided the transaction or block. PR 15141 moved DoS handling responsibility to net processing.
  • Net processing (and other components like the wallet/RPC) pass new inventory (transactions, blocks and headers) to validation through different function calls, such as:
  • Both ATMP and PNBH have a CValidationState& argument. The function returns details about the validation of the transaction or headers in that state object. PNB doesn’t currently take a state argument, but instead receives state back from PNB through the CValidationInterface::BlockChecked() callback. PR 16279 (covered in a previous PR review club) would add that state argument to PNB.
  • The CValidationState object was introduced in PR 2224, including DoS score tracking.
  • This PR tidies up some loose ends left by PR 15141. Most notably it adds BlockValidationState and TxValidationState as derived classes of ValidationState, since the reasons that a block and transaction can be rejected are different.
  • The important changes in this PR are in src/consensus/validation.h. Most of the other changes are updating the call sites to use the changed objects.
  • This PR is much easier to review if you look at the commits individually, especially if you use ryanofsky’s review tip in the PR description.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? (Don’t forget to put your PR review on GitHub.)

  2. What steps did you take, beyond reading the code?

  3. Does this PR change any behaviour, or is it a pure refactor?

  4. Why do you think we have different validation results for RECENT_CONSENSUS_CHANGE and CONSENSUS?

  5. Prepare one of the following for the review club:

    • a question of your own about the PR or code;
    • an observation you made from reviewing the PR.

Meeting Log

13:00 <@jnewbery> hi!
13:00 < kanzure> hi
13:00 < diogoser1io> hello!
13:00 < amiti> hi
13:00 < fjahr> hi
13:00 < lightlike> hi
13:01 <@jnewbery> Today's PR is 15921. Notes and questions are in the usual place: https://bitcoincore.reviews/15921.html
13:01 <@jnewbery> Thanks to everyone who reviewed/tested/left comments!
13:01 <@jnewbery> the PR was merged today, but that doesn't mean you shouldn't review it
13:01 < schmidty> hola
13:02 <@jnewbery> In my mind, there are (at least) three reasons to review PRs:
13:02 <@jnewbery> 1. Make suggestions to the PR author on how to improve the PR
13:02 <@jnewbery> 2. Leave your ACK if you think the code should be merged or your NACK if you think it shouldn't be
13:02 <@jnewbery> 3. Learn about the code
13:03 <@jnewbery> All three are still valuable after the PR has been merged
13:03 <@jnewbery> (1) can lead to follow-up PRs that can improve the code further, for (2) you might still find a bug that other reviewers have missed, and we should all be doing (3) all the time!
13:04 <@jnewbery> ok, any initial thoughts about the PR?
13:04 <@jnewbery> (as always, question 1 is Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? (Don’t forget to put your PR review on GitHub.)
13:05 < amiti> I thought it was interesting to review. Since it touched all the sites that care about block / txn validation, it exposed me to new parts of the codebase
13:06 < fjahr> Yeah, I acked since it makes a lot of sense to me to separate Tx and Block validation
13:06 <@jnewbery> amiti: good! The PR title says this is a tidy-up, but I thought it might be interesting because it touches the net_processing-validation interface, which I think are the two most interesting components in Bitcoin Core
13:07 <@jnewbery> fjahr: good
13:07 <@jnewbery> ok, for those that reviewed: what steps did you take to review/test?
13:08 < ajonas> Going through each commit was a must on this one. And ryanofsky's suggestion was very helpful on the first commit.
13:09 < amiti> I mostly just read the code one commit at a time & tried to understand the changes, and poke around the larger codebase to get context for them
13:10 < lightlike> just looked at each line of code that was changed and tried to understand it (saw the hint by ryanofski too late)
13:10 <@jnewbery> yeah, ryanofsky's reviewer tip was very helpful
13:10 < michaelfolkson> Agreed <ajonas> I'd like more of those reviewer tips
13:10 <@jnewbery> lightlike: perhaps I should have added the tip to the commit log?
13:11 <@jnewbery> next question: Does this PR change any behaviour, or is it a pure refactor?
13:11 < lightlike> jnewbery: no, it was clear enough in the first post of the PR, just my fault...
13:13 < michaelfolkson> #15141 changed behavior but this one (#15921) didn't right?
13:13 < fjahr> I did not see any behaviour change. It was consensus code after all :)
13:13 <@jnewbery> Did anyone here review the previous PR (15141)? If not, did any of you go back and look at it?
13:14 < lightlike> it does change an error message of sendtransaction rpc call in case of missing inputs, other than that I didn't see anything.
13:14 <@jnewbery> lightlike: good catch!
13:14 < michaelfolkson> I looked at #15141. I didn't "review" as already merged
13:14 < amiti> I def took a look at 15141 in the context of this PR, but didnt dig very deep
13:14 <@jnewbery> yes, some minor logging/error message change
13:15 <@jnewbery> michaelfolkson: yes, 15141 has some minor behaviour changes in the way we ban/disconnect peers
13:17 < michaelfolkson> That punishment logic was interesting
13:17 < michaelfolkson> But leave that to later, let's stay on #15921 for now ;)
13:18 <@jnewbery> michaelfolkson: yeah, the disconnect logic is very interesting
13:18 <@jnewbery> which leads us into the next question: Why do you think we have different validation results for RECENT_CONSENSUS_CHANGE and CONSENSUS?
13:19 < michaelfolkson> I initially thought it was SegWit related but it actually wasn't
13:19 <@jnewbery> well segwit was the last consensus change
13:19 < michaelfolkson> For a future soft fork/consensus change right?
13:19 < fjahr> I found that RECENT_CONSENSUS_CHANGE relates to everything after segwit
13:19 < fjahr> so it has not been used yet
13:20 < amiti> I didn't understand why thats already been introduced if its not used yet
13:20 < fjahr> but it would be used to prevent a potential network split
13:20 < amiti> fjahr: how so?
13:21 < lightlike> amiti: i had the same question, if it is never used I don't understand how introducing it now can help us in the future.
13:21 <@jnewbery> What do we do to peers that send us txs/blocks that are invalid because of CONSENSUS? What do we do if they're invalid for RECENT_CONSENSUS_CHANGE?
13:22 <@jnewbery> amiti: lightlike: right, it's not currently used, but I think the reasoning behind having it as a validation reason is interesting
13:22 < fjahr> I guess a new change would be flagged with this and then this would be used for a conditional to manage behaviour differently. So we don't ban peers who have not upgraded to segwit yet.
13:22 <@jnewbery> There's a clue here: https://github.com/bitcoin/bitcoin/blob/a6abc94e9307ea05972ef69732bb148acbfa870a/src/validation.cpp#L1544-L1548
13:23 <@jnewbery> fjahr: yes, that's right (if you replace the word 'segwit' with 'current consensus rules')
13:24 <@jnewbery> Does that make sense to everyone?
13:24 <@jnewbery> does anyone have the answer to this:
13:24 <@jnewbery> > What do we do to peers that send us txs/blocks that are invalid because of CONSENSUS? What do we do if they're invalid for RECENT_CONSENSUS_CHANGE?
13:25 < amiti> https://github.com/bitcoin/bitcoin/blob/master/src/net_processing.cpp#L1004
13:25 < fjahr> ah, got it, so segwit is just the point after which this was introduced but there was nothing about segwit that made this necessary in particular.
13:25 <@jnewbery> amiti: you've found the right line. What's it doing?
13:25 < amiti> RECENT_CONSENSUS_CHANGE we just break, but CONSENSUS we invoke misbehaving
13:25 < fjahr> 1. ban 2. nothing
13:26 <@jnewbery> and what does that mean we do the peer?
13:26 <@jnewbery> fjahr: more or less. I think we just disconnect not ban for (1)
13:26 < amiti> do we necessarily ban? or do we just increase the score?
13:27 < amiti> oh, the threshold is 100
13:27 < amiti> and we increase score by 100
13:27 <@jnewbery> no, you're right it's a ban
13:27 <@jnewbery> amiti: right. It's instaban
13:27 < amiti> haha. instaban.
13:28 <@jnewbery> the DOS scores are weird and were somewhar arbitrary until 15141
13:28 < lightlike> i understand it only in parts: Couldn't it be introduced just as well together with the next consensus change instead of now? Has it been implemented now so that future consensus changes don't forget about the idea?
13:29 <@jnewbery> lightlike: Yes, the code isn't exercised now so it could just be added with the next consensus change
13:29 <@jnewbery> I think the best reason for having the code in place now is so future consensus changes don't forget the idea
13:30 <@jnewbery> I think in segwit this was almost forgotten (but that was before i was active in Bitcoin Core so I'm certain about that)
13:30 < michaelfolkson> That would not have been ideal for network stability
13:30 <@jnewbery> the risk is that some portion of the network upgrades, the softfork is activated, and then someone sends a transaction that is invalid under the new rules to nodes that haven't upgraded yet
13:31 <@jnewbery> those old nodes relay the transaction, but if they relay it to an upgraded node then that upgraded node will reject it
13:31 <@jnewbery> we don't want the upgraded nodes to disconnect/ban the unupgraded nodes. That would cause a network split
13:32 <@jnewbery> so we want to treat txs/blocks that fail validation for RECENT_CONSENSUS_CHANGE differently than those that fail for long-standing CONSENSUS rules
13:32 <@jnewbery> (nothing about that changed in this PR, but I thought it was a fun sidetrack)
13:32 < amiti> ok so thats the reason for having the distinction between recent & existing, but the reason that code is _already_ introduced is so its not forgotten about during implementation?
13:32 <@jnewbery> amiti: yes
13:33 < amiti> gotcha. thanks
13:33 < michaelfolkson> I'm wondering if there are better approaches than putting these "placeholders" in
13:33 <@jnewbery> My final question was for all of you to either prepare a question or share an observation you made while reviewing the PR. Anyone have anything?
13:34 < fjahr> What did you think of my idea to move the punishment logic out?
13:34 <@jnewbery> oh, here's some discussion of RECENT_CONSENSUS_CHANGE: https://github.com/bitcoin/bitcoin/pull/15141#discussion_r247736063
13:35 < fjahr> I saw from other PRs that it is not necessarily a goal to reduce net_processing size but I am still interested
13:35 <@jnewbery> fjahr: I'm not sure it achieves much. peer punishment logic is so tied to net processing logic that separating them into separate translation units doesn't seem like an improvement
13:36 < amiti> jnewbery: I have a question about your git workflow- I noticed you pushed fixup commits in response to reviewer comments, and then squashed & force pushed afterwards. I was wondering if there was a specific reason you did that (vs squash and force push initially)... potentially to make it easier for reviewers to see the recent changes?
13:36 < fjahr> jnewbery: ok, thanks
13:36 <@jnewbery> peer punishment logic wouldn't be required by any other component, which is one reason you'd want to separate them
13:38 <@jnewbery> amiti: yeah, that's it. Just so those reviewers could very quickly see what changes I'd made in response to their comments. Lightlike suggested a much better way to do what he'd asked for, so I was able to throw away my commit and use his suggested method
13:38 <@jnewbery> as soon as those reviewers were happy I just squashed the changes into the relevant commits and force pushed. Verifying that a force-push doesn't change the final code is very quick
13:40 <@jnewbery> if I'd squashed and force-pushed initially and then lightlike had made his suggestion, I'd have wasted a bunch of time changing the initial commit and then changing it back and potentially having to resolve merge conflicts
13:40 <@jnewbery> any other comments/questions?
13:40 < amiti> ah I see. ok thanks
13:40 < michaelfolkson> If nobody else has anything I wouldn't mind chatting more about the punishment logic
13:41 <@jnewbery> michaelfolkson: just ask. No need to ask to ask :)
13:41 < michaelfolkson> So I'm assuming there is no way to trick a honest peer into forwarding a consensus invalid transaction
13:42 < michaelfolkson> I'm wondering if this scoring system is optimal and whether it can be gamed
13:42 <@jnewbery> michaelfolkson: not if the tx fails validation on your node
13:42 < michaelfolkson> Instaban seems pretty confident the peer is dishonest
13:43 <@jnewbery> of course if your node is old or isn't validating the same consensus rules as everyone else and an adversary knows that, then they can send you a tx that isn't valid according to the rest of the network and is valid for you. You'd try to relay that one
13:44 < michaelfolkson> What about the weaker punishments? Discussed in https://github.com/bitcoin/bitcoin/pull/15141
13:44 <@jnewbery> what do you mean by weaker punishments?
13:45 < lightlike> i think if you send two version messages, you just a get score of 10
13:45 < lightlike> for examlpe
13:45 < michaelfolkson> Oh a lot of them have moved to max punishment (100)
13:46 <@jnewbery> lightlike: it's actually 1
13:46 <@jnewbery> https://github.com/bitcoin/bitcoin/blob/a6abc94e9307ea05972ef69732bb148acbfa870a/src/net_processing.cpp#L1912
13:46 < sdaftuar> an important idea relating to tricking a peer into forwarding a consensus invalid transaction happens with softfork deployment
13:46 < michaelfolkson> Some of them were previously 10/50 points
13:46 < michaelfolkson> But now full ban
13:47 <@jnewbery> which is incredibly arbitrary. We'll tolerate a peer sending us 99 duplicate VERSION messages, but ban them on the 100th
13:47 < sdaftuar> this is why we strive to ensure that features that transactions which will obey new consensus rules are policy-invalid (though obviously not consensus invalid) to old peers
13:47 < sdaftuar> s/that features//
13:48 < michaelfolkson> Interesting
13:48 <@jnewbery> right. We want consensus changes to be policy-invalid for some length of time before they become consensus-invalid
13:49 < michaelfolkson> And we can overwrite the ban if it is a peer that we trust?
13:49 <@jnewbery> note that we don't ban peers for sending us policy-invalid txs: https://github.com/bitcoin/bitcoin/blob/a6abc94e9307ea05972ef69732bb148acbfa870a/src/net_processing.cpp#L1073-L1079
13:49 <@jnewbery> michaelfolkson: take a look at whitelisting. We'll never bad/disconnect a whitelisted peer
13:49 < sdaftuar> jnewbery: this gets to the heart of the CONSENSUS_CHANGE vs RECENT_CONSENSUS_CHANGE debate
13:49 < michaelfolkson> <jnewbery>: Yup thanks
13:50 <@jnewbery> Last 10 minutes. Any more questions/observations?
13:51 < michaelfolkson> Just to clarify on <fjahr>'s question. Really long files are ok as long as they don't contain parts that multiple components use?
13:52 < michaelfolkson> Length isn't a variable to consider?
13:53 <@jnewbery> michaelfolkson: it'd be nicer to have shorter source files, where code can be logically split up
13:54 < michaelfolkson> But low on the priority list right?
13:54 <@jnewbery> but in this case, the only places that call MaybeDisconnectPeerForX() are in net_processing, so it doesn't make sense to split it out
13:55 < michaelfolkson> Yeah it would only be for readability I suppose
13:55 < michaelfolkson> Ok thanks
13:55 <@jnewbery> any more questions?
13:57 <@jnewbery> ok, let's wrap it up there. Thanks everyone! And thanks sdaftuar for dropping in
13:57 < lightlike> thanks all!
13:57 <@jnewbery> I'll post next week's PR before this weekend
13:57 < michaelfolkson> Indeed, thanks everyone
13:57 < fjahr> thanks john! and all :)
13:57 < amiti> thanks !
13:58 < diogosergio> Thanks, it was good to follow this up!