Tidy up ValidationState interface (validation)

https://github.com/bitcoin/bitcoin/pull/15921

Host: jnewbery  -  PR author: jnewbery

Happy half-birthday to PR review club 🎉

Notes

  • This is a follow-up to PR 15141, which rewrote how DoS protection was handled. Previously, validation would assign DoS scores if a transaction or block was rejected, and pass the score back to net processing inside a CValidationState object. Net processing could then disconnect or ban the peer that provided the transaction or block. PR 15141 moved DoS handling responsibility to net processing.
  • Net processing (and other components like the wallet/RPC) pass new inventory (transactions, blocks and headers) to validation through different function calls, such as:
  • Both ATMP and PNBH have a CValidationState& argument. The function returns details about the validation of the transaction or headers in that state object. PNB doesn’t currently take a state argument, but instead receives state back from PNB through the CValidationInterface::BlockChecked() callback. PR 16279 (covered in a previous PR review club) would add that state argument to PNB.
  • The CValidationState object was introduced in PR 2224, including DoS score tracking.
  • This PR tidies up some loose ends left by PR 15141. Most notably it adds BlockValidationState and TxValidationState as derived classes of ValidationState, since the reasons that a block and transaction can be rejected are different.
  • The important changes in this PR are in src/consensus/validation.h. Most of the other changes are updating the call sites to use the changed objects.
  • This PR is much easier to review if you look at the commits individually, especially if you use ryanofsky’s review tip in the PR description.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? (Don’t forget to put your PR review on GitHub.)

  2. What steps did you take, beyond reading the code?

  3. Does this PR change any behaviour, or is it a pure refactor?

  4. Why do you think we have different validation results for RECENT_CONSENSUS_CHANGE and CONSENSUS?

  5. Prepare one of the following for the review club:

    • a question of your own about the PR or code;
    • an observation you made from reviewing the PR.

Meeting Log

  113:00 <@jnewbery> hi!
  213:00 <kanzure> hi
  313:00 <diogoser1io> hello!
  413:00 <amiti> hi
  513:00 <fjahr> hi
  613:00 <lightlike> hi
  713:01 <@jnewbery> Today's PR is 15921. Notes and questions are in the usual place: https://bitcoincore.reviews/15921.html
  813:01 <@jnewbery> Thanks to everyone who reviewed/tested/left comments!
  913:01 <@jnewbery> the PR was merged today, but that doesn't mean you shouldn't review it
 1013:01 <schmidty> hola
 1113:02 <@jnewbery> In my mind, there are (at least) three reasons to review PRs:
 1213:02 <@jnewbery> 1. Make suggestions to the PR author on how to improve the PR
 1313:02 <@jnewbery> 2. Leave your ACK if you think the code should be merged or your NACK if you think it shouldn't be
 1413:02 <@jnewbery> 3. Learn about the code
 1513:03 <@jnewbery> All three are still valuable after the PR has been merged
 1613:03 <@jnewbery> (1) can lead to follow-up PRs that can improve the code further, for (2) you might still find a bug that other reviewers have missed, and we should all be doing (3) all the time!
 1713:04 <@jnewbery> ok, any initial thoughts about the PR?
 1813:04 <@jnewbery> (as always, question 1 is Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? (Don’t forget to put your PR review on GitHub.)
 1913:05 <amiti> I thought it was interesting to review. Since it touched all the sites that care about block / txn validation, it exposed me to new parts of the codebase
 2013:06 <fjahr> Yeah, I acked since it makes a lot of sense to me to separate Tx and Block validation
 2113:06 <@jnewbery> amiti: good! The PR title says this is a tidy-up, but I thought it might be interesting because it touches the net_processing-validation interface, which I think are the two most interesting components in Bitcoin Core
 2213:07 <@jnewbery> fjahr: good
 2313:07 <@jnewbery> ok, for those that reviewed: what steps did you take to review/test?
 2413:08 <ajonas> Going through each commit was a must on this one. And ryanofsky's suggestion was very helpful on the first commit.
 2513:09 <amiti> I mostly just read the code one commit at a time & tried to understand the changes, and poke around the larger codebase to get context for them
 2613:10 <lightlike> just looked at each line of code that was changed and tried to understand it (saw the hint by ryanofski too late)
 2713:10 <@jnewbery> yeah, ryanofsky's reviewer tip was very helpful
 2813:10 <michaelfolkson> Agreed <ajonas> I'd like more of those reviewer tips
 2913:10 <@jnewbery> lightlike: perhaps I should have added the tip to the commit log?
 3013:11 <@jnewbery> next question: Does this PR change any behaviour, or is it a pure refactor?
 3113:11 <lightlike> jnewbery: no, it was clear enough in the first post of the PR, just my fault...
 3213:13 <michaelfolkson> #15141 changed behavior but this one (#15921) didn't right?
 3313:13 <fjahr> I did not see any behaviour change. It was consensus code after all :)
 3413:13 <@jnewbery> Did anyone here review the previous PR (15141)? If not, did any of you go back and look at it?
 3513:14 <lightlike> it does change an error message of sendtransaction rpc call in case of missing inputs, other than that I didn't see anything.
 3613:14 <@jnewbery> lightlike: good catch!
 3713:14 <michaelfolkson> I looked at #15141. I didn't "review" as already merged
 3813:14 <amiti> I def took a look at 15141 in the context of this PR, but didnt dig very deep
 3913:14 <@jnewbery> yes, some minor logging/error message change
 4013:15 <@jnewbery> michaelfolkson: yes, 15141 has some minor behaviour changes in the way we ban/disconnect peers
 4113:17 <michaelfolkson> That punishment logic was interesting
 4213:17 <michaelfolkson> But leave that to later, let's stay on #15921 for now ;)
 4313:18 <@jnewbery> michaelfolkson: yeah, the disconnect logic is very interesting
 4413:18 <@jnewbery> which leads us into the next question: Why do you think we have different validation results for RECENT_CONSENSUS_CHANGE and CONSENSUS?
 4513:19 <michaelfolkson> I initially thought it was SegWit related but it actually wasn't
 4613:19 <@jnewbery> well segwit was the last consensus change
 4713:19 <michaelfolkson> For a future soft fork/consensus change right?
 4813:19 <fjahr> I found that RECENT_CONSENSUS_CHANGE relates to everything after segwit
 4913:19 <fjahr> so it has not been used yet
 5013:20 <amiti> I didn't understand why thats already been introduced if its not used yet
 5113:20 <fjahr> but it would be used to prevent a potential network split
 5213:20 <amiti> fjahr: how so?
 5313:21 <lightlike> amiti: i had the same question, if it is never used I don't understand how introducing it now can help us in the future.
 5413:21 <@jnewbery> What do we do to peers that send us txs/blocks that are invalid because of CONSENSUS? What do we do if they're invalid for RECENT_CONSENSUS_CHANGE?
 5513:22 <@jnewbery> amiti: lightlike: right, it's not currently used, but I think the reasoning behind having it as a validation reason is interesting
 5613:22 <fjahr> I guess a new change would be flagged with this and then this would be used for a conditional to manage behaviour differently. So we don't ban peers who have not upgraded to segwit yet.
 5713:22 <@jnewbery> There's a clue here: https://github.com/bitcoin/bitcoin/blob/a6abc94e9307ea05972ef69732bb148acbfa870a/src/validation.cpp#L1544-L1548
 5813:23 <@jnewbery> fjahr: yes, that's right (if you replace the word 'segwit' with 'current consensus rules')
 5913:24 <@jnewbery> Does that make sense to everyone?
 6013:24 <@jnewbery> does anyone have the answer to this:
 6113:24 <@jnewbery> > What do we do to peers that send us txs/blocks that are invalid because of CONSENSUS? What do we do if they're invalid for RECENT_CONSENSUS_CHANGE?
 6213:25 <amiti> https://github.com/bitcoin/bitcoin/blob/master/src/net_processing.cpp#L1004
 6313:25 <fjahr> ah, got it, so segwit is just the point after which this was introduced but there was nothing about segwit that made this necessary in particular.
 6413:25 <@jnewbery> amiti: you've found the right line. What's it doing?
 6513:25 <amiti> RECENT_CONSENSUS_CHANGE we just break, but CONSENSUS we invoke misbehaving
 6613:25 <fjahr> 1. ban 2. nothing
 6713:26 <@jnewbery> and what does that mean we do the peer?
 6813:26 <@jnewbery> fjahr: more or less. I think we just disconnect not ban for (1)
 6913:26 <amiti> do we necessarily ban? or do we just increase the score?
 7013:27 <amiti> oh, the threshold is 100
 7113:27 <amiti> and we increase score by 100
 7213:27 <@jnewbery> no, you're right it's a ban
 7313:27 <@jnewbery> amiti: right. It's instaban
 7413:27 <amiti> haha. instaban.
 7513:28 <@jnewbery> the DOS scores are weird and were somewhar arbitrary until 15141
 7613:28 <lightlike> i understand it only in parts: Couldn't it be introduced just as well together with the next consensus change instead of now? Has it been implemented now so that future consensus changes don't forget about the idea?
 7713:29 <@jnewbery> lightlike: Yes, the code isn't exercised now so it could just be added with the next consensus change
 7813:29 <@jnewbery> I think the best reason for having the code in place now is so future consensus changes don't forget the idea
 7913:30 <@jnewbery> I think in segwit this was almost forgotten (but that was before i was active in Bitcoin Core so I'm certain about that)
 8013:30 <michaelfolkson> That would not have been ideal for network stability
 8113:30 <@jnewbery> the risk is that some portion of the network upgrades, the softfork is activated, and then someone sends a transaction that is invalid under the new rules to nodes that haven't upgraded yet
 8213:31 <@jnewbery> those old nodes relay the transaction, but if they relay it to an upgraded node then that upgraded node will reject it
 8313:31 <@jnewbery> we don't want the upgraded nodes to disconnect/ban the unupgraded nodes. That would cause a network split
 8413:32 <@jnewbery> so we want to treat txs/blocks that fail validation for RECENT_CONSENSUS_CHANGE differently than those that fail for long-standing CONSENSUS rules
 8513:32 <@jnewbery> (nothing about that changed in this PR, but I thought it was a fun sidetrack)
 8613:32 <amiti> ok so thats the reason for having the distinction between recent & existing, but the reason that code is _already_ introduced is so its not forgotten about during implementation?
 8713:32 <@jnewbery> amiti: yes
 8813:33 <amiti> gotcha. thanks
 8913:33 <michaelfolkson> I'm wondering if there are better approaches than putting these "placeholders" in
 9013:33 <@jnewbery> My final question was for all of you to either prepare a question or share an observation you made while reviewing the PR. Anyone have anything?
 9113:34 <fjahr> What did you think of my idea to move the punishment logic out?
 9213:34 <@jnewbery> oh, here's some discussion of RECENT_CONSENSUS_CHANGE: https://github.com/bitcoin/bitcoin/pull/15141#discussion_r247736063
 9313:35 <fjahr> I saw from other PRs that it is not necessarily a goal to reduce net_processing size but I am still interested
 9413:35 <@jnewbery> fjahr: I'm not sure it achieves much. peer punishment logic is so tied to net processing logic that separating them into separate translation units doesn't seem like an improvement
 9513:36 <amiti> jnewbery: I have a question about your git workflow- I noticed you pushed fixup commits in response to reviewer comments, and then squashed & force pushed afterwards. I was wondering if there was a specific reason you did that (vs squash and force push initially)... potentially to make it easier for reviewers to see the recent changes?
 9613:36 <fjahr> jnewbery: ok, thanks
 9713:36 <@jnewbery> peer punishment logic wouldn't be required by any other component, which is one reason you'd want to separate them
 9813:38 <@jnewbery> amiti: yeah, that's it. Just so those reviewers could very quickly see what changes I'd made in response to their comments. Lightlike suggested a much better way to do what he'd asked for, so I was able to throw away my commit and use his suggested method
 9913:38 <@jnewbery> as soon as those reviewers were happy I just squashed the changes into the relevant commits and force pushed. Verifying that a force-push doesn't change the final code is very quick
10013:40 <@jnewbery> if I'd squashed and force-pushed initially and then lightlike had made his suggestion, I'd have wasted a bunch of time changing the initial commit and then changing it back and potentially having to resolve merge conflicts
10113:40 <@jnewbery> any other comments/questions?
10213:40 <amiti> ah I see. ok thanks
10313:40 <michaelfolkson> If nobody else has anything I wouldn't mind chatting more about the punishment logic
10413:41 <@jnewbery> michaelfolkson: just ask. No need to ask to ask :)
10513:41 <michaelfolkson> So I'm assuming there is no way to trick a honest peer into forwarding a consensus invalid transaction
10613:42 <michaelfolkson> I'm wondering if this scoring system is optimal and whether it can be gamed
10713:42 <@jnewbery> michaelfolkson: not if the tx fails validation on your node
10813:42 <michaelfolkson> Instaban seems pretty confident the peer is dishonest
10913:43 <@jnewbery> of course if your node is old or isn't validating the same consensus rules as everyone else and an adversary knows that, then they can send you a tx that isn't valid according to the rest of the network and is valid for you. You'd try to relay that one
11013:44 <michaelfolkson> What about the weaker punishments? Discussed in https://github.com/bitcoin/bitcoin/pull/15141
11113:44 <@jnewbery> what do you mean by weaker punishments?
11213:45 <lightlike> i think if you send two version messages, you just a get score of 10
11313:45 <lightlike> for examlpe
11413:45 <michaelfolkson> Oh a lot of them have moved to max punishment (100)
11513:46 <@jnewbery> lightlike: it's actually 1
11613:46 <@jnewbery> https://github.com/bitcoin/bitcoin/blob/a6abc94e9307ea05972ef69732bb148acbfa870a/src/net_processing.cpp#L1912
11713:46 <sdaftuar> an important idea relating to tricking a peer into forwarding a consensus invalid transaction happens with softfork deployment
11813:46 <michaelfolkson> Some of them were previously 10/50 points
11913:46 <michaelfolkson> But now full ban
12013:47 <@jnewbery> which is incredibly arbitrary. We'll tolerate a peer sending us 99 duplicate VERSION messages, but ban them on the 100th
12113:47 <sdaftuar> this is why we strive to ensure that features that transactions which will obey new consensus rules are policy-invalid (though obviously not consensus invalid) to old peers
12213:47 <sdaftuar> s/that features//
12313:48 <michaelfolkson> Interesting
12413:48 <@jnewbery> right. We want consensus changes to be policy-invalid for some length of time before they become consensus-invalid
12513:49 <michaelfolkson> And we can overwrite the ban if it is a peer that we trust?
12613:49 <@jnewbery> note that we don't ban peers for sending us policy-invalid txs: https://github.com/bitcoin/bitcoin/blob/a6abc94e9307ea05972ef69732bb148acbfa870a/src/net_processing.cpp#L1073-L1079
12713:49 <@jnewbery> michaelfolkson: take a look at whitelisting. We'll never bad/disconnect a whitelisted peer
12813:49 <sdaftuar> jnewbery: this gets to the heart of the CONSENSUS_CHANGE vs RECENT_CONSENSUS_CHANGE debate
12913:49 <michaelfolkson> <jnewbery>: Yup thanks
13013:50 <@jnewbery> Last 10 minutes. Any more questions/observations?
13113:51 <michaelfolkson> Just to clarify on <fjahr>'s question. Really long files are ok as long as they don't contain parts that multiple components use?
13213:52 <michaelfolkson> Length isn't a variable to consider?
13313:53 <@jnewbery> michaelfolkson: it'd be nicer to have shorter source files, where code can be logically split up
13413:54 <michaelfolkson> But low on the priority list right?
13513:54 <@jnewbery> but in this case, the only places that call MaybeDisconnectPeerForX() are in net_processing, so it doesn't make sense to split it out
13613:55 <michaelfolkson> Yeah it would only be for readability I suppose
13713:55 <michaelfolkson> Ok thanks
13813:55 <@jnewbery> any more questions?
13913:57 <@jnewbery> ok, let's wrap it up there. Thanks everyone! And thanks sdaftuar for dropping in
14013:57 <lightlike> thanks all!
14113:57 <@jnewbery> I'll post next week's PR before this weekend
14213:57 <michaelfolkson> Indeed, thanks everyone
14313:57 <fjahr> thanks john! and all :)
14413:57 <amiti> thanks !
14513:58 <diogosergio> Thanks, it was good to follow this up!