Tidy up ValidationState interface (validation)

Oct 30, 2019

https://github.com/bitcoin/bitcoin/pull/15921

Host: jnewbery - PR author: jnewbery

Happy half-birthday to PR review club 🎉

Notes

This is a follow-up to PR 15141, which rewrote how DoS protection was handled. Previously, validation would assign DoS scores if a transaction or block was rejected, and pass the score back to net processing inside a CValidationState object. Net processing could then disconnect or ban the peer that provided the transaction or block. PR 15141 moved DoS handling responsibility to net processing.
Net processing (and other components like the wallet/RPC) pass new inventory (transactions, blocks and headers) to validation through different function calls, such as:
- AcceptToMemoryPool() (ATMP) processes a transaction for acceptance to the memory pool.
- ProcessNewBlock() (PNB) processes a new block (and if it’s valid, adds it to our mapBlockIndex and potentially adds it to the block chain).
- ProcessNewBlockHeaders() (PNBH) processes a vector of block headers.
Both ATMP and PNBH have a CValidationState& argument. The function returns details about the validation of the transaction or headers in that state object. PNB doesn’t currently take a state argument, but instead receives state back from PNB through the CValidationInterface::BlockChecked() callback. PR 16279 (covered in a previous PR review club) would add that state argument to PNB.
The CValidationState object was introduced in PR 2224, including DoS score tracking.
This PR tidies up some loose ends left by PR 15141. Most notably it adds BlockValidationState and TxValidationState as derived classes of ValidationState, since the reasons that a block and transaction can be rejected are different.
The important changes in this PR are in src/consensus/validation.h. Most of the other changes are updating the call sites to use the changed objects.
This PR is much easier to review if you look at the commits individually, especially if you use ryanofsky’s review tip in the PR description.

Questions

Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? (Don’t forget to put your PR review on GitHub.)
What steps did you take, beyond reading the code?
Does this PR change any behaviour, or is it a pure refactor?
Why do you think we have different validation results for RECENT_CONSENSUS_CHANGE and CONSENSUS?
Prepare one of the following for the review club:
- a question of your own about the PR or code;
- an observation you made from reviewing the PR.

Meeting Log

13:00

<@jnewbery> hi!

13:00

<kanzure> hi

13:00

<diogoser1io> hello!

13:00

<amiti> hi

13:00

<fjahr> hi

13:00

<lightlike> hi

13:01

<@jnewbery> Today's PR is 15921. Notes and questions are in the usual place: https://bitcoincore.reviews/15921.html

13:01

<@jnewbery> Thanks to everyone who reviewed/tested/left comments!

13:01

<@jnewbery> the PR was merged today, but that doesn't mean you shouldn't review it

13:01

<schmidty> hola

13:02

<@jnewbery> In my mind, there are (at least) three reasons to review PRs:

13:02

<@jnewbery> 1. Make suggestions to the PR author on how to improve the PR

13:02

<@jnewbery> 2. Leave your ACK if you think the code should be merged or your NACK if you think it shouldn't be

13:02

<@jnewbery> 3. Learn about the code

13:03

<@jnewbery> All three are still valuable after the PR has been merged

13:03

<@jnewbery> (1) can lead to follow-up PRs that can improve the code further, for (2) you might still find a bug that other reviewers have missed, and we should all be doing (3) all the time!

13:04

<@jnewbery> ok, any initial thoughts about the PR?

13:04

<@jnewbery> (as always, question 1 is Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? (Don’t forget to put your PR review on GitHub.)

13:05

<amiti> I thought it was interesting to review. Since it touched all the sites that care about block / txn validation, it exposed me to new parts of the codebase

13:06

<fjahr> Yeah, I acked since it makes a lot of sense to me to separate Tx and Block validation

13:06

<@jnewbery> amiti: good! The PR title says this is a tidy-up, but I thought it might be interesting because it touches the net_processing-validation interface, which I think are the two most interesting components in Bitcoin Core

13:07

<@jnewbery> fjahr: good

13:07

<@jnewbery> ok, for those that reviewed: what steps did you take to review/test?

13:08

<ajonas> Going through each commit was a must on this one. And ryanofsky's suggestion was very helpful on the first commit.

13:09

<amiti> I mostly just read the code one commit at a time & tried to understand the changes, and poke around the larger codebase to get context for them

13:10

<lightlike> just looked at each line of code that was changed and tried to understand it (saw the hint by ryanofski too late)

13:10

<@jnewbery> yeah, ryanofsky's reviewer tip was very helpful

13:10

<michaelfolkson> Agreed <ajonas> I'd like more of those reviewer tips

13:10

<@jnewbery> lightlike: perhaps I should have added the tip to the commit log?

13:11

<@jnewbery> next question: Does this PR change any behaviour, or is it a pure refactor?

13:11

<lightlike> jnewbery: no, it was clear enough in the first post of the PR, just my fault...

13:13

<michaelfolkson> #15141 changed behavior but this one (#15921) didn't right?

13:13

<fjahr> I did not see any behaviour change. It was consensus code after all :)

13:13

<@jnewbery> Did anyone here review the previous PR (15141)? If not, did any of you go back and look at it?

13:14

<lightlike> it does change an error message of sendtransaction rpc call in case of missing inputs, other than that I didn't see anything.

13:14

<@jnewbery> lightlike: good catch!

13:14

<michaelfolkson> I looked at #15141. I didn't "review" as already merged

13:14

<amiti> I def took a look at 15141 in the context of this PR, but didnt dig very deep

13:14

<@jnewbery> yes, some minor logging/error message change

13:15

<@jnewbery> michaelfolkson: yes, 15141 has some minor behaviour changes in the way we ban/disconnect peers

13:17

<michaelfolkson> That punishment logic was interesting

13:17

<michaelfolkson> But leave that to later, let's stay on #15921 for now ;)

13:18

<@jnewbery> michaelfolkson: yeah, the disconnect logic is very interesting

13:18

<@jnewbery> which leads us into the next question: Why do you think we have different validation results for RECENT_CONSENSUS_CHANGE and CONSENSUS?

13:19

<michaelfolkson> I initially thought it was SegWit related but it actually wasn't

13:19

<@jnewbery> well segwit was the last consensus change

13:19

<michaelfolkson> For a future soft fork/consensus change right?

13:19

<fjahr> I found that RECENT_CONSENSUS_CHANGE relates to everything after segwit

13:19

<fjahr> so it has not been used yet

13:20

<amiti> I didn't understand why thats already been introduced if its not used yet

13:20

<fjahr> but it would be used to prevent a potential network split

13:20

<amiti> fjahr: how so?

13:21

<lightlike> amiti: i had the same question, if it is never used I don't understand how introducing it now can help us in the future.

13:21

<@jnewbery> What do we do to peers that send us txs/blocks that are invalid because of CONSENSUS? What do we do if they're invalid for RECENT_CONSENSUS_CHANGE?

13:22

<@jnewbery> amiti: lightlike: right, it's not currently used, but I think the reasoning behind having it as a validation reason is interesting

13:22

<fjahr> I guess a new change would be flagged with this and then this would be used for a conditional to manage behaviour differently. So we don't ban peers who have not upgraded to segwit yet.

13:22

<@jnewbery> There's a clue here: https://github.com/bitcoin/bitcoin/blob/a6abc94e9307ea05972ef69732bb148acbfa870a/src/validation.cpp#L1544-L1548

13:23

<@jnewbery> fjahr: yes, that's right (if you replace the word 'segwit' with 'current consensus rules')

13:24

<@jnewbery> Does that make sense to everyone?

13:24

<@jnewbery> does anyone have the answer to this:

13:24

<@jnewbery> > What do we do to peers that send us txs/blocks that are invalid because of CONSENSUS? What do we do if they're invalid for RECENT_CONSENSUS_CHANGE?

13:25

<amiti> https://github.com/bitcoin/bitcoin/blob/master/src/net_processing.cpp#L1004

13:25

<fjahr> ah, got it, so segwit is just the point after which this was introduced but there was nothing about segwit that made this necessary in particular.

13:25

<@jnewbery> amiti: you've found the right line. What's it doing?

13:25

<amiti> RECENT_CONSENSUS_CHANGE we just break, but CONSENSUS we invoke misbehaving

13:25

<fjahr> 1. ban 2. nothing

13:26

<@jnewbery> and what does that mean we do the peer?

13:26

<@jnewbery> fjahr: more or less. I think we just disconnect not ban for (1)

13:26

<amiti> do we necessarily ban? or do we just increase the score?

13:27

<amiti> oh, the threshold is 100

13:27

<amiti> and we increase score by 100

13:27

<@jnewbery> no, you're right it's a ban

13:27

<@jnewbery> amiti: right. It's instaban

13:27

<amiti> haha. instaban.

13:28

<@jnewbery> the DOS scores are weird and were somewhar arbitrary until 15141

13:28

<lightlike> i understand it only in parts: Couldn't it be introduced just as well together with the next consensus change instead of now? Has it been implemented now so that future consensus changes don't forget about the idea?

13:29

<@jnewbery> lightlike: Yes, the code isn't exercised now so it could just be added with the next consensus change

13:29

<@jnewbery> I think the best reason for having the code in place now is so future consensus changes don't forget the idea

13:30

<@jnewbery> I think in segwit this was almost forgotten (but that was before i was active in Bitcoin Core so I'm certain about that)

13:30

<michaelfolkson> That would not have been ideal for network stability

13:30

<@jnewbery> the risk is that some portion of the network upgrades, the softfork is activated, and then someone sends a transaction that is invalid under the new rules to nodes that haven't upgraded yet

13:31

<@jnewbery> those old nodes relay the transaction, but if they relay it to an upgraded node then that upgraded node will reject it

13:31

<@jnewbery> we don't want the upgraded nodes to disconnect/ban the unupgraded nodes. That would cause a network split

13:32

<@jnewbery> so we want to treat txs/blocks that fail validation for RECENT_CONSENSUS_CHANGE differently than those that fail for long-standing CONSENSUS rules

13:32

<@jnewbery> (nothing about that changed in this PR, but I thought it was a fun sidetrack)

13:32

<amiti> ok so thats the reason for having the distinction between recent & existing, but the reason that code is _already_ introduced is so its not forgotten about during implementation?

13:32

<@jnewbery> amiti: yes

13:33

<amiti> gotcha. thanks

13:33

<michaelfolkson> I'm wondering if there are better approaches than putting these "placeholders" in

13:33

<@jnewbery> My final question was for all of you to either prepare a question or share an observation you made while reviewing the PR. Anyone have anything?

13:34

<fjahr> What did you think of my idea to move the punishment logic out?

13:34

<@jnewbery> oh, here's some discussion of RECENT_CONSENSUS_CHANGE: https://github.com/bitcoin/bitcoin/pull/15141#discussion_r247736063

13:35

<fjahr> I saw from other PRs that it is not necessarily a goal to reduce net_processing size but I am still interested

13:35

<@jnewbery> fjahr: I'm not sure it achieves much. peer punishment logic is so tied to net processing logic that separating them into separate translation units doesn't seem like an improvement

13:36

<amiti> jnewbery: I have a question about your git workflow- I noticed you pushed fixup commits in response to reviewer comments, and then squashed & force pushed afterwards. I was wondering if there was a specific reason you did that (vs squash and force push initially)... potentially to make it easier for reviewers to see the recent changes?

13:36

<fjahr> jnewbery: ok, thanks

13:36

<@jnewbery> peer punishment logic wouldn't be required by any other component, which is one reason you'd want to separate them

13:38

<@jnewbery> amiti: yeah, that's it. Just so those reviewers could very quickly see what changes I'd made in response to their comments. Lightlike suggested a much better way to do what he'd asked for, so I was able to throw away my commit and use his suggested method

13:38

<@jnewbery> as soon as those reviewers were happy I just squashed the changes into the relevant commits and force pushed. Verifying that a force-push doesn't change the final code is very quick

100

13:40

<@jnewbery> if I'd squashed and force-pushed initially and then lightlike had made his suggestion, I'd have wasted a bunch of time changing the initial commit and then changing it back and potentially having to resolve merge conflicts

101

13:40

<@jnewbery> any other comments/questions?

102

13:40

<amiti> ah I see. ok thanks

103

13:40

<michaelfolkson> If nobody else has anything I wouldn't mind chatting more about the punishment logic

104

13:41

<@jnewbery> michaelfolkson: just ask. No need to ask to ask :)

105

13:41

<michaelfolkson> So I'm assuming there is no way to trick a honest peer into forwarding a consensus invalid transaction

106

13:42

<michaelfolkson> I'm wondering if this scoring system is optimal and whether it can be gamed

107

13:42

<@jnewbery> michaelfolkson: not if the tx fails validation on your node

108

13:42

<michaelfolkson> Instaban seems pretty confident the peer is dishonest

109

13:43

<@jnewbery> of course if your node is old or isn't validating the same consensus rules as everyone else and an adversary knows that, then they can send you a tx that isn't valid according to the rest of the network and is valid for you. You'd try to relay that one

110

13:44

<michaelfolkson> What about the weaker punishments? Discussed in https://github.com/bitcoin/bitcoin/pull/15141

111

13:44

<@jnewbery> what do you mean by weaker punishments?

112

13:45

<lightlike> i think if you send two version messages, you just a get score of 10

113

13:45

<lightlike> for examlpe

114

13:45

<michaelfolkson> Oh a lot of them have moved to max punishment (100)

115

13:46

<@jnewbery> lightlike: it's actually 1

116

13:46

<@jnewbery> https://github.com/bitcoin/bitcoin/blob/a6abc94e9307ea05972ef69732bb148acbfa870a/src/net_processing.cpp#L1912

117

13:46

<sdaftuar> an important idea relating to tricking a peer into forwarding a consensus invalid transaction happens with softfork deployment

118

13:46

<michaelfolkson> Some of them were previously 10/50 points

119

13:46

<michaelfolkson> But now full ban

120

13:47

<@jnewbery> which is incredibly arbitrary. We'll tolerate a peer sending us 99 duplicate VERSION messages, but ban them on the 100th

121

13:47

<sdaftuar> this is why we strive to ensure that features that transactions which will obey new consensus rules are policy-invalid (though obviously not consensus invalid) to old peers

122

13:47

<sdaftuar> s/that features//

123

13:48

<michaelfolkson> Interesting

124

13:48

<@jnewbery> right. We want consensus changes to be policy-invalid for some length of time before they become consensus-invalid

125

13:49

<michaelfolkson> And we can overwrite the ban if it is a peer that we trust?