BIP 157: Add support for getcfheaders (
p2p) May 21, 2020
The PR branch HEAD was 4376797 at the time of this review club meeting.
This is the third PR in our special series on the
implementation. See the previous meetings on PR 18877 and PR
This PR adds support for the
getcfheaders P2P message. You should familiarise
yourself with the format of that message and the
cfheaders response before
reviewing the code changes. Both of those messages are defined in
getcfheaders message contains two fields:
StartHeight (the height of
the first block in the requested range) and
StopHash (the hash of the last
block in the requested range). The message handling code starts by deserializing
those fields into
uint256 local variables, respectively.
PrepareBlockFilterRequest() helper function has its interface changed
start_height and a
ProcessGetCFHeaders() function calls a new method on the block filter
indexer class called
New tests are added to the
p2p_blockfilters.py test file.
What happens if the
getcfheaders message is malformed and we’re not able
to deserialize it? Try testing this for yourself and looking at the debug
What changes do we need to make to the
ProcessGetCFCheckPt() function in
this PR? Why?
What data is the
LookupFilterHashRange() function returning? Why don’t
we just return the filter headers to the peer?
How is the new functionality tested? Should any other tests be added?
1 13:00 <jnewbery> #startmeeting
6 13:00 <jnewbery> fjahr: Thanks! I've updated #18876
8 13:00 <michaelfolkson> hi
10 13:01 <jnewbery> who had a chance to review the PR? (y/n)
15 13:02 <michaelfolkson> 0.5y
16 13:02 <jnewbery> Let's get into the questions. What happens if the getcfheaders message is malformed and we’re not able to deserialize it?
17 13:03 <theStack> concluding from my log, CDataStream::read() seems to throw an exception
18 13:04 <michaelfolkson> I didn't do this. But what was the best approach here to malform it? Randomly tweak it or deliberately construct a message that didn't follow the definition?
19 13:04 <theStack> [msghand] ProcessMessages(getcfheaders, 11 bytes): Exception 'CDataStream::read(): end of data: iostream error' (NSt8ios_base7failureB5cxx11E) caught
21 13:05 <theStack> michaelfolkson: i just modified the method msg_getcfheaders.serialize() in test_framework/messages.py to create some garbage data
22 13:05 <michaelfolkson> theStack: Ah nice, makes sense
24 13:06 <jnewbery> We don't need to add custom deserialization error handling for new message types
25 13:07 <jnewbery> Next question: What changes do we need to make to the ProcessGetCFCheckPt() function in this PR? Why?
26 13:09 <theStack> were there ever any discussions among bitcoin core devs about whether to use exceptions in general? there seem to be very divided views about that in general... same say they are evil because they basically behave like goto/longjmp, and some languages like golang don't even include them (sorry for slight off-topic :))
27 13:10 <jkczyz> Adding stop_hash and max_height_dif params to the PrepareBlockFilterRequest call
28 13:10 <jnewbery> theStack: not off-topic. It's a good question!
29 13:11 <theStack> PrepareBlockFilterRequst was extended to test the preconditions more detailled, by also checking the allowed block range
31 13:12 <jnewbery> if it's to handle errors from input that you don't control (as is the case in ProcessMessage()), then it makes sense
32 13:13 <theStack> jnewbery: sounds reasonable. in this case it's definitely a good thing that a deserialization return code doesn't need to be checked every time
34 13:15 <jnewbery> Next question: What data is the LookupFilterHashRange() function returning? Why don’t we just return the block headers to the peer?
35 13:16 <theStack> jnewbery: i think this is supposed to be "filter headers" in the second question? (i submitted a PR shortly before the start of the meeting)
36 13:17 <theStack> the idea is that only the one filter header is part of the cfheaders response, the rest are filter _hashes_, and the headers can be computed on the client-side from that information
37 13:18 <jnewbery> theStack: you're right. I mean filter headers.
38 13:18 <jkczyz> which allows them to verify they are linked correctly
39 13:18 <theStack> jkczyz: +1
40 13:19 <jnewbery> jkczyz: exactly
41 13:19 <jnewbery> meaning you can fetch the headers trustlessly if you have the checkpts
43 13:21 <jnewbery> ok, final question. How is the new functionality tested? Should any other tests be added?
44 13:22 <jkczyz> jnewbery: ah, error handling in relation to PrepareBlockFilterRequest? Or the exception handling discussion?
45 13:23 <jnewbery> the exception handling discussion (or any other wisdom you want to share with us)
46 13:24 <fjahr> basically the test does what the client should do as described above, it asks for cfheaders and then validates the hashes it receives by computing that it ends up at the right header
47 13:25 <jkczyz> I came from a world where C++ exception handling was discourage / outright banned. I tend to agree with the reasons behind that. I've come to like variant types where the return value is either the result or an error. Then you never have uninitialized or partially initialized objects around
48 13:25 <jkczyz> Rust has a Result enum type for this. I believe newer version of C++ have something similar
49 13:26 <jnewbery> yeah, that comment struck me as quite rust-ish
50 13:26 <michaelfolkson> There's no test that the node doesn't respond to getcfheaders with an unsupported filter type
51 13:26 <theStack> some languages go the way of returning two values, one for the actual result and one for an error code (i think golang also does this)
52 13:27 <michaelfolkson> That was outlined in the BIP
53 13:28 <fjahr> theStack: I think that's more of a style pattern? I mean you can write code this way in any language.
54 13:29 <jkczyz> jnewbery: Wrote the comment before I learned Rust but happy to see the idiom is prevalent in that world :)
55 13:29 <michaelfolkson> There is only one filter type currently though right?
57 13:31 <theStack> fjahr: yes, you are right. let's say then some languages propose this style pattern in their guidelines _and_ also provide the syntactic sugar for it (to not need an out-parameter, for example), more than others :)
59 13:32 <theStack> i think one of the most ugliest methods of error handling is global variables, like errno in the C standard library
60 13:33 <fjahr> I guess there could be a test that the client is disconnected if it requests invalid or too many filter headers
61 13:33 <jnewbery> michaelfolkson: yes, currently only one filter type. BIP 158 initially defined two filter types, but that was reduced to just basic filter types
62 13:33 <jnewbery> fjahr: seems reasonable
63 13:34 <michaelfolkson> Thanks jnewbery. What was the motivation for the second (now ditched) filter type?
64 13:35 <jkczyz> theStack: I soured on out params once I saw the result type idiom. Out params seem to encourage uninitialized objects and null ptrs
65 13:35 <jnewbery> michaelfolkson: I'm not sure. I didn't follow early discussion of BIP 158 very closely
66 13:35 <michaelfolkson> I know Pieter said something about needing a soft fork to make this more trust minimized. Maybe it was related to that
67 13:36 <jnewbery> jkczyz: when you say newer C++ has a Result enum type, do you know where I could find out more?
69 13:38 <sipa> michaelfolkson: i argued against the other filter because i thought it was useless
70 13:39 <michaelfolkson> Ok thanks sipa
72 13:40 <jkczyz> jnewbery: Looks like it from a brief glance though I was thinking of std::variant
75 13:40 <sipa> haskell has an (actual data or error string) type
77 13:41 <sipa> variant looks like a crazy hammer to achieve that
78 13:41 <michaelfolkson> Re the cfheaders cache size being big enough until 2047. I'm assuming this is a direct trade-off with OOM concerns. Could make it a bit bigger but then you are also raising OOM concerns
79 13:42 <jnewbery> unfortunately discussion about basic and extended filters was before the optech newsletter, so you'll just need to go back to the primary source material!
80 13:42 <michaelfolkson> Where are the Optech minus versions?!
81 13:43 <jnewbery> the paper I linked to talks about Haskell monads
82 13:43 <theStack> michaelfolkson: i think the OOM concerns were primarily about it being unlimited
83 13:43 <sipa> the extended filter was including all data elements pushed in scriptSigs i believe, which i think was a huge mistake in bip37
84 13:43 <sipa> or sPKs, i don't remember
85 13:44 <michaelfolkson> So could bump it theStack? Not a big deal but might as well push it out as far as possible if no downsides?
87 13:44 <fjahr> The extended filter contains extra data that is meant to enable applications
88 13:44 <fjahr> with more advanced smart contracts. An extended filter MUST contain exactly the
89 13:44 <fjahr> following items for each transaction in a block ''except the coinbase'':
90 13:44 <fjahr> * Each item within the witness stack of each input (if the input has a witness)
91 13:44 <fjahr> * Each data push in the scriptSig of each input
92 13:44 <jnewbery> (I haven't watched it, but it looks interesting)
93 13:45 <fjahr> (the removed part of BIP158 on extended filter type)
94 13:45 <theStack> sipa: putting all scriptSigs in a filter sounds crazy... would there ever be any use to match signatures?
95 13:46 <theStack> michaelfolkson: in my opinion it could be bumped, the memory usage is neglictible, whether it is 2000 or e.g. 4000...
97 13:47 <sipa> jkczyz: i can't remember seeing that
98 13:47 <sipa> theStack: the idea was presumably that it allows matching for the full pubkey
100 13:50 <michaelfolkson> One final one. I know we are going through these PRs pretty quickly week by week. Do they need longer time to be reviewed? Or are they pretty much ready to all be merged?
101 13:50 <theStack> sipa: yes matching only the public key part of scriptSig data sounds reasonable... the signatures themselves not i think (don't know though how easy it is to distinguish)
102 13:51 <michaelfolkson> There should be a few weeks on the first one, a few weeks on the second one etc etc?
103 13:51 <jnewbery> michaelfolkson: they're ready to be merged once they've been reviewed and the maintainers consider them safe to merge.
104 13:51 <sipa> theStack: imho the only thing you should ever care about is the entire sPK as a whole
106 13:53 <jnewbery> (and just means that there may be conflicts that require rebase and rereview)
107 13:54 <jnewbery> The next PR in the sequence is "Serve cfilters requests", which is pretty much more of the same, so I suggest we don't do a review club on that and reconvene when "Signal NODE_COMPACT_FILTERS support" is open.
108 13:55 <theStack> michaelfolkson: concerns about too quick merging on the first BIP 157 PR were expressed, see last two comments on #18877
109 13:55 <michaelfolkson> Makes sense. Doesn't seem applicable here but generally there is a trade-off of waiting for more reviews versus getting the merging over with.
110 13:56 <jnewbery> michaelfolkson: it's down to the maintainer's judgement. If they think it's received enough reviews, then they'll merge it
111 13:56 <fjahr> I also found it a bit unusual that there were merge target date. On the other hand I agree, the code has been out there for a long time, just in different commits
112 13:56 <jnewbery> There's also nothing to stop people from reviewing a PR after it's been merged.
113 13:57 <michaelfolkson> Not asking because I'm critical, just asking to understand the thought process better ;)
114 13:57 <michaelfolkson> But thanks
115 13:57 <sipa> i also find it strange to have a merge target date
116 13:58 <jnewbery> I put a target date to hold myself accountable to make progress. I'm trying to turn around all review comments on these PRs in less than 24 hours.
117 13:59 <michaelfolkson> I suppose this set of PRs has a unique history of being held up for years due to a couple of Concept NACKs too. If the code has been sitting there for all that time...
118 14:00 <jnewbery> It seems like a monumental waste of effort to have a PR open for 9 months where review happens, there's some delay before the author responds, the reviewer gets bored, the PR needs to be rebased, etc
119 14:00 <michaelfolkson> Yup
120 14:01 <theStack> sipa: hmm if you only put sPK data in a block filter you couldn't detect if a block spent an utxo of your interest, or did you mean something different?
121 14:01 <jnewbery> I hope it's obvious that I'm not proposing that these things get merged on those dates if there hasn't been enough review
122 14:02 <sipa> theStack: BIP158 includes the sPK of UTXOs when they're created *and* when they're spent
123 14:02 <fjahr> jnewbery, Bitcoin Scrum Master :p
124 14:02 <michaelfolkson> Yeah that was clear to me jnewbery
125 14:03 <theStack> sipa: ah, that makes sense then
126 14:03 <jnewbery> jfahr: haha. I'll pass on that title if you don't mind
128 14:04 <jnewbery> alright that's time. Thanks everyone!
129 14:05 <jnewbery> #endmeeting
131 14:05 <theStack> so far i didn't have the impression that something got merged without a good amount of ACKs -- but i guess it's always a good thing if people stay watchful
132 14:05 <theStack> thanks!
133 14:05 <michaelfolkson> Thanks. Great work on progressing these jnewbery
134 14:05 <jkczyz> thanks jnewbery
136 14:08 <jnewbery> (until a couple of weeks ago when I rebased it)
137 14:10 <michaelfolkson> Or general understanding that there was some limited opposition at a conceptual level?
138 14:10 <michaelfolkson> PRs languish due to lack of interest or opposition right? It didn't seem like the former to me but I could be wrong
140 14:17 <michaelfolkson> Yeah interesting. I don't envy the maintainers. We had a Socratic with Jeremy Rubin on CTV and it seems like the design for that proposal has benefitted from the perceived opposition over time
141 14:18 <michaelfolkson> Seems like a much stronger proposal now than if it had been rushed through years ago. But knowing when to delay or oppose versus supporting it being merged seems like a fiendish responsibility
142 14:19 <michaelfolkson> Above my pay grade :)
143 14:19 <theStack> my assumption is that the potential for opposition or controversy would be way larger if it was enabled by default