BIP 157: Signal support for compact block filters with NODE_COMPACT_FILTERS (
p2p) Jun 4, 2020
The PR branch HEAD was f5c003d3e at the time of this review club meeting.
This is the fourth PR in our special series on the
implementation. See the previous meetings on PR 18877, PR
18960 and PR 19010. Notes
This PR adds support for the
NODE_COMPACT_FILTERS service bit, used in
addr messages to signal a node’s capabilities. The existing
service bit are
defined in protocol.h.
Prior to this PR, even if a node is configured to serve
cfcheckpts messages, it won’t signal that capability to
peers on the network.
This PR includes two important changes from the original implementation
PrepareBlockFilterRequest() no longer calls
BlockUntilSyncedToCurrentChain() to avoid blocking the message handler
-peerblockfilters can be set and
NODE_COMPACT_FILTERS can be signaled
before the index is fully built.
for those changes and jimpo’s
in the PR. See also the
of prior discussion in PR 16442. Questions
What are the implications of
PrepareBlockFilterRequest() calling into
BlockUntilSyncedToCurrentChain()? What are the implications of not calling
BlockUntilSyncedToCurrentChain()? Which is preferable? Are there any
What are the potential problems with a node signaling
before its compact block filters index is built? What are the potential
issues if a node changes its service bits dynamically?
1 13:00 <jnewbery> #startmeeting
2 13:00 <jnewbery> Hi! Welcome to the final special BIP 157 review club meeting.
3 13:00 <michaelfolkson> hi
6 13:01 <jnewbery> I think this might be the most interesting of the BIP 157 PRs, since it touches on some protocol design questions
7 13:02 <jnewbery> The code changes are small and simple, but the design decisions have important implications for BIP 157 clients
8 13:02 <jnewbery> is someone able to summarize the changes?
10 13:03 <jnewbery> hi jon!
11 13:03 <michaelfolkson> You mean the changes between jimpo's code and your code or this PR generally?
12 13:04 <jnewbery> ok, I'll summarize. It's adding signaling for BIP 157 support, using the NODE_COMPACT_FILTERS service bit in version and addr messages
13 13:05 <jnewbery> michaelfolkson: I meant generally in this PR, but if you want to summarize the differences from Jimpo's implementation, that'd be great
14 13:06 <michaelfolkson> Ok. One of the changes is that your code signals a node can provide filters for temporarily when it actually can't (as I understand)
15 13:07 <jnewbery> That's not exactly it
16 13:07 <jnewbery> it'll signal support for filters before it's fully built the cache. It's able to serve filters from blocks that it's processed
17 13:08 <michaelfolkson> Ah ok gotcha
18 13:09 <jnewbery> sorry, s/cache/index
19 13:09 <michaelfolkson> So it is not as simple saying willing to and unable to versus willing to and able to
21 13:10 <jnewbery> with this PR, a node that signals NODE_COMPACT_FILTERS will always be able to serve the filters that it has built, it's just possible that it hasn't fully built the index yet
22 13:11 <michaelfolkson> I'm stuck in this IBD paradigm. Although there are some parallels they don't map perfectly
23 13:11 <jnewbery> There's one other change, which is that PrepareBlockFilterRequest() no longer calls BlockUntilSyncedToCurrentChain().
24 13:11 <jnewbery> What are the implications of PrepareBlockFilterRequest() calling into BlockUntilSyncedToCurrentChain()? What are the implications of not calling BlockUntilSyncedToCurrentChain()? Which is preferable? Are there any alternative solutions?
25 13:12 <jnewbery> michaelfolkson: it's very similar to IBD. A node will signal NODE_NETWORK before it has completed IBD
26 13:12 <jkczyz> Calling it can prevent other messages from being processed. Not calling it may lead to requests not being served if the index does not contain the filter for the requested block
27 13:12 <jnewbery> that means "I can serve blocks". It doesn't mean "I'll be able to serve you any block that you request"
28 13:13 <jnewbery> jkczyz: exactly. Calling that function blocks the message handling thread on the scheduler thread, which I think we should avoid unless absolutely necessary
29 13:13 <jnewbery> the scheduler thread may be doing slow things like flushing the wallet to disk
30 13:14 <jkczyz> jnewbery: What if only some of the requested filters are available? Will we respond with a partial set?
31 13:15 <jkczyz> Haven't looked into this myself but it just came to mind
32 13:15 <jnewbery> And not blocking on the scheduler thread means that if we receive a request before the BlockConnected callback to the compact block index has been called, we won't be able to respond to that thread
33 13:16 <jnewbery> I think we'd reply with just a partial set. Let me check
34 13:17 <jnewbery> oh no, I'm wrong, we wouldn't respond with any of the filters
37 13:19 <jonatack> the original impl would have also entailed full node users having to stop and restart the node once the index finished building
38 13:19 <jkczyz> michaelfolkson: I was trying to think if there is some analogy there. As in, if a node knows it shouldn't request a block from a peer, could there be a similar mechanism for filters
39 13:20 <jnewbery> jonatack: right. Well actually the original original implementation toggled the service bit dynamically. Some reviewers (including me) didn't like that, so jimpo changed it to be toggled manually
40 13:20 <michaelfolkson> But what is the reason for not responding with a partial set? Because filters are smaller than blocks and therefore you should expect to receive all of them in one go?
41 13:20 <jonatack> yeah, dynamic isn't better, for the reasons everyone stated
43 13:22 <jnewbery> so the peer knows not to ask for higher blocks than that
44 13:22 <michaelfolkson> jkczyz: Yeah they do not map perfectly but I think they share enough characteristics such that there are a few questions like "Well IBD doesn't do what filter serving is doing. Why not?"
45 13:22 <MarcoFalke> with block filters the peer also knows not to ask for filters of block that the node hasn't announced, no?
46 13:24 <jnewbery> MarcoFalke: that's what the BIP says, but I think it's not a great design. I think it should be fine for a node to receive and relay a block before it's built the filter
47 13:24 <jnewbery> tying those two functions together seems like a bad idea
48 13:26 <jnewbery> it constrains implementers to couple together internal components that needn't be (eg here, blocking message handling on the scheduler thread so that the index gets built)
49 13:26 <jkczyz> jnewbery: ah, so doing so would also require re-introducing BlockUntilSyncedToCurrentChain?
50 13:26 <MarcoFalke> I see your point, but the validation interface queue needs to be processed either way at some point. Otherwise it will "overflow"
51 13:27 <jonatack> is it relatively common to see implementation development pushback and drive BIP updates?
52 13:27 <jnewbery> jkczyz: it requires some kind of sychronicity
53 13:27 <MarcoFalke> jonatack: Often issues with a BIP are only realized after it has been deployed. (E.g. bloomfilters, ouch)
54 13:28 <sipa> MarcoFalke: that's a fair example, but the issue there wasn't really discovered through implementation
55 13:28 <michaelfolkson> Steelmanning jimpo's argument. Changing the BIP is no big deal (imo).
56 13:28 <MarcoFalke> jup, not implementation, but more analysis
57 13:28 <michaelfolkson> "The BIP was designed to expose a consistent interface to clients, which is conceptually simpler, making it easier to implement client code correctly."
58 13:29 <michaelfolkson> You disagree with this jnewbery?
59 13:29 <jnewbery> one other way to achieve the same would be to have some kind of 'queue' of outstanding BIP157 requests, and if they can't be serviced, delay processing them (I put queue in quotes because I think we'd want to limit it to one request)
60 13:29 <sipa> jnewbery: i think you can also look at it the other way... why should an implementation detail in core affect the BIP?
61 13:29 <jonatack> MarcoFalke: thanks. that's not terribly surprising.
62 13:30 <sipa> for all kinds of network-servicable data peers can reasonable expect synchronicity... if you've processed a block, you can give it to others
63 13:30 <sipa> the bloom filter is weird in that it's started as an optional index, but now is becoming a network-servicable thing
65 13:30 <jnewbery> (requests from a single peer are still treated serially)
66 13:31 <jnewbery> I don't think we should do this because it adds complexity, but there are potentially other ways to achieve what's wanted
67 13:32 <sipa> and if we want to say do block processing in a separate thread as well, like index building, we'll be faced with the same issues, and they need to be solved in a way that doesn't break the protocol synchronicity guarantees
68 13:32 <sipa> so it feels a bit strange to use a current implementation issue with filters specially
69 13:33 <jnewbery> sipa: I think my point is that we should avoid synchronicity between the core functionality of relaying blocks and the optional functionality of serving filters
70 13:33 <sipa> jnewbery: well we already have that, as long as it's an RPC interface
71 13:33 <sipa> if it's a P2P thing, it should behave P2P like
72 13:34 <jnewbery> I don't think the block example is the same. We wouldn't relay a block through headers if we hadn't yet processed it
73 13:35 <jkczyz> I've heard discussion around eventually committing the filter in the block (sorry if that terminology is no precise) -- at that point, does this problem simply go away (i.e., would the index be updated when the block is connected)? Or does the problem still exist for older blocks?
74 13:35 <sipa> jnewbery: i'm not sure i see the difference
76 13:38 <jnewbery> sipa: if we send a header to a peer, we've already processed the block, so we can respond to a getdata request for it
77 13:38 <jnewbery> the header message itself is an indication that we can serve the block
78 13:38 <jnewbery> but why should it also be an indication that we have built the filter
79 13:39 <sipa> so perhaps we shouldn't send a header message to BIP157 peers before we've built the filter for it?
80 13:39 <jnewbery> we don't know which peers are BIP157 clients
82 13:39 <sipa> that's annoying
83 13:39 <jnewbery> the service bit is on the server side
84 13:41 <jnewbery> jkczyz: if the filter is committed, then presumably it's required for validation, and therefore a peer sending you a header for that block indicates that they have the filter
85 13:41 <MarcoFalke> We know they are clients when they send the first request, so what about disconnecting them instead of having them turn thumbs for 20 seconds (or whatever the timeout is)
86 13:41 <jnewbery> (again, it's slightly different for compact blocks where we relay a compact block before fully validating)
87 13:42 <sipa> yeah, compact blocks are explicitly asynchronously processed, as an exception to the usual protocol guarantees, with a good reason
88 13:42 <sipa> it seems BIP157 is written with an assumption of synchronous processing
89 13:43 <jnewbery> sipa: right, and with an assumption that servers will always be able to service any request
90 13:44 <sipa> and actually implementing that seems to require not announcing headers until the index has caught up?
91 13:44 <jnewbery> I'd prefer to not add this message-handler thread blocking, but I don't think it's the end of the world if we do. We're talking about very uncommon circumstances (usually I expect we'd be able to build the filter in the round-trip time between sending a header and receiving a getcfilters request)
92 13:45 <sipa> given that this functionality is opt-in, perhaps that's not too crazy?
93 13:45 <MarcoFalke> sipa: Yes, that's a different issue from the thread blocking. I'd say we should be nice to peers and disconnect them
94 13:46 <sipa> MarcoFalke: i agree
95 13:46 <jonatack> MarcoFalke: agree
96 13:46 <sipa> (independently of the thread blocking issue)
97 13:46 <MarcoFalke> (when the index itself is in "ibd" or syncing phase)
98 13:46 <MarcoFalke> which might happen when the node itself is already out of ibd
99 13:48 <jnewbery> sorry, to be clear about what you're all agreeing to: you think that if we've completed IBD but not yet finished building the index, we should disconnect and peers that request cfilters?
100 13:48 <MarcoFalke> jup, any blockfilter message that we can't service
101 13:48 <sipa> i think that whenever we would be not responding to a cfilters request, we should disconnect instead
102 13:50 <sipa> i also think it would be cleaner that whenever reasonable (post IBD?), we should wait for the index to sync and respond, instead of not responding/disconnecting
103 13:50 <jnewbery> I think that's reasonable, but would need to think about it
104 13:50 <MarcoFalke> if the peer asks for the filters from block [1000, 1500] and we can send them, sure. Though if we are out of ibd and at block 700k and they ask for the filter for block 700k -> disconnect
105 13:50 <jonatack> In general, when faced with a necessary reasonable tradeoff between user experience for full node users versus that of light clients, I'd favor the full node users.
106 13:50 <jnewbery> sipa: so add the BlockUntilSyncedToCurrentChain() back?
107 13:51 <MarcoFalke> jnewbery: That one can't be solved by a call to Sync()
108 13:51 <jnewbery> MarcoFalke: I understand. Trying to clarify sipa's "we should wait for the index to sync and respond"
109 13:52 <MarcoFalke> ah, missed that msg
110 13:52 <sipa> it feels to me that BIP157, as written, assumes synchronicity - so if we should either implement it that way, or change the BIP
111 13:52 <sipa> or not implement it
112 13:52 <sipa> but this is not a trivial change
113 13:53 <jnewbery> We only have 10 minutes left, but I did want to cover the second question
114 13:53 <jnewbery> What are the potential problems with a node signaling NODE_COMPACT_FILTERS before its compact block filters index is built? What are the potential issues if a node changes its service bits dynamically?
115 13:53 <sipa> and given that the protocol is an entirely opt-in extension, i think just adding syncs to make it act synchronously is not a big deal
116 13:53 <jnewbery> We've already talked quite a bit about some of the implications, but I was specifically interested in how the service bits are cached in various places
117 13:54 <jnewbery> like in other nodes' addrmans, and in the seed nodes
118 13:54 <sipa> they're overwritten on connection, so i don't think changing them on the fly is a big deal - as long as a reasonable degree of caching remains useful
119 13:54 <jnewbery> if the service bits change dynamically, is it possible that other nodes will be gossipping our old service bits for a long time?
120 13:55 <sipa> if you'd expect service bits to flip-flop constantly, there is a problem
121 13:56 <sipa> but i don't see much of a problem with changing them once a condition is fulfilled so that they won't change back
122 13:56 <MarcoFalke> It shouldn't matter much either way
123 13:56 <michaelfolkson> jnewbery: I don't know... It depends on how effective gossip floods the entire network?
124 13:56 <michaelfolkson> *effectively
125 13:57 <michaelfolkson> It is not flapping. It is old service bit -> new service bit right?
126 13:58 <michaelfolkson> It is one way. It doesn't return back to the old service bit ever
127 13:58 <jnewbery> Right. It's going from not present to present
128 13:58 <MarcoFalke> It latches to true (assuming no restart)
129 13:58 <sipa> i think (but i'm not sure) that the propagation of the gossip addr messages with the new flag is just as fast as the original announcement
130 13:59 <jnewbery> how does the new flag overwrite the old flag?
131 14:00 <sipa> the addr message with the new flag in is gossiped, which will overwrite the entry in addrman
132 14:00 <sipa> upon connection to the actual peer it's also overwritten
133 14:00 <MarcoFalke> sipa: But the light peer won't connect if the flag isn't set
135 14:01 <MarcoFalke> So it should be slightly slower
136 14:01 <jnewbery> I think I need to look more into how addrs are gossipped and how old data is refreshed
137 14:01 <sipa> i don't think there is a difference really
138 14:02 <sipa> if you initially gossip with flag "incorrectly" on, feature-needing peers will connect, and need to disconnect and retry a different peer
139 14:02 <MarcoFalke> sipa: Oh your point is that the propagation into addrmans does not slow down?
140 14:02 <MarcoFalke> I assumed "light clients will connect to you just as fast as if you were setting the flag a day in advance"
141 14:02 <sipa> yeah, my claim - but i'd need to look this up myself - is that the "updating" of the flag is not slower whether or not it has been broadcast already in the past
142 14:03 <jonatack> hmmmm
143 14:03 <sipa> so there is just the window during which peers think you're still on the old feature, but you're really already synced up to the new one, that matters
144 14:03 <jnewbery> ok, that's time! Thanks for input, everyone.
146 14:04 <michaelfolkson> Thanks jnewbery
147 14:04 * jnewbery goes to dig into addr gossip
148 14:04 <MarcoFalke> jnewbery: Thanks for hosting. And thanks for picking up this pull request. I don't think it would have made it this far by now if you hadn't put in the effort!
149 14:04 <jonatack> this ^
150 14:05 <jkczyz> yeah, thanks jnewbery
151 14:05 <jonatack> thanks! very interesting
152 14:06 <jnewbery> MarcoFalke: thanks, and thanks for all your review :)