The PR branch HEAD was f5c003d3e at the time of this review club meeting.
This is the fourth PR in our special series on the BIP
157
implementation. See the previous meetings on PR 18877, PR
18960 and PR 19010.
Notes
This PR adds support for the NODE_COMPACT_FILTERS service bit, used in
version and addr messages to signal a node’s capabilities. The existing
service bit are
defined in protocol.h.
Prior to this PR, even if a node is configured to serve cfilter,
cfheaders and cfcheckpts messages, it won’t signal that capability to
peers on the network.
This PR includes two important changes from the original implementation
in PR 16442:
PrepareBlockFilterRequest() no longer calls
BlockUntilSyncedToCurrentChain() to avoid blocking the message handler
thread.
-peerblockfilters can be set and NODE_COMPACT_FILTERS can be signaled
before the index is fully built.
What are the implications of PrepareBlockFilterRequest() calling into
BlockUntilSyncedToCurrentChain()? What are the implications of not calling
BlockUntilSyncedToCurrentChain()? Which is preferable? Are there any
alternative solutions?
What are the potential problems with a node signaling NODE_COMPACT_FILTERS
before its compact block filters index is built? What are the potential
issues if a node changes its service bits dynamically?
<jnewbery> with this PR, a node that signals NODE_COMPACT_FILTERS will always be able to serve the filters that it has built, it's just possible that it hasn't fully built the index yet
<jnewbery> What are the implications of PrepareBlockFilterRequest() calling into BlockUntilSyncedToCurrentChain()? What are the implications of not calling BlockUntilSyncedToCurrentChain()? Which is preferable? Are there any alternative solutions?
<jkczyz> Calling it can prevent other messages from being processed. Not calling it may lead to requests not being served if the index does not contain the filter for the requested block
<jnewbery> jkczyz: exactly. Calling that function blocks the message handling thread on the scheduler thread, which I think we should avoid unless absolutely necessary
<jnewbery> And not blocking on the scheduler thread means that if we receive a request before the BlockConnected callback to the compact block index has been called, we won't be able to respond to that thread
<jkczyz> michaelfolkson: I was trying to think if there is some analogy there. As in, if a node knows it shouldn't request a block from a peer, could there be a similar mechanism for filters
<jnewbery> jonatack: right. Well actually the original original implementation toggled the service bit dynamically. Some reviewers (including me) didn't like that, so jimpo changed it to be toggled manually
<michaelfolkson> But what is the reason for not responding with a partial set? Because filters are smaller than blocks and therefore you should expect to receive all of them in one go?
<michaelfolkson> jkczyz: Yeah they do not map perfectly but I think they share enough characteristics such that there are a few questions like "Well IBD doesn't do what filter serving is doing. Why not?"
<jnewbery> MarcoFalke: that's what the BIP says, but I think it's not a great design. I think it should be fine for a node to receive and relay a block before it's built the filter
<jnewbery> it constrains implementers to couple together internal components that needn't be (eg here, blocking message handling on the scheduler thread so that the index gets built)
<michaelfolkson> "The BIP was designed to expose a consistent interface to clients, which is conceptually simpler, making it easier to implement client code correctly."
<jnewbery> one other way to achieve the same would be to have some kind of 'queue' of outstanding BIP157 requests, and if they can't be serviced, delay processing them (I put queue in quotes because I think we'd want to limit it to one request)
<sipa> and if we want to say do block processing in a separate thread as well, like index building, we'll be faced with the same issues, and they need to be solved in a way that doesn't break the protocol synchronicity guarantees
<jnewbery> sipa: I think my point is that we should avoid synchronicity between the core functionality of relaying blocks and the optional functionality of serving filters
<jkczyz> I've heard discussion around eventually committing the filter in the block (sorry if that terminology is no precise) -- at that point, does this problem simply go away (i.e., would the index be updated when the block is connected)? Or does the problem still exist for older blocks?
<jnewbery> jkczyz: if the filter is committed, then presumably it's required for validation, and therefore a peer sending you a header for that block indicates that they have the filter
<MarcoFalke> We know they are clients when they send the first request, so what about disconnecting them instead of having them turn thumbs for 20 seconds (or whatever the timeout is)
<jnewbery> I'd prefer to not add this message-handler thread blocking, but I don't think it's the end of the world if we do. We're talking about very uncommon circumstances (usually I expect we'd be able to build the filter in the round-trip time between sending a header and receiving a getcfilters request)
<jnewbery> sorry, to be clear about what you're all agreeing to: you think that if we've completed IBD but not yet finished building the index, we should disconnect and peers that request cfilters?
<sipa> i also think it would be cleaner that whenever reasonable (post IBD?), we should wait for the index to sync and respond, instead of not responding/disconnecting
<MarcoFalke> if the peer asks for the filters from block [1000, 1500] and we can send them, sure. Though if we are out of ibd and at block 700k and they ask for the filter for block 700k -> disconnect
<jonatack> In general, when faced with a necessary reasonable tradeoff between user experience for full node users versus that of light clients, I'd favor the full node users.
<jnewbery> What are the potential problems with a node signaling NODE_COMPACT_FILTERS before its compact block filters index is built? What are the potential issues if a node changes its service bits dynamically?
<jnewbery> We've already talked quite a bit about some of the implications, but I was specifically interested in how the service bits are cached in various places
<sipa> they're overwritten on connection, so i don't think changing them on the fly is a big deal - as long as a reasonable degree of caching remains useful
<sipa> yeah, my claim - but i'd need to look this up myself - is that the "updating" of the flag is not slower whether or not it has been broadcast already in the past
<sipa> so there is just the window during which peers think you're still on the old feature, but you're really already synced up to the new one, that matters
<MarcoFalke> jnewbery: Thanks for hosting. And thanks for picking up this pull request. I don't think it would have made it this far by now if you hadn't put in the effort!