net processing: Reduce resource usage for inbound block-relay-only connections (p2p)

https://github.com/bitcoin/bitcoin/pull/22778

Host: larryruane  -  PR author: jnewbery

The PR branch HEAD was 9db82f1 at the time of this review club meeting.

Notes

  • Each of our node’s peers is either inbound (our peer initiated the connection) or outbound (we initiated the connection). Once the connection is established, the relationship is largely symmetric (the protocol works the same in both directions), although we treat outbound peers with some preference.

  • When we initiate an outbound connection, we may decide that we won’t relay (forward) transactions or addresses to this peer. This peer is designated a block-relay-only peer. This kind of peer connection was covered in earlier review club meetings (PRs 15759, 19858, 20726). The peer doesn’t know that we’re treating it as block-relay-only.

  • When connecting with a peer, we allocate a Peer object in the net_processing application layer (and a Cnode in the connection layer). If we plan to announce and send transactions and addresses to this peer (for example, if it’s not a block-relay-only peer), we create an additional TxRelay struct object (owned by the Peer object) to track the state needed for relaying transactions to this peer.

  • The TxRelay object includes:
    • the list of transactions we plan to announce to this peer
    • the rolling bloom filter of transactions we expect the peer to know about (because either we sent the transaction to the peer, or it has sent it to us)
    • if we’ve enabled BIP37 (see below), the bloom filter sent by the peer
  • If we advertise support for BIP37 bloom filters to our peer during version handshake, the peer can send us a bloom filter after which we won’t send transactions that aren’t present in the filter (to reduce bandwidth).

  • A node which initiates a connection may indicate that it doesn’t want to receive transactions. It does this using the fRelay field in the version message.

  • If we’ve advertised support for bloom filters to this peer, it may, at any time later, send us a filterload message, which has the side-effect of enabling transaction relay from us to this peer, even if it had sent relay = false in the version message.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?

  2. What’s a bloom filter?

  3. Why are some peers block-relay-only? Briefly, what’s the purpose of having some peers being block-relay-only?

  4. This PR reduces resource usage. Which resource is reduced and by how much?

  5. Why is the TxRelay object separate from the Peer object?

  6. Why is NODE_BLOOM not the default behavior (why does a node choose to advertise it, or not, using a service bit)? Why do most nodes apparently not enable this service?

  7. Why would a peer send us fRelay = false in the version message? Why might a peer send us fRelay = false if it wasn’t planning to send us a filterload?

  8. This PR is divided into four commits. What are some principles we should follow when deciding on how to distribute the changes among the commits?

Bonus Questions

  1. This PR adds GetTxRelay() to safely (using a mutex) return a pointer to a peer’s TxRelay structure. If the a call to GetTxRelay() (for example, this one) returns a non-null pointer, then the pointer is used without holding the mutex. What prevents the TxRelay structure from being deallocated just after the pointer to it is returned by GetTxRelay(), making the pointer invalid?

  2. Some class variables are annotated mutable. What does this mean? Hint: This is fairly common with mutex variables.

  3. The pointer variable tx_relay is initialized as const, but it’s used to mutate the TxRelay structure it’s pointing to. How is that possible?

  117:01 <larryruane> #startmeeting
  217:01 <svav> Hi
  317:01 <larryruane> Welcome to PR Review Club! This meeting is intended for beginners to learn about the Bitcoin Core codebase and how to review PRs.
  417:01 <brunoerg> hi
  517:01 <larryruane> Today we're looking at PR #22778 "net processing: Reduce resource usage for inbound block-relay-only connections"
  617:02 <larryruane> Notes are here https://bitcoincore.reviews/22778
  717:02 <yashraj> hi
  817:02 <larryruane> Please feel free to ask questions whenever you want, and don't worry about interrupting - this meeting is for learning!
  917:02 <BlueMoon> Good afternoon everyone, greetings from Mexico.
 1017:02 <larryruane> Who had a chance to review the PR and/or look at the notes? How about a y/n from people who are here
 1117:02 <OliverOff> y
 1217:03 <svav> Looked at notes
 1317:03 <BlueMoon> I checked it a bit
 1417:03 <yashraj> y, mostly notes
 1517:03 <larryruane> If there are any first-time people here, please say let us know and say Hi if you'd like!
 1617:04 <larryruane> As you can see, this PR was merged last week, but that's okay, we can still a lot from it
 1717:04 <theStack> hi
 1817:04 <larryruane> (we decided to review this PR before it was merged)
 1917:05 <svav> Welcome to any newcomers. How did you hear about this meeting?
 2017:06 <effexzi> Hi every1
 2117:07 <b_1o1> hi all
 2217:07 <larryruane> Hi to all! I guess we can start in, we always like to know, if you've reviewed the PR, how did you go about it?
 2317:08 <larryruane> Or in general, for any PR, what are your favorite tips for reviewing?
 2417:08 <BlueMoon> I did some reading on incoming and outgoing connections.
 2517:09 <larryruane> Mine personally is to checkout the branch locally, start up vscode (or just "code" on linux), which has a nice git integration, and look at the commits separately... then I also sometimes run the debugger on both one of the python tests and the node (bitcoind)
 2617:10 <larryruane> BlueMoon: that's good, can you tell us at a high level what's the difference between those kinds of connections?
 2717:11 <yashraj> basically tried to read meeting logs and github conversations for this and related PRs. Everything's just a rabbithole for noobs
 2817:11 <BlueMoon> It seems to me that if our peer initiates the connection it is an incoming peer, if we initiate the connection the peer is an outgoing peer.
 2917:11 <larryruane> BlueMoon: yes exactly!
 3017:12 <BlueMoon> Thanks!
 3117:12 <larryruane> when we initiate the connection (an outgoing connection), there are multiple types, two main ones that are most important for today's PR, can anyone say what they are and what's the difference between them?
 3217:14 <OliverOff> one kind transmits both block data and mempool data (unconfirmed txs), the other kind transmits only block data and it's called block-relay-only connections
 3317:14 <svav> Types are outbound-full-relay connection and block-relay-only connection
 3417:15 <svav> outbound-full-relay will gossip address and transaction information, block-relay-only will not
 3517:15 <larryruane> yes, good, here's the full list: https://github.com/bitcoin/bitcoin/blob/master/src/net.h#L131
 3617:17 <larryruane> why is there such a thing as block-relay-only connections?
 3717:18 <svav> It is for improved security, to help prevent partition attacks
 3817:18 <larryruane> why was that connection type added to the software?
 3917:18 <larryruane> svav: yes, exactly
 4017:18 <svav> For network security. By not relaying transactions or addresses, these connections are harder to detect by a third party, thus helping obfuscate the network topology.
 4117:19 <BlueMoon> The importance of propagation speed is crucial for network decentralisation.
 4217:19 <OliverOff> to make it more difficult to identify the networks topology, which could lead to privacy loss (identifying the IP that originated a transaction) and facilitate GBP attacks
 4317:20 <larryruane> yes, the block-relay-only connection type was added by PR 15759 (2019) https://github.com/bitcoin/bitcoin/pull/15759
 4417:20 <larryruane> you can see some good explanation there about why it was added
 4517:21 <larryruane> notice that block-relay-only applies only to outbound connections, not inbound
 4617:21 <theStack> an interesting fact is that there is also a "blocksonly" mode for reducing traffic by turning off transaction relay completely (option -blocksonly); this can be confusing due to the similar naming to "block-relay-only"... at least it was for me :)
 4717:22 <OliverOff> s/GBP/BGP (Border Gateway ProtocoĂźl)
 4817:23 <larryruane> theStack: yes excellent, i was just going to mention that next! setting up a block-relay-only peer affects only how we interact with that peer, whereas `blocksonly=` (configuration option) affects our behavior with all our peers
 4917:23 <larryruane> do we still have a mempool if we are `blocksonly`?
 5017:25 <theStack> i would guess yes for submitting transactions via wallet or RPC calls? of course it would be pretty empty most of the time
 5117:25 <OliverOff> thought so too but what's the point of submitting if you're not transmitting this tx to the rest of the network?
 5217:25 <OliverOff> only if you're mining yourself
 5317:26 <larryruane> yes I think that's right, maybe the node can use fewer resources (less memory etc.)
 5417:27 <larryruane> you can still initiate a transaction and relay it out to the network, but it's bad for privacy because everyone will know that you initiated it
 5517:28 <larryruane> Okay feel free to continue on that topic, but question 2, what's a bloom filter (very basically)?
 5617:29 <theStack> (for those interested, this is a thread in the bitcointalk board where gmaxwell gives a bit background and bandwidth savings results for the back then newly introduced -blocksonly mode: https://bitcointalk.org/index.php?topic=1377345.0)
 5717:29 <larryruane> theStack: that's great, thanks!
 5817:29 <yashraj> theStack: thanks
 5917:30 <BlueMoon> Thanks, this is very interesting :)
 6017:30 <svav> A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set.
 6117:30 <OliverOff> this interactive learning material was personally helpful in (re)learning about bloom filters: https://llimllib.github.io/bloomfilter-tutorial/
 6217:30 <svav> Bloom filters can have false positives but not false negatives.
 6317:31 <yashraj> OliverOff: thanks!
 6417:31 <larryruane> svav: OliverOff: thanks, very helpful explanation and link
 6517:31 <BlueMoon> Thanks for the link, very nice.
 6617:33 <larryruane> so it's a way of compactly storing a set of values, but you can only test if a value is present (you can't iterate a bloom filter like a map or a vector), and you can get false positives (the filter can say an item is there when it actually isn't)
 6717:33 <b_1o1> OliverOff: thanks!
 6817:33 <larryruane> Ok so briefly, how are bloom filters used in the bitcoin P2P protocol? (just briefly)
 6917:34 <OliverOff> nodes tell you a filter and ask you to only relay transactions that pass that bloom filter
 7017:35 <sipa> Only when BIP37 is enabled, which isn't the case for most nodes.
 7117:35 <BlueMoon> <3
 7217:35 <sipa> There are probably more than 3 nodes with it enabled
 7317:35 <sipa> ;)
 7417:36 <larryruane> OliverOff: good! sipa: yes, that's question 6, why do most nodes not support this bloom filtering service (that is they don't set the `NODE_BLOOM` service bit in the version message they send to their peers)?
 7517:37 <sipa> Apologies for skipping ahead!
 7617:37 <larryruane> oh no that's fine we've already been doing that!
 7717:40 <larryruane> bloom filtering (offering that service) used to be enabled by default (you always had the option of disabling it), but it was changed to default-disabled here: https://github.com/bitcoin/bitcoin/pull/16152
 7817:41 <larryruane> that PR's description indicates the reason .. but it still wasn't entirely clear to me, until I found this: https://blog.lopp.net/could-spv-support-a-billion-bitcoin-users-sizing-up-a-scaling-claim/
 7917:42 <sipa> We use bloom filters for a lot of things in the p2p code and validation. BIP37 is the only place where it's specifically part of the protocol, in order to perform server-side filtering.
 8017:42 <larryruane> search on the page for "SPV server scaling"
 8117:43 <yashraj> looks like bloom filters are used in a bunch of places based on PR 15759 meeting log?
 8217:43 <BlueMoon> Thank you very much, I am learning a lot.
 8317:43 <sipa> But "disabling Bloom filters" is maybe a bit confusing, as it's just about disabling server-side filtering, not all other places where Bloom filters are used.
 8417:43 <sipa> LevelDB (which we use for the utxo database) uses Bloom filters internally, for example.
 8517:43 <larryruane> sipa: thanks, isn't there a newer data structure that's similar to but an improvement on bloom filters? I can't recall its name (if there is one)
 8617:43 <larryruane> sipa: oh I see, thanks
 8717:44 <sipa> Cuckoo filters!
 8817:44 <larryruane> oh that's right, thanks
 8917:45 <sipa> BIP158 (client-side filtering) uses yet another datastructure (golomb filters), which are smaller than Bloom filters and cuckoo filters, but don't support efficient updating after creation.
 9017:46 <larryruane> Thanks, I've learned a lot. Okay so finally we're ready to consider question 4, this PR reduces resource usage, which resource and roughly how much?
 9117:46 <OliverOff> is BIP158 going to supersede BIP37 with regards to SPV?
 9217:46 <theStack> it seems that the "pure" bloom filter CBloomFilter is only used for BIP37, whereas the rolling bloom filter CRollingBloomFilter is used at more places (if i interpreted my "git grep" results correctly :))
 9317:47 <sipa> OliverOff: Yes and no - they're very different designs. BIP157/BIP158 can be used for some things BIP37 was, but not all, and the other way around.
 9417:47 <BlueMoon> I found this https://blog.trezor.io/bip158-compact-block-filters-9b813b07a878
 9517:47 <larryruane> and what is CRollingBloomFilter? Is it one that supports removing items? (that's just a guess)
 9617:48 <sipa> No, it's a Bloom filter from which elements expire.
 9717:48 <OliverOff> and what was the DoS problem with BIP37 that warranted it to be disabled by default? was it sending an avalanche of `filterload` and `filterclear` messages?
 9817:48 <sipa> You can't explicitly delete anything from it, but they only guarantee that inserted elements remain in it for a number of insertions that follow.
 9917:49 <larryruane> OliverOff: No, the best explanation I've seen is the article by Lopp that I linked to above
10017:49 <sipa> OliverOff: The problem with BIP37 is that it causes disk and CPU load that is not proportional to the bandwidth.
10117:49 <OliverOff> sipa: @larr
10217:49 <sipa> BIP37 is also a privacy nightmare.
10317:49 <OliverOff> thanks
10417:50 <sipa> OliverOff: You can send a filter that matches nothing, and then ask for all blocks in the chain. Now the attacker gets only tiny messages (empty filtered blocks), while the victim is reading through the entire blockchain and processing every tx in it.
10517:50 <sipa> Ideally, we want some cost to the attacker (such as bandwidth) that is proportional to how much work they're causing the victim to perform.
10617:51 <theStack> BIP37 is, if at all, used mostly in local networks nowadays i would guess, e.g. if you want to connect a wallet to your own node?
10717:51 <sipa> BIP37 was made optional in BIP111.
10817:52 <larryruane> To maybe help answer question 4, let's look at question 5, what is the `TxRelay` object (struct), and why is it separate from the `Peer` object (also a struct)?
10917:52 <yashraj> About Q4: we save on resource usage by not initialising TxRelay if invound peer has indicated they will never request transaction relay.
11017:53 <svav> Re Q4 we are saving memory
11117:53 <larryruane> yashraj: svav: yes exactly, and why is it helpful to not create an instance of `TxRelay`?
11217:53 <larryruane> (if we don't need to, obviously :) )
11317:54 <BlueMoon> <3
11417:54 <svav> TxRelay is held in memory
11517:54 <sipa> Peer is also held in memory.
11617:55 <svav> It is a case of not creating a TxRelay resource for a particular incoming node, when we know it will never be required
11717:57 <yashraj> do we have to create a TxRelay for every inbound peer separately and if so why? sorry if this is completely wrong I'm (very) new.
11817:58 <svav> if we have not offered NODE_BLOOM to the peer and it has set fRelay to 0, then we know that it will never request transaction announcements, and that we can save resources by not initializing the TxRelay data structure for that incoming node
11917:58 <larryruane> svav: yes, `TxRelay` contains only state that's needed for relaying transactions (outward) to a particular peer ... so if we never need to send transactions to this peer, then there's no reason to allocate one of these for this peer
12017:59 <svav> yashraj the reason for this PR is that there are some circumstances that we know a TXRelay will never be needed for a particular incoming node, and therefore we save memory resources in our node by not creating it - I think!
12117:59 <larryruane> yashraj: yes, the `TxRelay` is per-peer (whether inbound or outbound), because it contains stuff like: which transactions does this peer already know about?
12218:00 <larryruane> we're at time, feel free to continue the discussion! Thanks to all for attending!!
12318:00 <larryruane> #endmeeting

]