net processing: Reduce resource usage for inbound block-relay-only connections (p2p)

Jun 8, 2022

https://github.com/bitcoin/bitcoin/pull/22778

Host: larryruane - PR author: jnewbery

The PR branch HEAD was 9db82f1 at the time of this review club meeting.

Notes

Each of our node’s peers is either inbound (our peer initiated the connection) or outbound (we initiated the connection). Once the connection is established, the relationship is largely symmetric (the protocol works the same in both directions), although we treat outbound peers with some preference.
When we initiate an outbound connection, we may decide that we won’t relay (forward) transactions or addresses to this peer. This peer is designated a block-relay-only peer. This kind of peer connection was covered in earlier review club meetings (PRs 15759, 19858, 20726). The peer doesn’t know that we’re treating it as block-relay-only.
When connecting with a peer, we allocate a Peer object in the net_processing application layer (and a Cnode in the connection layer). If we plan to announce and send transactions and addresses to this peer (for example, if it’s not a block-relay-only peer), we create an additional TxRelay struct object (owned by the Peer object) to track the state needed for relaying transactions to this peer.
The TxRelay object includes:
- the list of transactions we plan to announce to this peer
- the rolling bloom filter of transactions we expect the peer to know about (because either we sent the transaction to the peer, or it has sent it to us)
- if we’ve enabled BIP37 (see below), the bloom filter sent by the peer
If we advertise support for BIP37 bloom filters to our peer during version handshake, the peer can send us a bloom filter after which we won’t send transactions that aren’t present in the filter (to reduce bandwidth).
A node which initiates a connection may indicate that it doesn’t want to receive transactions. It does this using the fRelay field in the version message.
If we’ve advertised support for bloom filters to this peer, it may, at any time later, send us a filterload message, which has the side-effect of enabling transaction relay from us to this peer, even if it had sent relay = false in the version message.

Questions

Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?
What’s a bloom filter?
Why are some peers block-relay-only? Briefly, what’s the purpose of having some peers being block-relay-only?
This PR reduces resource usage. Which resource is reduced and by how much?
Why is the TxRelay object separate from the Peer object?
Why is NODE_BLOOM not the default behavior (why does a node choose to advertise it, or not, using a service bit)? Why do most nodes apparently not enable this service?
Why would a peer send us fRelay = false in the version message? Why might a peer send us fRelay = false if it wasn’t planning to send us a filterload?
This PR is divided into four commits. What are some principles we should follow when deciding on how to distribute the changes among the commits?

Bonus Questions

This PR adds GetTxRelay() to safely (using a mutex) return a pointer to a peer’s TxRelay structure. If the a call to GetTxRelay() (for example, this one) returns a non-null pointer, then the pointer is used without holding the mutex. What prevents the TxRelay structure from being deallocated just after the pointer to it is returned by GetTxRelay(), making the pointer invalid?
Some class variables are annotated mutable. What does this mean? Hint: This is fairly common with mutex variables.
The pointer variable tx_relay is initialized as const, but it’s used to mutate the TxRelay structure it’s pointing to. How is that possible?

17:01

<larryruane> #startmeeting

17:01

<svav> Hi

17:01

<larryruane> Welcome to PR Review Club! This meeting is intended for beginners to learn about the Bitcoin Core codebase and how to review PRs.

17:01

<brunoerg> hi

17:01

<larryruane> Today we're looking at PR #22778 "net processing: Reduce resource usage for inbound block-relay-only connections"

17:02

<larryruane> Notes are here https://bitcoincore.reviews/22778

17:02

<yashraj> hi

17:02

<larryruane> Please feel free to ask questions whenever you want, and don't worry about interrupting - this meeting is for learning!

17:02

<BlueMoon> Good afternoon everyone, greetings from Mexico.

17:02

<larryruane> Who had a chance to review the PR and/or look at the notes? How about a y/n from people who are here

17:02

<OliverOff> y

17:03

<svav> Looked at notes

17:03

<BlueMoon> I checked it a bit

17:03

<yashraj> y, mostly notes

17:03

<larryruane> If there are any first-time people here, please say let us know and say Hi if you'd like!

17:04

<larryruane> As you can see, this PR was merged last week, but that's okay, we can still a lot from it

17:04

<theStack> hi

17:04

<larryruane> (we decided to review this PR before it was merged)

17:05

<svav> Welcome to any newcomers. How did you hear about this meeting?

17:06

<effexzi> Hi every1

17:07

<b_1o1> hi all

17:07

<larryruane> Hi to all! I guess we can start in, we always like to know, if you've reviewed the PR, how did you go about it?

17:08

<larryruane> Or in general, for any PR, what are your favorite tips for reviewing?

17:08

<BlueMoon> I did some reading on incoming and outgoing connections.

17:09

<larryruane> Mine personally is to checkout the branch locally, start up vscode (or just "code" on linux), which has a nice git integration, and look at the commits separately... then I also sometimes run the debugger on both one of the python tests and the node (bitcoind)

17:10

<larryruane> BlueMoon: that's good, can you tell us at a high level what's the difference between those kinds of connections?

17:11

<yashraj> basically tried to read meeting logs and github conversations for this and related PRs. Everything's just a rabbithole for noobs

17:11

<BlueMoon> It seems to me that if our peer initiates the connection it is an incoming peer, if we initiate the connection the peer is an outgoing peer.

17:11

<larryruane> BlueMoon: yes exactly!

17:12

<BlueMoon> Thanks!

17:12

<larryruane> when we initiate the connection (an outgoing connection), there are multiple types, two main ones that are most important for today's PR, can anyone say what they are and what's the difference between them?

17:14

<OliverOff> one kind transmits both block data and mempool data (unconfirmed txs), the other kind transmits only block data and it's called block-relay-only connections

17:14

<svav> Types are outbound-full-relay connection and block-relay-only connection

17:15

<svav> outbound-full-relay will gossip address and transaction information, block-relay-only will not

17:15

<larryruane> yes, good, here's the full list: https://github.com/bitcoin/bitcoin/blob/master/src/net.h#L131

17:17

<larryruane> why is there such a thing as block-relay-only connections?

17:18

<svav> It is for improved security, to help prevent partition attacks

17:18

<larryruane> why was that connection type added to the software?

17:18

<larryruane> svav: yes, exactly

17:18

<svav> For network security. By not relaying transactions or addresses, these connections are harder to detect by a third party, thus helping obfuscate the network topology.

17:19

<BlueMoon> The importance of propagation speed is crucial for network decentralisation.

17:19

<OliverOff> to make it more difficult to identify the networks topology, which could lead to privacy loss (identifying the IP that originated a transaction) and facilitate GBP attacks

17:20

<larryruane> yes, the block-relay-only connection type was added by PR 15759 (2019) https://github.com/bitcoin/bitcoin/pull/15759

17:20

<larryruane> you can see some good explanation there about why it was added

17:21

<larryruane> notice that block-relay-only applies only to outbound connections, not inbound

17:21

<theStack> an interesting fact is that there is also a "blocksonly" mode for reducing traffic by turning off transaction relay completely (option -blocksonly); this can be confusing due to the similar naming to "block-relay-only"... at least it was for me :)

17:22

<OliverOff> s/GBP/BGP (Border Gateway Protocoßl)

17:23

<larryruane> theStack: yes excellent, i was just going to mention that next! setting up a block-relay-only peer affects only how we interact with that peer, whereas `blocksonly=` (configuration option) affects our behavior with all our peers

17:23

<larryruane> do we still have a mempool if we are `blocksonly`?

17:25

<theStack> i would guess yes for submitting transactions via wallet or RPC calls? of course it would be pretty empty most of the time

17:25

<OliverOff> thought so too but what's the point of submitting if you're not transmitting this tx to the rest of the network?

17:25

<OliverOff> only if you're mining yourself

17:26

<larryruane> yes I think that's right, maybe the node can use fewer resources (less memory etc.)

17:27

<larryruane> you can still initiate a transaction and relay it out to the network, but it's bad for privacy because everyone will know that you initiated it

17:28

<larryruane> Okay feel free to continue on that topic, but question 2, what's a bloom filter (very basically)?

17:29

<theStack> (for those interested, this is a thread in the bitcointalk board where gmaxwell gives a bit background and bandwidth savings results for the back then newly introduced -blocksonly mode: https://bitcointalk.org/index.php?topic=1377345.0)

17:29

<larryruane> theStack: that's great, thanks!

17:29

<yashraj> theStack: thanks

17:30

<BlueMoon> Thanks, this is very interesting :)

17:30

<svav> A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set.

17:30

<OliverOff> this interactive learning material was personally helpful in (re)learning about bloom filters: https://llimllib.github.io/bloomfilter-tutorial/

17:30

<svav> Bloom filters can have false positives but not false negatives.

17:31

<yashraj> OliverOff: thanks!

17:31

<larryruane> svav: OliverOff: thanks, very helpful explanation and link

17:31

<BlueMoon> Thanks for the link, very nice.

17:33

<larryruane> so it's a way of compactly storing a set of values, but you can only test if a value is present (you can't iterate a bloom filter like a map or a vector), and you can get false positives (the filter can say an item is there when it actually isn't)

17:33

<b_1o1> OliverOff: thanks!

17:33

<larryruane> Ok so briefly, how are bloom filters used in the bitcoin P2P protocol? (just briefly)

17:34

<OliverOff> nodes tell you a filter and ask you to only relay transactions that pass that bloom filter

17:35

<sipa> Only when BIP37 is enabled, which isn't the case for most nodes.

17:35

<BlueMoon> <3

17:35

<sipa> There are probably more than 3 nodes with it enabled

17:35

<sipa> ;)

17:36

<larryruane> OliverOff: good! sipa: yes, that's question 6, why do most nodes not support this bloom filtering service (that is they don't set the `NODE_BLOOM` service bit in the version message they send to their peers)?

17:37

<sipa> Apologies for skipping ahead!

17:37

<larryruane> oh no that's fine we've already been doing that!

17:40

<larryruane> bloom filtering (offering that service) used to be enabled by default (you always had the option of disabling it), but it was changed to default-disabled here: https://github.com/bitcoin/bitcoin/pull/16152

17:41

<larryruane> that PR's description indicates the reason .. but it still wasn't entirely clear to me, until I found this: https://blog.lopp.net/could-spv-support-a-billion-bitcoin-users-sizing-up-a-scaling-claim/

17:42

<sipa> We use bloom filters for a lot of things in the p2p code and validation. BIP37 is the only place where it's specifically part of the protocol, in order to perform server-side filtering.

17:42

<larryruane> search on the page for "SPV server scaling"

17:43

<yashraj> looks like bloom filters are used in a bunch of places based on PR 15759 meeting log?

17:43

<BlueMoon> Thank you very much, I am learning a lot.

17:43

<sipa> But "disabling Bloom filters" is maybe a bit confusing, as it's just about disabling server-side filtering, not all other places where Bloom filters are used.

17:43

<sipa> LevelDB (which we use for the utxo database) uses Bloom filters internally, for example.

17:43

<larryruane> sipa: thanks, isn't there a newer data structure that's similar to but an improvement on bloom filters? I can't recall its name (if there is one)

17:43

<larryruane> sipa: oh I see, thanks

17:44

<sipa> Cuckoo filters!

17:44

<larryruane> oh that's right, thanks

17:45

<sipa> BIP158 (client-side filtering) uses yet another datastructure (golomb filters), which are smaller than Bloom filters and cuckoo filters, but don't support efficient updating after creation.

17:46

<larryruane> Thanks, I've learned a lot. Okay so finally we're ready to consider question 4, this PR reduces resource usage, which resource and roughly how much?

17:46

<OliverOff> is BIP158 going to supersede BIP37 with regards to SPV?

17:46

<theStack> it seems that the "pure" bloom filter CBloomFilter is only used for BIP37, whereas the rolling bloom filter CRollingBloomFilter is used at more places (if i interpreted my "git grep" results correctly :))

17:47

<sipa> OliverOff: Yes and no - they're very different designs. BIP157/BIP158 can be used for some things BIP37 was, but not all, and the other way around.

17:47

<BlueMoon> I found this https://blog.trezor.io/bip158-compact-block-filters-9b813b07a878

17:47

<larryruane> and what is CRollingBloomFilter? Is it one that supports removing items? (that's just a guess)

17:48

<sipa> No, it's a Bloom filter from which elements expire.

17:48

<OliverOff> and what was the DoS problem with BIP37 that warranted it to be disabled by default? was it sending an avalanche of `filterload` and `filterclear` messages?

17:48

<sipa> You can't explicitly delete anything from it, but they only guarantee that inserted elements remain in it for a number of insertions that follow.

17:49

<larryruane> OliverOff: No, the best explanation I've seen is the article by Lopp that I linked to above

100

17:49

<sipa> OliverOff: The problem with BIP37 is that it causes disk and CPU load that is not proportional to the bandwidth.

101

17:49

<OliverOff> sipa: @larr

102

17:49

<sipa> BIP37 is also a privacy nightmare.

103

17:49

<OliverOff> thanks

104

17:50

<sipa> OliverOff: You can send a filter that matches nothing, and then ask for all blocks in the chain. Now the attacker gets only tiny messages (empty filtered blocks), while the victim is reading through the entire blockchain and processing every tx in it.

105

17:50

<sipa> Ideally, we want some cost to the attacker (such as bandwidth) that is proportional to how much work they're causing the victim to perform.

106

17:51

<theStack> BIP37 is, if at all, used mostly in local networks nowadays i would guess, e.g. if you want to connect a wallet to your own node?

107

17:51

<sipa> BIP37 was made optional in BIP111.

108

17:52

<larryruane> To maybe help answer question 4, let's look at question 5, what is the `TxRelay` object (struct), and why is it separate from the `Peer` object (also a struct)?

109

17:52

<yashraj> About Q4: we save on resource usage by not initialising TxRelay if invound peer has indicated they will never request transaction relay.

110

17:53

<svav> Re Q4 we are saving memory

111

17:53

<larryruane> yashraj: svav: yes exactly, and why is it helpful to not create an instance of `TxRelay`?

112

17:53

<larryruane> (if we don't need to, obviously :) )

113

17:54

<BlueMoon> <3

114

17:54

<svav> TxRelay is held in memory

115

17:54

<sipa> Peer is also held in memory.

116

17:55

<svav> It is a case of not creating a TxRelay resource for a particular incoming node, when we know it will never be required

117

17:57

<yashraj> do we have to create a TxRelay for every inbound peer separately and if so why? sorry if this is completely wrong I'm (very) new.

118

17:58

<svav> if we have not offered NODE_BLOOM to the peer and it has set fRelay to 0, then we know that it will never request transaction announcements, and that we can save resources by not initializing the TxRelay data structure for that incoming node

119

17:58

<larryruane> svav: yes, `TxRelay` contains only state that's needed for relaying transactions (outward) to a particular peer ... so if we never need to send transactions to this peer, then there's no reason to allocate one of these for this peer

120

17:59

<svav> yashraj the reason for this PR is that there are some circumstances that we know a TXRelay will never be needed for a particular incoming node, and therefore we save memory resources in our node by not creating it - I think!

121

17:59

<larryruane> yashraj: yes, the `TxRelay` is per-peer (whether inbound or outbound), because it contains stuff like: which transactions does this peer already know about?

122

18:00

<larryruane> we're at time, feel free to continue the discussion! Thanks to all for attending!!

123

18:00

<larryruane> #endmeeting

]