Add DISABLETX message for negotiating block-relay-only connections (p2p)

Jan 12, 2022

https://github.com/bitcoin/bitcoin/pull/20726

Host: mzumsande - PR author: sdaftuar

The PR branch HEAD was 28df701eb1 at the time of this review club meeting.

Notes

On block-relay-only connections (introduced in PR #15759), transaction and address relay are disabled. Currently, Bitcoin Core nodes per default make two outbound block-relay-only connections in addition to 8 regular outbound connections.
Block-relay-only connections have been subject of multiple review club sessions, among them #15759 on the PR introducing them.
Currently, block-relay-only connections are established indirectly:
- When making an outbound block-relay-only connection, a node sets the boolean flag fRelay in the version message to false.
- fRelay (introduced in the context of BIP37) does not imply that transactions cannot be sent for the entire duration of the connection - in its original use case with BIP37, relay of transactions can be activated later on.
- fRelay=false is also used in -blocksonly mode, a low-bandwidth option in which a node does not want to receive transactions from any peer, but does participate in address relay.
- Therefore, nodes currently don’t have a notion which of their incoming peers see the connection as block-relay-only and don’t have any logic attached to it.
PR #20726, accompanied by the new BIP proposal BIP338, introduces the new p2p message disabletx for block-relay-only connections, which makes it explicit that no messages related to transaction relay should ever be exchanged over the duration of the connection.

Questions

Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?
What are the benefits of introducing the disabletx message? What are the downsides, if any?
When a node makes an outgoing block-relay-only connection, it will send a disabletx message after this PR. Will there be other changes in behavior for the sender beyond this (if yes, which ones)?
When a node receives both fRelay=false and a disabletx message from an incoming peer, will it behave differently after this PR? If yes, how?
Earlier discussions in the PR revolved around the interaction of disabletx and address relay. Why is it, after PR #21528, no longer necessary to change any address-relay related code in this PR?
The new m_disable_tx field of the Peer struct is of type std::atomic<bool>. Why is an std::atomic used here instead of a simple bool, or a bool guarded by a lock?
There have also been suggestions to achieve some of the goals of this PR without introducing a new message, as in PR #22778. Do you prefer an indirect approach to the explicit approach of this PR?

Meeting Log

17:00

<lightlike> #startmeeting

17:00

<lightlike> Hi!

17:00

<emzy> hi

17:00

<stickies-v> hi

17:00

<tarun> hi

17:00

<svav> Hi

17:01

<tim1> hi

17:01

<effexzi> Hi

17:01

<kouloumos> hi

17:01

<larryruane> hi

17:01

<sipa> hi

17:01

<b10c> hi

17:01

<lightlike> Welcome to PR Review Club - today's session is about PR 20726 - Add DISABLETX message for negotiating block-relay-only connections

17:01

<lightlike> https://github.com/bitcoin/bitcoin/pull/20726

17:01

<lightlike> Before we start, is anybody here for the first time?

17:01

<michaelfolkson> hi

17:02

<glozow> hi

17:03

<lightlike> Seems to be not the case - who had the chance to review this week's PR (y/n)?

17:03

<stickies-v> 0.5y

17:03

<tarun> y

17:03

<emzy> n

17:03

<larryruane> 0.5

17:03

<sipa> y, a long time ago

17:04

<michaelfolkson> 0.5y

17:04

<jnewbery> hi

17:04

<lightlike> cool - and what was your initial impression?

17:05

<stickies-v> seems very sensible, but also that I think it maybe would've been nice to have fRelay be more expressive as an integer so we could capture more states, including the one expressed by disabletx

17:05

<stickies-v> (but that's always easy in hindsight)

17:06

<tarun> I have perhaps a background question--is block only an all or nothing approach--that a node is block only for all its outbound connections?

17:07

<glozow> it always seemed like a bit of a hack to be using fRelay to control tx relay, since it's a bloom filter-related field

17:07

<sipa> A node can run in blockonly mode, where it just doesn't ask for individual transactions from peers at all.

17:07

<sipa> That's different from block-relay-only connections, which are on a per-connection basis.

17:08

<lightlike> yes, it sounds very similar, but is a different thing

17:08

<tarun> ok-that distinction is helpful-thanks

17:08

<lightlike> ok - let's move to the first question:

17:08

<michaelfolkson> tarun: If I understand your question the answer is no. A node may be block only for some of its peers and a normal peer for other of its peers

17:08

<lightlike> What are the benefits of introducing the disabletx message? What are the downsides, if any?

17:08

<glozow> that's perhaps the most-asked question in pr review club

17:08

<sipa> michaelfolkson: Then it's not blockonly. Blockonly means it has no mempool at all.

17:09

<larryruane> from the review club notes, "... and don’t have any logic attached to it" -- can someone elaborate on what that means?

17:09

<stickies-v> I think the benefit is mostly altruistic, in that your peer can optimize their local state to not have to prepare for sending you transactions in the future (e.g. keeping sketches with erlay)

17:09

<sipa> The purpose of blockonly mode is different from the purpose of block-relay-only peers.

17:09

<glozow> sipa: has mempool, but ignores incoming transactions that aren't from whitelisted peers, no?

17:09

<stickies-v> But there's also a small selfish benefit, in that you'd not have to receive other messages (e.g. ADDR) anymore after sending DISABLETX, so less bandwidth usage

17:10

<lightlike> stickies-v: yes to both!

17:10

<sipa> glozow: Fair, with whitelisted peers that forcibly give you transactions you could still have a mempool (though that's kind of weird)... the point of blockonly mode is just reducing bandwidth because you don't care about having a mempool.

17:10

<michaelfolkson> Blockonly versus block-relay-only oof. Ok that is a distinction I wasn't making

17:12

<sipa> stickies-v: Is erlay that related? If erlay isn't negotiated with a peer, no sketches have to be kept for it, whether it's disabletx or not.

17:12

<lightlike> yes, every default-configured node today has 2 block-relay-only connections, so it's probably much more common than "-blocksonly" mode.

17:12

<larryruane> Is it counterintuitive naming for DISABLETX to control addr messages?

17:13

<stickies-v> sipa ah yes, good point. Then I'm not sure what other local optimizations could currently be made by knowing that you won't have to send transactions in the future? Are there any?

17:13

<sipa> Yes, all other tx related data structures (e.g. the set of recently announced transactions for deduplication etc).

17:14

<sipa> (if that's still the case; I saw something about doing that lazily)

17:14

<sipa> My understanding may be a bit outdated.

17:14

<stickies-v> That's compact block related, right? Anticipating which tx's your peer would already have in mempool as to know what to include in blocks you send along?

17:14

<michaelfolkson> To go from a block-relay-only peer to a normal peer requires a disconnection and new handshake?

17:15

<glozow> not compact block related, but not announcing something that your peer already announced

17:15

<jnewbery> stickies-v: anything in the TxRelay struct in https://github.com/bitcoin/bitcoin/pull/22778/files can be omitted if transactions don't need to be relayed over the connection

17:15

<lightlike> yes, there is the TxRelay struct in net.h with different objects (notable a bloom filter) that is not needed on a block-relay-only connection

17:16

<sipa> I think the rationale is mostly knowing that a connection will never need these datastructures, which could help with e.g. deciding how many connections can be supported given local resource constraints.

17:16

<lightlike> michaelfolkson: yes, that is not something that can be changed during a connection

17:17

<lightlike> Any downsides to the new message?

17:17

<jnewbery> the biggest saving is the rolling bloom filter m_tx_inventory_known_filter which is >500k per connection

17:17

<michaelfolkson> I just wonder if you have a reliable block-relay-only peer and you're struggling for peers maybe you'd want to transition them to be a normal peer

17:17

<stickies-v> thx, I hadn't looked at TxRelay yet, will look into it a bit more. I'd assumed the bloom filter related stuff could already be optimized just based on fRelay, though

17:17

<jnewbery> stickies-v: you're right, and that's what #22778 does, although it needs rebase and comments addressed

17:17

<glozow> bloom filter m_tx_inventory_known_filter is distinct from BIP37 bloom filters fyi

17:18

<glozow> oh, facepalm, i misread the message, ignore me

17:18

<jnewbery> glozow: right, it's a rolling bloom filter, which is the same data structure we use for various purposes in p2p

17:19

<michaelfolkson> Maybe clutching at straws but is it a downside telling your peers what you want from them explicitly a bit of a privacy leak. I'm thinking that is a bit of a push

17:19

<sipa> You're already telling them you don't want transactions from them.

17:19

<sipa> DISABLETX is telling them you'll never want transactions from them.

17:21

<lightlike> I wonder if having a separate protocol message for each boolean of information you want to exchange during feature negotiation is ideal design in the long run - it could lead to a zoo of messages over time that might be hard to keep track with

17:21

<sipa> There aren't that many pieces of information to be negotiated really... transactions, blocks, and ip addresses.

17:22

<sipa> I guess headers is another one.

17:23

<lightlike> yes, currently the boolen-like messages are WTXIDRELAY, SENDADDRV2 and SENDHEADERS afaik, but there is probably potential for growth

17:23

<sipa> lightlike: If we had perfect foresight about what we'd ever want the protocol to be, I'm sure something cleaner/more efficient could be chosen, but the "new protocol version number to add support for X; new message to negotiate the feature" has proven pretty flexible so far.

17:23

<sipa> Don't forget that most P2P connections are pretty long-lived (hours to days/weeks or more even), so a few messages back and forth once at the start isn't a huge cost.

17:23

<sipa> https://www.dsn.kastel.kit.edu/bitcoin/ for numbers.

17:24

<lightlike> next question:

17:24

<lightlike> When a node makes an outgoing block-relay-only connection, it will send a disabletx message after this PR. Will there be other changes in behavior for the sender beyond this (if yes, which ones)?

17:24

<jnewbery> lightlike: I share that concern. I think it's tractable now, but it gets more complex as the number of features increases and there are features that can only be negotiated if other features have already been enabled (eg erlay requires wtxidrelay).

17:25

<lightlike> sipa: yes, true. I'm not sure if the protocol would be more simple if these weren't separate messages but bits in one message containing a bitfield or something like this.

17:28

<stickies-v> lightlike I don't think it's implemented in this PR yet, but my understanding is that in follow up PRs the sender of DISABLETX would also have to stop sending other messages like ADDR in order not to be disconnected?

17:29

<stickies-v> or is that logic implemented already?

17:29

<lightlike> stickies-v: actually, that won't be necessary, we'll get to that in a later question. addr relay works well together with this PR and BIP338 "out of the box"

17:30

<sipa> The BIP only states that it is recommended not to send ADDR/... messages to DISABLETX peers, but without any hard rule.

17:30

<lightlike> but yes, there are no other changes in behavior for the *sender*.

17:31

<lightlike> moving to the receiver:

17:31

<lightlike> When a node receives both fRelay=false and a disabletx message from an incoming peer, will it behave differently after this PR? If yes, how?

17:31

<Kaizen_K_> it wont send any transactions to that incoming peer?

17:32

<stickies-v> It will disconnect such peers that send now invalid messages like GETDATA

17:33

<lightlike> Kaizen_K_: not sending transaction is something it would already be done before, just using the old "fRelay=false" information, so it's not changed with this PR.

17:33

<Kaizen_K_> ty

17:34

<lightlike> stickies-v: correct! If you first send a disabletx message and then don't follow the BIP by following up with tx-related messages, you'll get disconnected.

17:35

<lightlike> note that the part where the receiver doesn't initialize the TxRelay objects is not part of this PR, it would probably make sense as a follow-up PR.

17:36

<michaelfolkson> So a possible downside to this DISABLETX approach is it formalizes the relationship as a block relay only and to get out of it you need to redo the handshake. Current behavior is a peer can later request to receive transactions again using filterload without redoing the handshake?

17:37

<sipa> Filterload is all but dead.

17:37

<sipa> It requires actively opting in to BIP37 support.

100

17:38

<michaelfolkson> Ok gotcha, thanks

101

17:38

<lightlike> I think the old behavior is not disallowed: If you intend to use BIP37, just don't send a disabletx message, and everything works as good (or bad) as before

102

17:38

<sipa> Indeed.

103

17:38

<lightlike> Next question:

104

17:39

<lightlike> Earlier discussions in the PR revolved around the interaction of disabletx and address relay. Why is it, after PR #21528, no longer necessary to change any address-relay related code in this PR?

105

17:40

<Kaizen_K_> I think that peer #21528 filters peers that have no other peers? Is that correct?

106

17:41

<michaelfolkson> lightlike: Is this the deferring initialization of m_addr_known?

107

17:41

<lightlike> michaelfolkson: yes, correct!

108

17:44

<lightlike> It basically changed the way address relay is setup: if you make an outgoing block-relay-connection, you know that you don't want addr related messages, and just don't send any such messages.

109

17:44

<lightlike> On getting an inbound connection, you'd only start sending addr related address after having received at least one.

110

17:45

<michaelfolkson> That is a neat side effect of #21528. Presumably it was deliberate with this advantage in mind?

111

17:45

<lightlike> So, everything plays nice already on a block-relay-only connection, without a need to disable something.

112

17:45

<Kaizen_K_> Can anyone explain what addr messages are please? Can't find anything on google.

113

17:45

<stickies-v> https://en.bitcoin.it/wiki/Protocol_documentation#addr

114

17:45

<lightlike> That's what https://github.com/bitcoin/bitcoin/pull/20726#issuecomment-1006005523 refers to.

115

17:45

<sipa> Kaizen_K_: They relay IP addresses of other nodes.

116

17:45

<Kaizen_K_> ty

117

17:47

<svav> I googled Bitcoin addr message and found this https://en.bitcoin.it/wiki/Protocol_documentation#addr

118

17:47

<sipa> Also see BIP155, for its successor, ADDRV2.

119

17:48

<lightlike> there was also a review club meeting on PR 21528 that I forgot to link: https://bitcoincore.reviews/21528

120

17:48

<lightlike> Next question is more C++-technical:

121

17:48

<lightlike> The new m_disable_tx field of the Peer struct is of type std::atomic<bool>. Why is an std::atomic used here instead of a simple bool, or a bool guarded by a lock?

122

17:50

<glozow> for thread safety

123

17:50

<sipa> A bool guarded by a lock would also be thread safe, but much slower on most platforms.

124

17:51

<glozow> and more code

125

17:51

<michaelfolkson> When would you use a bool guarded by a lock if it is much slower? :)

126

17:52

<lightlike> yes, I guess the basic thing is that the "Peer" struct is not guarded by cs_main as a whole, so something needs to be done for each of the members separately

127

17:52

<sipa> You wouldn't.

128

17:52

<sipa> But perhaps you need a lock that guards more than just a bool.

129

17:52

<sipa> An atomic is only usable if you want a guarded data structure that only contains a bool, and nothing more.

130

17:52

<lightlike> michaelfolkson, sipa: could there be situation where this makes sense because you care about enforcing a lock order?

131

17:53

<sipa> If lock order is relevant, it means you're not guarding just a bool.

132

17:53

<larryruane> it seems this boolean is set (to true) from only one thread (message processing), but read from other threads, and locking is needed even in this case

133

17:54

<sipa> An atomic<bool> is equivalent to a bool with a lock that's only held at the innermost level.

134

17:54

<stickies-v> sipa https://en.cppreference.com/w/cpp/atomic/atomic seems to also be defined on other types than bool?

135

17:54

<glozow> for every read/write to an atomic_bool, the lock will be taken right before and released right before that operation is over, so you wouldn't be taking multiple locks at a time in different orders?

136

17:54

<larryruane> can there be `atomic<someLargeStruct>`? we seem to see it only for simple types

137

17:55

<sipa> stickies-v: Oh, sorry, I didn't mean to imply it just worked for bools.

138

17:55

<sipa> larryruane: If there was, it'd almost certainly be implemented using a lock :D

139

17:55

<larryruane> yes, just wondering if it's even allowed by the compilter

140

17:55

<lightlike> could it be that operations such as setting a bool aren't atomic anyway? Or is this implemenation-specific and we jsut want to be on the safe side?

141

17:56

<sipa> larryruane: Pet peeve of mine: the compiler is irrelevant. The question you should ask if is if it's permitted by the *language* (and so far, I think the answer is no, but it could be expanded in future language versions of course).

142

17:56

<willcl_ark> Does a mutex also not get locked and unlocked each with an atomic operation, i.e. 2 atomic operations minimum?

143

17:57

<sipa> willcl_ark: In practice, yes.

144

17:58

<lightlike> larryruane: I think that std::atomic is currently limited to just few small, "integer-like" types. I don't think it exists for double, for example.

145

17:58

<sipa> lightlike: Unguarded concurrent access is undefined behavior.

146

17:58

<sipa> It might work, it might also send all your BTC in your wallet to me.

147

17:59

<stickies-v> Now that's an innovative source of dev funding

148

18:00

<lightlike> ok, time's up - thanks a lot everyone!

149

18:00

<sipa> In practice, x86 has a fairly strong consistency model, so it's not unreasonable that the resulting compiled code actually works there (but it certainly might not). On other hardware (non-Mac ARM, in particular) non-atomic operations are pretty likely to kill you.

150

18:00

<lightlike> #endmeeting