The PR branch HEAD was 28df701eb1 at the time of this review club meeting.
Notes
On block-relay-only connections (introduced in PR #15759),
transaction and address relay are disabled. Currently, Bitcoin Core nodes
per default make two outbound block-relay-only connections in addition to 8 regular
outbound connections.
Block-relay-only connections have been subject of multiple review club sessions,
among them #15759 on the PR introducing them.
Currently, block-relay-only connections are established
indirectly:
When making an outbound block-relay-only
connection, a node sets the boolean flag fRelay in the version message to false.
fRelay (introduced in the context of BIP37)
does not imply that transactions cannot be sent for the entire duration of the
connection - in its original use case with BIP37, relay of transactions can
be activated later on.
fRelay=false is also used in -blocksonly mode, a low-bandwidth
option in which a node does not want to receive transactions from any peer,
but does participate in address relay.
Therefore, nodes currently don’t have a notion which of their incoming peers
see the connection as block-relay-only and don’t have any logic attached to it.
PR #20726,
accompanied by the new BIP proposal BIP338,
introduces the new p2p message disabletx for block-relay-only connections, which
makes it explicit that no messages related to transaction relay should ever be
exchanged over the duration of the connection.
What are the benefits of introducing the disabletx message? What are the downsides, if any?
When a node makes an outgoing block-relay-only connection, it will send a
disabletx message after this PR. Will there be other changes in behavior for the
sender beyond this (if yes, which ones)?
When a node receives both fRelay=false and a disabletx message from an incoming peer, will it behave
differently after this PR? If yes, how?
Earlier discussions in the PR revolved around the interaction of disabletx and address
relay. Why is it, after PR #21528,
no longer necessary to change any address-relay related code in this PR?
The new m_disable_tx field of the Peer struct is of type std::atomic<bool>.
Why is an std::atomic used
here instead of a simple bool, or a bool guarded by a lock?
There have also been suggestions to achieve some of the goals of this PR without
introducing a new message, as in PR #22778.
Do you prefer an indirect approach to the explicit approach of this PR?
<stickies-v> seems very sensible, but also that I think it maybe would've been nice to have fRelay be more expressive as an integer so we could capture more states, including the one expressed by disabletx
<michaelfolkson> tarun: If I understand your question the answer is no. A node may be block only for some of its peers and a normal peer for other of its peers
<stickies-v> I think the benefit is mostly altruistic, in that your peer can optimize their local state to not have to prepare for sending you transactions in the future (e.g. keeping sketches with erlay)
<stickies-v> But there's also a small selfish benefit, in that you'd not have to receive other messages (e.g. ADDR) anymore after sending DISABLETX, so less bandwidth usage
<sipa> glozow: Fair, with whitelisted peers that forcibly give you transactions you could still have a mempool (though that's kind of weird)... the point of blockonly mode is just reducing bandwidth because you don't care about having a mempool.
<stickies-v> sipa ah yes, good point. Then I'm not sure what other local optimizations could currently be made by knowing that you won't have to send transactions in the future? Are there any?
<stickies-v> That's compact block related, right? Anticipating which tx's your peer would already have in mempool as to know what to include in blocks you send along?
<lightlike> yes, there is the TxRelay struct in net.h with different objects (notable a bloom filter) that is not needed on a block-relay-only connection
<sipa> I think the rationale is mostly knowing that a connection will never need these datastructures, which could help with e.g. deciding how many connections can be supported given local resource constraints.
<michaelfolkson> I just wonder if you have a reliable block-relay-only peer and you're struggling for peers maybe you'd want to transition them to be a normal peer
<stickies-v> thx, I hadn't looked at TxRelay yet, will look into it a bit more. I'd assumed the bloom filter related stuff could already be optimized just based on fRelay, though
<michaelfolkson> Maybe clutching at straws but is it a downside telling your peers what you want from them explicitly a bit of a privacy leak. I'm thinking that is a bit of a push
<lightlike> I wonder if having a separate protocol message for each boolean of information you want to exchange during feature negotiation is ideal design in the long run - it could lead to a zoo of messages over time that might be hard to keep track with
<sipa> lightlike: If we had perfect foresight about what we'd ever want the protocol to be, I'm sure something cleaner/more efficient could be chosen, but the "new protocol version number to add support for X; new message to negotiate the feature" has proven pretty flexible so far.
<sipa> Don't forget that most P2P connections are pretty long-lived (hours to days/weeks or more even), so a few messages back and forth once at the start isn't a huge cost.
<lightlike> When a node makes an outgoing block-relay-only connection, it will send a disabletx message after this PR. Will there be other changes in behavior for the sender beyond this (if yes, which ones)?
<jnewbery> lightlike: I share that concern. I think it's tractable now, but it gets more complex as the number of features increases and there are features that can only be negotiated if other features have already been enabled (eg erlay requires wtxidrelay).
<lightlike> sipa: yes, true. I'm not sure if the protocol would be more simple if these weren't separate messages but bits in one message containing a bitfield or something like this.
<stickies-v> lightlike I don't think it's implemented in this PR yet, but my understanding is that in follow up PRs the sender of DISABLETX would also have to stop sending other messages like ADDR in order not to be disconnected?
<lightlike> stickies-v: actually, that won't be necessary, we'll get to that in a later question. addr relay works well together with this PR and BIP338 "out of the box"
<lightlike> When a node receives both fRelay=false and a disabletx message from an incoming peer, will it behave differently after this PR? If yes, how?
<lightlike> Kaizen_K_: not sending transaction is something it would already be done before, just using the old "fRelay=false" information, so it's not changed with this PR.
<lightlike> stickies-v: correct! If you first send a disabletx message and then don't follow the BIP by following up with tx-related messages, you'll get disconnected.
<lightlike> note that the part where the receiver doesn't initialize the TxRelay objects is not part of this PR, it would probably make sense as a follow-up PR.
<michaelfolkson> So a possible downside to this DISABLETX approach is it formalizes the relationship as a block relay only and to get out of it you need to redo the handshake. Current behavior is a peer can later request to receive transactions again using filterload without redoing the handshake?
<lightlike> I think the old behavior is not disallowed: If you intend to use BIP37, just don't send a disabletx message, and everything works as good (or bad) as before
<lightlike> Earlier discussions in the PR revolved around the interaction of disabletx and address relay. Why is it, after PR #21528, no longer necessary to change any address-relay related code in this PR?
<lightlike> It basically changed the way address relay is setup: if you make an outgoing block-relay-connection, you know that you don't want addr related messages, and just don't send any such messages.
<lightlike> The new m_disable_tx field of the Peer struct is of type std::atomic<bool>. Why is an std::atomic used here instead of a simple bool, or a bool guarded by a lock?
<lightlike> yes, I guess the basic thing is that the "Peer" struct is not guarded by cs_main as a whole, so something needs to be done for each of the members separately
<larryruane> it seems this boolean is set (to true) from only one thread (message processing), but read from other threads, and locking is needed even in this case
<glozow> for every read/write to an atomic_bool, the lock will be taken right before and released right before that operation is over, so you wouldn't be taking multiple locks at a time in different orders?
<lightlike> could it be that operations such as setting a bool aren't atomic anyway? Or is this implemenation-specific and we jsut want to be on the safe side?
<sipa> larryruane: Pet peeve of mine: the compiler is irrelevant. The question you should ask if is if it's permitted by the *language* (and so far, I think the answer is no, but it could be expanded in future language versions of course).
<lightlike> larryruane: I think that std::atomic is currently limited to just few small, "integer-like" types. I don't think it exists for double, for example.
<sipa> In practice, x86 has a fairly strong consistency model, so it's not unreasonable that the resulting compiled code actually works there (but it certainly might not). On other hardware (non-Mac ARM, in particular) non-atomic operations are pretty likely to kill you.