The PR branch HEAD was 63103da at the time of this review club meeting.
Notes
Transaction relay is a three step process: inv -> getdata -> tx:
the relaying node sends an inv message to the receiving node to announce
a new transaction.
if the receiving node wants to retrieve the transaction that it learned
about in the inv message then it will send a getdata message to the
relaying node to request the full transaction. (The receiving node wont
send a getdata message to a peer for transactions that it has already
seen or if it has already sent a getdata message for the transaction to
a different peer.)
the relaying node delivers a tx message to the receiving node. If the
relaying node is no longer able to deliver the transaction, it responds
with notfound instead of the tx.
Currently if a tx message is received it will not be processed if:
The node is in blocks-only mode and the peer sending the message has
no relay permissions
The peer is a blocks-relay only peer
Otherwise, the node checks if it has already processed this transaction before
using the AlreadyHaveTx()
checks and if it not then the transaction is processed by AcceptToMemoryPool().
This is done even if it has not made a request for that transaction
(i.e: If it recieves a tx message without having first sent a getdata
message asking for the tx message then it will still process the transaction).
In other words the 3 step process described above is not enforced.
This PR aims to change that by requiring a transaction to first be requested
by a node before it processes an accociated tx message.
The PR author provides additional motivation for the PR and raises some of the
concerns that this change brings about in a mailing list thread.
What is the TxRequestTracker class in src/txrequest.h used for?
What are the different states that a peer/tx āannouncementā can be in? Where
in the code is an announcement shifted between the different states?
What does the author claim the problem is with the current way in which
transaction messages are handled?
What is the change that this PR is making and how does it solve the problem
mentioned in Q4?
When checking the TxRequestTracker to see if the node has requested a
transaction, why are both the transactionās txid and wtxid identifiers used?
(hint: See PR 19569 for details
on fetching orphan parents from wtxid peers)
There are a lot of switches on the PF_RELAY flag in code relevant to this
PR. What does the PF_RELAY flag mean? In what case(s) would you want to set
this flag for a peer?
Discussion: Since the inv -> getdata -> tx sequence has not been necessary
for communicating and receiving transaction data, some other clients donāt
bother with the sequence at all. This means that if this change was deployed
to Bitcoin Core nodes then other clients would not be able to relay transactions
to upgraded Bitcoin Core nodes. Eventually upgraded nodes would make up the
majority of the network, and so those clients would have to adapt and update.
Do the pros out weigh these cons? And if so, what is fair time frame to allow for
the other clients to adapt?
<elle> Thanks john! Hi everyone! Today we are looking at #21224: Halt processing of unrequested transactions. It was originally part of #20277 but has recently been split off to its own PR. The PR touches some interesting parts of the code that handles how we learn about new transactions. Letās dive in!
<elle> Also, feel free to ask questions at any time. We are all here to support each other and learn. If you are confused about something, chances are that others are confused about it too and so asking your question will benefit everyone :) Iām also a newbie so please correct me if I misspeak.
<elle> pinheadmz: dergoegge: indeed! it is used to keep track of the txs that our peers have notified us about and then we also use it to determine which peers we will request the txs from and when we will do so
<elle> 3.1: First of all: `TxRequestTracker` manages our announcements using a state machine. What are the different states that an announcement can be in?
<jnewbery> dergoegge: once we've received a transaction or the request state is COMPLETED for all peers (i.e. we don't have anyone else we can ask for the tx)
<abstract> eoin if thenomenclature of pull rather than push is confusing, despite the request originating from someone who's changing something (hence a push), it's a request to the repository maintainer to "pull" in the code to the repo
<emzy> DoS / DDoS possibility by opening many connectionions to a node and send junk transactions that are expensive to validate but. I think after the PR we will be limited by the in-flight transactions?
<ariard> dergoegge: when you're thinking about DoS, it's not a binary question, is a low-fee, expensive-to-validate tx a DoS in period of mempool congestion?
<glozow> does it solve the DoS problem? idk if it would stop an attacker that's sending us cpu-intensive transactions, they can still send plenty of invs and we'll ask for them. I like jnewbery's good highlight of aj's good point: "If what we want to do is rate-limit TX messages we receive from a peer IMO we should literally do that."
<jnewbery> it doesn't get added to the Peer objects. It just tracks individual announcements (and makes sure they're cleared up when a peer disconnect, for example)
<sdaftuar> glozow: this PR woudl not solve any DoS problems at the moment (though it would achieve ariard's original goal of ignoring unrequested transactions during IBD)
<felixweis> bitcoinj, after it sends a constructed transaction to a peer it does however check for INV messages for that hash on other peers to build confidence the wallet transaction has propagated trough the bitcoin network
<sdaftuar> my (mild) objection to ariard's original proposal is that i think it complicates our logic to special-case things for IBD, especially when it makes sense that they shoudl hold true generally
<ariard> glozow: yes but our utxo set might lead us to relay invalid transactions (their utxos has already been spent at tip), and feerate evaluation of such unrequested is really uncertain
<jnewbery> sdaftuar: harding shared that with me. I asked whether he thought it should be pointed out on the mailing list, but his understanding was that the author was already aware that clients were doing this
<sdaftuar> jnewbery: imo it's helpful to know how much we're breaking things (if at all) with changes like this. ie i don't know if it's just researchers doing tests that are the soruce of unrequested transactions, or wallets with widespread communities or what
<elle> 6. When checking the `TxRequestTracker` to see if the node has requested a transaction, why are both the transactionās txid and wtxid identifiers used? (hint: See PR 19569 for details on fetching orphan parents from wtxid peers)
<jnewbery> eoin: when we first start the node and we're catching up to the tip, we're in a state called IBD until we catch up. That effects various parts of our logic. I don't think all those differences are documented in a single place, but this comment from sipa in a different PR is a good overview: https://github.com/bitcoin/bitcoin/pull/21106#issuecomment-783676898
<felixweis> eoin: you start requesting/downloading all historical blocks from archival peers and verify their integrity. during that time you can't verify the integrity of almost all recent mempool transactions because they are spending coins created at a much later point in time anyway.
<felixweis> your node will use the memory allocated for the mempool (default 300MB) for the dbcache. that speeds up the generation of your local UTXO set because it needs to write to disk less often.
<elle> 6. When checking the `TxRequestTracker` to see if the node has requested a transaction, why are both the transactionās txid and wtxid identifiers used? (hint: See PR 19569 for details on fetching orphan parents from wtxid peers)
<elle> Cool lets skip 7 and move onto the discusssion. The question is specific but we can also talk more generally about the pros vs cons of this change:
<elle> but that would result in some duplication right? Cause the parent would be relayed and then when the child comes then the package would be relayed etc etc?
<elle> short on time. here is question 8: 8. Discussion: Since the `inv` -> `getdata` -> `tx` sequence has not been necessary for communicating and receiving transaction data, some other clients donāt bother with the sequence at all. This means that if this change was deployed to Bitcoin Core nodes then other clients would not be able to relay transactions to upgraded Bitcoin Core nodes.
<elle> Eventually upgraded nodes would make up the majority of the network, and so those clients would have to adapt and update. Do the pros out weigh these cons? And if so, what is fair time frame to allow for the other clients to adapt?
<ariard> elle: I think this question is really interesting because it underscores the fact that Core as a project doesn't have a process for p2p breaking changes
<jnewbery> I think the Andreas Schildback android bitcoin wallet was based on the same codebase, so I'd guess it does the same. I don't know of any others (but I wouldn't)
<sdaftuar> i think aj (and jnewbery) made arguments that this is undesirable regardless of whether this is breaking (at least, no discussion on the PR indicated that people thought this was breaking, though it's an open question)
<michaelfolkson> What was harding's rationale for not wanting to highlight a specific alternative implementation jnewbery? Not wanting to highlight something that is no longer well maintained?
<jnewbery> Yeah, lots of good discussion on the PR between aj and sdaftuar. I'd recommend reading it for a good lesson in "considerations when making p2p protocol changes"
<sdaftuar> glozow: right, you could imagine that if we thought this was a good change, but just didn't want to break existing software, that if that existing software doens't connect to default Bitcoin COre nodes anyway then maybe there's room to make a change, at least for users who aren't enabling NODE_BLOOM on their own node