The PR branch HEAD was ba99f37a at the time of this review club meeting.
Notes
The mempool keeps a set of unbroadcast
transactions
containing the txids of transactions that have not passed initial broadcast yet
(determined based on a heuristic, whether or not the node has received a
getdata for it). The unbroadcast set was introduced in PR #18038, which we
covered in a previous review club.
There are two (related) mechanisms for rebroadcasting transactions:
The wallet re-submits its own transactions to the node in
CWallet::ResendWalletTransactions()
(which calls into the node’s BroadcastTransaction() function).
Both of these mechanisms are executed on the scheduler thread.
Transactions can be referred to by txid (without witness) or by wtxid (with
witness, defined in
BIP141).
Multiple valid transactions can have the same non-witness data (same txid) but
different witnesses (different wtxid).
Transactions are announced to peers either by txid or by wtxid (since PR
#18044, which we also covered in a previous review club). Whether a
peer wishes to receive announcements using txid or wtxid is negotiated during
connection. We refer to peers that prefer to receive wtxid announcements as
wtxid-relay peers.
There are two unexpected behaviors in
BroadcastTransaction():
one is related to unbroadcast and the other is in wtxid-based transaction relay.
Unbroadcast: If BroadcastTransaction() is called with a transaction that
has the same txid as a transaction in the mempool (can be same witness,
different witness or even invalid witness), it causes the transaction to be
re-added to the unbroadcast set.
Relay: If BroadcastTransaction() is called with a
same-txid-different-wtxid transaction as something already in the mempool,
it will call RelayTransaction() with the wtxid of the argument tx’s wtxid
rather than the one in the mempool. This causes the relay to fail (INVs are
not sent) for wtxid-relay peers because SendMessages() queries the mempool by
wtxid, doesn’t find it, and drops the announcement.
How is BroadcastTransaction() invoked (i.e. what code paths)?
What happens when PeerManager::RelayTransaction() is called with the wtxid
of a transaction that isn’t in the mempool?
What does the unbroadcast set represent, conceptually? When should a
transaction be added and removed from the unbroadcast set?
In what scenario would the mempool have a transaction with the same txid but
different wtxid as a wallet transaction (feel free to give creative answers)?
What bugs are present prior to this PR? Can you think of a scenario in which
they cause a privacy leak?
How does PR #22261 fix these bugs?
Bonus: Why does the unbroadcast set contain txids instead of wtxids?
<LarryRuane> This is to send a transaction that originated in the current node (as opposed to being relayed, one that we recieved from a different node) ... (i'm not too confident in this answer)
<LarryRuane> could we get a little more background on "RPC" versus "wallet"? If I had to guess, I'd say `sendrawtransaction` is RPC, and generally has NOTHING to do with the local wallet ... where as "wallet RPC" would be something like `sendmany`
<glozow> LarryRuane: correct - `sendrawtransaction` lets you submit a raw transaction directly to the node, you could call that without having the wallet ocmpiled
<michaelfolkson> So is the BroadcastTransaction() function used for both transactions originated in user's wallet and transactions received from other peers?
<LarryRuane> The way I understand unbroadcast set is, our node originates a tx, sends it out, but isn't sure that any other node has it (maybe the P2P message got dropped?)
<michaelfolkson> LarryRuane: Right the message could have got dropped, the node might already know of the transaction or the node might not be interested in that transaction despite not having it?
<glozow> michaelfolkson: right, so we're naming scenarios for why a peer might not send us a GETDATA for a transaction (which would cause the tx to remain in the unbroadcast set)
<glozow> and you're correct - it's possible there's some connection issue, they might already know the transaction, and "might not be interested" could be that they already rejected the tx or heard it from someone else
<LarryRuane> so IIUC what makes the unbroadcast set special is that we don't retry those tx sends as quickly as we used to, we wait about up to one day
<lightlike> if that tx originates from your node, how could it happen that they know it already? (if they learnt about it from some other peer then *that peer or another* must have sent you a GETDATA at some time, so it would be cleared from the unbroadcast set already?)
<glozow> so our issue is: after we've already broadcast the tx, we don't want to re-add it to our unbroadcast set, because that will cause us to keep rebroadcasting it (and we won't remove it because our peer won't re-request it from us)
<LarryRuane> is this caused by what may be considered a user error? Do I have this flow right?: User initiates a tx (on our local node) ... gets broadcast (and also added to unbroadcast set) ... GETDATA happens, so we remove from unbroadcast set .. then user re-submits the same tx?
<svav> it will call RelayTransaction() with the wtxid of the argument tx’s wtxid rather than the one in the mempool. This causes the relay to fail (INVs are not sent) for wtxid-relay peers because SendMessages() queries the mempool by wtxid, doesn’t find it, and drops the announcement.
<glozow> so before this PR, if you called `BroadcastTransaction()` with a transaction that has same-txid-different-wtxid as a tx in the mempool, it would get re-added to the unbroadcast set without even going through validation
<LarryRuane> I'm trying to understand why there would be two tx with the same txid but different witnesses (different wtxid) ... is it because there's some random element in the witness data? so if I submit the same tx twice (but signing each separately), then this can happen?
<LarryRuane> sipa so was I correct in my earlier question, you can create an unlimited number of valid witnesses for a given tx (assuming you can create one)?
<glozow> ok the scenario is this: we have a transaction. the mempool has one with witness A and we call `BroadcastTransaction()` with the exact same tx, but different witness
<lightlike> If i understand it correctly, "transactions not being relayed" would rather be a downside of the first fix: if we happen to have bad peers, broadcast to them, get a GETDATA but they don't relay any further, then we'd currently try again after ~24hours (when we have better peers), after the fix we wouldn't do this anymore.
<LarryRuane> we're about out of time, so this would be too long of a discussion, but I notice there's no new or changed test code with this PR .. would be interesting to get into why, and generally, what kind of PRs need tests and what kind don't