Two small fixes to node broadcast logic (
mempool, p2p) Jun 23, 2021
The PR branch HEAD was ba99f37a at the time of this review club meeting.
The mempool keeps a set of
containing the txids of transactions that have not passed initial broadcast yet
(determined based on a heuristic, whether or not the node has received a
getdata for it). The unbroadcast set was introduced in PR #18038, which we
covered in a
previous review club.
There are two (related) mechanisms for rebroadcasting transactions:
Both of these mechanisms are executed on the scheduler thread.
Transactions can be referred to by txid (without witness) or by wtxid (with
witness, defined in
Multiple valid transactions can have the same non-witness data (same txid) but
different witnesses (different wtxid).
Transactions are announced to peers either by txid or by wtxid (since PR
#18044, which we also covered in a
previous review club). Whether a
peer wishes to receive announcements using txid or wtxid is negotiated during
connection. We refer to peers that prefer to receive wtxid announcements as
There are two unexpected behaviors in
one is related to unbroadcast and the other is in wtxid-based transaction relay.
BroadcastTransaction() is called with a transaction that
has the same txid as a transaction in the mempool (can be same witness,
different witness or even invalid witness), it causes the transaction to be
re-added to the unbroadcast set.
BroadcastTransaction() is called with a
same-txid-different-wtxid transaction as something already in the mempool,
it will call
RelayTransaction() with the wtxid of the argument tx’s wtxid
rather than the one in the mempool. This causes the relay to fail (
not sent) for wtxid-relay peers because
SendMessages() queries the mempool by
wtxid, doesn’t find it, and drops the announcement.
Did you review the PR?
Concept ACK, approach ACK, tested ACK, or
What was your review approach?
BroadcastTransaction() invoked (i.e. what code paths)?
What happens when
PeerManager::RelayTransaction() is called with the wtxid
of a transaction that isn’t in the mempool?
What does the unbroadcast set represent, conceptually? When should a
transaction be added and removed from the unbroadcast set?
In what scenario would the mempool have a transaction with the same txid but
different wtxid as a wallet transaction (feel free to give creative answers)?
What bugs are present prior to this PR? Can you think of a scenario in which
they cause a privacy leak?
How does PR #22261 fix these bugs?
Bonus: Why does the unbroadcast set contain txids instead of wtxids?
1 17:00 <glozow> #startmeeting
5 17:01 <dopedsilicon> Hey
6 17:01 <michaelfolkson> hola
10 17:01 <glozow> did anyone get a chance to look at the notes or review the PR?
11 17:02 <svav> Looked at the notes
12 17:02 <michaelfolkson> Yup, maybe can't answer all the questions though
13 17:03 <michaelfolkson> Took a while to get my head around why the fixes were needed
14 17:03 <glozow> okie dokie
15 17:03 <glozow> so this PR is making changes to the `BroadcastTransaction()` function - can anyone tell me what this function does?
17 17:04 <glozow> (guesses are fine too)
18 17:05 <michaelfolkson> So broadcasts a transaction :) Called by RPCs
19 17:06 <michaelfolkson> Too obvious?
20 17:06 <lightlike> adds transactions to the mempool and/or unbroadcast set and also relays them immediately
21 17:06 <LarryRuane> This is to send a transaction that originated in the current node (as opposed to being relayed, one that we recieved from a different node) ... (i'm not too confident in this answer)
22 17:06 <glozow> lightlike: wonderful, yes. invokes the mempool validation logic + relays
23 17:07 <glozow> LarryRuane: aha yes, it's interesting that it's in the src/node folder
24 17:07 <glozow> so these transactions would come from clients (e.g. RPC and wallet)
25 17:08 <glozow> where is `BroadcastTransaction()` called? is anyone able to link us to some call sites?
26 17:09 <svav> src/node/transaction.cpp
27 17:09 <LarryRuane> could we get a little more background on "RPC" versus "wallet"? If I had to guess, I'd say `sendrawtransaction` is RPC, and generally has NOTHING to do with the local wallet ... where as "wallet RPC" would be something like `sendmany`
28 17:09 <glozow> svav: that's where `BroadcastTransaction()` is defined yes
29 17:09 <sipa> LarryRuane: that sounds correct
30 17:10 <sipa> sendmany is instructing your local wallet to construct & broadcast a transaction
31 17:10 <svav> can be called by either sendrawtransaction RPC or wallet RPCs
32 17:10 <glozow> LarryRuane: correct - `sendrawtransaction` lets you submit a raw transaction directly to the node, you could call that without having the wallet ocmpiled
33 17:10 <sipa> sendrawtransaction is sending it yourself, no wallet involved
35 17:11 <LarryRuane> thanks sipa and glozow, very helpful
36 17:11 <michaelfolkson> So is the BroadcastTransaction() function used for both transactions originated in user's wallet and transactions received from other peers?
37 17:11 <glozow> ok, does anyone know exactly what bugs are being fixed in #22261 or should we walk through it together?
38 17:12 <LarryRuane> michaelfolkson I think not, not used for tx received from other peers
39 17:12 <glozow> michaelfolkson: it wouldn't come from other peers
40 17:12 <glozow> that invokes `AcceptToMemoryPool()` from the net processing layer
41 17:13 <michaelfolkson> LarryRuane glozow: Ok thanks
42 17:13 <michaelfolkson> I think I understand what bugs are being fixed but I had to get my head round some terminology
43 17:13 <glozow> michaelfolkson: feel free to give a guess of what the bugs are?
44 17:14 <michaelfolkson> Ok so summarizing John's description...
45 17:15 <michaelfolkson> Rebroadcasting a transaction that peers aren't interested in
46 17:15 <michaelfolkson> (as it has already been broadcast to them)
47 17:15 <michaelfolkson> Need to define what the unbroadcast set is here
48 17:15 <glozow> mm okay, i think this probably touches on the unbroadcast set
49 17:15 <glozow> that's 1 part of the PR
50 17:16 <glozow> good idea, anybody want to define unbroadcast set?
53 17:16 <michaelfolkson> I looked it up there, yeah
54 17:16 <michaelfolkson> "when the initial broadcast of a transaction hasn’t been deemed successful"
55 17:16 <michaelfolkson> And then you need to define what successful means :)
56 17:17 <michaelfolkson> receiving a single GETDATA
57 17:17 <LarryRuane> The way I understand unbroadcast set is, our node originates a tx, sends it out, but isn't sure that any other node has it (maybe the P2P message got dropped?)
58 17:17 <glozow> LarryRuane: right, the goal is to have an idea of whether our initial broadcast of transactions has succeeded.
59 17:18 <LarryRuane> other nodes *may* have it, but we're not sure
60 17:18 <michaelfolkson> LarryRuane: Right the message could have got dropped, the node might already know of the transaction or the node might not be interested in that transaction despite not having it?
61 17:18 <michaelfolkson> Is that right glozow?
62 17:19 <glozow> michaelfolkson: right, so we're naming scenarios for why a peer might not send us a GETDATA for a transaction (which would cause the tx to remain in the unbroadcast set)
63 17:19 <glozow> and you're correct - it's possible there's some connection issue, they might already know the transaction, and "might not be interested" could be that they already rejected the tx or heard it from someone else
64 17:20 <LarryRuane> so IIUC what makes the unbroadcast set special is that we don't retry those tx sends as quickly as we used to, we wait about up to one day
65 17:20 <glozow> i don't think we reject tx invs for no reason 🤔
66 17:20 <michaelfolkson> So it could be "unsuccessful" from your perspective even though all nodes already know and have that transaction
67 17:20 <glozow> LarryRuane: right
68 17:20 <glozow> michaelfolkson: yes. that's unlikely if this is the first time broadcasting that transaction, though
69 17:20 <LarryRuane> for better privacy (sorry if i'm being obvious :) )
70 17:21 <glozow> this is only intended for the initial broadcast
71 17:21 <michaelfolkson> glozow: Right if you are originating the transaction, gotcha
72 17:21 <glozow> LarryRuane: not at all, i'm sure this is new information for lots of people! or good to review
73 17:22 <lightlike> if that tx originates from your node, how could it happen that they know it already? (if they learnt about it from some other peer then *that peer or another* must have sent you a GETDATA at some time, so it would be cleared from the unbroadcast set already?)
74 17:23 <glozow> lightlike: right, it shouldn't be the case that your unbroadcast set has a transaction that has already gone through initial broadcast
75 17:23 <glozow> but one of the bugs here is you could re-add a transaction to your unbroadcast set
76 17:23 <LarryRuane> ok but if it's a *set* ... sets can't have duplicates, right?
77 17:24 <glozow> yes, but once you've removed it, the set won't remember that it's already seen it before
78 17:24 <LarryRuane> ah, ok
79 17:25 <glozow> so our issue is: after we've already broadcast the tx, we don't want to re-add it to our unbroadcast set, because that will cause us to keep rebroadcasting it (and we won't remove it because our peer won't re-request it from us)
80 17:25 <glozow> does that make sense?
82 17:25 <lightlike> would we remove it though after it makes it into a block in that situation?
83 17:26 <michaelfolkson> What was the rationale for adding it to the unbroadcast set before this PR?
84 17:26 <michaelfolkson> Just an oversight?
85 17:26 <LarryRuane> is this caused by what may be considered a user error? Do I have this flow right?: User initiates a tx (on our local node) ... gets broadcast (and also added to unbroadcast set) ... GETDATA happens, so we remove from unbroadcast set .. then user re-submits the same tx?
86 17:26 <glozow> lightlike: i thiiiink so
87 17:27 <glozow> LarryRuane: correct, it would be because the user called it again
88 17:28 <LarryRuane> so all what we've discussed so far here, is independed of the txid / wtxid distinction? I'm unclear how that relates to all this
89 17:28 <michaelfolkson> Yeah that is coming now
90 17:28 <glozow> LarryRuane: that's relevant here too
91 17:29 <LarryRuane> is the map key for the unbroadcast set the txid or the wtxid? (guess i could just look it up!)
92 17:29 <michaelfolkson> glozow: Relevant to the first fix?
93 17:29 <svav> Are there any diagrams for all this?
94 17:29 <LarryRuane> yes i was asking about the first fix
95 17:30 <michaelfolkson> That's the second fix? There's no overlap right?
96 17:30 <glozow> well, they're intertwined
97 17:30 <michaelfolkson> Ohh
98 17:30 <glozow> so what happens if you call `BroadcastTransaction()` with a transaction that has same-txid-different-wtxid as a tx in the mempool?
99 17:30 <glozow> (on master, before this PR)
100 17:30 <glozow> svav: i'm not sure, i think the review club for 18038 would be best place for info
101 17:31 <glozow> (do we know what I mean when i say same-txid-different-wtxid?)
102 17:31 <svav> it will call RelayTransaction() with the wtxid of the argument tx’s wtxid rather than the one in the mempool. This causes the relay to fail (INVs are not sent) for wtxid-relay peers because SendMessages() queries the mempool by wtxid, doesn’t find it, and drops the announcement.
103 17:31 <LarryRuane> glozow yes I do get that distinction
104 17:31 <glozow> svav: yes! well said
105 17:32 <svav> I copied that lol
108 17:32 <glozow> so before this PR, if you called `BroadcastTransaction()` with a transaction that has same-txid-different-wtxid as a tx in the mempool, it would get re-added to the unbroadcast set without even going through validation
109 17:33 <glozow> not a huge issue, but not exactly the behavior we want
110 17:33 <glozow> so the first fix in the PR is to move adding to unbroadcast set after a successful submission to mempool
111 17:33 <svav> What is the fix then?
112 17:34 <glozow> the other problem is that we would call `RelayTransaction()` with the wtxid of this transaction, rather than the one in the mempool
113 17:35 <glozow> ("this" transaction = the argument to the function)
114 17:35 <svav> For beginners can someone explain how two transaction can have the same txid but different wtxids?
115 17:35 <michaelfolkson> I do agree with svav that a diagram would be nice here :)
116 17:35 <svav> I love diagrams :)
117 17:35 <michaelfolkson> The interaction between the mempool, the unbroadcast set etc
118 17:35 <LarryRuane> I'm trying to understand why there would be two tx with the same txid but different witnesses (different wtxid) ... is it because there's some random element in the witness data? so if I submit the same tx twice (but signing each separately), then this can happen?
119 17:35 <LarryRuane> svav yes you beat me to that question!
120 17:36 <michaelfolkson> It was discussed yesterday in the L2 onchain workshop right? You find a cheaper witness
121 17:36 <glozow> mm so the txid of a transaction doesn't commit to the witness data
122 17:37 <glozow> i'm trying to find a good diagram
123 17:37 <LarryRuane> right so IIUC the txid commits to the *effects* of a tx only, the wtxid commits also to the authorization
124 17:37 <glozow> LarryRuane: correct
126 17:38 <michaelfolkson> At the bottom
127 17:38 <michaelfolkson> The wtxids get collected up to the coinbase transaction and are independent of the txids
128 17:38 <glozow> nono, that's the coinbase witness commitment
129 17:38 <glozow> we're talking about what transaction data gets serialized to get the wtxid vs the txid
131 17:39 <glozow> this is the simplest diagram i can find
132 17:40 <glozow> basically, you don't include witness data in txid haha
133 17:40 <michaelfolkson> Right the coinbase tx commits to the witnesses but not the transaction attached to that witness
134 17:40 <sipa> yes they do
135 17:40 <glozow> er, we're not talking about the coinbase commitment
136 17:41 <sipa> the coinbase tx commitments contains the merkle root of the wtxid tree
138 17:41 <sipa> wtxids commit to the full transaction, witness and non-witness data
139 17:41 <michaelfolkson> But you can change the wtxid without changing the txid of that transaction
140 17:41 <sipa> yes, by changing the witness
141 17:42 <michaelfolkson> The witness data is not committed to in the txid. Otherwise that would reintroduce malleability
143 17:42 <LarryRuane> sipa so was I correct in my earlier question, you can create an unlimited number of valid witnesses for a given tx (assuming you can create one)?
144 17:42 <michaelfolkson> Ok thanks. I thought I understood that lol
145 17:43 <sipa> LarryRuane: yes
146 17:43 <sipa> LarryRuane: well, technically speaking there is only a finite number of valid witnesses, due to block weight being finite :)
149 17:44 <michaelfolkson> Ok sorry glozow let's get back to your question
150 17:44 <glozow> so our peermanager's `RelayTransaction()` function takes a txid and wtxid parameter
151 17:44 <glozow> (can anyone tell me why?)
152 17:45 <glozow> what happens if we're trying to relay a transaction to our peer and can't find it in our mempool?
153 17:46 <michaelfolkson> We can't find the right wtxid or we literally can't find the txid?
154 17:46 <glozow> we can't find the transaction
155 17:46 <glozow> we know the id
156 17:47 <michaelfolkson> We can't relay it? We don't have it?
158 17:47 <michaelfolkson> You mean like it returns an error?
159 17:47 <glozow> no, we just skip it
160 17:47 <svav> wtxid parameter also used because some peers that prefer to receive wtxid announcements, i.e. wtxid-relay peers
161 17:48 <glozow> it's fully possible that we want to announce a transaction but it gets evicted from mempool by the time we go to announce it
163 17:48 <glozow> but in this case, it could be because we called `RelayTransaction()` with a wtxid of a transaction that isn't in our mempool
164 17:48 <glozow> svav: exactly
165 17:48 <michaelfolkson> What's the scenario here. The wallet doesn't know what the node is doing? That causes the confusion?
166 17:49 <michaelfolkson> If they were perfectly in sync this wouldn't happen
167 17:50 <glozow> ok the scenario is this: we have a transaction. the mempool has one with witness A and we call `BroadcastTransaction()` with the exact same tx, but different witness
168 17:50 <lightlike> why would you malleate your own transactions?
169 17:50 <glozow> idk 🤷 maybe there are other parties in it
170 17:50 <glozow> it's a multisig, idk
171 17:51 <michaelfolkson> lightlike: It is finding a cheaper witness after previously broadcasting a more expensive witness I think
172 17:51 <glozow> maybe it has multiple spending paths and you/a counterparty both made transactions
173 17:51 <LarryRuane> michaelfolkson cheaper meaning smaller (so lower fee)?
174 17:51 <michaelfolkson> It isn't deliberate malleation. And obviously the txid isn't malleated post SegWit
175 17:52 <michaelfolkson> LarryRuane: Smaller, so yeah lower fee
176 17:52 <glozow> this isn't a show-stopping bug, it's just not the behavior we'd expect
177 17:53 <glozow> either way, we should try to relay the transaction in our mempool
178 17:53 <michaelfolkson> "Prior to this commit, if BroadcastTransaction() is called with
179 17:53 <michaelfolkson> relay=true, then it'll call RelayTransaction() using the txid/wtxid of
180 17:53 <michaelfolkson> the new tx, not the txid/wtxid of the mempool tx."
181 17:53 <michaelfolkson> And the mempool doesn't have the new transaction?
183 17:54 <michaelfolkson> Ok
184 17:54 — michaelfolkson sweats
185 17:54 <LarryRuane> what's the worst effect (at a high level) of not fixing these bugs? transactions not being relayed around the network?
186 17:55 <glozow> good question. anybody have ideas?
187 17:55 <michaelfolkson> 2) just means less effective relaying right? More failures
188 17:55 <michaelfolkson> 1) is wasteful. Unnecessary relaying
189 17:55 <lightlike> for the first one, it would seem the opposite: tx beinged inv'ed too much (although the network already knows about it)
191 17:58 <glozow> no consensus failures. but worth fixing, ya?
192 17:58 <lightlike> If i understand it correctly, "transactions not being relayed" would rather be a downside of the first fix: if we happen to have bad peers, broadcast to them, get a GETDATA but they don't relay any further, then we'd currently try again after ~24hours (when we have better peers), after the fix we wouldn't do this anymore.
193 17:58 <LarryRuane> we're about out of time, so this would be too long of a discussion, but I notice there's no new or changed test code with this PR .. would be interesting to get into why, and generally, what kind of PRs need tests and what kind don't
194 17:58 <michaelfolkson> LarryRuane: If you read the review comments tests were discussed
195 17:59 <LarryRuane> ok thanks (hadn't done that, will do)
196 17:59 <michaelfolkson> duncandean is working on tests it appears
197 17:59 <michaelfolkson> I think everyone agrees it should have tests
199 18:00 <LarryRuane> often the tests are harder to write than the code being tested, I've noticed!
200 18:00 <glozow> that's all we have time for :)
201 18:00 <glozow> #endmeeting