Rework rebroadcast logic to improve privacy (mempool, p2p, wallet)

Nov 27, 2019

https://github.com/bitcoin/bitcoin/pull/16698

Host: amitiuttarwar - PR author: amitiuttarwar

Notes

Currently, a node only rebroadcasts a transaction if it is the originating wallet, which is very bad for privacy. These changes propose extracting the rebroadcast logic from the wallet into the mempool and having each node rebroadcast a set of transactions it thinks should have been mined by now.
Should have been mined is a tricky concept to define. At a high level, this implementation determines what should have been mined as follows:
1. Every ~20 mins, run a fee rate cache that computes the top of the mempool and stores the min package fee rate for a transaction to be included.
2. Rebroadcast events are triggered by a random poisson timer. When the timer pops, check if there has been a new block since the cache last ran. If there hasn’t been a new block, bump the timers to try and hit those conditions in the next run. If the conditions have been met this round, continue with the rebroadcast logic
3. Calculate the set of transactions that are at the top of the mempool and >30 mins old.
4. Filter out any transactions with fee rate < cached fee rate.
5. Rebroadcast the remaining set.
To minimize rebroadcast noise, we want the order of events to be -> cache run | block | rebroadcast. If we run the cache and then rebroadcast (without a block in between), the cache is not as meaningful. Thus we store the block height & compare to chain tip before rebroadcasting.
With these changes, the division of responsibility between the wallet & the node (mempool) are as follows:
- The wallet ensures the (local) mempool knows about the transaction by periodically attempting to resubmit that transaction.
- The mempool applies rebroadcast logic to ensure that the network knows about any transactions it believes should have already been mined.
This new logic could result in a problem: if a transaction is submitted locally and not immediately relayed, it could take a long time to be included in the rebroadcast set (for example, if it’s a low fee rate transaction), and thus a long time until it’s first seen on the network. To mitigate this, the mempool keeps track of transactions submitted locally via m_unbroadcast_txids. When it’s time to rebroadcast, any unbroadcast transactions will be added to the set to relay. A transaction is removed from this set when a peer requests is via a GETDATA message.
Reviewer notes: A good place to get started is by reading the commit messages. A great way to add value would be to think through or observe the interactions between the different mechanisms introduced.

Questions

Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? (Don’t forget to put your PR review on GitHub.)
Did you take any steps beyond reading the code?
This PR claims to improve privacy. Do you agree? What are privacy leaks that would remain after these changes? Can you describe the attack vectors?
What are some of the tradeoffs / costs incurred in this implementation?
What mechanisms prevent extreme bandwidth spikes on the network?
- Why is the cache less meaningful if theres no block between cache & rebroadcast?
- How does the cache interact with the recency filter to reduce rebroadcast set?
How are the functional tests? What is not tested? What are other ways of testing these changes?

Meeting Log

13:00

<jnewbery> #startmeeting

13:00

<b10c> hi

13:00

<jnewbery> hi everyone!

13:00

<michaelfolkson> Hi

13:00

<lightlike> hi

13:00

<mattcruzz> hi

13:00

<emzy> HI

13:00

<nehan> hi

13:00

<fjahr> Hi

13:00

<fanquake> hi

13:00

<jonatack> hi

13:00

<ariard_> hi

13:00

<sebastianvstaa> hi

13:00

<jnewbery> Quick show of hands: who's here for the first time o/

13:00

<mattcruzz> My first time

13:00

<sanoj> hi

13:00

<jnewbery> welcome mattcruzz!

13:00

<amiti> hi all! welcome :)

13:01

<mattcruzz> Thanks!

13:01

<jonatack> mattcruzz: nice!

13:01

<jnewbery> Today's PR is 16698: Rework rebroadcast logic to improve privacy

13:01

<jnewbery> Notes and questions are in the normal place: https://bitcoincore.reviews/16698.html

13:02

<jnewbery> amiti, who authored the PR is hosting the meeting today. Thanks amiti!

13:02

<andrewtoth_> hi

13:02

<jnewbery> I'll hand straight over to you

13:02

<amiti> cool, thanks everyone for coming

13:02

<amiti> who got a chance to review the pr? how about a quick round of y / n from everyone

13:03

<sanoj> y

13:03

<ariard_> yes still on it

13:03

<jnewbery> y (high-level review. Not detailed)

13:03

<sebastianvstaa> still on it

13:03

<andrewtoth_> y

13:03

<jonatack> y, just for an hour though, need more time on it

13:03

<emilengler> A bit

13:03

<nehan> some, still working on it

13:03

<michaelfolkson> y (close to a concept ACK but have a few questions to get me over the line)

13:03

<b10c> n

13:03

<lightlike> short look today, longer look some months ago.

13:03

<mattcruzz> n, just wanted to sit in

13:03

<fjahr> N

13:03

<emzy> n

13:04

<amiti> ok awesome!

13:04

<amiti> one thing before we get started is a quick reminder about etiquette- all questions are welcome and feel free to just jump in anytime to ask your question. we'll get to answering it :)

13:04

<amiti> so to kick us off- can someone give a quick summary of the PR?

13:05

<emilengler> Previously, a node would only rebroadcast a transaction if it is the originating wallet, which is very bad for privacy. These changes propose extracting the rebroadcast logic from the wallet into the mempool and having all nodes rebroadcast a set of transactions it thinks should have been mined by now.

13:05

<emilengler> From the notes on bitcoincore.reviews

13:06

<sebastianvstaa> so what is the definition of "should have been mined by now"?

13:06

<michaelfolkson> HIgh enough fee

13:06

<andrewtoth_> there have been blocks mined since it's been seen

13:06

<emilengler> sebastianvstaa: That they have at least 1 confirmation I believe

13:06

<pinheadmz> hi! didnt get to reviw, just gonna lurk today

13:07

<sebastianvstaa> ok thats 3 suggestions

13:07

<nehan> reworded: nodes will rebroadcast all transactions which match a certain set of conditions and have not been mined yet. the wallet will no longer rebroadcast "my" txns, instead they will be passed to the node to handle.

13:07

<michaelfolkson> In theory the transactions with the highest fees per vbyte should be mined in the next block

13:07

<sebastianvstaa> michaelfolkson: ok. high enough fee compared to what?

13:07

<amiti> emilengler, nehan: yup!

13:07

<emilengler> michaelfolkson: There are miners who mine empty blocks

13:08

<amiti> sebastianvstaa: the majority of this implementation is trying to answer the question of what "should have been mined" - can anyone give an overview of the mechanisms implemented?

13:08

<emilengler> Last time I looked at the BC there were quite a few but this was ~1 year ago

13:08

<michaelfolkson> emilengler: Indeed which is one of my later questions

13:09

<michaelfolkson> If there was a league table of all the transactions in your mempool from highest fee per vbyte to lowest fee per vbyte you would expect the highest transactions to be mined first until the block is full

13:09

<andrewtoth_> the PR caching a fee rate, and looks at the top 3/4 of the mempool that are above that feerate, and if a block has been mined since it's been seen. I believe that's how the PR answers the question

13:09

<sebastianvstaa> ah i see.

13:10

<amiti> andrewtoth_: yup! that is one of the main mechanisms

13:11

<amiti> taking a step back, as michaelfolkson is saying, you conceptually expect the txns at the top of your mempool (highest fee rates) to be mined into blocks

13:11

<jnewbery> emilengler: I see you've just left a code-style comment on the PR. All review is welcome, but I think at this stage higher-level conceptual review is more useful (the code is likely to move around during the review process so style nits add a lot of noise)

13:12

<amiti> the rebroadcast mechanism is meant to help reconcile if thats not happening. maybe there was a network failure or the transaction was at one point dropped from other mempools

13:12

<emilengler> jnewbery: Sure, it was just catching my eye ;)

13:12

<nehan> amiti: do you know if this happens in practice?

13:13

<emilengler> Also is it a good idea to rely on fees? It is in the complete control of the miners and like michaelfolkson also noted, what is with empty blocks?

13:14

<nehan> anyone: i'm trying to learn a bit more about user-rebroadcast expectations to get a feel for if the timing makes sense. i'm not sure what users expect from the wallet in terms of rebroadcast

13:14

<andrewtoth_> if your node is configured to accept longer ancestor chains, for instance, then this will ensure that longer chains are also rebroadcast after descendents are mined. That way peers with different configurations will get the other txs after some time.

13:14

<amiti> nehan: I'm currently working on running a node with my patch to observe the current network conditions, but there are definite cases where this could happen. Eg. in 2017 competitive fee rate market, transactions would get evicted from normal sized mempools, and if one of those transactions were yours, you would want to rebroadcast it so it could later be mined.

13:15

<amiti> emilengler: I dont understand your questions. Fees are in complete control of the miners? And what about empty blocks?

13:15

<nehan> andrewtoth_:could you explain what you mean by "ancestor" and "descendant"? how can a descendant be mined before an ancestor?

13:15

<emilengler> amiti: Oh sorry I don't meant fees, I meant the transactions that are included in a block

13:15

<nehan> amiti: thanks!

13:15

<emilengler> And therefore the problem with empty blocks

13:15

<andrewtoth_> emilengler: this PR actually constructs a block template, like miners do if they run core, so it uses the same logic to figure out what should be included

13:16

<ariard_> nehan: have a look -limitancestorcount and -limitdescendantscunt

13:16

<michaelfolkson> In the case of multiple empty blocks mined there could be a lot of unnecessary transaction rebroadcasting?

13:17

<ariard_> ancestors and descendants are more or less in-mempool unconfirmed parents/children

13:17

<amiti> michaelfolkson: yeah, I think thats true. Can anyone explain why?

13:17

<emzy> What if you would like to have the transaction time out from the mempool? You can't do that if everyone is rebroadcasting. Not sure if this is relevant.

13:18

<jnewbery> nehan: the default config is to only allow transaction packages up to size 25 transactions / 101Kb, but that is configuarble. If I change that to eg 50 transactions, I could have a long chain of transaction in my mempool that other nodes don't have (since they'll reject anything beyond the 25th in the chain).

13:18

<lightlike> one nice side effect of this PR might be that newly connected nodes get a good representation of the mempool faster (because other nodes will INV older high-fee transactions to them if they think that they should have been mined)

13:18

<ariard_> and so quicker running fee estimation

13:19

<amiti> emzy: once you broadcast a transaction, you should never expect it to "time out". Users could be running mempools of different sizes and still have that transaction. Or a random user could be rebroadcasting all transactions (running a patch). if you want to reliably "cancel" the transaction, you would have to spend the inputs in a different transaction.

13:19

<jnewbery> with this PR, once [some of] those first 25 transactions are mined and my node thinks the later transactions should be mined, my node will rebroadcast them

13:19

<emilengler> emzy: You mean that nodes could get down if they receive too much traffic because of so much rebroadcasting?

13:20

<andrewtoth_> amiti: i don't believe that would work if the tx is not signaling RBF. I recall a PR to introduce a removemempoolentry RPC. Perhaps that would be useful if this change is merged.

13:20

<michaelfolkson> lightlike: A newly connected node would receive the transactions before any periodic rebroadcasting?

13:20

<emzy> emilengler: no. amiti got my question right.

13:21

<nehan> jnewbery: got it, thanks

13:21

<andrewtoth_> perhaps there could be a limit to how many txs get rebroadcast, to limit worst case bandwidth use. Not sure what implications that would have though

13:21

<jnewbery> andrewtoth_: the scenario we're talking about is where we want to replace transactions in other node's mempools. removemempoolentry wouldn't have any effect here.

13:22

<andrewtoth_> jnewbery: yes, i understand. Then this seems like an issue if a user does want to have their tx canceled, say if they broadcast at 1 sat/byte but fee rates remain high for several weeks

13:22

<jnewbery> I think a change like https://github.com/bitcoin/bitcoin/pull/16409 (Remove mempool expiry, treat txs as replaceable instead) is more relevant here

13:23

<lightlike> michaelfolkson: I don't think so. Afaik, if the tx are older and most of the network have them already in their mempools, they currently would not periodically INV them, so the new node would only learn of them once they have been included in a block.

100

13:23

<jnewbery> (I'm not suggesting there's any dependency between this PR and that one - just that in the scenario you're describing, they're both relevant)

101

13:24

<andrewtoth_> jnewbery: thanks for bringing that to my attention

102

13:25

<amiti> andrewtoth_: trying to "cancel" your transaction is different than trying to bump the fee to get it to confirm. since many mempool policies are configurable, even if your transaction would have been evicted / dropped / expired from the standard mempool, there may be other mempools that still have it (for example, that of a miners)

103

13:25

<amiti> does that make sense?

104

13:25

<amiti> ok lets get started with the questions =P

105

13:26

<andrewtoth_> amiti: sure thanks

106

13:26

<amiti> This PR claims to improve privacy. Do you agree? What are privacy leaks that would remain after these changes? Can you describe the attack vectors?

107

13:26

<amiti> first part- do you agree that this PR would improve privacy? maybe some more quick y / n ?

108

13:26

<michaelfolkson> y

109

13:26

<emilengler> y

110

13:26

<andrewtoth_> y

111

13:27

<jnewbery> potentially y

112

13:27

<amiti> can anyone describe a privacy leak / attack vector that would remain after these changes?

113

13:27

<emzy> y

114

13:27

<ariard> hmmm I think there is an assumption than your tx has previously flooded

115

13:27

<michaelfolkson> Oh interesting jnewbery not convinced

116

13:27

<sebastianvstaa> y

117

13:27

<amiti> ariard: can you elaborate?

118

13:28

<jonatack> potentially an improvement, but I am wary of opening new privacy leaks or unexpected x-order effects

119

13:28

<ariard> because if you rebroadcast set <A,B,C> and let's say B hasn't been previously announced you may deduce rebroadcaster is near or is the origin node

120

13:28

<nehan> you are still the only node constantly relaying transactions that haven't confirmed yet from your wallet

121

13:28

<nehan> so i'm not sure how much this improves privacy

122

13:28

<ariard> in practice, I think the min age limit somewhere should avoid this issue

123

13:29

<amiti> ariard: hm, but if B wasn't previously announced, a peer wouldn't be able to identify that its a rebroadcast, so wouldn't it look like an initial relay?

124

13:29

<jnewbery> michaelfolkson: I haven't reviewed it yet. privacy is a slippery thing. This PR potentially could be a nice improvement, but the devil is always in the details.

125

13:29

<ariard> amiti: exactly that's my point it may look like an initial relay with all its pitfalls

126

13:30

<andrewtoth_> nehan: this PR explicity changes that so your node is *not* the only one constantly relaying your own txs

127

13:30

<amiti> ariard: ah, yes, this PR doesn't do anything to improve the privacy around initial relay.

128

13:30

<nehan> andrewoth_: only if they make it into the mempool

129

13:30

<nehan> no?

130

13:30

<ariard> amiti: yes initial relay isn't great in term of privacy

131

13:31

<nehan> if your wallet's txns are not accepted into other nodes' mempools, they will not rebroadcast them? shouldn't have used the word "confirmed"

132

13:31

<amiti> nehan: hm interesting, but if your txn isn't accepted into other nodes' mempools, it wouldn't be able to be mined?

133

13:31

<ariard> so yes this PR seems to improve privacy but it's bounded by initial relay leaks

134

13:32

<sebastianvstaa> this would be lot more private with the dandelion protocol

135

13:32

<nehan> amiti: it wouldn't be able to be mined right now, true.

136

13:32

<emilengler> What's an inital relay? The inital sync?

137

13:33

<andrewtoth_> emilengler: the first time a tx is broadcast from the original node

138

13:33

<emilengler> thx

139

13:33

<jonatack> An interesting paper just came out about txn clustering https://s-tikhomirov.github.io/transaction-clustering-bitcoin/

140

13:33

<amiti> nehan: so your point is true, if you are broadcasting a txn that you believe is valid but it gets rejected by other mempools, you would expect it to be mined and keep rebroadcasting it

141

13:34

<amiti> this has interesting implications for software upgrades that tighten the mempool acceptance criteria

142

13:34

<andrewtoth_> this PR is not addressing initial relay, it's addressing rebroadcasting after txs are already in other mempools

143

13:34

<nehan> amiti: exactly, if my mempool was larger, for example, I think?

144

13:34

<andrewtoth_> nehan: also for the case where your ancestor chain configuration is longer than others

145

13:34

<jnewbery> this PR doesn't claim to improve initial tx relay. I think it might be useful to just restate what this PR does improve. Currently, if you have a connection to a peer and you see the same transaction INVed from that peer more than once, you can have a very high confidence that the wallet attached to that peer owns the transaction. It's like a big flashing beacon saying "THIS TRANSACTION

146

13:34

<jnewbery> IS MINE". This PR improves that

147

13:35

<michaelfolkson> Hence the "re"broadcasting

148

13:35

<amiti> nehan: hm, if my mempool was larger I'd be more likely to rebroadcast some of those txns (bc they would be evicted from smaller ones), but I don't think that addresses the policy mismatch that you were talking about

149

13:35

<nehan> amiti: but once i get this txn that i think this mineable but others don't to another node that thinks it's mineable, it will also rebroadcast it.

150

13:36

<mattcruzz> jnewbery: what does "INVed" mean?

151

13:37

<ariard> jnewbery: agree but if your transaction didn't propagate on the network at initial relay, if it appears among a rebroadcast set from a node A I think you can assume that's a wallet tx issued by A

152

13:37

<jnewbery> mattcruz: when we learn about a transaction, we tell our peers about it using an INV message with the txid. If the peer wants the tx they'll reply with a GETDATA, and we'll then send them the TX message

153

13:37

<mattcruzz> jnewbery: got it, thanks for the explanation

154

13:38

<jnewbery> ariard: only if you're the only peer connected to node A

155

13:38

<amiti> nehan: yes. so I think the clear example of this is with a software upgrade. say nodes A, B, C are on an old policy, and nodes X, Y, Z have upgraded. node A forms a txn that gets rejected by the upgraded nodes, but B and C will accept them. When blocks come in and the fee rate indicates this txn should have been mined, all of the old nodes would rebroadcast it.

156

13:38

<sanoj> mattcruzz: more details here -> https://en.bitcoin.it/wiki/Network#Standard_relaying

157

13:38

<nehan> amiti: ah i see what you're saying. yes. is this bad?

158

13:39

<amiti> nehan: I mean. its not great....

159

13:39

<lightlike> I don't completely understand why the rebroadcast per se is so important, also in its current form. Would it lead to a significant error rate if we just trusted that the tx has flooded to the network after getting enough GETDATAs to our initial send, and forget about it (i.e. just remove the rebroadcasting)?

160

13:39

<amiti> nehan: but its a problem that already exists today

161

13:39

<mattcruzz> sanoj: thanks, I'll read up on it

162

13:39

<amiti> ariard: I don't follow. why would you be able to assume its a wallet tx issued by A?

163

13:39

<michaelfolkson> mattcruzz: short for inventory message

164

13:40

<ariard> jnewbery: if you're connected to neighboors of node A, you may know also what have been announced to A and make analysis from then

165

13:40

<amiti> ariard: you wouldn't know that it was specifically relayed because of being selected as a "rebroadcast". so it would just look like the initial relay.

166

13:40

<michaelfolkson> lightlike: It would impact your chances of getting your transaction mined in the next block (or an upcoming block)

167

13:41

<jnewbery> ariard: we'd expect node A's neighbours to also rebroadcast (after this change)

168

13:41

<amiti> ariard: I'm unclear if you are trying to highlight an issue with the rebroadcast logic or the initial broadcast logic. Are you claiming that the rebroadcast policy introduces an additional leak? Or that the guarantees are no better than the initial relay policy?

169

13:42

<amiti> ok! lets move onto the next question

170

13:42

<amiti> What are some of the tradeoffs / costs incurred in this implementation?

171

13:42

<nehan> i need to take off early -- thank you amiti! i'll try to read more later.

172

13:43

<ariard> amiti: would lean to second option and even further, this PR for sure improve rebroacast logic privacy, but you may still have leak due of rebroadcast being in fact an initial relay

173

13:43

<amiti> nehan: thanks for coming & your questions! let me know if you have more questions at any time :)

174

13:43

<michaelfolkson> amiti: Privacy, bandwidth, probability of getting a transaction mined in next block

175

13:43

<andrewtoth_> I think the main cost is rebroadcasting a lot of transactions that aren't rebroadcast with current behaviour. This could potentially be a large bandwidth increase.

176

13:43

<emzy> More network traffic that is redundant.

177

13:44

<ariard> amiti: IMO it's not worse just lower bounded by initial relay own leaks

178

13:44

<amiti> lightlike: GETDATAs can indicate that your transaction has initially been propagated out to the network, but there are other reasons that the txn might no longer be in other mempools. Eg. evicted based on fee rate. Or expired after 2 weeks. etc.

179

13:44

<jnewbery> It does seem problematic that if there's a softfork or tightening of policy rules, unupgraded nodes will continue to ping transactions that are no longer consensus- or policy- valid between themselves.

180

13:45

<amiti> yup! the biggest downside of this implementation is the potential bandwidth impact, which is something that needs to be evaluated very carefully before these changes could be accepted

181

13:45

<andrewtoth_> question - with erlay does this pr not become redundant? If we just use our entire mempool as the sketch, then we no longer have to send INVs for unconfirmed.

182

13:45

<michaelfolkson> I was waiting for the Erlay question :)

183

13:46

<jnewbery> andrewtoth_: erlay doesn't propose to use the entire mempool for the sketch. Just the transactions received over a recent window

184

13:46

<gleb> Sketching an entire mempool is not trivial, I don't have a solution for it now. It would be a separate project,

185

13:46

<andrewtoth_> jnewbery: ahh i see, so then the bandwidth problem will go away though, if we just include the should have been confirmed txs in the sketch every so often. no invs needed

186

13:46

<michaelfolkson> In a post Erlay world, nodes know which transactions to forward to which peers

187

13:47

<gleb> michaelfolkson: I'm not sure I understand you :)

188

13:47

<michaelfolkson> I don't understand you either lol. What do you mean by an "entire" mempool?

189

13:47

<gleb> By the way guys, I think Erlay is really orthogonal, so I think we should focus on something else here.

190

13:48

<michaelfolkson> Erlay will work out what transactions a peer is missing. So won't need to blindly rebroadcast to all peers?

191

13:48

<amiti> so, since bandwidth is the biggest concern.. what mechanisms are implemented to prevent huge rebroadcast sets?

192

13:48

<andrewtoth_> could there be a limit on how many txs get rebroadcasted each time, so the bandwidth would be limited? and not include the same txs for subsequent rebroadcasts?

193

13:48

<jnewbery> gleb: +1

194

13:48

<ariard> jnewbery: you think unupgraded nodes should prune old txn of their mempool to avoid this issue?

195

13:49

<michaelfolkson> I will ask you offline gleb

196

13:49

<amiti> ariard: can you describe more? I'm definitely interested in any ideas to improve the rebroadcasting no-longer-policy-valid-txns issue

197

13:50

<jnewbery> ariard: potentially. With the current logic, unless the originating wallet is rebroadcasting, then the tx will be expired from the mempool after two weeks. If all unupgraded nodes are rebroadcasting, they'll keep refreshing the tx in each other's mempools

198

13:50

<gleb> I think asking unupgraded nodes to do something is impractical. We should assume that those nodes might not have an active maintainer to accomplish such tasks

199

13:51

<lightlike> there is a fixed limit (3/4 block weight), plus the fee rate filter.

200

13:51

<amiti> thank you lightlike :)

201

13:51

<jnewbery> gleb: sorry. In this case, 'unupgraded' refers to nodes that have this PR, but don't have some future consensus or policy change

202

13:51

<ariard> gleb: it would be unupgraded node starting from this patch

203

13:51

<andrewtoth_> ahh i see

204

13:51

<andrewtoth_> is that a low enough limit?

205

13:52

<amiti> so, there's an initial fixed limit (currently 3/4 block weight, could potentially change)

206

13:52

<ariard> something like mapRebroadcastPeriod to track tx stucking in the mempool and being constantly rebroadcast

207

13:53

<ariard> like insert at first rebroadcast and prune every 2 weeks

208

13:53

<amiti> and actually, the way the fee rate filter is implemented means that it will skip over recent txns (less than 30 min old), but likely still end up with a set thats total 3/4 block weight .. assuming there are enough "old" txns in the mempool

209

13:53

<amiti> so the cache mechanism is really important

210

13:54

<amiti> if anyone is interested, I posted this gist to describe the two ways (that I see) that the cache helps reduce the rebroadcast set

211

13:54

<amiti> https://gist.github.com/amitiuttarwar/17ddf44e28e3de896b9be0139621f6f9

212

13:54

<michaelfolkson> Cool

213

13:55

<andrewtoth_> amiti: thanks!

214

13:55

<amiti> ok. it looks like we have just 5 minutes left

215

13:55

<amiti> there's so much more we can talk about, but if you have any pressing questions please speak up

216

13:55

<michaelfolkson> Go for another hour? ;)

217

13:55

<amiti> I can also stick around for a while after

218

13:55

<amiti> hahahahhahah

219

13:56

<ariard> last point I'm not sure about the GETDATA mechanism

220

13:56

<ariard> like it seems a weird ACK mechanism

221

13:56

<amiti> yes! I have concerns about that as well

222

13:56

<amiti> ariard: what are your hesitations?

223

13:57

<ariard> amiti: about the new m_unbroadcast_txids field

224

13:57

<ariard> what are you tryng to solve with this exactly ?

225

13:58

<amiti> ariard: from review notes- "This new logic could result in a problem: if a transaction is submitted locally and not immediately relayed, it could take a long time to be included in the rebroadcast set (for example, if its a low fee rate transaction), and thus a long time until it’s first seen on the network. To mitigate this, the mempool keeps track of transactions submitted locally via

226

13:58

<amiti> m_unbroadcast_txids. When it’s time to rebroadcast, any unbroadcast transactions will be added to the set to relay. A transaction is removed from this set when a peer requests is via a GETDATA message."

227

13:58

<ariard> it seems redudnant with wallet rebroadcast

228

13:59

<lightlike> there would also be a possibility to actively query for the tx (like asking a new peer for the tx after some time, to which we haven't INVed it before). Althought that would possibly also be less than perfect for privacy.

229

13:59

<ariard> amiti: "and thus a long time until it's first seen on the network"

230

13:59

<ariard> amiti: you mean wallet rebroadcast timer here of 24h?

231

14:00

<amiti> ariard: not quite. here's the scenario: you disable p2p. you form a low fee rate txn and submit to your mempool. you re-enable p2p. currently (before this PR) the rebroadcast mechanism would pick it up and send it out for initial broadcast. with these changes, it might be a long while before it hits the conditions to be rebroadcast to the network

232

14:00

<amiti> ariard: the 24h wallet timer is how frequently the wallet tries to submit it to your local mempool. it will be a no-op if the txn is already in the mempool.

233

14:00

<amiti> thats only useful if the txn got dropped from your own mempool.

234

14:01

<ariard> amiti: hmm but isn't the issue here is than your tx is low feerate

235

14:01

<amiti> lightlike: you are suggesting an active query to ensure the initial relay succeeded?

236

14:01

<amiti> ok! looks like the hour is up. I'll stick around and continue this convo, but thank you to everyone who came out :)

237

14:01

<amiti> (or stayed in at your computer I guess)

238

14:02

<jnewbery> Thanks amiti. Great meeting!

239

14:02

<amiti> ariard: why is it an issue to have a low fee rate txn? lol...

240

14:02

<jnewbery> #endmeeting

241

14:02

<ariard> amiti: because youre trying to prioritise this low feerate tx against your own rebroacast logic

242

14:02

<sanoj> Thanks amiti

243

14:02

<lightlike> amiti: "suggesting" is too much, just a spontaneous idea :-)

244

14:02

<lightlike> thanks amiti!

245

14:03

<amiti> lightlike: yeah totally :) along those lines, something I was considering is if you could use getting an INV for the txn to confirm that it has successfully propagated

246

14:03

<jonatack> Thanks amiti!

247

14:03

<emzy> Thanks all!

248

14:03

<sebastianvstaa> thanks amiti!

249

14:03

<emzy> Thanks amiti!

250

14:03

<ariard> lightlike: yes would be interesting to actively query rather than wait for a GETDATA which never comes for latency reason

251

14:04

<amiti> lightlike: but that hits some edge cases... like odd network graphs with poorly connected nodes

252

14:04

<amiti> ariard: if you never get any GETDATAs, your txn hasn't succesfully made it into any mempools... so you should keep sending it out. right?

253

14:05

<jesseposner> Thanks amiti!

254

14:06

<lightlike> ariard: if a GETDATA never comes (and we never sent the TX), there is 0% that the tx has made it, so that is a necessary condition. Actively querying an unrelated peer would give additional confidence that the tx has made it beyond our peers.

255

14:07

<ariard> lightlik: what's about seningd the tx with a BLOCKTXN?

256

14:08

<amiti> ariard: I'm not familiar with the BLOCKTXN message. I'll check it out.

257

14:08

<andrewtoth_> Thanks amiti!

258

14:08

<lightlike> ariard: wouldn't that mean that the tx has been mined?

259

14:08

<amiti> ok... yeah thats what I would expect

260

14:09

<amiti> but, your txn couldnt be mined if there were never any GETDATAs for it

261

14:10

<ariard> amiti: apart of you're the miner, it may be a weird edge case, I'm just not sure about relying on GETDATA as ACK

262

14:11

<michaelfolkson> I have a couple of quick questions too. 1) How do you prevent all the unnecessary rebroadcasting in the case of empty blocks being mined? How commonly do empty blocks (or half empty blocks for no apparent reason) occur these days?

263

14:11

<lightlike> yes. I guess if you have bad peers (e.g. if you are eclipsed), they could send GETDATAS to you but not relay the tx further.

264

14:11

<amiti> yeah. also the way the unbroadcast set works is you only need _one_ GETDATA

265

14:11

<amiti> which I don't love.

266

14:12

<amiti> I hash it out more here: https://github.com/bitcoin/bitcoin/pull/16698#discussion_r348840194

267

14:13

<amiti> michaelfolkson: the case of empty blocks is a good point. the cache is much less useful if fee rates are only increasing (aka no blocks come in, or empty block), but its not totally useless.

268

14:14

<amiti> but thanks for raising. I'll want to monitor on the network and observe what actually happens.

269

14:14

<ariard> amiti: yes was reading it rn, btw you may collect all concerns raised there and on the PR and propose the subject at a meeting to go forward faster :)

270

14:14

<michaelfolkson> Could you incorporate into the rebroadcast logic whether blocks are full or not?

271

14:15

<amiti> ariard: sorry, what?

272

14:15

<amiti> michaelfolkson: hm, I guess that could be possible. what is "full" though?

273

14:15

<andrewtoth_> weight of close to 4M

274

14:16

<amiti> right. we'd have to define how close is close enough.

275

14:16

<michaelfolkson> Yeah something like this. Only rebroadcast when the block(s) are 80-100% full

276

14:16

<amiti> def possible. I'll keep it in mind and share what I observe from the network

277

14:16

<amiti> I guess it also depends on how frequent empty blocks are

278

14:16

<michaelfolkson> The important thing is just not to regularly rebroadcast when blocks are empty or less than 50% full

279

14:17

<michaelfolkson> That's most likely the reason the transaction isn't in the block

280

14:17

<jonatack> amiti: what are your preferred ways or sites to observe the network?

281

14:17

<amiti> I mean, the main one I'm working on right now is running a node with my PR & extra verbose logging to see what the impact will be.

282

14:17

<andrewtoth_> if a block is less than 50% full but not close to empty, that just means that there isn't enough in miners' mempools. In those situations, everything in the mempool should be rebroadcasted

283

14:18

<michaelfolkson> You can't know that for sure?

284

14:18

<ariard> amiti: maybe ask to talk about your PR during a core-dev meeting to let people go through all issues at once

285

14:18

<andrewtoth_> michaelfolkson: i suppose not...

286

14:19

<ariard> The scope of the PR is really wide, you may track also all questions/answers raised in a gist

287

14:19

<amiti> ariard: ah, gotcha. good ideas

288

14:19

<michaelfolkson> Why is the rationale for an empty block or a block with one transaction or a block that is half full any different?

289

14:20

<andrewtoth_> it's just based on past behaviour of miners. Either there is only coinbase plus a few other txs, which usually happens when a new block comes in before a new template is formed, or the blocks have the entire mempool of the miner

290

14:20

<jonatack> amiti: further to ariard's suggestion, you could possibly propose the PR at the weekly irc for inclusion into "high-priority chasing concept acks"

291

14:20

<michaelfolkson> Ok we should let you go amiti. Thanks for staying longer. One last question. You said in a presentation you'd written lots and lots of tests. Did a lot of them not make it into the PR?

292

14:21

<amiti> ahahahhahaha

293

14:21

<amiti> michaelfolkson: these are not enough tests for you??

294

14:21

<amiti> most of the time I've spent on this PR is on those integration tests

295

14:21

<amiti> and oof based on some of the review comments still looks like there is some flakiness

296

14:21

<jonatack> amiti: e.g. https://github.com/bitcoin/bitcoin/projects/8

297

14:21

<michaelfolkson> I just wondered if you had a lot more that didn't make it in

298

14:21

<amiti> nope =P

299

14:22

<michaelfolkson> Ah ok :)

300

14:22

<amiti> they are just difficult & time consuming to get the conditions right

301

14:22

<amiti> esp since timing & p2p stuff

302

14:22

<amiti> so maybe saying "lots" was misleading

303

14:22

<michaelfolkson> Ok cool. Thanks a lot. Great session and great work

304

14:22

<amiti> thanks !

305

14:22

<amiti> jonatack: thanks for the suggestion

306

14:22

<jonatack> I believe you with the tests and plan to dig into them. Should learn a lot there given the work you've put into them.

307

14:24

<jonatack> Great job with the (well-attended!) session +1

308

14:24

<amiti> jonatack: thanks for the feedback around flakiness. I've been working really hard to have them reliable (& they are for me). I'll look into it, but can I recruit you for some debugging?

309

14:25

<jonatack> amiti: yep

310

14:27

<jonatack> (and fwiw, looking over your filters gist i could see it as docs in either the your code and/or your tests)

311

14:30

<amiti> oh interesting. docs in the code itself....

312

14:32

<jonatack> like suhas for instance, he does it well

313

14:32

<amiti> yeah I agree. I definitely love how suhas comments the code

314

14:32

<amiti> the gist feels less directly related to the code, cause its more of my interpretation of what I expect to happen.

315

14:32

<jonatack> and some of the functional tests have really great docs, ascii art binary trees etc

316

14:33

<amiti> only observing the real network can indicate what will actually happen

317

14:36

<instagibbs> <lightlike as-is the wallet can fail to get it into your *own* mempool, the rebroadcast is necessary to make sure it actually gets broadcast once

318

14:40

<jonatack> amiti: hmm mempool_rebroadcast.py passed twice, then failed at line 171 again on the 3rd try. please ping me anytime about your tests. happy to look and i'll learn things

319

14:40

<michaelfolkson> Can you provide link(s) jonatack to Suhas test docs?

320

14:40

<michaelfolkson> I wouldn't mind looking at them

321

14:40

<amiti> jonatack: thanks I'll check it out. I've been hunting down these sources of flakiness. Might ask you to send me some logs.

322

14:43

<jonatack> michaelfolkson: wrt suhas we were talking about code docs, e.g. in https://github.com/bitcoin/bitcoin/pull/16400

323

14:43

<jonatack> amiti: sure np, lmk

324

14:44

<michaelfolkson> "some of the functional tests have really great docs, ascii art binary trees etc"

325

14:45

<jonatack> amiti: (esp if i'm seeing test timeouts and you're not)

326

14:45

<jonatack> michaelfolkson: oh, those aren't by suhas afaik, and i don't recall offhand which ones but if you look at the functional tests you'll come across them

327

14:46

<michaelfolkson> Ok thanks

328

14:49

<jonatack> michaelfolkson: wallet_listsinceblock.py

329

14:50

<michaelfolkson> ?

330

14:50

<jonatack> michaelfolkson: e.g. the code comments in test_reorg() and test_double_spend()