Rework rebroadcast logic to improve privacy (mempool, p2p, wallet)

https://github.com/bitcoin/bitcoin/pull/16698

Host: amitiuttarwar  -  PR author: amitiuttarwar

Notes

  • Currently, a node only rebroadcasts a transaction if it is the originating wallet, which is very bad for privacy. These changes propose extracting the rebroadcast logic from the wallet into the mempool and having each node rebroadcast a set of transactions it thinks should have been mined by now.

  • Should have been mined is a tricky concept to define. At a high level, this implementation determines what should have been mined as follows:
    1. Every ~20 mins, run a fee rate cache that computes the top of the mempool and stores the min package fee rate for a transaction to be included.
    2. Rebroadcast events are triggered by a random poisson timer. When the timer pops, check if there has been a new block since the cache last ran. If there hasn’t been a new block, bump the timers to try and hit those conditions in the next run. If the conditions have been met this round, continue with the rebroadcast logic
    3. Calculate the set of transactions that are at the top of the mempool and >30 mins old.
    4. Filter out any transactions with fee rate < cached fee rate.
    5. Rebroadcast the remaining set.
  • To minimize rebroadcast noise, we want the order of events to be -> cache run | block | rebroadcast. If we run the cache and then rebroadcast (without a block in between), the cache is not as meaningful. Thus we store the block height & compare to chain tip before rebroadcasting.

  • With these changes, the division of responsibility between the wallet & the node (mempool) are as follows:
    • The wallet ensures the (local) mempool knows about the transaction by periodically attempting to resubmit that transaction.
    • The mempool applies rebroadcast logic to ensure that the network knows about any transactions it believes should have already been mined.
  • This new logic could result in a problem: if a transaction is submitted locally and not immediately relayed, it could take a long time to be included in the rebroadcast set (for example, if it’s a low fee rate transaction), and thus a long time until it’s first seen on the network. To mitigate this, the mempool keeps track of transactions submitted locally via m_unbroadcast_txids. When it’s time to rebroadcast, any unbroadcast transactions will be added to the set to relay. A transaction is removed from this set when a peer requests is via a GETDATA message.

  • Reviewer notes: A good place to get started is by reading the commit messages. A great way to add value would be to think through or observe the interactions between the different mechanisms introduced.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? (Don’t forget to put your PR review on GitHub.)

  2. Did you take any steps beyond reading the code?

  3. This PR claims to improve privacy. Do you agree? What are privacy leaks that would remain after these changes? Can you describe the attack vectors?

  4. What are some of the tradeoffs / costs incurred in this implementation?

  5. What mechanisms prevent extreme bandwidth spikes on the network?
    • Why is the cache less meaningful if theres no block between cache & rebroadcast?
    • How does the cache interact with the recency filter to reduce rebroadcast set?
  6. How are the functional tests? What is not tested? What are other ways of testing these changes?

Meeting Log

  113:00 <jnewbery> #startmeeting
  213:00 <b10c> hi
  313:00 <jnewbery> hi everyone!
  413:00 <michaelfolkson> Hi
  513:00 <lightlike> hi
  613:00 <mattcruzz> hi
  713:00 <emzy> HI
  813:00 <nehan> hi
  913:00 <fjahr> Hi
 1013:00 <fanquake> hi
 1113:00 <jonatack> hi
 1213:00 <ariard_> hi
 1313:00 <sebastianvstaa> hi
 1413:00 <jnewbery> Quick show of hands: who's here for the first time o/
 1513:00 <mattcruzz> My first time
 1613:00 <sanoj> hi
 1713:00 <jnewbery> welcome mattcruzz!
 1813:00 <amiti> hi all! welcome :)
 1913:01 <mattcruzz> Thanks!
 2013:01 <jonatack> mattcruzz: nice!
 2113:01 <jnewbery> Today's PR is 16698: Rework rebroadcast logic to improve privacy
 2213:01 <jnewbery> Notes and questions are in the normal place: https://bitcoincore.reviews/16698.html
 2313:02 <jnewbery> amiti, who authored the PR is hosting the meeting today. Thanks amiti!
 2413:02 <andrewtoth_> hi
 2513:02 <jnewbery> I'll hand straight over to you
 2613:02 <amiti> cool, thanks everyone for coming
 2713:02 <amiti> who got a chance to review the pr? how about a quick round of y / n from everyone
 2813:03 <sanoj> y
 2913:03 <ariard_> yes still on it
 3013:03 <jnewbery> y (high-level review. Not detailed)
 3113:03 <sebastianvstaa> still on it
 3213:03 <andrewtoth_> y
 3313:03 <jonatack> y, just for an hour though, need more time on it
 3413:03 <emilengler> A bit
 3513:03 <nehan> some, still working on it
 3613:03 <michaelfolkson> y (close to a concept ACK but have a few questions to get me over the line)
 3713:03 <b10c> n
 3813:03 <lightlike> short look today, longer look some months ago.
 3913:03 <mattcruzz> n, just wanted to sit in
 4013:03 <fjahr> N
 4113:03 <emzy> n
 4213:04 <amiti> ok awesome!
 4313:04 <amiti> one thing before we get started is a quick reminder about etiquette- all questions are welcome and feel free to just jump in anytime to ask your question. we'll get to answering it :)
 4413:04 <amiti> so to kick us off- can someone give a quick summary of the PR?
 4513:05 <emilengler> Previously, a node would only rebroadcast a transaction if it is the originating wallet, which is very bad for privacy. These changes propose extracting the rebroadcast logic from the wallet into the mempool and having all nodes rebroadcast a set of transactions it thinks should have been mined by now.
 4613:05 <emilengler> From the notes on bitcoincore.reviews
 4713:06 <sebastianvstaa> so what is the definition of "should have been mined by now"?
 4813:06 <michaelfolkson> HIgh enough fee
 4913:06 <andrewtoth_> there have been blocks mined since it's been seen
 5013:06 <emilengler> sebastianvstaa: That they have at least 1 confirmation I believe
 5113:06 <pinheadmz> hi! didnt get to reviw, just gonna lurk today
 5213:07 <sebastianvstaa> ok thats 3 suggestions
 5313:07 <nehan> reworded: nodes will rebroadcast all transactions which match a certain set of conditions and have not been mined yet. the wallet will no longer rebroadcast "my" txns, instead they will be passed to the node to handle.
 5413:07 <michaelfolkson> In theory the transactions with the highest fees per vbyte should be mined in the next block
 5513:07 <sebastianvstaa> michaelfolkson: ok. high enough fee compared to what?
 5613:07 <amiti> emilengler, nehan: yup!
 5713:07 <emilengler> michaelfolkson: There are miners who mine empty blocks
 5813:08 <amiti> sebastianvstaa: the majority of this implementation is trying to answer the question of what "should have been mined" - can anyone give an overview of the mechanisms implemented?
 5913:08 <emilengler> Last time I looked at the BC there were quite a few but this was ~1 year ago
 6013:08 <michaelfolkson> emilengler: Indeed which is one of my later questions
 6113:09 <michaelfolkson> If there was a league table of all the transactions in your mempool from highest fee per vbyte to lowest fee per vbyte you would expect the highest transactions to be mined first until the block is full
 6213:09 <andrewtoth_> the PR caching a fee rate, and looks at the top 3/4 of the mempool that are above that feerate, and if a block has been mined since it's been seen. I believe that's how the PR answers the question
 6313:09 <sebastianvstaa> ah i see.
 6413:10 <amiti> andrewtoth_: yup! that is one of the main mechanisms
 6513:11 <amiti> taking a step back, as michaelfolkson is saying, you conceptually expect the txns at the top of your mempool (highest fee rates) to be mined into blocks
 6613:11 <jnewbery> emilengler: I see you've just left a code-style comment on the PR. All review is welcome, but I think at this stage higher-level conceptual review is more useful (the code is likely to move around during the review process so style nits add a lot of noise)
 6713:12 <amiti> the rebroadcast mechanism is meant to help reconcile if thats not happening. maybe there was a network failure or the transaction was at one point dropped from other mempools
 6813:12 <emilengler> jnewbery: Sure, it was just catching my eye ;)
 6913:12 <nehan> amiti: do you know if this happens in practice?
 7013:13 <emilengler> Also is it a good idea to rely on fees? It is in the complete control of the miners and like michaelfolkson also noted, what is with empty blocks?
 7113:14 <nehan> anyone: i'm trying to learn a bit more about user-rebroadcast expectations to get a feel for if the timing makes sense. i'm not sure what users expect from the wallet in terms of rebroadcast
 7213:14 <andrewtoth_> if your node is configured to accept longer ancestor chains, for instance, then this will ensure that longer chains are also rebroadcast after descendents are mined. That way peers with different configurations will get the other txs after some time.
 7313:14 <amiti> nehan: I'm currently working on running a node with my patch to observe the current network conditions, but there are definite cases where this could happen. Eg. in 2017 competitive fee rate market, transactions would get evicted from normal sized mempools, and if one of those transactions were yours, you would want to rebroadcast it so it could later be mined.
 7413:15 <amiti> emilengler: I dont understand your questions. Fees are in complete control of the miners? And what about empty blocks?
 7513:15 <nehan> andrewtoth_:could you explain what you mean by "ancestor" and "descendant"? how can a descendant be mined before an ancestor?
 7613:15 <emilengler> amiti: Oh sorry I don't meant fees, I meant the transactions that are included in a block
 7713:15 <nehan> amiti: thanks!
 7813:15 <emilengler> And therefore the problem with empty blocks
 7913:15 <andrewtoth_> emilengler: this PR actually constructs a block template, like miners do if they run core, so it uses the same logic to figure out what should be included
 8013:16 <ariard_> nehan: have a look -limitancestorcount and -limitdescendantscunt
 8113:16 <michaelfolkson> In the case of multiple empty blocks mined there could be a lot of unnecessary transaction rebroadcasting?
 8213:17 <ariard_> ancestors and descendants are more or less in-mempool unconfirmed parents/children
 8313:17 <amiti> michaelfolkson: yeah, I think thats true. Can anyone explain why?
 8413:17 <emzy> What if you would like to have the transaction time out from the mempool? You can't do that if everyone is rebroadcasting. Not sure if this is relevant.
 8513:18 <jnewbery> nehan: the default config is to only allow transaction packages up to size 25 transactions / 101Kb, but that is configuarble. If I change that to eg 50 transactions, I could have a long chain of transaction in my mempool that other nodes don't have (since they'll reject anything beyond the 25th in the chain).
 8613:18 <lightlike> one nice side effect of this PR might be that newly connected nodes get a good representation of the mempool faster (because other nodes will INV older high-fee transactions to them if they think that they should have been mined)
 8713:18 <ariard_> and so quicker running fee estimation
 8813:19 <amiti> emzy: once you broadcast a transaction, you should never expect it to "time out". Users could be running mempools of different sizes and still have that transaction. Or a random user could be rebroadcasting all transactions (running a patch). if you want to reliably "cancel" the transaction, you would have to spend the inputs in a different transaction.
 8913:19 <jnewbery> with this PR, once [some of] those first 25 transactions are mined and my node thinks the later transactions should be mined, my node will rebroadcast them
 9013:19 <emilengler> emzy: You mean that nodes could get down if they receive too much traffic because of so much rebroadcasting?
 9113:20 <andrewtoth_> amiti: i don't believe that would work if the tx is not signaling RBF. I recall a PR to introduce a removemempoolentry RPC. Perhaps that would be useful if this change is merged.
 9213:20 <michaelfolkson> lightlike: A newly connected node would receive the transactions before any periodic rebroadcasting?
 9313:20 <emzy> emilengler: no. amiti got my question right.
 9413:21 <nehan> jnewbery: got it, thanks
 9513:21 <andrewtoth_> perhaps there could be a limit to how many txs get rebroadcast, to limit worst case bandwidth use. Not sure what implications that would have though
 9613:21 <jnewbery> andrewtoth_: the scenario we're talking about is where we want to replace transactions in other node's mempools. removemempoolentry wouldn't have any effect here.
 9713:22 <andrewtoth_> jnewbery: yes, i understand. Then this seems like an issue if a user does want to have their tx canceled, say if they broadcast at 1 sat/byte but fee rates remain high for several weeks
 9813:22 <jnewbery> I think a change like https://github.com/bitcoin/bitcoin/pull/16409 (Remove mempool expiry, treat txs as replaceable instead) is more relevant here
 9913:23 <lightlike> michaelfolkson: I don't think so. Afaik, if the tx are older and most of the network have them already in their mempools, they currently would not periodically INV them, so the new node would only learn of them once they have been included in a block.
10013:23 <jnewbery> (I'm not suggesting there's any dependency between this PR and that one - just that in the scenario you're describing, they're both relevant)
10113:24 <andrewtoth_> jnewbery: thanks for bringing that to my attention
10213:25 <amiti> andrewtoth_: trying to "cancel" your transaction is different than trying to bump the fee to get it to confirm. since many mempool policies are configurable, even if your transaction would have been evicted / dropped / expired from the standard mempool, there may be other mempools that still have it (for example, that of a miners)
10313:25 <amiti> does that make sense?
10413:25 <amiti> ok lets get started with the questions =P
10513:26 <andrewtoth_> amiti: sure thanks
10613:26 <amiti> This PR claims to improve privacy. Do you agree? What are privacy leaks that would remain after these changes? Can you describe the attack vectors?
10713:26 <amiti> first part- do you agree that this PR would improve privacy? maybe some more quick y / n ?
10813:26 <michaelfolkson> y
10913:26 <emilengler> y
11013:26 <andrewtoth_> y
11113:27 <jnewbery> potentially y
11213:27 <amiti> can anyone describe a privacy leak / attack vector that would remain after these changes?
11313:27 <emzy> y
11413:27 <ariard> hmmm I think there is an assumption than your tx has previously flooded
11513:27 <michaelfolkson> Oh interesting jnewbery not convinced
11613:27 <sebastianvstaa> y
11713:27 <amiti> ariard: can you elaborate?
11813:28 <jonatack> potentially an improvement, but I am wary of opening new privacy leaks or unexpected x-order effects
11913:28 <ariard> because if you rebroadcast set <A,B,C> and let's say B hasn't been previously announced you may deduce rebroadcaster is near or is the origin node
12013:28 <nehan> you are still the only node constantly relaying transactions that haven't confirmed yet from your wallet
12113:28 <nehan> so i'm not sure how much this improves privacy
12213:28 <ariard> in practice, I think the min age limit somewhere should avoid this issue
12313:29 <amiti> ariard: hm, but if B wasn't previously announced, a peer wouldn't be able to identify that its a rebroadcast, so wouldn't it look like an initial relay?
12413:29 <jnewbery> michaelfolkson: I haven't reviewed it yet. privacy is a slippery thing. This PR potentially could be a nice improvement, but the devil is always in the details.
12513:29 <ariard> amiti: exactly that's my point it may look like an initial relay with all its pitfalls
12613:30 <andrewtoth_> nehan: this PR explicity changes that so your node is *not* the only one constantly relaying your own txs
12713:30 <amiti> ariard: ah, yes, this PR doesn't do anything to improve the privacy around initial relay.
12813:30 <nehan> andrewoth_: only if they make it into the mempool
12913:30 <nehan> no?
13013:30 <ariard> amiti: yes initial relay isn't great in term of privacy
13113:31 <nehan> if your wallet's txns are not accepted into other nodes' mempools, they will not rebroadcast them? shouldn't have used the word "confirmed"
13213:31 <amiti> nehan: hm interesting, but if your txn isn't accepted into other nodes' mempools, it wouldn't be able to be mined?
13313:31 <ariard> so yes this PR seems to improve privacy but it's bounded by initial relay leaks
13413:32 <sebastianvstaa> this would be lot more private with the dandelion protocol
13513:32 <nehan> amiti: it wouldn't be able to be mined right now, true.
13613:32 <emilengler> What's an inital relay? The inital sync?
13713:33 <andrewtoth_> emilengler: the first time a tx is broadcast from the original node
13813:33 <emilengler> thx
13913:33 <jonatack> An interesting paper just came out about txn clustering https://s-tikhomirov.github.io/transaction-clustering-bitcoin/
14013:33 <amiti> nehan: so your point is true, if you are broadcasting a txn that you believe is valid but it gets rejected by other mempools, you would expect it to be mined and keep rebroadcasting it
14113:34 <amiti> this has interesting implications for software upgrades that tighten the mempool acceptance criteria
14213:34 <andrewtoth_> this PR is not addressing initial relay, it's addressing rebroadcasting after txs are already in other mempools
14313:34 <nehan> amiti: exactly, if my mempool was larger, for example, I think?
14413:34 <andrewtoth_> nehan: also for the case where your ancestor chain configuration is longer than others
14513:34 <jnewbery> this PR doesn't claim to improve initial tx relay. I think it might be useful to just restate what this PR does improve. Currently, if you have a connection to a peer and you see the same transaction INVed from that peer more than once, you can have a very high confidence that the wallet attached to that peer owns the transaction. It's like a big flashing beacon saying "THIS TRANSACTION
14613:34 <jnewbery> IS MINE". This PR improves that
14713:35 <michaelfolkson> Hence the "re"broadcasting
14813:35 <amiti> nehan: hm, if my mempool was larger I'd be more likely to rebroadcast some of those txns (bc they would be evicted from smaller ones), but I don't think that addresses the policy mismatch that you were talking about
14913:35 <nehan> amiti: but once i get this txn that i think this mineable but others don't to another node that thinks it's mineable, it will also rebroadcast it.
15013:36 <mattcruzz> jnewbery: what does "INVed" mean?
15113:37 <ariard> jnewbery: agree but if your transaction didn't propagate on the network at initial relay, if it appears among a rebroadcast set from a node A I think you can assume that's a wallet tx issued by A
15213:37 <jnewbery> mattcruz: when we learn about a transaction, we tell our peers about it using an INV message with the txid. If the peer wants the tx they'll reply with a GETDATA, and we'll then send them the TX message
15313:37 <mattcruzz> jnewbery: got it, thanks for the explanation
15413:38 <jnewbery> ariard: only if you're the only peer connected to node A
15513:38 <amiti> nehan: yes. so I think the clear example of this is with a software upgrade. say nodes A, B, C are on an old policy, and nodes X, Y, Z have upgraded. node A forms a txn that gets rejected by the upgraded nodes, but B and C will accept them. When blocks come in and the fee rate indicates this txn should have been mined, all of the old nodes would rebroadcast it.
15613:38 <sanoj> mattcruzz: more details here -> https://en.bitcoin.it/wiki/Network#Standard_relaying
15713:38 <nehan> amiti: ah i see what you're saying. yes. is this bad?
15813:39 <amiti> nehan: I mean. its not great....
15913:39 <lightlike> I don't completely understand why the rebroadcast per se is so important, also in its current form. Would it lead to a significant error rate if we just trusted that the tx has flooded to the network after getting enough GETDATAs to our initial send, and forget about it (i.e. just remove the rebroadcasting)?
16013:39 <amiti> nehan: but its a problem that already exists today
16113:39 <mattcruzz> sanoj: thanks, I'll read up on it
16213:39 <amiti> ariard: I don't follow. why would you be able to assume its a wallet tx issued by A?
16313:39 <michaelfolkson> mattcruzz: short for inventory message
16413:40 <ariard> jnewbery: if you're connected to neighboors of node A, you may know also what have been announced to A and make analysis from then
16513:40 <amiti> ariard: you wouldn't know that it was specifically relayed because of being selected as a "rebroadcast". so it would just look like the initial relay.
16613:40 <michaelfolkson> lightlike: It would impact your chances of getting your transaction mined in the next block (or an upcoming block)
16713:41 <jnewbery> ariard: we'd expect node A's neighbours to also rebroadcast (after this change)
16813:41 <amiti> ariard: I'm unclear if you are trying to highlight an issue with the rebroadcast logic or the initial broadcast logic. Are you claiming that the rebroadcast policy introduces an additional leak? Or that the guarantees are no better than the initial relay policy?
16913:42 <amiti> ok! lets move onto the next question
17013:42 <amiti> What are some of the tradeoffs / costs incurred in this implementation?
17113:42 <nehan> i need to take off early -- thank you amiti! i'll try to read more later.
17213:43 <ariard> amiti: would lean to second option and even further, this PR for sure improve rebroacast logic privacy, but you may still have leak due of rebroadcast being in fact an initial relay
17313:43 <amiti> nehan: thanks for coming & your questions! let me know if you have more questions at any time :)
17413:43 <michaelfolkson> amiti: Privacy, bandwidth, probability of getting a transaction mined in next block
17513:43 <andrewtoth_> I think the main cost is rebroadcasting a lot of transactions that aren't rebroadcast with current behaviour. This could potentially be a large bandwidth increase.
17613:43 <emzy> More network traffic that is redundant.
17713:44 <ariard> amiti: IMO it's not worse just lower bounded by initial relay own leaks
17813:44 <amiti> lightlike: GETDATAs can indicate that your transaction has initially been propagated out to the network, but there are other reasons that the txn might no longer be in other mempools. Eg. evicted based on fee rate. Or expired after 2 weeks. etc.
17913:44 <jnewbery> It does seem problematic that if there's a softfork or tightening of policy rules, unupgraded nodes will continue to ping transactions that are no longer consensus- or policy- valid between themselves.
18013:45 <amiti> yup! the biggest downside of this implementation is the potential bandwidth impact, which is something that needs to be evaluated very carefully before these changes could be accepted
18113:45 <andrewtoth_> question - with erlay does this pr not become redundant? If we just use our entire mempool as the sketch, then we no longer have to send INVs for unconfirmed.
18213:45 <michaelfolkson> I was waiting for the Erlay question :)
18313:46 <jnewbery> andrewtoth_: erlay doesn't propose to use the entire mempool for the sketch. Just the transactions received over a recent window
18413:46 <gleb> Sketching an entire mempool is not trivial, I don't have a solution for it now. It would be a separate project,
18513:46 <andrewtoth_> jnewbery: ahh i see, so then the bandwidth problem will go away though, if we just include the should have been confirmed txs in the sketch every so often. no invs needed
18613:46 <michaelfolkson> In a post Erlay world, nodes know which transactions to forward to which peers
18713:47 <gleb> michaelfolkson: I'm not sure I understand you :)
18813:47 <michaelfolkson> I don't understand you either lol. What do you mean by an "entire" mempool?
18913:47 <gleb> By the way guys, I think Erlay is really orthogonal, so I think we should focus on something else here.
19013:48 <michaelfolkson> Erlay will work out what transactions a peer is missing. So won't need to blindly rebroadcast to all peers?
19113:48 <amiti> so, since bandwidth is the biggest concern.. what mechanisms are implemented to prevent huge rebroadcast sets?
19213:48 <andrewtoth_> could there be a limit on how many txs get rebroadcasted each time, so the bandwidth would be limited? and not include the same txs for subsequent rebroadcasts?
19313:48 <jnewbery> gleb: +1
19413:48 <ariard> jnewbery: you think unupgraded nodes should prune old txn of their mempool to avoid this issue?
19513:49 <michaelfolkson> I will ask you offline gleb
19613:49 <amiti> ariard: can you describe more? I'm definitely interested in any ideas to improve the rebroadcasting no-longer-policy-valid-txns issue
19713:50 <jnewbery> ariard: potentially. With the current logic, unless the originating wallet is rebroadcasting, then the tx will be expired from the mempool after two weeks. If all unupgraded nodes are rebroadcasting, they'll keep refreshing the tx in each other's mempools
19813:50 <gleb> I think asking unupgraded nodes to do something is impractical. We should assume that those nodes might not have an active maintainer to accomplish such tasks
19913:51 <lightlike> there is a fixed limit (3/4 block weight), plus the fee rate filter.
20013:51 <amiti> thank you lightlike :)
20113:51 <jnewbery> gleb: sorry. In this case, 'unupgraded' refers to nodes that have this PR, but don't have some future consensus or policy change
20213:51 <ariard> gleb: it would be unupgraded node starting from this patch
20313:51 <andrewtoth_> ahh i see
20413:51 <andrewtoth_> is that a low enough limit?
20513:52 <amiti> so, there's an initial fixed limit (currently 3/4 block weight, could potentially change)
20613:52 <ariard> something like mapRebroadcastPeriod to track tx stucking in the mempool and being constantly rebroadcast
20713:53 <ariard> like insert at first rebroadcast and prune every 2 weeks
20813:53 <amiti> and actually, the way the fee rate filter is implemented means that it will skip over recent txns (less than 30 min old), but likely still end up with a set thats total 3/4 block weight .. assuming there are enough "old" txns in the mempool
20913:53 <amiti> so the cache mechanism is really important
21013:54 <amiti> if anyone is interested, I posted this gist to describe the two ways (that I see) that the cache helps reduce the rebroadcast set
21113:54 <amiti> https://gist.github.com/amitiuttarwar/17ddf44e28e3de896b9be0139621f6f9
21213:54 <michaelfolkson> Cool
21313:55 <andrewtoth_> amiti: thanks!
21413:55 <amiti> ok. it looks like we have just 5 minutes left
21513:55 <amiti> there's so much more we can talk about, but if you have any pressing questions please speak up
21613:55 <michaelfolkson> Go for another hour? ;)
21713:55 <amiti> I can also stick around for a while after
21813:55 <amiti> hahahahhahah
21913:56 <ariard> last point I'm not sure about the GETDATA mechanism
22013:56 <ariard> like it seems a weird ACK mechanism
22113:56 <amiti> yes! I have concerns about that as well
22213:56 <amiti> ariard: what are your hesitations?
22313:57 <ariard> amiti: about the new m_unbroadcast_txids field
22413:57 <ariard> what are you tryng to solve with this exactly ?
22513:58 <amiti> ariard: from review notes- "This new logic could result in a problem: if a transaction is submitted locally and not immediately relayed, it could take a long time to be included in the rebroadcast set (for example, if its a low fee rate transaction), and thus a long time until it’s first seen on the network. To mitigate this, the mempool keeps track of transactions submitted locally via
22613:58 <amiti> m_unbroadcast_txids. When it’s time to rebroadcast, any unbroadcast transactions will be added to the set to relay. A transaction is removed from this set when a peer requests is via a GETDATA message."
22713:58 <ariard> it seems redudnant with wallet rebroadcast
22813:59 <lightlike> there would also be a possibility to actively query for the tx (like asking a new peer for the tx after some time, to which we haven't INVed it before). Althought that would possibly also be less than perfect for privacy.
22913:59 <ariard> amiti: "and thus a long time until it's first seen on the network"
23013:59 <ariard> amiti: you mean wallet rebroadcast timer here of 24h?
23114:00 <amiti> ariard: not quite. here's the scenario: you disable p2p. you form a low fee rate txn and submit to your mempool. you re-enable p2p. currently (before this PR) the rebroadcast mechanism would pick it up and send it out for initial broadcast. with these changes, it might be a long while before it hits the conditions to be rebroadcast to the network
23214:00 <amiti> ariard: the 24h wallet timer is how frequently the wallet tries to submit it to your local mempool. it will be a no-op if the txn is already in the mempool.
23314:00 <amiti> thats only useful if the txn got dropped from your own mempool.
23414:01 <ariard> amiti: hmm but isn't the issue here is than your tx is low feerate
23514:01 <amiti> lightlike: you are suggesting an active query to ensure the initial relay succeeded?
23614:01 <amiti> ok! looks like the hour is up. I'll stick around and continue this convo, but thank you to everyone who came out :)
23714:01 <amiti> (or stayed in at your computer I guess)
23814:02 <jnewbery> Thanks amiti. Great meeting!
23914:02 <amiti> ariard: why is it an issue to have a low fee rate txn? lol...
24014:02 <jnewbery> #endmeeting
24114:02 <ariard> amiti: because youre trying to prioritise this low feerate tx against your own rebroacast logic
24214:02 <sanoj> Thanks amiti
24314:02 <lightlike> amiti: "suggesting" is too much, just a spontaneous idea :-)
24414:02 <lightlike> thanks amiti!
24514:03 <amiti> lightlike: yeah totally :) along those lines, something I was considering is if you could use getting an INV for the txn to confirm that it has successfully propagated
24614:03 <jonatack> Thanks amiti!
24714:03 <emzy> Thanks all!
24814:03 <sebastianvstaa> thanks amiti!
24914:03 <emzy> Thanks amiti!
25014:03 <ariard> lightlike: yes would be interesting to actively query rather than wait for a GETDATA which never comes for latency reason
25114:04 <amiti> lightlike: but that hits some edge cases... like odd network graphs with poorly connected nodes
25214:04 <amiti> ariard: if you never get any GETDATAs, your txn hasn't succesfully made it into any mempools... so you should keep sending it out. right?
25314:05 <jesseposner> Thanks amiti!
25414:06 <lightlike> ariard: if a GETDATA never comes (and we never sent the TX), there is 0% that the tx has made it, so that is a necessary condition. Actively querying an unrelated peer would give additional confidence that the tx has made it beyond our peers.
25514:07 <ariard> lightlik: what's about seningd the tx with a BLOCKTXN?
25614:08 <amiti> ariard: I'm not familiar with the BLOCKTXN message. I'll check it out.
25714:08 <andrewtoth_> Thanks amiti!
25814:08 <lightlike> ariard: wouldn't that mean that the tx has been mined?
25914:08 <amiti> ok... yeah thats what I would expect
26014:09 <amiti> but, your txn couldnt be mined if there were never any GETDATAs for it
26114:10 <ariard> amiti: apart of you're the miner, it may be a weird edge case, I'm just not sure about relying on GETDATA as ACK
26214:11 <michaelfolkson> I have a couple of quick questions too. 1) How do you prevent all the unnecessary rebroadcasting in the case of empty blocks being mined? How commonly do empty blocks (or half empty blocks for no apparent reason) occur these days?
26314:11 <lightlike> yes. I guess if you have bad peers (e.g. if you are eclipsed), they could send GETDATAS to you but not relay the tx further.
26414:11 <amiti> yeah. also the way the unbroadcast set works is you only need _one_ GETDATA
26514:11 <amiti> which I don't love.
26614:12 <amiti> I hash it out more here: https://github.com/bitcoin/bitcoin/pull/16698#discussion_r348840194
26714:13 <amiti> michaelfolkson: the case of empty blocks is a good point. the cache is much less useful if fee rates are only increasing (aka no blocks come in, or empty block), but its not totally useless.
26814:14 <amiti> but thanks for raising. I'll want to monitor on the network and observe what actually happens.
26914:14 <ariard> amiti: yes was reading it rn, btw you may collect all concerns raised there and on the PR and propose the subject at a meeting to go forward faster :)
27014:14 <michaelfolkson> Could you incorporate into the rebroadcast logic whether blocks are full or not?
27114:15 <amiti> ariard: sorry, what?
27214:15 <amiti> michaelfolkson: hm, I guess that could be possible. what is "full" though?
27314:15 <andrewtoth_> weight of close to 4M
27414:16 <amiti> right. we'd have to define how close is close enough.
27514:16 <michaelfolkson> Yeah something like this. Only rebroadcast when the block(s) are 80-100% full
27614:16 <amiti> def possible. I'll keep it in mind and share what I observe from the network
27714:16 <amiti> I guess it also depends on how frequent empty blocks are
27814:16 <michaelfolkson> The important thing is just not to regularly rebroadcast when blocks are empty or less than 50% full
27914:17 <michaelfolkson> That's most likely the reason the transaction isn't in the block
28014:17 <jonatack> amiti: what are your preferred ways or sites to observe the network?
28114:17 <amiti> I mean, the main one I'm working on right now is running a node with my PR & extra verbose logging to see what the impact will be.
28214:17 <andrewtoth_> if a block is less than 50% full but not close to empty, that just means that there isn't enough in miners' mempools. In those situations, everything in the mempool should be rebroadcasted
28314:18 <michaelfolkson> You can't know that for sure?
28414:18 <ariard> amiti: maybe ask to talk about your PR during a core-dev meeting to let people go through all issues at once
28514:18 <andrewtoth_> michaelfolkson: i suppose not...
28614:19 <ariard> The scope of the PR is really wide, you may track also all questions/answers raised in a gist
28714:19 <amiti> ariard: ah, gotcha. good ideas
28814:19 <michaelfolkson> Why is the rationale for an empty block or a block with one transaction or a block that is half full any different?
28914:20 <andrewtoth_> it's just based on past behaviour of miners. Either there is only coinbase plus a few other txs, which usually happens when a new block comes in before a new template is formed, or the blocks have the entire mempool of the miner
29014:20 <jonatack> amiti: further to ariard's suggestion, you could possibly propose the PR at the weekly irc for inclusion into "high-priority chasing concept acks"
29114:20 <michaelfolkson> Ok we should let you go amiti. Thanks for staying longer. One last question. You said in a presentation you'd written lots and lots of tests. Did a lot of them not make it into the PR?
29214:21 <amiti> ahahahhahaha
29314:21 <amiti> michaelfolkson: these are not enough tests for you??
29414:21 <amiti> most of the time I've spent on this PR is on those integration tests
29514:21 <amiti> and oof based on some of the review comments still looks like there is some flakiness
29614:21 <jonatack> amiti: e.g. https://github.com/bitcoin/bitcoin/projects/8
29714:21 <michaelfolkson> I just wondered if you had a lot more that didn't make it in
29814:21 <amiti> nope =P
29914:22 <michaelfolkson> Ah ok :)
30014:22 <amiti> they are just difficult & time consuming to get the conditions right
30114:22 <amiti> esp since timing & p2p stuff
30214:22 <amiti> so maybe saying "lots" was misleading
30314:22 <michaelfolkson> Ok cool. Thanks a lot. Great session and great work
30414:22 <amiti> thanks !
30514:22 <amiti> jonatack: thanks for the suggestion
30614:22 <jonatack> I believe you with the tests and plan to dig into them. Should learn a lot there given the work you've put into them.
30714:24 <jonatack> Great job with the (well-attended!) session +1
30814:24 <amiti> jonatack: thanks for the feedback around flakiness. I've been working really hard to have them reliable (& they are for me). I'll look into it, but can I recruit you for some debugging?
30914:25 <jonatack> amiti: yep
31014:27 <jonatack> (and fwiw, looking over your filters gist i could see it as docs in either the your code and/or your tests)
31114:30 <amiti> oh interesting. docs in the code itself....
31214:32 <jonatack> like suhas for instance, he does it well
31314:32 <amiti> yeah I agree. I definitely love how suhas comments the code
31414:32 <amiti> the gist feels less directly related to the code, cause its more of my interpretation of what I expect to happen.
31514:32 <jonatack> and some of the functional tests have really great docs, ascii art binary trees etc
31614:33 <amiti> only observing the real network can indicate what will actually happen
31714:36 <instagibbs> <lightlike as-is the wallet can fail to get it into your *own* mempool, the rebroadcast is necessary to make sure it actually gets broadcast once
31814:40 <jonatack> amiti: hmm mempool_rebroadcast.py passed twice, then failed at line 171 again on the 3rd try. please ping me anytime about your tests. happy to look and i'll learn things
31914:40 <michaelfolkson> Can you provide link(s) jonatack to Suhas test docs?
32014:40 <michaelfolkson> I wouldn't mind looking at them
32114:40 <amiti> jonatack: thanks I'll check it out. I've been hunting down these sources of flakiness. Might ask you to send me some logs.
32214:43 <jonatack> michaelfolkson: wrt suhas we were talking about code docs, e.g. in https://github.com/bitcoin/bitcoin/pull/16400
32314:43 <jonatack> amiti: sure np, lmk
32414:44 <michaelfolkson> "some of the functional tests have really great docs, ascii art binary trees etc"
32514:45 <jonatack> amiti: (esp if i'm seeing test timeouts and you're not)
32614:45 <jonatack> michaelfolkson: oh, those aren't by suhas afaik, and i don't recall offhand which ones but if you look at the functional tests you'll come across them
32714:46 <michaelfolkson> Ok thanks
32814:49 <jonatack> michaelfolkson: wallet_listsinceblock.py
32914:50 <michaelfolkson> ?
33014:50 <jonatack> michaelfolkson: e.g. the code comments in test_reorg() and test_double_spend()
33114:50 <jonatack> michaelfolkson: in reply to your question
33214:51 <michaelfolkson> What PR is this?
33314:51 <michaelfolkson> http://wallet_listsinceblock.py
33414:52 <jonatack> michaelfolkson: it's a functional test in the repo, you can git blame it to see authors and commits
33514:53 <jonatack> gotta go
33614:53 <michaelfolkson> URL doesn't work for me
33714:53 <jonatack> it's in the repo with the other functional tests in: test/functional/wallet_listsinceblock.py
33814:55 <michaelfolkson> Ah ok thanks! See ya