Currently, a node only rebroadcasts a transaction if it is the originating
wallet, which is very bad for privacy. These changes propose
extracting the rebroadcast logic from the wallet into the mempool and having each node
rebroadcast a set of transactions it thinks should have been mined by now.
Should have been mined is a tricky concept to define. At a high
level, this implementation determines what should have been mined as follows:
Every ~20 mins, run a fee rate cache that computes the top of the mempool
and stores the min package fee rate for a transaction to be included.
Rebroadcast events are triggered by a random poisson timer.
When the timer pops, check if there has been a new block since the cache last
ran. If there hasn’t been a new block, bump the timers to try and
hit those conditions in the next run. If the conditions have been met this
round, continue with the rebroadcast logic
Calculate the set of transactions that are at the top of the mempool and >30 mins old.
Filter out any transactions with fee rate < cached fee rate.
Rebroadcast the remaining set.
To minimize rebroadcast noise, we want the order of events to be ->
cache run | block | rebroadcast. If we run the cache and then rebroadcast
(without a block in between), the cache is not as meaningful. Thus we store
the block height & compare to chain tip before rebroadcasting.
With these changes, the division of responsibility between the wallet & the
node (mempool) are as follows:
The wallet ensures the (local) mempool knows about the transaction by periodically
attempting to resubmit that transaction.
The mempool applies rebroadcast logic to ensure that the network knows
about any transactions it believes should have already been mined.
This new logic could result in a problem: if a transaction is submitted locally
and not immediately relayed, it could take a long time to be included in the
rebroadcast set (for example, if it’s a low fee rate transaction), and thus a long
time until it’s first seen on the network. To mitigate this, the mempool keeps
track of transactions submitted locally via m_unbroadcast_txids. When it’s
time to rebroadcast, any unbroadcast transactions will be added to the set to
relay. A transaction is removed from this set when a peer requests is via a
Reviewer notes: A good place to get started is by reading the commit messages.
A great way to add value would be to think through or observe the
interactions between the different mechanisms introduced.
<emilengler> Previously, a node would only rebroadcast a transaction if it is the originating wallet, which is very bad for privacy. These changes propose extracting the rebroadcast logic from the wallet into the mempool and having all nodes rebroadcast a set of transactions it thinks should have been mined by now.
<nehan> reworded: nodes will rebroadcast all transactions which match a certain set of conditions and have not been mined yet. the wallet will no longer rebroadcast "my" txns, instead they will be passed to the node to handle.
<michaelfolkson> If there was a league table of all the transactions in your mempool from highest fee per vbyte to lowest fee per vbyte you would expect the highest transactions to be mined first until the block is full
<andrewtoth_> the PR caching a fee rate, and looks at the top 3/4 of the mempool that are above that feerate, and if a block has been mined since it's been seen. I believe that's how the PR answers the question
<jnewbery> emilengler: I see you've just left a code-style comment on the PR. All review is welcome, but I think at this stage higher-level conceptual review is more useful (the code is likely to move around during the review process so style nits add a lot of noise)
<andrewtoth_> if your node is configured to accept longer ancestor chains, for instance, then this will ensure that longer chains are also rebroadcast after descendents are mined. That way peers with different configurations will get the other txs after some time.
<amiti> nehan: I'm currently working on running a node with my patch to observe the current network conditions, but there are definite cases where this could happen. Eg. in 2017 competitive fee rate market, transactions would get evicted from normal sized mempools, and if one of those transactions were yours, you would want to rebroadcast it so it could later be mined.
<jnewbery> nehan: the default config is to only allow transaction packages up to size 25 transactions / 101Kb, but that is configuarble. If I change that to eg 50 transactions, I could have a long chain of transaction in my mempool that other nodes don't have (since they'll reject anything beyond the 25th in the chain).
<lightlike> one nice side effect of this PR might be that newly connected nodes get a good representation of the mempool faster (because other nodes will INV older high-fee transactions to them if they think that they should have been mined)
<amiti> emzy: once you broadcast a transaction, you should never expect it to "time out". Users could be running mempools of different sizes and still have that transaction. Or a random user could be rebroadcasting all transactions (running a patch). if you want to reliably "cancel" the transaction, you would have to spend the inputs in a different transaction.
<lightlike> michaelfolkson: I don't think so. Afaik, if the tx are older and most of the network have them already in their mempools, they currently would not periodically INV them, so the new node would only learn of them once they have been included in a block.
<amiti> andrewtoth_: trying to "cancel" your transaction is different than trying to bump the fee to get it to confirm. since many mempool policies are configurable, even if your transaction would have been evicted / dropped / expired from the standard mempool, there may be other mempools that still have it (for example, that of a miners)
<jnewbery> this PR doesn't claim to improve initial tx relay. I think it might be useful to just restate what this PR does improve. Currently, if you have a connection to a peer and you see the same transaction INVed from that peer more than once, you can have a very high confidence that the wallet attached to that peer owns the transaction. It's like a big flashing beacon saying "THIS TRANSACTION
<amiti> nehan: hm, if my mempool was larger I'd be more likely to rebroadcast some of those txns (bc they would be evicted from smaller ones), but I don't think that addresses the policy mismatch that you were talking about
<ariard> jnewbery: agree but if your transaction didn't propagate on the network at initial relay, if it appears among a rebroadcast set from a node A I think you can assume that's a wallet tx issued by A
<jnewbery> mattcruz: when we learn about a transaction, we tell our peers about it using an INV message with the txid. If the peer wants the tx they'll reply with a GETDATA, and we'll then send them the TX message
<amiti> nehan: yes. so I think the clear example of this is with a software upgrade. say nodes A, B, C are on an old policy, and nodes X, Y, Z have upgraded. node A forms a txn that gets rejected by the upgraded nodes, but B and C will accept them. When blocks come in and the fee rate indicates this txn should have been mined, all of the old nodes would rebroadcast it.
<lightlike> I don't completely understand why the rebroadcast per se is so important, also in its current form. Would it lead to a significant error rate if we just trusted that the tx has flooded to the network after getting enough GETDATAs to our initial send, and forget about it (i.e. just remove the rebroadcasting)?
<amiti> ariard: I'm unclear if you are trying to highlight an issue with the rebroadcast logic or the initial broadcast logic. Are you claiming that the rebroadcast policy introduces an additional leak? Or that the guarantees are no better than the initial relay policy?
<amiti> lightlike: GETDATAs can indicate that your transaction has initially been propagated out to the network, but there are other reasons that the txn might no longer be in other mempools. Eg. evicted based on fee rate. Or expired after 2 weeks. etc.
<jnewbery> It does seem problematic that if there's a softfork or tightening of policy rules, unupgraded nodes will continue to ping transactions that are no longer consensus- or policy- valid between themselves.
<jnewbery> ariard: potentially. With the current logic, unless the originating wallet is rebroadcasting, then the tx will be expired from the mempool after two weeks. If all unupgraded nodes are rebroadcasting, they'll keep refreshing the tx in each other's mempools
<amiti> and actually, the way the fee rate filter is implemented means that it will skip over recent txns (less than 30 min old), but likely still end up with a set thats total 3/4 block weight .. assuming there are enough "old" txns in the mempool
<amiti> ariard: from review notes- "This new logic could result in a problem: if a transaction is submitted locally and not immediately relayed, it could take a long time to be included in the rebroadcast set (for example, if its a low fee rate transaction), and thus a long time until it’s first seen on the network. To mitigate this, the mempool keeps track of transactions submitted locally via
<amiti> m_unbroadcast_txids. When it’s time to rebroadcast, any unbroadcast transactions will be added to the set to relay. A transaction is removed from this set when a peer requests is via a GETDATA message."
<lightlike> there would also be a possibility to actively query for the tx (like asking a new peer for the tx after some time, to which we haven't INVed it before). Althought that would possibly also be less than perfect for privacy.
<amiti> ariard: not quite. here's the scenario: you disable p2p. you form a low fee rate txn and submit to your mempool. you re-enable p2p. currently (before this PR) the rebroadcast mechanism would pick it up and send it out for initial broadcast. with these changes, it might be a long while before it hits the conditions to be rebroadcast to the network
<lightlike> ariard: if a GETDATA never comes (and we never sent the TX), there is 0% that the tx has made it, so that is a necessary condition. Actively querying an unrelated peer would give additional confidence that the tx has made it beyond our peers.
<michaelfolkson> I have a couple of quick questions too. 1) How do you prevent all the unnecessary rebroadcasting in the case of empty blocks being mined? How commonly do empty blocks (or half empty blocks for no apparent reason) occur these days?
<andrewtoth_> if a block is less than 50% full but not close to empty, that just means that there isn't enough in miners' mempools. In those situations, everything in the mempool should be rebroadcasted
<andrewtoth_> it's just based on past behaviour of miners. Either there is only coinbase plus a few other txs, which usually happens when a new block comes in before a new template is formed, or the blocks have the entire mempool of the miner
<michaelfolkson> Ok we should let you go amiti. Thanks for staying longer. One last question. You said in a presentation you'd written lots and lots of tests. Did a lot of them not make it into the PR?