Allow tx replacement by smaller witness (tx fees and policy, validation)

https://github.com/bitcoin/bitcoin/pull/24007

Host: larryruane  -  PR author: LarryRuane

The PR branch HEAD was b15079ac7b at the time of this review club meeting.

Notes

  • The content of a transaction can be separated into two components:
    • non-witness data
    • witness data
  • Since Segwit activation, there are two ways a transaction can be serialized, with and without the witness data.

  • A transaction’s txid is the sha256d hash of its non-witness serialization.

  • A transaction’s wtxid is the sha256d hash of its full (including its witness) serialization. (Hashing the serialization of only the witness data isn’t useful.)

  • It’s possible for two transactions to have the same txid but different wtxids. The opposite is not possible (same wtxids but different txids). That is, the mapping from txid to wtxid is one-to-many.

  • The only place any type of transaction ID appears on the blockchain is within transaction inputs, each of which contains, among other things, a COutPoint. This object “points” to an output of the source transaction using the txid (note, not wtxid) of the source transaction and the index into its outputs array. This is the output that this input is “spending”.

  • Currently, when a transaction is submitted to the mempool and an existing mempool transaction has the same txid, the incoming transaction is immediately rejected.

  • Replace-by-fee (RBF), as described in BIP125, allows transactions that have already been accepted into the mempool to be replaced by a newly-arriving transaction. Previous to BIP125, an incoming transaction that spent any of the same outputs as an existing in-mempool transaction would be rejected as a double-spend attempt. This has been called the “first seen safe” policy.

  • If an RBF replacement does occur, any descendant (downstream) transactions in the mempool must be removed. These are transactions that (recursively) spend outputs of the transaction being replaced.

  • The replacement transaction can spend different inputs (except at least one, or else it wouldn’t conflict and replacement wouldn’t be needed), can have different outputs, and therefore will always have a different txid than the transaction(s) being replaced.

  • There are several conditions that must be met for an RBF replacement to occur, documented here, including a requirement that the replacement pay more fees than the original transaction(s).

  • PR 24007 implements something similar to RBF except the txid of the two transactions is the same (but the wtxids are different). As it’s not possible for a same-txid-different witness transaction to include a different absolute fee amount, the rules for witness replacement differ from that of regular RBF.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?

  2. A wtxid “commits to” an entire transaction, including its witness, while a txid does not. What does the phrase “commits to” mean in Bitcoin?

  3. Is the mempool “indexed” by txid or wtxid? Equivalently, we can ask: Can the mempool contain multiple transactions with the same txid (but different wtxids)? If not, would it make sense for it to do so?

  4. Does this PR change a consensus rule? Why or why not? What happens if some nodes are running this PR and their peers are not?

  5. Should Bitcoin Core policies be miner incentive-compatible?

  6. When an RBF replacement occurs, why is it necessary to remove the descendant transactions?

  7. When a witness replacement occurs, is it necessary to remove the descendant transactions? Why or why not?

  8. This PR allows replacement even if the existing transaction hasn’t signaled replaceability. Is this an oversight?

  9. Why would a witness-replacement transaction be broadcast? Why not broadcast the replacement transaction initially? What is this PR’s use case?

  10. How can a transaction have multiple possible witness data that are different sizes? (Hint: see the test!)

  11. The PR requires the replacement transaction’s size to be 95% or less than the size of the replaced transaction. Why can’t it just use the normal RBF rules?

  12. Why is this check written as (simplified) new_size * 100 >= old_size * 95 rather than the more obvious new_size >= old_size * 0.95 ?

  13. (Extra credit) How does witness replacement interact with packages?

Meeting Log

  117:01 <larryruane> #startmeeting
  217:01 <erik-etsuji-kato> hi
  317:01 <OliverOffing> hey everyone
  417:01 <tarun> g=hi
  517:01 <monlovesmango> hello
  617:01 <lightlike> hi
  717:01 <sipa> hi
  817:01 <larryruane> Hi everyone! Welcome to another PR Review Club
  917:02 <larryruane> Feel free to say hi to let everyone know you're here
 1017:02 <sanya> hi
 1117:02 <docallag> Hi.
 1217:02 <larryruane> anyone here for the first time?
 1317:02 <jaonoctus> gm
 1417:02 <monlovesmango> my first time :)
 1517:03 <larryruane> Today we're looking at #24007 "[mempool] allow tx replacement by smaller witness"
 1617:03 <larryruane> monlovesmango: welcome!
 1717:03 <monlovesmango> ty!!
 1817:03 <sdaftuar> hi
 1917:03 <larryruane> Here's the link to the notes and questions: https://bitcoincore.reviews/24007
 2017:04 <monlovesmango> I got it open thank you!
 2117:04 <larryruane> Who had a chance to review today's PR? y/n/partial and what was your approach?
 2217:04 <svav> n but read the notes
 2317:04 <ziggie> y
 2417:04 <brunoerg> 0.6y
 2517:04 <jaonoctus> y
 2617:04 <narcelio> y
 2717:04 <tarun> y
 2817:04 <erik-etsuji-kato> Code review only
 2917:04 <btckid> y/notes
 3017:05 <OliverOffing> n
 3117:05 <stickies-v> y, untested code&notes review
 3217:05 <docallag> y
 3317:05 <theStack_> 0.5y (only light conceptual review and code-reviewed the test so far)
 3417:06 <larryruane> mempool is a somewhat tricky area, would anyone like to summarize the PR? Feel free to add background if you'd like!
 3517:07 <ziggie> This PR tries to introduce a new replacement option for transactions with different witnesses
 3617:07 <theStack_> the PR enables to replace txs in the mempool if only the witness data changes, but the remaining parts of the transactions are unchanged (i.e. same txid, different wtxid)
 3717:08 <OliverOffing> this PR allows users to replace the witness data for a transaction in the mempool, as long as the new witness data is smaller (i.e. higher fee _rate_)
 3817:08 <stickies-v> If you want to change (and broadcast) the witness of an already broadcasted transaction (e.g. by spending a different path in the script), this is currently not allowed because the txid doesn't change when just the witness data changes. This PR allows such updated transactions to be acceptedin mempool and broadcasted, provided that the new witness is sufficiently small enough.
 3917:09 <larryruane> ziggie: theStack_: OliverOffing - correct, what's the current policy if a transaction arrives at the mempool but there's already a tx with the same txid?
 4017:09 <docallag> Rejected as an attempted double spend
 4117:09 <OliverOffing> discard it (first seen safe policy)
 4217:09 <theStack_> it is rejected
 4317:10 <larryruane> yes, good! maybe something basic now, what's the difference between a `txid` and a `wtxid`? Why are there two different IDs?
 4417:10 <svav> Current policy is that new transaction with same txid is immediately rejected
 4517:11 <OliverOffing> `txid` = hash(tx data), `wtxid` = hash(tx data + witness data)
 4617:11 <docallag> txid = hash of non-witness data, wtxid = hash of both
 4717:11 <stickies-v> txid is calculated as the hash of the serialized transaction without witness data (and thus malleable), wtxid is the hash of the serialized tx with witness data
 4817:11 <erik-etsuji-kato> TxId is the hash of legacy data only (all transaction data minus the witness stack an the flag), wTxId is the hash with the witness
 4917:11 <btckid> a txid hashes the non witness data while wtxid hashes both non witness data and witness data
 5017:12 <stickies-v> *not malleable instead of mallaeable
 5117:12 <jaonoctus> txid is a double sha256 of tx data (without the witness part)
 5217:12 <larryruane> all great answers ... What big change in (I think) 2017 introduced the concept of `wtxid`?
 5317:12 <brunoerg> segwit
 5417:12 <OliverOffing> segwit
 5517:12 <erik-etsuji-kato> segwit
 5617:12 <svav> Segregated witness
 5717:12 <btckid> segwit
 5817:13 <larryruane> hehe yes, too easy for you all! ... general cryptography term is "commited to" ... what does that mean?
 5917:13 <larryruane> (by the way, feel free to ask your own questions or lead the discussion elsewhere!)
 6017:13 <svav> Uniquely associated with?
 6117:13 <larryruane> *committed to
 6217:14 <ziggie> would it be possible to use the RBF signaling for replacing a tx with the same txid
 6317:14 <stickies-v> refer to a hash of 'something', so that when the 'something' in any way changes your reference becomes invalid
 6417:14 <larryruane> like, in this case, we say the wtxid "commits to" the witness, and you've all said the witness is included in computing wtxid, so why is it called that?
 6517:15 <larryruane> stickies-v: you got it, that's what I was looking for. If A commits to B, then you can't change B without changing A
 6617:16 <larryruane> (A usually (always?) being a hash)
 6717:16 <docallag> If the input and outputs have to be the same how would the witness data change? (please ignore if off topic)
 6817:17 <larryruane> No that's on-topic, good question. Anyone like to explain how that can happen?
 6917:17 <OliverOffing> a script can have multiple forks/paths, and different witness data can trigger different paths
 7017:17 <btckid> a change in the script path?
 7117:17 <docallag> Ah, tks!
 7217:18 <glozow> easiest example of a multi-spending path script is OP_DROP OP_TRUE
 7317:18 <glozow> and then the witness data can be literally anything
 7417:19 <glozow> er i guess, 1 item anything
 7517:19 <stickies-v> signatures are also malleable (hence segwit), so I think even without segwit that could be the case. It wouldn't hit the 5% witness size decrease requirement of this PR though, just for completeness
 7617:19 <larryruane> an tx output can be thought of as a lock, but the lock can possibly be unlockable with different "keys" ... each "key" corresponds to a different witness (very high level description)
 7717:19 <stickies-v> *even without P2SH, not "even without segwit". sorry
 7817:19 <docallag> Great examples. Thanks all.
 7917:20 <monlovesmango> ziggie: I don't think so bc RBF didn't change rule that duplicate txid is not allowed (bc RBF will always have new txid)...?
 8017:20 <glozow> docallag: see here for example code https://github.com/bitcoin/bitcoin/blob/e3699b71c46bf66cfe363fc76ab8761dc39555a7/src/test/txpackage_tests.cpp#L333
 8117:21 <larryruane> Okay, feel free to continue this line of thought, but if we're ready to move to the mempool ... the mempool is a collection of unconfirmed transactions, each represented by key-value pair (with unique keys, like a std::map) ... what is the key?
 8217:22 <brunoerg> txid?
 8317:22 <btckid> tcid
 8417:22 <btckid> txid
 8517:22 <OliverOffing> txid
 8617:22 <svav> txid
 8717:22 <narcelio> txid
 8817:22 <stickies-v> is there just a single key? if i understand correctly the mempool is indexed on 5 keys?
 8917:22 <lightlike> many keys, it's a multi index
 9017:22 <larryruane> yes, good, txid ... meaning, there can't be 2 transactions in the mempool with same txid ... but why? why don't we key on wtxid?
 9117:23 <larryruane> stickies-v: yes, exactly right, i'm at a very high level here :)
 9217:23 <glozow> we do key on wtxid though?
 9317:23 <OliverOffing> because indexing on wtxid would not prevent two txs from spending the same inputs
 9417:23 <larryruane> glozow: do we key on wtxid?
 9517:23 <stickies-v> https://github.com/bitcoin/bitcoin/blob/2935bd9d67e5a60171e570bde54a212a81d034e9/src/txmempool.h#L371-L376
 9617:23 <glozow> https://github.com/bitcoin/bitcoin/blob/e3699b71c46bf66cfe363fc76ab8761dc39555a7/src/txmempool.h#L184-L196
 9717:24 <glozow> https://github.com/bitcoin/bitcoin/blob/master/src/txmempool.h#L465
 9817:24 <larryruane> OliverOffing: perfect, that's exactly it! transactions in the mempool can _never_ be conflicting, and two transactions with the same txid are necessarily conflicting (spending the same inputs)
 9917:25 <glozow> mempool indexes by txid, wtxid, descendant feerate, ancestor feerate, and entry time
10017:25 <larryruane> glozow: stickies-v: this lets us *look up* transactions by wtxid, but there can still not be two transactions with the same txid, right? I hope I have that right!
10117:26 <svav> Am I right in thinking txid will always be unique in the mempool whereas wtxid may not be?
10217:26 <larryruane> svav: I don't think it's possible to have two (or more) tx with same wtxid but different txid
10317:27 <stickies-v> larryruane it doesn't get accepted in mempool because it wouldn't pass MemPoolAccept::PreChecks but I don't think the CTxMempool wouldn't technically be able to?
10417:27 <larryruane> unless you were extremely "lucky" (unlucky)
10517:27 <lightlike> i think both must be unique because they are both boost::multi_index::hashed_unique indexes
10617:27 <glozow> if the question is whether boost multi index container will let us have 2 entries, i'll need to go read the docs. if the question is whether our node will survive if we put 2 transactions with the same txid in the mempool, the answer is no
10717:28 <larryruane> stickies-v: good point, I'm not sure if anything actually enforces no duplicate txids (other than conflicting spends), like, if you happened to have a collision
10817:28 <svav> Can txid be regarded as a unique identifier as far as Bitcoin transactions are concerned?
10917:29 <svav> Can anyone link to the code where txid and wtxid is generated?
11017:30 <larryruane> svav: that's a great question, the answer is yes and no ... It's yes in the sense that two tx with the same txid must have the same *effect* (let's review, what is the effect of a tx? It's to destroy some UTXOs and create some new ones) ... but these two tx can be *enabled* by different witnesses, so they're different in that way ... Do I have this right? (I'm kind of new to this myself!)
11117:31 <stickies-v> larryruane I was typing something very similar, agreed
11217:32 <larryruane> we've covered more or less questions 1-3, let's quickly cover question 4, does this PR change a consensus rule?
11317:32 <glozow> no
11417:32 <erik-etsuji-kato> no
11517:32 <brunoerg> no, mempool is individual
11617:32 <jaonoctus> nope
11717:33 <larryruane> correct, excellent ... what happens if some nodes are updated to run this PR and others aren't? does that cause a problem?
11817:34 <stickies-v> no, nodes that aren't updated will just not forward these transactions
11917:34 <larryruane> (this is the type of thing we *always* have to worry about in bitcoin!)
12017:35 <stickies-v> you do need sufficient nodes to be upgraded in order to somewhat reliably expect your transaction would be propagated to most miners
12117:35 <larryruane> stickies-v: maybe, I think interestingly enough, if a node receives a tx with a txid that is already in its mempool, it re-broadcasts its *existing* tx - is that right glozow ?
12217:35 — docallag Couldn't my (old) node broadcast the old transaction but your node (new) broadcast the new transaction?
12317:36 <larryruane> stickies-v: yes (i was replying to your previous msg)
12417:36 <OliverOffing> different nodes would keep different transactions in the mempool but that's not really a problem for the network—only for the user trying to replace their tx's wit data as stickies-v mentioned
12517:36 <larryruane> docallag: yes, I think that's what does happen with this PR
12617:36 <glozow> larryruane: oh no, I think that was fixed in #22261 https://github.com/bitcoin/bitcoin/pull/22261
12717:37 <larryruane> glozow: +1 thanks
12817:38 <larryruane> Personally I think question 5 is really interesting ... Should Bitcoin Core mempool policy be miner-incentive compatible? If so, why?
12917:38 <jaonoctus50> wdym by miner-incentive compatible?
13017:39 <glozow> a mempool is as useful as it is an accurate reflection of what's in the miners mempools. being incentive-incompatible is a good way to deviate from what would be in a miner's mempool. so yes.
13117:39 <stickies-v> I think so. If not, you encourage miners to set up individual backchannels (e.g. sending tx directly to miner for a direct fee) to help get non-standard transactions mined, and that puts smaller/anonymous miners at a disadvantage
13217:39 <larryruane> jaonoctus50: should our mempool contain what (at least most) miners would *prefer* to have in their mempools?
13317:39 <OliverOffing> yes, miners secure the network. the mempool needs be compatible with miner incentives so that the txs get to miners' nodes
13417:39 <svav> Are these definitions still correct? /////// Definition of txid remains unchanged: the double SHA256 of the traditional serialization format:
13517:39 <svav>   [nVersion][txins][txouts][nLockTime] /////// A new wtxid is defined: the double SHA256 of the new serialization with witness data:
13617:39 <svav>   [nVersion][marker][flag][txins][txouts][witness][nLockTime]
13717:39 <theStack_> i think it should be, if i understand "miner-incentive compatible" correctly; we want miners to maximize their fees in order to increase the security
13817:39 <lightlike> yes, because otherwise miners are incentivised to accept transactions by other ways than the p2p network, such as their own endpoints.
13917:40 <larryruane> stickies-v: great answer, i hadn't thought of that!
14017:40 <brunoerg> i think so, because we want to be aligned with them, just because we want to see our transactions being mined.
14117:40 <larryruane> (and lightlike too, +1)
14217:41 <larryruane> here's a crazy thought, BTW (not in the notes), we sometimes have these policies that we're not sure miners are following - maybe we can look at the tx that arrive in each *mined block* and dynamically adjust our policies to match!
14317:42 <larryruane> to align better our mempool contents (and thus fee estimation, relay policies, etc.) with actual miner policies
14417:42 <brunoerg> larryruane: make sense
14517:42 <larryruane> probably too complicated, but just a thought
14617:42 <stickies-v> larryruane but then you have to reverse engineer those policies from the results you observe? which seems far from trivial?
14717:42 <erik-etsuji-kato> But this can be used by miners to, e.g, bump the fees artificilly
14817:43 <erik-etsuji-kato> They can add fake high-fee transactions in ther mined blocks
14917:43 <jaonoctus50> erik-etsuji-kato: good point
15017:43 <larryruane> erik-etsuji-kato: great point ... it may be too weird to set up this circular dependency
15117:43 <stickies-v> erik-etsuji-kato I'm not sure that's relevant here actually, because those transactions don't get propagated to the network (because then if someone else mines them, the miner loses money)
15217:44 <larryruane> stickies-v: that's true, if mining is fairly decentralized
15317:44 <OliverOffing> unless they collude
15417:45 <OliverOffing> which is something plausible given that fees would increase across the board for everyone
15517:45 <monlovesmango> does witness data size impact the fees that miners collect?
15617:45 <stickies-v> hmm they don't need to collude I think, any dishonest miner can do this, it's trivial to just construct some high fee tx's and inject them only into your own block. I just mean that this wouldn't affect policy, because they're not propagated out of block
15717:46 <larryruane> okay let's see (but again, feel free to keep going on any previous thread), question 6, quick detour to RBF, when RBF occurs, why is it necessary to remove mempool decendants? Is that necessary for witness-replacement?
15817:46 <stickies-v> monlovesmango yes it does, but it's discounted so each witness data byte is less expense than each non-witness data byte
15917:46 <brunoerg> because the txid is changed
16017:46 <stickies-v> 4 times less expensive, to be precise
16117:46 <theStack_> after RBF occurs, the descendants of the replaced txs are not valid anymore, as they (directly or indirectly) spend invalid inputs
16217:46 <erik-etsuji-kato> It changes the txid, invalidating the inputs in it's descendants
16317:47 <brunoerg> in the case of witness-replacement, it doesnt happen because the txid doesnt change
16417:47 <larryruane> monlovesmango: yes, because if a witness replacement occurs, the feerate improves, so that helps the miner
16517:47 <theStack_> brunoerg: +1
16617:47 <larryruane> brunoerg: can you elaborate, what does txid not changing mean?
16717:48 <larryruane> oh i see theStack_ kind of answered this already, thanks
16817:49 <larryruane> maybe we can cover 8 quickly, why does this PR allow witness-replacement even if the tx hasn't signaled replacability (which is required for RBF)?
16917:49 <ziggie> would this change also affect the constuction of scripts to foresee some kind of competing scenario ?
17017:49 <larryruane> (I think this has been answered already)
17117:50 <OliverOffing> because neither inputs nor outputs change so no harm no faul?
17217:50 <larryruane> (since txid isn't changing, downstream transactions' inputs aren't invalidated)
17317:50 <theStack_> i think it has to do with the fact that the recipient doesn't are about a potential witness-replacement, as the outputs aren't changed?
17417:50 <ziggie> so that you kind of padded script paths to make them equal size, just in case they will not meet this policy
17517:50 <theStack_> *care
17617:50 <erik-etsuji-kato> +larryruane: Any tx can be replaced by witness, without signaling
17717:52 <ziggie> oh forget my question, I was missing the point that the witness only changes sry
17817:52 <larryruane> theStack_: +1 ... RBF replacability signaling allows the downstream tx to worried.. but they're not worried in this case
17917:52 <stickies-v> I think this would break protocols that rely on unconfirmed wtxids, but I don't think those protocols currently exist and that seems like an unreliable thing to do anyway so we probably shouldn't care?
18017:53 <theStack_> though, even without RBF signalling there is a reason to be worried, as the tx could just have a too-low fee and eventually get kicked out of the mempool, and _then_ get replaced? (probably a different discussion though)
18117:54 <larryruane> I'd really love to get to question 9, I don't have good answers myself. There was a good comment on the PR just a few hours ago on these lines (what is the use case here?) - https://github.com/bitcoin/bitcoin/pull/24007#issuecomment-1022303102
18217:54 <stickies-v> theStack_ yeah absolutely, and like with any 0conf protocols, one can always pay a miner to include a newer tx version anyway so...
18317:56 <OliverOffing> larryruane that you ask it does seem a bit feature-creeping
18417:57 <larryruane> OliverOffing: fair point ... So there's that question (9), and there are a few more questions (11-13) but only a few minutes left, anyone like to choose anything else to discuss?
18517:57 <glozow> re: use cases, it seem like we would only be interested in this if there's a pinning attack. like, you have counterparties on your transaction and somebody bloats up the witness by thousands of vbytes or something. i can't imagine why a normal user is broadcasting their own transaction multiple times with various witnesses
18617:58 <jaonoctus50> glozow: +1
18717:58 <stickies-v> larryruane I can potentially see this being useful in protocols where one party signals they're willing to go on chain in a suboptimal path by broadcasting that larger transaction, maybe encouraging all parties to instead lower the fees and go for key spend (in taproot case) instead
18817:59 <stickies-v> a game of chicken, kind of
18917:59 <svav> Can anyone answer question 9)?
19017:59 <larryruane> makes sense ... I guess overall, maybe the main argument is that allowing witness replacement is compatible with miner incentive, and it's not very complex
19118:00 <larryruane> svav: I think that will have to happen, and is happening, int the PR
19218:00 <larryruane> ok we're at time, thanks everyone!
19318:00 <theStack_> for question 12, i think the reason for multiplying both sides with an integer rather than one side with a float is to avoid floating-point arithmetics?
19418:00 <theStack_> thanks for hosting larryruane!
19518:00 <docallag> tks all, lots to digest
19618:01 <larryruane> theStack_: +1
19718:01 <docallag> glozow thanks for the links
19818:01 <jaonoctus50> ty
19918:01 <monlovesmango> larryruane: for the attack mentioned in that post, wouldn't it make more send to require the witness to be smaller by a certain byte size rather than percentage to eliminate this attack vector? or is there a reason we are using percentage of data size?
20018:01 <erik-etsuji-kato> tks all
20118:01 <btckid> thank you all
20218:01 <tarun> thank you.
20318:01 <glozow> larryruane: do you know of any applications where you share an input with an untrusted counterparty who might be able to broadcast with a different witness to grief you?
20418:01 <brunoerg> thank you
20518:01 <stickies-v> ty for hosting and for the very detailed notes and questions, larryruane !
20618:01 <OliverOffing> thanks larryruane and thanks all
20718:01 <glozow> thanks for hosting larryruane!
20818:01 <larryruane> #endmeeting