Allow tx replacement by smaller witness (tx fees and policy, validation)

Jan 26, 2022

https://github.com/bitcoin/bitcoin/pull/24007

Host: larryruane - PR author: LarryRuane

The PR branch HEAD was b15079ac7b at the time of this review club meeting.

Notes

The content of a transaction can be separated into two components:
- non-witness data
- witness data
Since Segwit activation, there are two ways a transaction can be serialized, with and without the witness data.
A transaction’s txid is the sha256d hash of its non-witness serialization.
A transaction’s wtxid is the sha256d hash of its full (including its witness) serialization. (Hashing the serialization of only the witness data isn’t useful.)
It’s possible for two transactions to have the same txid but different wtxids. The opposite is not possible (same wtxids but different txids). That is, the mapping from txid to wtxid is one-to-many.
The only place any type of transaction ID appears on the blockchain is within transaction inputs, each of which contains, among other things, a COutPoint. This object “points” to an output of the source transaction using the txid (note, not wtxid) of the source transaction and the index into its outputs array. This is the output that this input is “spending”.
Currently, when a transaction is submitted to the mempool and an existing mempool transaction has the same txid, the incoming transaction is immediately rejected.
Replace-by-fee (RBF), as described in BIP125, allows transactions that have already been accepted into the mempool to be replaced by a newly-arriving transaction. Previous to BIP125, an incoming transaction that spent any of the same outputs as an existing in-mempool transaction would be rejected as a double-spend attempt. This has been called the “first seen safe” policy.
If an RBF replacement does occur, any descendant (downstream) transactions in the mempool must be removed. These are transactions that (recursively) spend outputs of the transaction being replaced.
The replacement transaction can spend different inputs (except at least one, or else it wouldn’t conflict and replacement wouldn’t be needed), can have different outputs, and therefore will always have a different txid than the transaction(s) being replaced.
There are several conditions that must be met for an RBF replacement to occur, documented here, including a requirement that the replacement pay more fees than the original transaction(s).
PR 24007 implements something similar to RBF except the txid of the two transactions is the same (but the wtxids are different). As it’s not possible for a same-txid-different witness transaction to include a different absolute fee amount, the rules for witness replacement differ from that of regular RBF.

Questions

Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?
A wtxid “commits to” an entire transaction, including its witness, while a txid does not. What does the phrase “commits to” mean in Bitcoin?
Is the mempool “indexed” by txid or wtxid? Equivalently, we can ask: Can the mempool contain multiple transactions with the same txid (but different wtxids)? If not, would it make sense for it to do so?
Does this PR change a consensus rule? Why or why not? What happens if some nodes are running this PR and their peers are not?
Should Bitcoin Core policies be miner incentive-compatible?
When an RBF replacement occurs, why is it necessary to remove the descendant transactions?
When a witness replacement occurs, is it necessary to remove the descendant transactions? Why or why not?
This PR allows replacement even if the existing transaction hasn’t signaled replaceability. Is this an oversight?
Why would a witness-replacement transaction be broadcast? Why not broadcast the replacement transaction initially? What is this PR’s use case?
How can a transaction have multiple possible witness data that are different sizes? (Hint: see the test!)
The PR requires the replacement transaction’s size to be 95% or less than the size of the replaced transaction. Why can’t it just use the normal RBF rules?
Why is this check written as (simplified) new_size * 100 >= old_size * 95 rather than the more obvious new_size >= old_size * 0.95 ?
(Extra credit) How does witness replacement interact with packages?

Meeting Log

17:01

<larryruane> #startmeeting

17:01

<erik-etsuji-kato> hi

17:01

<OliverOffing> hey everyone

17:01

<tarun> g=hi

17:01

<monlovesmango> hello

17:01

<lightlike> hi

17:01

<sipa> hi

17:01

<larryruane> Hi everyone! Welcome to another PR Review Club

17:02

<larryruane> Feel free to say hi to let everyone know you're here

17:02

<sanya> hi

17:02

<docallag> Hi.

17:02

<larryruane> anyone here for the first time?

17:02

<jaonoctus> gm

17:02

<monlovesmango> my first time :)

17:03

<larryruane> Today we're looking at #24007 "[mempool] allow tx replacement by smaller witness"

17:03

<larryruane> monlovesmango: welcome!

17:03

<monlovesmango> ty!!

17:03

<sdaftuar> hi

17:03

<larryruane> Here's the link to the notes and questions: https://bitcoincore.reviews/24007

17:04

<monlovesmango> I got it open thank you!

17:04

<larryruane> Who had a chance to review today's PR? y/n/partial and what was your approach?

17:04

<svav> n but read the notes

17:04

<ziggie> y

17:04

<brunoerg> 0.6y

17:04

<jaonoctus> y

17:04

<narcelio> y

17:04

<tarun> y

17:04

<erik-etsuji-kato> Code review only

17:04

<btckid> y/notes

17:05

<OliverOffing> n

17:05

<stickies-v> y, untested code&notes review

17:05

<docallag> y

17:05

<theStack_> 0.5y (only light conceptual review and code-reviewed the test so far)

17:06

<larryruane> mempool is a somewhat tricky area, would anyone like to summarize the PR? Feel free to add background if you'd like!

17:07

<ziggie> This PR tries to introduce a new replacement option for transactions with different witnesses

17:07

<theStack_> the PR enables to replace txs in the mempool if only the witness data changes, but the remaining parts of the transactions are unchanged (i.e. same txid, different wtxid)

17:08

<OliverOffing> this PR allows users to replace the witness data for a transaction in the mempool, as long as the new witness data is smaller (i.e. higher fee _rate_)

17:08

<stickies-v> If you want to change (and broadcast) the witness of an already broadcasted transaction (e.g. by spending a different path in the script), this is currently not allowed because the txid doesn't change when just the witness data changes. This PR allows such updated transactions to be acceptedin mempool and broadcasted, provided that the new witness is sufficiently small enough.

17:09

<larryruane> ziggie: theStack_: OliverOffing - correct, what's the current policy if a transaction arrives at the mempool but there's already a tx with the same txid?

17:09

<docallag> Rejected as an attempted double spend

17:09

<OliverOffing> discard it (first seen safe policy)

17:09

<theStack_> it is rejected

17:10

<larryruane> yes, good! maybe something basic now, what's the difference between a `txid` and a `wtxid`? Why are there two different IDs?

17:10

<svav> Current policy is that new transaction with same txid is immediately rejected

17:11

<OliverOffing> `txid` = hash(tx data), `wtxid` = hash(tx data + witness data)

17:11

<docallag> txid = hash of non-witness data, wtxid = hash of both

17:11

<stickies-v> txid is calculated as the hash of the serialized transaction without witness data (and thus malleable), wtxid is the hash of the serialized tx with witness data

17:11

<erik-etsuji-kato> TxId is the hash of legacy data only (all transaction data minus the witness stack an the flag), wTxId is the hash with the witness

17:11

<btckid> a txid hashes the non witness data while wtxid hashes both non witness data and witness data

17:12

<stickies-v> *not malleable instead of mallaeable

17:12

<jaonoctus> txid is a double sha256 of tx data (without the witness part)

17:12

<larryruane> all great answers ... What big change in (I think) 2017 introduced the concept of `wtxid`?

17:12

<brunoerg> segwit

17:12

<OliverOffing> segwit

17:12

<erik-etsuji-kato> segwit

17:12

<svav> Segregated witness

17:12

<btckid> segwit

17:13

<larryruane> hehe yes, too easy for you all! ... general cryptography term is "commited to" ... what does that mean?

17:13

<larryruane> (by the way, feel free to ask your own questions or lead the discussion elsewhere!)

17:13

<svav> Uniquely associated with?

17:13

<larryruane> *committed to

17:14

<ziggie> would it be possible to use the RBF signaling for replacing a tx with the same txid

17:14

<stickies-v> refer to a hash of 'something', so that when the 'something' in any way changes your reference becomes invalid

17:14

<larryruane> like, in this case, we say the wtxid "commits to" the witness, and you've all said the witness is included in computing wtxid, so why is it called that?

17:15

<larryruane> stickies-v: you got it, that's what I was looking for. If A commits to B, then you can't change B without changing A

17:16

<larryruane> (A usually (always?) being a hash)

17:16

<docallag> If the input and outputs have to be the same how would the witness data change? (please ignore if off topic)

17:17

<larryruane> No that's on-topic, good question. Anyone like to explain how that can happen?

17:17

<OliverOffing> a script can have multiple forks/paths, and different witness data can trigger different paths

17:17

<btckid> a change in the script path?

17:17

<docallag> Ah, tks!

17:18

<glozow> easiest example of a multi-spending path script is OP_DROP OP_TRUE

17:18

<glozow> and then the witness data can be literally anything

17:19

<glozow> er i guess, 1 item anything

17:19

<stickies-v> signatures are also malleable (hence segwit), so I think even without segwit that could be the case. It wouldn't hit the 5% witness size decrease requirement of this PR though, just for completeness

17:19

<larryruane> an tx output can be thought of as a lock, but the lock can possibly be unlockable with different "keys" ... each "key" corresponds to a different witness (very high level description)

17:19

<stickies-v> *even without P2SH, not "even without segwit". sorry

17:19

<docallag> Great examples. Thanks all.

17:20

<monlovesmango> ziggie: I don't think so bc RBF didn't change rule that duplicate txid is not allowed (bc RBF will always have new txid)...?

17:20

<glozow> docallag: see here for example code https://github.com/bitcoin/bitcoin/blob/e3699b71c46bf66cfe363fc76ab8761dc39555a7/src/test/txpackage_tests.cpp#L333

17:21

<larryruane> Okay, feel free to continue this line of thought, but if we're ready to move to the mempool ... the mempool is a collection of unconfirmed transactions, each represented by key-value pair (with unique keys, like a std::map) ... what is the key?

17:22

<brunoerg> txid?

17:22

<btckid> tcid

17:22

<btckid> txid

17:22

<OliverOffing> txid

17:22

<svav> txid

17:22

<narcelio> txid

17:22

<stickies-v> is there just a single key? if i understand correctly the mempool is indexed on 5 keys?

17:22

<lightlike> many keys, it's a multi index

17:22

<larryruane> yes, good, txid ... meaning, there can't be 2 transactions in the mempool with same txid ... but why? why don't we key on wtxid?

17:23

<larryruane> stickies-v: yes, exactly right, i'm at a very high level here :)

17:23

<glozow> we do key on wtxid though?

17:23

<OliverOffing> because indexing on wtxid would not prevent two txs from spending the same inputs

17:23

<larryruane> glozow: do we key on wtxid?

17:23

<stickies-v> https://github.com/bitcoin/bitcoin/blob/2935bd9d67e5a60171e570bde54a212a81d034e9/src/txmempool.h#L371-L376

17:23

<glozow> https://github.com/bitcoin/bitcoin/blob/e3699b71c46bf66cfe363fc76ab8761dc39555a7/src/txmempool.h#L184-L196

17:24

<glozow> https://github.com/bitcoin/bitcoin/blob/master/src/txmempool.h#L465

17:24

<larryruane> OliverOffing: perfect, that's exactly it! transactions in the mempool can _never_ be conflicting, and two transactions with the same txid are necessarily conflicting (spending the same inputs)

17:25

<glozow> mempool indexes by txid, wtxid, descendant feerate, ancestor feerate, and entry time

100

17:25

<larryruane> glozow: stickies-v: this lets us *look up* transactions by wtxid, but there can still not be two transactions with the same txid, right? I hope I have that right!

101

17:26

<svav> Am I right in thinking txid will always be unique in the mempool whereas wtxid may not be?

102

17:26

<larryruane> svav: I don't think it's possible to have two (or more) tx with same wtxid but different txid

103

17:27

<stickies-v> larryruane it doesn't get accepted in mempool because it wouldn't pass MemPoolAccept::PreChecks but I don't think the CTxMempool wouldn't technically be able to?

104

17:27

<larryruane> unless you were extremely "lucky" (unlucky)

105

17:27

<lightlike> i think both must be unique because they are both boost::multi_index::hashed_unique indexes

106

17:27

<glozow> if the question is whether boost multi index container will let us have 2 entries, i'll need to go read the docs. if the question is whether our node will survive if we put 2 transactions with the same txid in the mempool, the answer is no

107

17:28

<larryruane> stickies-v: good point, I'm not sure if anything actually enforces no duplicate txids (other than conflicting spends), like, if you happened to have a collision

108

17:28

<svav> Can txid be regarded as a unique identifier as far as Bitcoin transactions are concerned?

109

17:29

<svav> Can anyone link to the code where txid and wtxid is generated?

110

17:30

<larryruane> svav: that's a great question, the answer is yes and no ... It's yes in the sense that two tx with the same txid must have the same *effect* (let's review, what is the effect of a tx? It's to destroy some UTXOs and create some new ones) ... but these two tx can be *enabled* by different witnesses, so they're different in that way ... Do I have this right? (I'm kind of new to this myself!)

111

17:31

<stickies-v> larryruane I was typing something very similar, agreed

112

17:32

<larryruane> we've covered more or less questions 1-3, let's quickly cover question 4, does this PR change a consensus rule?

113

17:32

<glozow> no

114

17:32

<erik-etsuji-kato> no

115

17:32

<brunoerg> no, mempool is individual

116

17:32

<jaonoctus> nope

117

17:33

<larryruane> correct, excellent ... what happens if some nodes are updated to run this PR and others aren't? does that cause a problem?

118

17:34

<stickies-v> no, nodes that aren't updated will just not forward these transactions

119

17:34

<larryruane> (this is the type of thing we *always* have to worry about in bitcoin!)

120

17:35

<stickies-v> you do need sufficient nodes to be upgraded in order to somewhat reliably expect your transaction would be propagated to most miners

121

17:35

<larryruane> stickies-v: maybe, I think interestingly enough, if a node receives a tx with a txid that is already in its mempool, it re-broadcasts its *existing* tx - is that right glozow ?

122

17:35

— docallag Couldn't my (old) node broadcast the old transaction but your node (new) broadcast the new transaction?

123

17:36

<larryruane> stickies-v: yes (i was replying to your previous msg)

124

17:36

<OliverOffing> different nodes would keep different transactions in the mempool but that's not really a problem for the network—only for the user trying to replace their tx's wit data as stickies-v mentioned

125

17:36

<larryruane> docallag: yes, I think that's what does happen with this PR

126

17:36

<glozow> larryruane: oh no, I think that was fixed in #22261 https://github.com/bitcoin/bitcoin/pull/22261

127

17:37

<larryruane> glozow: +1 thanks

128

17:38

<larryruane> Personally I think question 5 is really interesting ... Should Bitcoin Core mempool policy be miner-incentive compatible? If so, why?

129

17:38

<jaonoctus50> wdym by miner-incentive compatible?

130

17:39

<glozow> a mempool is as useful as it is an accurate reflection of what's in the miners mempools. being incentive-incompatible is a good way to deviate from what would be in a miner's mempool. so yes.

131

17:39

<stickies-v> I think so. If not, you encourage miners to set up individual backchannels (e.g. sending tx directly to miner for a direct fee) to help get non-standard transactions mined, and that puts smaller/anonymous miners at a disadvantage

132

17:39

<larryruane> jaonoctus50: should our mempool contain what (at least most) miners would *prefer* to have in their mempools?

133

17:39

<OliverOffing> yes, miners secure the network. the mempool needs be compatible with miner incentives so that the txs get to miners' nodes

134

17:39

<svav> Are these definitions still correct? /////// Definition of txid remains unchanged: the double SHA256 of the traditional serialization format:

135

17:39

<svav> [nVersion][txins][txouts][nLockTime] /////// A new wtxid is defined: the double SHA256 of the new serialization with witness data:

136

17:39

<svav> [nVersion][marker][flag][txins][txouts][witness][nLockTime]

137

17:39

<theStack_> i think it should be, if i understand "miner-incentive compatible" correctly; we want miners to maximize their fees in order to increase the security

138

17:39

<lightlike> yes, because otherwise miners are incentivised to accept transactions by other ways than the p2p network, such as their own endpoints.

139

17:40

<larryruane> stickies-v: great answer, i hadn't thought of that!

140

17:40

<brunoerg> i think so, because we want to be aligned with them, just because we want to see our transactions being mined.

141

17:40

<larryruane> (and lightlike too, +1)

142

17:41

<larryruane> here's a crazy thought, BTW (not in the notes), we sometimes have these policies that we're not sure miners are following - maybe we can look at the tx that arrive in each *mined block* and dynamically adjust our policies to match!

143

17:42

<larryruane> to align better our mempool contents (and thus fee estimation, relay policies, etc.) with actual miner policies

144

17:42

<brunoerg> larryruane: make sense

145

17:42

<larryruane> probably too complicated, but just a thought

146

17:42

<stickies-v> larryruane but then you have to reverse engineer those policies from the results you observe? which seems far from trivial?

147

17:42

<erik-etsuji-kato> But this can be used by miners to, e.g, bump the fees artificilly

148

17:43

<erik-etsuji-kato> They can add fake high-fee transactions in ther mined blocks

149

17:43

<jaonoctus50> erik-etsuji-kato: good point

150

17:43

<larryruane> erik-etsuji-kato: great point ... it may be too weird to set up this circular dependency

151

17:43

<stickies-v> erik-etsuji-kato I'm not sure that's relevant here actually, because those transactions don't get propagated to the network (because then if someone else mines them, the miner loses money)

152

17:44

<larryruane> stickies-v: that's true, if mining is fairly decentralized

153

17:44

<OliverOffing> unless they collude

154

17:45

<OliverOffing> which is something plausible given that fees would increase across the board for everyone

155

17:45

<monlovesmango> does witness data size impact the fees that miners collect?

156

17:45

<stickies-v> hmm they don't need to collude I think, any dishonest miner can do this, it's trivial to just construct some high fee tx's and inject them only into your own block. I just mean that this wouldn't affect policy, because they're not propagated out of block

157

17:46

<larryruane> okay let's see (but again, feel free to keep going on any previous thread), question 6, quick detour to RBF, when RBF occurs, why is it necessary to remove mempool decendants? Is that necessary for witness-replacement?

158

17:46

<stickies-v> monlovesmango yes it does, but it's discounted so each witness data byte is less expense than each non-witness data byte

159

17:46

<brunoerg> because the txid is changed

160

17:46

<stickies-v> 4 times less expensive, to be precise

161

17:46

<theStack_> after RBF occurs, the descendants of the replaced txs are not valid anymore, as they (directly or indirectly) spend invalid inputs

162

17:46

<erik-etsuji-kato> It changes the txid, invalidating the inputs in it's descendants

163

17:47

<brunoerg> in the case of witness-replacement, it doesnt happen because the txid doesnt change

164

17:47

<larryruane> monlovesmango: yes, because if a witness replacement occurs, the feerate improves, so that helps the miner

165

17:47

<theStack_> brunoerg: +1

166

17:47

<larryruane> brunoerg: can you elaborate, what does txid not changing mean?

167

17:48

<larryruane> oh i see theStack_ kind of answered this already, thanks

168

17:49

<larryruane> maybe we can cover 8 quickly, why does this PR allow witness-replacement even if the tx hasn't signaled replacability (which is required for RBF)?

169

17:49

<ziggie> would this change also affect the constuction of scripts to foresee some kind of competing scenario ?

170

17:49

<larryruane> (I think this has been answered already)

171

17:50

<OliverOffing> because neither inputs nor outputs change so no harm no faul?

172

17:50

<larryruane> (since txid isn't changing, downstream transactions' inputs aren't invalidated)

173

17:50

<theStack_> i think it has to do with the fact that the recipient doesn't are about a potential witness-replacement, as the outputs aren't changed?

174

17:50

<ziggie> so that you kind of padded script paths to make them equal size, just in case they will not meet this policy

175

17:50

<theStack_> *care

176

17:50

<erik-etsuji-kato> +larryruane: Any tx can be replaced by witness, without signaling

177

17:52

<ziggie> oh forget my question, I was missing the point that the witness only changes sry

178

17:52

<larryruane> theStack_: +1 ... RBF replacability signaling allows the downstream tx to worried.. but they're not worried in this case

179

17:52

<stickies-v> I think this would break protocols that rely on unconfirmed wtxids, but I don't think those protocols currently exist and that seems like an unreliable thing to do anyway so we probably shouldn't care?

180

17:53

<theStack_> though, even without RBF signalling there is a reason to be worried, as the tx could just have a too-low fee and eventually get kicked out of the mempool, and _then_ get replaced? (probably a different discussion though)

181

17:54

<larryruane> I'd really love to get to question 9, I don't have good answers myself. There was a good comment on the PR just a few hours ago on these lines (what is the use case here?) - https://github.com/bitcoin/bitcoin/pull/24007#issuecomment-1022303102

182

17:54

<stickies-v> theStack_ yeah absolutely, and like with any 0conf protocols, one can always pay a miner to include a newer tx version anyway so...

183

17:56

<OliverOffing> larryruane that you ask it does seem a bit feature-creeping

184

17:57

<larryruane> OliverOffing: fair point ... So there's that question (9), and there are a few more questions (11-13) but only a few minutes left, anyone like to choose anything else to discuss?

185

17:57

<glozow> re: use cases, it seem like we would only be interested in this if there's a pinning attack. like, you have counterparties on your transaction and somebody bloats up the witness by thousands of vbytes or something. i can't imagine why a normal user is broadcasting their own transaction multiple times with various witnesses

186

17:58

<jaonoctus50> glozow: +1

187

17:58

<stickies-v> larryruane I can potentially see this being useful in protocols where one party signals they're willing to go on chain in a suboptimal path by broadcasting that larger transaction, maybe encouraging all parties to instead lower the fees and go for key spend (in taproot case) instead

188

17:59

<stickies-v> a game of chicken, kind of

189

17:59

<svav> Can anyone answer question 9)?

190

17:59

<larryruane> makes sense ... I guess overall, maybe the main argument is that allowing witness replacement is compatible with miner incentive, and it's not very complex

191

18:00

<larryruane> svav: I think that will have to happen, and is happening, int the PR

192

18:00

<larryruane> ok we're at time, thanks everyone!

193

18:00

<theStack_> for question 12, i think the reason for multiplying both sides with an integer rather than one side with a float is to avoid floating-point arithmetics?

194

18:00

<theStack_> thanks for hosting larryruane!

195

18:00

<docallag> tks all, lots to digest

196

18:01

<larryruane> theStack_: +1

197

18:01

<docallag> glozow thanks for the links

198

18:01

<jaonoctus50> ty

199

18:01

<monlovesmango> larryruane: for the attack mentioned in that post, wouldn't it make more send to require the witness to be smaller by a certain byte size rather than percentage to eliminate this attack vector? or is there a reason we are using percentage of data size?

200

18:01

<erik-etsuji-kato> tks all

201

18:01

<btckid> thank you all

202

18:01

<tarun> thank you.

203

18:01

<glozow> larryruane: do you know of any applications where you share an input with an untrusted counterparty who might be able to broadcast with a different witness to grief you?

204

18:01

<brunoerg> thank you

205

18:01

<stickies-v> ty for hosting and for the very detailed notes and questions, larryruane !

206

18:01

<OliverOffing> thanks larryruane and thanks all

207

18:01

<glozow> thanks for hosting larryruane!

208

18:01

<larryruane> #endmeeting