index TxOrphanage by wtxid, allow entries with same txid (p2p)

May 1, 2024

https://github.com/bitcoin/bitcoin/pull/30000

Host: glozow - PR author: glozow

Notes

An orphan transaction is a transaction with missing inputs. The p2p code uses a TxOrphanage to store orphan transactions, to be reconsidered later if/when its parent transaction(s) are submitted to mempool. There are two ways this can happen:
- ProcessOrphanTx, which is called at the start of each ProcessMessages, pops orphans from a work set that is updated whenever a parent transaction is accepted to mempool.
- When a low-feerate parent is paired with its child in the orphanage to be submitted together as a package. This happens in two locations:
  - when a transaction fails for being low feerate
  - when a low feerate parent is downloaded again This “opportunistic 1-parent-1-child (1p1c) package submission” logic was added in PR #28970.
An orphan can be removed in a few different ways:
- When it is successfully submitted to mempool.
- If it is confirmed or conflicted in a block.
- If the peer that sent this orphan disconnects.
- After it has been in the orphanage for more than 20 minutes.
- If it is randomly selected for eviction when the orphanage reaches maximum capacity.
- If it is found to be invalid for some reason other than missing inputs. For example, a transaction may be missing two parents, and we have only accepted one of them to mempool so far. In that case, we keep the orphan and will reconsider it again after the second parent is submitted.
Different transactions can have the same txid but different witnesses, i.e. different wtxids. For example, a same-txid-different-witness transaction can have an invalid signature (and thus be invalid) or a larger witness (but same fee and thus lower feerate).
- In previous review clubs, we have covered same-txid-different-witness transactions in relation to transaction broadcasts and mempool replacements.
- We also covered adding Txid vs Wtxid type-safety to TxOrphanage in a previous review club.
Prior to this PR, the TxOrphanage is indexed by Txid and, when considering a new transaction in AddTx, immediately fails if the new transaction’s txid matches that of an existing entry.
- TxOrphanage::HaveTx takes a GenTxid to query the data structure by either txid or wtxid.
- HaveTx is primarily called by AlreadyHaveTx which also accepts a GenTxid.

Questions

Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?
Why would we want to allow multiple transactions with the same txid to exist in the TxOrphanage at the same time? What kind of situation does this prevent?
What are some examples of same-txid-different-witness orphans? (Bonus points if you can write a test case in the functional test for your example).
Let’s consider the effects of only allowing 1 entry per txid. What happens if a malicious peer sends us a mutated version of the orphan transaction, where the parent is not low feerate? What needs to happen for us to end up accepting this child to mempool? (There are multiple answers).
Let’s consider the effects if we have a 1p1c package (where the parent is low feerate and must be submitted with its child). What needs to happen for us to end up accepting the correct parent+child package to mempool?
Instead of allowing multiple transactions with the same txid (where we are obviously wasting some space on a version we will not accept), should we allow a transaction to replace an existing entry in the TxOrphanage? What would be the requirements for replacement?
Where in the code do we check whether the orphanage contains a transaction? Is the query done by wtxid, txid, or both? (Hint: there are at least 5).
This PR removes the ability to query the orphanage by txid, since the TxOrphanage no longer has an index by txid. Is that okay, and why or why not?

Meeting Log

17:01

<glozow> #startmeeting

17:01

<dergoegge> hi

17:01

<maxedw_> hi

17:01

<glozow> Welcome to PR Review Club!

17:01

<glozow> We're looking at #30000 today: https://bitcoincore.reviews/30000

17:02

<glozow> I know we posted the notes very very late, but did anybody get a chance to look at the PR or the notes?

17:02

<angusp> yep

17:02

<maxedw_> yes

17:02

<ion-> Very briefly

17:02

<glozow> angusp: maxedw_: ion-: ⭐ yay!

17:03

<glozow> Let's dive into the questions. Why would we want to allow multiple transactions with the same txid to exist in the TxOrphanage at the same time? What kind of situation does this prevent?

17:03

<maxedw_> When a parent comes in that a valid orphaned child can be combined with it to form a package. Prevents an attacker from preventing us getting our package in the mempool and confirmed

17:03

<stickies-v> hi

17:05

<stickies-v> mostly looked at the PR (well, mostly at TxOrphanage in general)

17:05

<angusp> I'm not super sure, my guess is malleability, if a messed-with tx was seen first and in the orphanage and you're indexing by txid, the honest one can't be included

17:05

<glozow> maxedw_: great! let's look at that a bit more closely. let's say an attacker has the parent tx and the child tx. what should they do / send to you?

17:05

<ion-> If an attacker constructs a malleable transaction to a valid one, and his version is received first?

17:05

<glozow> angusp: ion-: yep yep, malleated how?

17:06

<angusp> put `[b"garbage"]` in the witness ;) - or tweak the signature

17:06

<maxedw_> they should malleate the child and hold off sending the parent?

17:07

<angusp> you can't know it's an invalid tx if you've never seen the parent

17:07

<glozow> angusp: haha yes, was just about to link to the test as a hint. correct, they change the witness somehow

17:07

<ion-> the two transactions having different witness versions?

17:07

<ion-> as you say in the pr description?

17:08

<glozow> okay so the attacker sent you a malleated version of the child, and nobody has sent you the parent yet. what happens if an honest peer sends you the real child tx?

17:08

<angusp> in the current code?

17:08

<glozow> angusp: yes

17:08

<ion-> it will be rejected i guess

17:09

<maxedw_> before pr you wouldn't put it in the orphanage

17:09

<glozow> still walking through what exactly we're trying to fix in the PR

17:09

<angusp> the line `if (m_orphans.count(hash)) return false;` would prevent the honest/real child tx from being accepted

17:09

<glozow> maxedw_: angusp: yes exactly!

17:10

<angusp> so then the honest peer would have to rebroadcast the real child later when it's not an orphan? (Or is there a mechanism for my peer to re-request it later?)

17:10

<glozow> we'd call `AddTx` here https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L4635-L4637 but it'd be dropped at the start of the function

17:10

<glozow> angusp: good question

17:11

<glozow> TLDR yes you would need to download it again from that peer or from somebody else

17:11

<glozow> however let's look at what the code does

17:12

<glozow> as you can see, after `AddTx`, we call `ForgetTxHash()` to forget about all the announcements we got for this tx: https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L4635-L4642

17:13

<glozow> which means that we won't download it again until somebody sends us the inv or tx again

17:13

<glozow> and if, at that point, we still have the fake orphan in the orphanage, the same thing happens

17:14

<glozow> Let's say we never receive the parent transaction from anybody. When will we finally delete the fake tx from orphanage?

17:14

<maxedw_> 20 minute timeout?

17:14

<angusp> after time / random when orphanage is full?

17:14

<maxedw_> or when we know it's invalid

17:14

<glozow> maxedw_: angusp: yep!

17:14

<ion-> expiration 20 min

17:14

<maxedw_> (if it's invalid)

17:15

<angusp> > or when we know it's invalid

17:15

<angusp> Can we ever know an orphan is invalid?

17:15

<glozow> maxedw_: yes, that's correct. but we wouldn't figure that out until we recieved the parent

17:15

<glozow> angusp: only if we reconsider it after getting the missing parent(s)

17:15

<glozow> let's explore this as well: while we have the malleated child in the orphanage, what happens if we receive the (real) parent?

17:16

<maxedw_> low fee or not?

17:16

<ion-> Accpet it and invalidate the child

17:16

<glozow> maxedw_: let's start with not low fee

17:16

<glozow> (well, the answer is the same but the codepaths are slightly different)

17:17

<maxedw_> parent accepted, child kicked out?

17:18

<maxedw_> was there an example where it was malleated so it was still valid but larger and so fee less?

17:18

<glozow> maxedw_: yep. When we accept the parent, we queue up the orphan for reconsideration in `ProcessOrphanTx`: https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L3366

17:18

<glozow> maxedw_: yeah, so let's answer the other option: what happens if the parent is low feerate (i.e. needs bumping from the child)?

17:20

<maxedw_> If it's really low feerate it could get dropped out of the mempool and if it's just too low to get in a block it could be delayed?

17:20

<glozow> maxedw_: it's so low feerate it doesn't get accepted to mempool

17:21

<maxedw_> then I spose it gets discarded?

17:21

<glozow> ok so the parent gets rejected, and we try to submit it with the orphan transaction as a package here: https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L4669-L4673

17:21

<glozow> it's the same result - the child has an invalid signature, so it gets removed from the orphanage

17:22

<glozow> and, yes, we discard the parent after

17:22

<glozow> Ok! We covered the next question "What are some examples of same-txid-different-witness orphans?", we mentioned a bad signature and a really large witness for a lower feerate

17:23

<ion-> segwit v0 compared to v1 maybe?

17:24

<maxedw_> can an attacker make a witness larger knowing only the valid witness and transaction?

17:24

<ion-> Depending on the sighash used?

17:25

<glozow> maxedw_: certainly they can make the witness larger, though they might not make a valid tx

17:25

<maxedw_> with P2WSH there could be multiple different WitnessScripts that are possible, not sure if there is something that could be done there

17:25

<angusp> The code you linked to here https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L4639-L4641 we forget both the txid and wtxid -- is that still OK in the attack case we've covered? You're 'forgetting' the txid which could be malleated

17:25

<glozow> in any case, the feerate would be checked before the script

17:28

<glozow> angusp: hm. the intention there is "ok we've already downloaded this tx so there's no point in continuing to try to download it"

17:28

<glozow> I suppose... anything with this txid is also going to be an orphan

17:29

<glozow> And we aren't going to add anything with the same txid to the orphanage anyway

17:30

<glozow> So I guess this is consistent with current behavior?

17:30

<angusp> yeah, makes sense

17:32

<glozow> Ok next question. Instead of allowing multiple transactions with the same txid (where we are obviously wasting some space on a version we will not accept), should we allow a transaction to replace an existing entry in the TxOrphanage?

17:32

<glozow> ion-: I don't think the version is in the witness?

17:33

<ion-> nope!

17:33

<glozow> so that is not a way to change the wtxid without changing the txid

17:34

<angusp> Also think no, because you know nothing about the parent tx, you can't really pick which will be valid or not (between an honest tx and a maleated one)

17:34

<maxedw_> the only reason I could think to replace would be if it was a higher fee but you would also have to know it was valid which you couldn't do without the parent

17:35

<angusp> maxedw_: Higher fee should change the txid because the amounts change?

17:36

<maxedw_> smaller witness could do it?

17:36

<glozow> angusp: maxedw_: nice, good answers. for "higher fee" you can go by size i suppose

17:36

<glozow> (going for feerate not fee)

17:37

<glozow> somebody gave a good suggestion to not have duplicate txids from the same peer

17:37

<angusp> do orphaned txs not get broadcast to other peers?

17:37

<glozow> (because presumably if a peer is going to send you duplicates, they're either replacing the previous one or sending garbage)

17:37

<glozow> angusp: no they don't. we only broadcast after we submit to mempool

17:38

<angusp> gotcha

17:38

<glozow> but other than that, I agree. there's not really a metric we can use to choose one tx over the other

17:39

<ion-> Could we use a sometimes change and sometimes not approach?

17:40

<glozow> ion-: you mean like flip a coin?

17:40

<ion-> yes, to make the attackers work more difficult

17:40

<glozow> doesn't that make it easier? an honest peer will only send 1 tx. an attacker can send many, and has a 1/2 chance of displacing the tx each time

100

17:41

<glozow> Next question: Where in the code do we check whether the orphanage contains a transaction? Is the query done by wtxid, txid, or both? (Hint: there are at least 5).

101

17:42

<maxedw_> TxOrphanage::AddTx - wtxid (txid only used for log messages)

102

17:42

<maxedw_> TxOrphanage::EraseTxNoLock - wtxid

103

17:42

<maxedw_> TxOrphanage::HaveTx - wtxid

104

17:42

<maxedw_> TxOrphanage::GetTxToReconsider - wtxid

105

17:42

<maxedw_> TxOrphanage::EraseForBlock - uses outpoint which is txid?

106

17:43

<glozow> Yes nice, lots of queries within txorphanage. There are at least 4 more (hint: I'm looking for lines in net_processing.cpp)

107

17:44

<angusp> Presumably it was done by both Txid and Wtxid as the existing code also had m_wtxid_to_orphan_it -- I'm not really familiar enough to know where other than doing a code search!

108

17:44

<maxedw_> I did think I should have looked in net_processing too!

109

17:44

<glozow> :) and to clarify the question, let's talk about queries in the code before this PR

110

17:45

<glozow> Hint: you can see that `m_orphanage.HaveTx()` is called in `AlreadyHaveTx` here: https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L2298

111

17:45

<glozow> Where is `AlreadyHaveTx` called?

112

17:46

<maxedw_> PeerManagerImpl::ProcessInvalidTx looks to be one

113

17:46

<maxedw_> I checked with the PR..

114

17:46

<angusp> 1) When receiving a new tx (/package?) from a peer 2) getting a new block

115

17:47

<glozow> angusp: yes on the first one! that ones right here: https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L4531

116

17:47

<glozow> (and we check only by wtxid)

117

17:47

<glozow> maxedw_ mentioned the getting a new block one, that's `EraseForBlock`

118

17:48

<angusp> 3) If removing rejecting a tx that's a parent of an orphan we have

119

17:49

<glozow> Actually to note "uses outpoint which is txid" note that that's the txid of the parent, not the tx in the orphanage

120

17:50

<glozow> angusp: aha, i assume you mean this line: https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L4632

121

17:50

<maxedw_> glozow: that makes sense

122

17:50

<glozow> angusp: for that (3), is that by txid, wtxid, or both?

123

17:52

<glozow> hint: we're looking at parents using the prevouts of the tx

124

17:52

<angusp> I think both - or rather, Wtxid if has witness, else txid

125

17:52

<angusp> 50/50 haha

126

17:53

<glozow> it's by txid, but I'll give you partial credit for "both" because txid == wtxid sometimes

127

17:54

<glozow> There's 2 more `AlreadyHaveTx` callsites i'm looking for in net_processing.cpp. Can we find them?

128

17:55

<angusp> Hrm, so how does that work when you switch the orphanage to being Wtxid indexed?

129

17:55

<glozow> angusp: good question!!

130

17:55

<glozow> (that's also Q8, which is the only question left)

131

17:56

<glozow> "This PR removes the ability to query the orphanage by txid, since the TxOrphanage no longer has an index by txid. Is that okay, and why or why not?"

132

17:56

<angusp> (I can find the other `AlreadyHaveTx` calls but not sure what the code around them is doing!)

133

17:56

<glozow> angusp: ok no problem i'll just list them here

134

17:57

<glozow> https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L4226 is when we receive an `inv` message for a transaction

135

17:57

<glozow> and https://github.com/bitcoin/bitcoin/blob/842f7fdf786fcbbdf3df40522945813404f8a397/src/net_processing.cpp#L6288 is when we are sending a `getdata` (in response to an `inv`) and we want to make sure that we aren't about to request a tx we already have