Use wtxid for transaction relay (mempool, p2p)

Feb 5, 2020

https://github.com/bitcoin/bitcoin/pull/18044

Host: jonatack - PR author: sdaftuar

The PR branch HEAD was c4a23a1 at the time of this review club meeting.

Notes

Context

The author of this PR, Suhas Daftuar, has been working to improve the privacy and resilience of Bitcoin’s peer-to-peer network. See his PR history.
This PR builds on PR #17951 “Use rolling bloom filter of recent block txs for AlreadyHave() check” by the same author, which was just merged.
Suhas wrote a BIP draft that this PR implements: WTXID-based transaction relay.
Postscript: see also the #bitcoin-core-dev IRC discussions of 05 Feb 2020 and 06 Feb 2020 below.

What is a `wtxid`?

BIP 141 (Segregated Witness) introduced the definition of wtxid:
- “A new wtxid is defined: the double SHA256 of the new serialization with witness data.”
- “If all txins (transaction inputs) are not witness program, a transaction’s wtxid is equal to its txid.”
PR #11203 added wtxid to the mempool entry output of entryToJSON() in src/rpc/blockchain.cpp, thereby exposing wtxids to callers in the output of RPCs getmempoolentry, getmempooldescendants, getmempoolancestors, and getrawmempool.

WTXID-based transaction relay

Using the txid (which does not include the witness) is problematic because the witness can be malleated without changing the txid. See #8279 for a full discussion of the issue.
The PR description contains a very full and clear description of the motivation and changes.

Questions

Did you review the PR? Concept ACK, approach ACK, ACK <commit>, or NACK? Don’t forget to put your PR review on GitHub or ask questions.
Describe recentRejects: what type of data structure is it, what data does it contain, and what is it used for? (Hint: git grep -ni rejects).
In your opinion, does this PR save bandwidth for older nodes talking to newer nodes? What about downloading from old and new peers alike?
According to commit 61e2e97, using both txid and wtxid-based relay with peers means that we could sometimes download the same transaction twice, if announced via 2 different hashes from different peers. What do you think of the heuristic of delaying txid-peer-GETDATA requests by 2 seconds, if we have at least one wtxid-based peer?
In this comment, Suhas mentions a possible race condition where a peer could send a txid-based INV to us before it receives the WTXIDRELAY message (changing the relay to be wtxid-based), which would cause relay of that transaction to fail. What do you think, and how best should this case be handled?
Do you see any other potential race conditions, DoS vectors, or backward incompatibilities of this change?
Overall, do you think the potential benefits of these changes merit the additional complexity and (map relay index) data size, if any?

Meeting Log

18:03

<jonatack> We'll get started in an hour -- hopefully everyone isn't at Advancing Bitcoin today :)

18:03

<jonatack> Some thoughts I wanted to share before the meeting:

18:03

<jonatack> A few people have been contacting me privately with questions about the PR or how to review.

18:04

<jonatack> That's great, but please ask any review questions here on this irc channel, which allows more people to help and also to benefit from the conversation.

19:00

<jonatack> #startmeeting

19:00

⚡ scone waves

19:00

<jonatack> Hi all! Welcome to this week's episode of the Bitcoin Core PR Review club!

19:00

<jonatack> #topic This week, we are looking at PR 18044 - "Use wtxid for transaction relay"(mempool, p2p) by sdaftuar

19:00

<jonatack> We usually start Bitcoin Core IRC meetings with a 'hi' so it's clear who's at keyboard. Feel free to say hi, even if you arrive in the middle of the meeting!

19:00

<SirRichard> hi

19:00

<jonatack> If you didn't have have a chance to read the session notes, the PR description, and the BIP draft, I encourage you to do so.

19:01

<scone> hi

19:01

<amiti> hi

19:01

<lightlike> hi

19:01

<fodediop> Hello, underworld!

19:01

<jonatack> They provide a good background for this discussion and for reviewing the PR.

19:01

<emzy> hi

19:01

<raj_> hi..

19:01

<jonatack> (It's a really interesting one.)

19:01

<jonatack> Thanks to raj_ who I believe put up a review!

19:01

<jonatack> Remember, the goal is to actually review the PR *and* publish it on GitHub, e.g. put the study work into practice and help Bitcoin Core!

19:02

<jonatack> Review is very important in Bitcoin Core. Consistent review is arguably the scarcest resource and the main bottleneck to progress.

19:02

<jonatack> There are currently 758 open issues and 341 open PRs, and the stack keeps growing.

19:02

<jonatack> Please jump in at any point with thoughts and questions. Don't be shy! This discussion is about your thoughts and input.

19:02

<kanzure> hi

19:02

<jonatack> Question 1: Did you review the PR?

19:02

<raj_> yes.

19:02

<emzy> no

19:03

<fodediop> no

19:03

<jonatack> review in progress for me

19:03

<lightlike> a bit

19:03

<jonatack> Question 2: Concept ACK, approach ACK, ACK, or NACK?

19:04

<jonatack> moving on... with this PR a Concept ACK atm is about the best that can be done

19:04

<raj_> Concept ACK.

19:04

<jonatack> Who read the BIP draft? It's short.

19:04

<emzy> yes

19:05

<amiti> yes

19:05

<fodediop> no

19:05

<raj_> read it. Its just on the communication protocol to activate wtxid based transaction exchange.

19:05

ℹ hardforkthis7 is now known as tylerlevine

19:05

<tylerlevine> hi

19:05

<jonatack> Note that there was a discussion in the last hour on #bitcoin-core-dev that concerns part of the draft

19:05

<scone> BIP https://github.com/sdaftuar/bips/blob/2020-02-wtxid-relay/bip-wtxid-relay.mediawiki

19:06

<jonatack> "2. The wtxidrelay message must be sent in response to a VERSION message from a peer whose protocol version is >= 70016, and prior to sending a VERACK."

19:07

<jonatack> which is contradicted by parts of the current codebase

19:07

<raj_> i am a little dizzy on the rationale behind this. Can you elaborate the issue a little?

19:07

<jonatack> Question: Describe `recentRejects`: what type of data structure is it, what data does it contain, and what is it used for? (Hint: `git grep -ni rejects`).

19:08

<jonatack> raj_: I think the PR, the BIP draft, and the notes provide the background?

19:09

<raj_> regarding this, there is only Suhas' comment i found in the PR saying this might create a race condition.

19:10

<scone> I'll take my chance at summarizing (warning may be inaccurate): In order to improve p2p transaction relay overhead, let's develop a more efficient way of keeping track of transactions. But because of some issues with how tx's are kept track of right now, if we switched to tracking with the Witness txid - better methods would be enabled

19:10

<jonatack> raj_: are you talking about the pre/post VERACK question?

19:10

<raj_> yes

19:11

<jonatack> raj_: I'd suggest git grepping VERACK

19:11

<raj_> noted.

19:11

<lightlike> recentRecjects: a bloom filter for hashes of txes that we have rejected (and won't send GETDATA for if other peers INV them to us)

19:11

<jonatack> scone: good start! of course, the devil is in the details

19:12

<scone> yep everything is simple at the 1000 m view :]

19:13

<scone> ok so what are we currently focused on in group discussion? ( i got lost)

19:14

<ajonas> So to build a little on what scone said - an attacker can send malleated witness data to the mempool which then corrupts AcceptedtoMempool()

19:14

<raj_> recentRejects is a bloom filter structure that keep tracks of rejected txs. Previously it only added non segwit txs which are not maleated. Now it adds either invalid segwit txs, or non segwit non maliated txs.

19:14

<jonatack> lightlike: yes, but careful -- there is bloomfilter and rollingbloomfilter

19:15

<ajonas> JamesOB left a great comment about recentrejects -> https://github.com/bitcoin/bitcoin/commit/b191c7dfb7ede3f74edb3a32b8ac6fa2f4d6b78a

19:15

<jonatack> some people who reviewed the previous PR about AlreadyHave() mixed the two

19:15

<jonatack> ajonas: agreed!

19:17

<jonatack> Also see net_processing.cpp:L2601-2612

19:17

<ajonas> The problem as it currently stands is that we don't punish peers who relay us invalid or DoSy transactions because we don't know for certain whether it's a misbehaving peer or a false positive on the bloom filter

19:18

<jonatack> ajonas: right! if the witness is malleated, the txid remains the same

19:19

<jonatack> (responding to your prev comment -- i'm slow!)

19:19

<jonatack> This PR covers a lot of domain areas in the codebase, which makes it compelling to study

19:20

<raj_> ajonas: This problem still stands even with wtxids right?

19:20

<jonatack> mempool, txn relay, recent rejects, the orphan map, the relay map, confirmed txns, network protocol

19:21

<ajonas> raj_: not for nodes passing wxtids between them. For the older nodes yes,

19:22

<ajonas> (meant . not ,)

19:23

<jonatack> Recent rejects are defined in the codebase in net_processing.cpp:L148. Let's move forward.

19:23

<jonatack> Question: In your opinion, does this PR save bandwidth for older nodes talking to newer nodes? What about downloading from old and new peers alike?

19:26

<ajonas> So Suhas mentions this is a concern of his which is why he is experimenting with the delaying download by 2 seconds of txs by txid if we have a wtxid peer

19:26

<raj_> It seems to me that old nodes will waste some bandwidth by asking the same tx both by txid and wtxid. New nodes will also face the same issue if they have any old nodes connected. This might be incomplete understanding.

19:26

<jonatack> hint: see the PR description

19:27

<jonatack> ajonas: yes. interestingly, this isn't the only 2-second delay used to favor actions

19:27

<jonatack> e.g. git grep std::chrono::seconds{2}

19:27

<lightlike> if one of the two nodes is old, nothing should change wrt status quo.

19:27

<raj_> sorry, i will go with no bandwidth waste for old nodes. Old nodes cant see wtxids. Only new nodes are in trouble in the transition period.

19:28

<jonatack> lightlike: right! more an issue WRT to future policy, IIUC

19:29

<jonatack> raj_: i think so, the intent is to remain backward compatible while adding support for newer peers

19:29

<jonatack> Question: Is it important that feature negotiation of wtxidrelay happen between VERSION and VERACK, to avoid relay problems from switching after a connection is up?

19:29

<jonatack> (see latest commit https://github.com/bitcoin-core-review-club/bitcoin/commit/c4a23a1)

19:30

<jonatack> or would it be better to do it post-VERACK?

19:31

<jonatack> (this is being discussed RN on bitcoin-core-dev)

19:31

<ajonas> yeah, this is the discussion happening in real time in #bitcoin-core-dev

19:31

<scone> Time Notice: half way through meeting :]

19:32

<jonatack> Question: According to commit 61e2e97 (https://github.com/bitcoin-core-review-club/bitcoin/commit/61e2e97), using both txid and wtxid-based relay with peers means that we could sometimes

19:32

<jonatack> download the same transaction twice, if announced via 2 different hashes from different peers.

19:32

<jonatack> What do you think of the heuristic of delaying txid-peer-GETDATA requests by 2 seconds, if we have at least one wtxid-based peer?

19:33

<jonatack> How would you test this?

19:33

<raj_> I would like to see some result and also variation of perfomance with variation of huristic parameters.

19:33

<jonatack> One of the issues with p2p changes is that things can be broken without unit or functional tests necessarily failing.

19:33

<jonatack> This requires spending time thinking hard about the changes. Or setting up new testing or logging.

19:33

<jonatack> Or benchmarking.

19:34

<raj_> we can create a test network of nodes in regtest. Program them to send around some predefined transactions. and see the heuristic at play.

100

19:35

<lightlike> could the delay slow down the effectiveness of message propagation in the initial phase where just a low percentage of nodes are new?

101

19:35

<jonatack> We can also try debug logging. See: bitcoin-cli help logging

102

19:35

<ajonas> jonatack: Ah yes -- issue #14210 would be great for someone to make progress on

103

19:35

<jonatack> e.g. bitcoin-cli logging '["net"]'

104

19:36

<jonatack> ajonas: agree! that's a compelling issue

105

19:37

<jonatack> ajonas: i didn't realise you were working on it :+1:

106

19:37

<ajonas> I am?

107

19:38

<jonatack> lightlike: you worked on that as well?

108

19:38

<jonatack> ajonas: oops. misread the last comment.

109

19:39

<jonatack> Question: In https://github.com/bitcoin/bitcoin/pull/18044#discussion_r373775416 sdaftuar mentions a possible race condition where a peer could send a txid-based INV to us before it gets this message, which would cause relay of that transaction to fail. Do you agree?

110

19:39

<ajonas> I've got to bow out for the rest of the session. Thanks jonatack.

111

19:40

<jonatack> cheers ajonas

112

19:40

<lightlike> jonatack: a little, but not recently

113

19:40

<raj_> jonatack: i am not seeing how this would fail relay of those txs.

114

19:42

<jonatack> lightlike: at any rate it seems like a potentially valuable contribution to make

115

19:43

<jonatack> raj_: I would need to dive into the code more on this

116

19:44

<jonatack> Question: Does anyone see any other potential race conditions (or DoS vectors)?

117

19:45

<jonatack> These are the kind of questions that the complex state of the p2p network require thinking about.

118

19:45

<jonatack> Don't be shy :)

119

19:45

<raj_> Not that any i can see. Need to dive deeper.

120

19:45

<jonatack> Right.

121

19:45

<jonatack> Question: Overall, do you think the potential benefits of these changes merit the additional complexity and data storage (if any)?

122

19:46

<lightlike> i'd guess it is important to check that the behavior is correct if peers break protocol and send us old txids when we agreed on new mode, and vice versa

123

19:47

<jonatack> lightlike: yes. My initial impression is that tests like p2p_leak and others go in the right direction but can be more exhaustive... I would need to try adding tests to confirm that.

124

19:47

<jonatack> The p2p_* suite in general.

125

19:47

<raj_> yes aggreed with the concept overall. It really doesnt make sense to only see txids while a major chunk of the txs remains un commited to. Changing to wtxids makes a lot of sense. The only issue is correct transition between two phases.

126

19:48

<jonatack> Overall, I'm a concept ack but the details need thinking, verifying, and testing

127

19:48

<jonatack> Does anyone have any comments? Questions?

128

19:49

<jonatack> Last ten minutes.

129

19:49

<fodediop> Thank you jonatack. It's my first time here, so I'm just soaking in the process.

130

19:50

<jonatack> fodediop: Good to see you here! This PR was a wide one.

131

19:50

<fodediop> Thank you!

132

19:51

<jonatack> One thing that makes me hesitate here is the larger data storage and increase in code complexity.

133

19:51

<lightlike> why is there a larger data storage?

134

19:51

<scone> any simulations of the p2p communication overhead reductions? could one argue that the domains of efficiency are critical for nodes?

135

19:52

<raj_> I would like to ask the same question here as i asked you personally jonatack. Its obvious that to make better contribution towards code review its essential to develop complete understanding of the different interactions between different structures that are happening, even sometimes asynchrobnnously. Which seems like a daunting hill to climb. So i would like to know how everyone else is going about it? Do

136

19:52

<raj_> you practise any methodical path to gain better understanding of the codebase?

137

19:52

<jonatack> lightlike: the map relay index iiuc

138

19:54

<jonatack> lightlike: WDYT?

139

19:54

<jonatack> scone: yes! better simulations of the p2p network gets back to the issue ajonas linked to above.

140

19:54

<lightlike> jonatack: oh, ok. I think you are right.

141

19:55

<jonatack> raj_: it's just time, really, and poking around

142

19:56

<jonatack> basically as described here: https://jonatack.github.io/articles/how-to-review-pull-requests-in-bitcoin-core#if-youre-not-sure-where-to-start

143

19:56

<jonatack> that's my understanding of it, at least

144

19:57

<jonatack> 3 minutes! any last remarks?

145

19:57

<jonatack> #action - Finish reviewing 18044 and put your review on GitHub!

146

19:58

<lightlike> thanks for hosting! will follow the pr (it will probably take some months until merged)

147

19:59

<fodediop> Thank you for hosting jonatack!

148

19:59

<jonatack> If anyone would like to host a review club session, or propose a PR, don't hesitate!

149

19:59

<emzy> thanks jonatack

150

19:59

<raj_> thanks, this was a nice pr to discuss. willing to work on it next to provide any further review if i can. I am also willing to work on simulating p2p testing for these type of changes. If someone is already working on such please let me know, would love to extend some helping hand.

151

19:59

<raj_> thanks jonatack

152

19:59

<jonatack> lightlike: agree, for now the PR and BIP need concept acks and critical thinking

153

19:59

<scone> ty jonatack

154

19:59

<jonatack> Thanks everyone!

155

19:59

<jonatack> #endmeeting

156

20:00

<jonatack> Will post the meeting log shortly. Don't forget to volunteer to host or to propose PRs you'd like to see.

157

20:02

<jonatack> raj_: that's great if you can work on/make progress on simulating p2p testing.

158

20:03

<scone> +1

05 Feb 2020 #bitcoin-core-dev IRC discussion

159

17:02

<sdaftuar> Is not sending any p2p message between VERSION and VERACK an important thing to do? i was going to add this new wtxidrelay feature negotiation there, but just realized I also would need to change some code that ignores messages received before VERACK, which gave me pause

160

17:43

<sipa> sdaftuar: how is this different than sendheaders etc?

161

17:44

<sdaftuar> sendheaders, feefilter, etc all are sent in response to VERACK

162

17:45

<sdaftuar> but the issue i wanted to avoid is there being a relay failure in between processing a peer's VERACK (allowing announcement of transactions to that peer, generally) and processing that peer's WTXIDRELAY message (changing the relay to be via wtxid)

163

17:45

<sdaftuar> (of course, with extra bookkeeping there would not necessarily be a failure, but it would be tedious to do that)

164

17:46

<sdaftuar> it's a pretty minor issue IMO, as it is a very short window that this could happen -- but i did notice a test failure due to this, so i thought it better to fix somehow

165

17:57

<jonatack> sdaftuar: right, looking at the latest commit at https://github.com/bitcoin-core-review-club/bitcoin/commit/c4a23a1. Was the test failure in p2p_tx_download.py?

166

18:04

<sipa> sdaftuar: but receiving a message before verack may be a problem for other applications, maybe?

167

18:09

<sdaftuar> sipa: yeah, i wasn't sure about that. we assign a misbheavior point to a node that gives us an unexpected message before VERACK

168

18:10

<sdaftuar> jonatack: the test failure i saw when doing this after VERACK was in p2p_permissions i think

169

18:10

<sdaftuar> (the intermediate commit, where i tried to move this to before VERACK, was totally busted by the way; i pushed a better version a little while ago)

170

18:14

<jonatack> thanks -- seems safer (in principle) to handle a post-VERACK time gap before WTXIDRELAY, if reasonably feasible

171

18:27

<jonatack> sdaftuar: in p2p_leak.py it actually states "A node should never send anything other than VERSION/VERACK/REJECT until it's received a VERACK" and test for that

172

18:27

<sipa> sdaftuar: in the past people have used appending data to the version message for this purpose

173

18:27

<jonatack> (non-exhaustively apparently)

174

18:28

<sipa> heh, REJECT was permitted?

175

18:32

<jonatack> seems so in the test

176

18:32

<jonatack> (if i'm reading it correctly and it's functioning properly)

177

19:01

<sdaftuar> cfields: I think you wrote that comment jonatack is referencing, any thoughts?

178

19:08

<sdaftuar> sipa: extending the version message would be nice and simple, but i assume no one likes these variable length messages that ensue from that approach?

179

19:09

<sipa> sdaftuar: exactly

180

19:10

<sipa> some libbitcoin people complained about BIP37 adding an optional field to version

181

19:13

<sipa> that was also 8 years ago

182

19:14

<sdaftuar> i guess if there's nothing intrinsically wrong with a design where we throw message in between VERSION and VERACK, that seems most extensible to me

183

19:15

<sdaftuar> but if software complains, then i am not sure what to do

184

19:16

<sdaftuar> one option could be to do a bunch more work to support txid or wtxid-based announcements with a peer, so that turning on wtxidrelay on a link is seamless, but that is not clearly worth the effort t ome

185

19:17

<sipa> sdaftuar: perhaps just do it with a message between version and verack (and gated by version number, i guess?), and when things are more ready, discuss on the ML whether that could cause problems

186

19:17

<sdaftuar> that is where it's at right now (including gating it on version number, which i bumped)

187

19:17

<sdaftuar> seems reasonable

188

19:18

<sipa> i agree that's actually cleanest

189

19:18

<sdaftuar> possibly i could also just move it to being after VERACK -- in some ways, transaction relay failing between VERACK and negotiation of wtxid relay is no different than transaction relay failing because the connection wasn't set up yet at the time the transaction was announced

190

19:18

<sdaftuar> eg the test failure i observed could have happened for other reasons

191

19:19

<sdaftuar> it just happened to have happened because of the thing i could sort of control

192

19:20

<sdaftuar> anyway i'll leave it for now and revisit, thanks for thinking about this

193

20:40

<luke-jr> sdaftuar: sipa: for reference https://github.com/bitcoin/bips/blob/master/bip-0060.mediawiki

194

20:40

<sipa> luke-jr: ah yes, exactly

195

20:42

<luke-jr> it's not entirely clear to me if this BIP got implemented or not

196

20:42

<luke-jr> if not, then the ship has sailed, and IMO we should be free to extend this way again :P

197

20:43

<sipa> it's not because the BIP didn't get adopted that the problem it was trying to address is solved

198

20:44

<luke-jr> ?

199

20:47

<sipa> BIP60 argued that adding optional fields to the version message was problematic, and suggested a solution; even if that solutions wasn't adopted, that does not imply that its concern (optional fields at the end of version) does not matter

200

21:04

<luke-jr> sipa: my point is that if the version message is already variable-length, adding another field doesn't chnage that

201

21:04

<sipa> it still has problems with serialization of deployments (what if two P2P extensions both want to add extra data?)... historically speaking that hasn't been a problem though :p

06 Feb 2020 #bitcoin-core-dev IRC discussion

202

20:54

<sdaftuar> hi - i'm back, if anyone has questions about wtxid-relay i can discus

203

20:55

<jeremyrubin> yay! please do

204

20:55

<MarcoFalke> sdaftuar: There was a question whether it was needed for package relay

205

20:55

<sdaftuar> i think it's a nice-to-have, but non-essential

206

20:56

<sdaftuar> nice-to-have only because any tx-relay protocol change we make in the future (like erlay, or dandelion, etc) should be done on wtxid-based relay

207

20:56

<jeremyrubin> I guess more concretely where it fits into the https://github.com/bitcoin/bitcoin/projects/14 workflow and where you think it belongs timeline wise

208

20:56

<sdaftuar> well, i'm probably personally gated on it, as i don't want to work on more p2p relay things based on txid-relay at this point

209

20:57

<jeremyrubin> Like if new rebroadcasting stuff like what amiti is working on should be done on wtxids then do we try to slot this before it

210

20:57

<sdaftuar> barring some reason that wtxid-relay is a problem

211

20:57

<jeremyrubin> ah ok; so it slots before further package relay work for you

212

20:57

<luke-jr> I never really understood why we didn't do wtxid-relay from the start

213

20:57

<sdaftuar> luke-jr: we shoudl have! the second best time is now

214

20:57

<jeremyrubin> we didn't have wtxids before segwit

215

20:57

<luke-jr> (or if I did, I forgot)

216

20:58

<sdaftuar> it was just more work, and we were busy

217

20:58

<sdaftuar> but i think it's pretty straightforward, and we should do it, ideally before we make a standardness change to segwit transactions

218

20:58

<jonatack> ack

219

20:58

<sipa> i think initially it wasn't that clear that it was needed in the first place

220

20:58

<sipa> and when segwit was further along, it got pushed back to "later"

221

20:58

<sipa> seems later is now

222

20:59

<sdaftuar> yeah i'm not sure how much anyone thought about it until petertodd pointed out the issues in #8279

223

20:59

<gribble> https://github.com/bitcoin/bitcoin/issues/8279 | Mempool DoS risk in segwit due to malleated transactions · Issue #8279 · bitcoin/bitcoin · GitHub

224

20:59

<jeremyrubin> My only concern looking at the code is that a new index in maptx kinda sucks

225

20:59

<sdaftuar> a bit more memory, but i don't see a way around it, and i think the tradeoff is well worth the benefit

226

21:00

<luke-jr> could we add both entries to the same index?

227

21:00

<sdaftuar> luke-jr: we need to look up by both txid and wtxid

228

21:00

<sdaftuar> so two keys

229

21:00

<sdaftuar> we use the boost multiindex already, which i think is pretty efficient?

230

21:01

<jeremyrubin> sdaftuar: you can actually reuse the saltedtxid hasher across both indexes I think

231

21:01

<sipa> sdaftuar: i need to revive my use-allocator-to-count-multiindex-memory-use stuff... i'm not sure how accurate we currently are

232

21:02

<sdaftuar> sipa: ah yes i have no idea how to do that, if you can advise on how to update the memory calculation better than i did in the PR, please let me know

233

21:02

<jonatack> ^ +1

234

21:02

<sdaftuar> jeremyrubin: i don't think i follow

235

21:02

<sipa> sdaftuar: i have some WIP code that i can probably use to verify whether or current heuristic is accurate... actually replacing it is probably harder

236

21:03

<aj> is #14895 really still chasing concept ack?

237

21:03

<gribble> https://github.com/bitcoin/bitcoin/issues/14895 | Package relay design questions · Issue #14895 · bitcoin/bitcoin · GitHub

238

21:03

<jeremyrubin> sdaftuar: I need to think about it a little bit. Fundamentally you want to be able to index by either TXID or WTXID

239

21:04

<jeremyrubin> But because it's a hash table there's a lot of extra overhead (idk what load the boost table works well till)

240

21:05

<jeremyrubin> Just trying to think if there's a way to be able to index by either

241

21:06

<jeremyrubin> Do you envision that we ever remove the txid index?

242

21:07

<sdaftuar> we can't do that very easily

243

21:07

<jeremyrubin> Or you think it's there forever for compat

244

21:07

<sdaftuar> because transactions reference inputs by txid

245

21:07

<jeremyrubin> right

246

21:07

<sipa> jeremyrubin: UTXOs are indexed by txid

247

21:07

<luke-jr> hmm

248

21:07

<sdaftuar> so unless someone gave you a hint for what wtxid to look for, you're screwed

249

21:07

<luke-jr> sdaftuar: well, only for mempool-spending txs?

250

21:08

<jeremyrubin> OK I'm OK with it

251

21:08

<jeremyrubin> BUT

252

21:08

<sdaftuar> i think in the case of package relay though, i might imagine that we'll get those hints, but not in a generic enough way that we could ever git rid of the index

253

21:08

<sdaftuar> luke-jr: right

254

21:08

<jeremyrubin> You have to review my next two PRs first

255

21:08

<jeremyrubin> Because I get rid of mapTxLinks

256

21:08

<luke-jr> could hypothetically use wtxids in the tx structure there, and continue to use just the txids for signatures?

257

21:08

<jeremyrubin> which can pay for this new index ;)

258

21:08

<sdaftuar> luke-jr: that would be a big relay change though

259

21:08

<luke-jr> maybe worth it long-term

260

21:08

<sdaftuar> and it just seems like a lot of edge cases would break

261

21:08

<sdaftuar> yeah, i can't rule it out

262

21:09

<sipa> i don't understand what the point is

263

21:09

<sipa> not having wtxids in transactions is exactly what segwit made possible

264

21:09

<sdaftuar> yes :)

265

21:09

<sdaftuar> i think saving a little memory is not worth the effort here

266

21:10

<kanzure> i think the point was something about rebroadcast logic or first-seen issues?

267

21:10

<kanzure> right, bad witnesses or something?

268

21:10

<jeremyrubin> Yeah, I think given that we're going to kill mapTxLinks it's going to be fine (I just don't want people to have a reason not to upgrade to wtxid index)

269

21:10

<jeremyrubin> kanzure: anyone can malleate witnesses

270

21:10

<sdaftuar> jeremyrubin: i don't think this has anything to do with mapTxLinks though? i didn't need to touch it for the wtxid-relay PR

271

21:11

<jeremyrubin> sdaftuar: I'm saying that one of the PRs I pinged you on kills mapTxLinks

272

21:11

<sdaftuar> (unless i am missing something!)

273

21:11

<sdaftuar> right, i imagine that should be fine

274

21:11

<jeremyrubin> so the memory/hashing overhead going away there is probably the same as a new index

275

21:12

<sdaftuar> one way to think about this is that only the net_processing layer needs to be able to look things up in the mempool by wtxid

276

21:12

<sdaftuar> as that's the only place in our code where we need to deteremine whether we already have a wtxid someone is offering

277

21:12

<jeremyrubin> So I'm OK with not introducing a regression

278

21:12

<sdaftuar> anything internal to the mempool is unaffected by this change

279

21:12

<jeremyrubin> Because we have a way to pay for it

280

21:12

<sdaftuar> oh, you're saying that the extra memory is a wash with those other changes? even better :)

281

21:12

<jeremyrubin> Yes

282

21:13

<jeremyrubin> https://github.com/JeremyRubin/bitcoin/pull/7

283

21:14

<sipa> jeremyrubin: seems completely unrelated; we need wtxid based relay i think, and even if the only way to do it is by adding memory, we should

284

21:14

<jeremyrubin> sdaftuar previously said it was nice ot have but not required

285

21:15

<sipa> and if we can get rid of mapTxLinks, we should, unrelated of wtxid based relay

286

21:15

<sdaftuar> jeremyrubin: that was for package relay

287

21:15

<jeremyrubin> ah

288

21:15

<sdaftuar> i think package relay could be done with or without wtxid relay

289

21:15

<sdaftuar> but wtxid relay is required to solve some bandwidth-waste issues

290

21:15

<jeremyrubin> I thought you meant in general

291

21:15

<jeremyrubin> sipa: I just don't want there to be a reason for economic nodes to not upgrade that's all

292

21:16

<jeremyrubin> the changes are obviously independent

293

21:16

<sdaftuar> the issue we have with txid relay is that when a peer announces a segwit transaction that doesn't pass your policy checks, then you don't know whether another peer announcing the same txid has the same transactionr or not

294

21:16

<sdaftuar> because maybe just the witness was malleated

295

21:16

<jeremyrubin> But that we're not increasing overheads overall means we can not worry at all

296

21:16

<sdaftuar> so you have to download it (otherwise, an attacker could malleate transactions to interfere with relay)

297

21:17

<sdaftuar> and this is wasteful, particularly after a policy change to segwit-transaction-acceptance (eg taproot, or any other policy change)

298

21:17

<sdaftuar> when you would expect old nodes to reject a certain category of new transaction

299

21:18

<sdaftuar> there's also a related CPU DoS issue with how we determine whether a transaction is witness-stripped, which will be alleviated in the future when we no longer need to worry about adding txid's for segwit transactions to our reject filter

300

21:18

<sdaftuar> so i think we definitely need wtxid relay, even if we support txid-based-relay indefinitely to support old software

301

21:21

<jeremyrubin> I guess it's not clear to me why this has to be a new mempool index rather than a fixed size separate cache only in net_proc

302

21:21

<jeremyrubin> will look more closely

303

21:21

<sdaftuar> jeremyrubin: i think code simplicity?

304

21:21

<sipa> jeremyrubin: for my current mempool, the extra index would be a 0.55% memory usage increase

305

21:22

<sdaftuar> maintaining a separate data structure just to shave a few bytes doesn't seem worth the effort to me. the mempool is probably already too big

306

21:22

<jeremyrubin> sipa: I'm concerned with hashing too, we're slowing down all inserts

307

21:22

<aj> yeah, extra indexes on multi_index are pretty cheap

308

21:22

<sdaftuar> inserts aren't in the critical path of block acceptance, i think they're small compared to transaction validation speeds

309

21:22

<sipa> yeah

310

21:23

<jeremyrubin> Cool -- these are all good things to document & measure in advocating this change

311

21:23

<sdaftuar> if you want to worry about CPU usage in transaction acceptance, we should reopen by parallel-script-check-thread PR for mempool acceptance

312

21:23

<sdaftuar> (i do worry about it, but i think the mempool data structures are far from our biggest concern)

313

21:35

<aj> sdaftuar: did you see https://github.com/bitcoin/bitcoin/pull/17303#issuecomment-581363980 ? there's a patch there that's a different approach for #15505 ; worth trying? (seems silly to add wtxid for mapRelay if we could just get rid of mapRelay first instead)

314

21:35

<gribble> https://github.com/bitcoin/bitcoin/pull/15505 | p2p: Request NOTFOUND transactions immediately from other outbound peers, when possible by sdaftuar · Pull Request #15505 · bitcoin/bitcoin · GitHub

315

21:37

<sdaftuar> aj: i would definitely prefer to get rid of mapRelay, but i think the additional memory i propose using in 18044 is very minor, it's just an extra key

316

21:37

<sdaftuar> and it should be easy to remove either before or after the wtxid-relay PR

317

21:38

<sdaftuar> (if we can get rid of mapRelay before, i can easily pull that commit out of my branch)

318

21:39

<aj> sdaftuar: oh, yeah, not a meaningful criticism

Use wtxid for transaction relay (mempool, p2p)

Notes

Context

What is a wtxid?

WTXID-based transaction relay

Questions

Meeting Log

05 Feb 2020 #bitcoin-core-dev IRC discussion

06 Feb 2020 #bitcoin-core-dev IRC discussion

What is a `wtxid`?