Track and use all potential peers for orphan resolution (p2p)

https://github.com/bitcoin/bitcoin/pull/31397

Host: glozow  -  PR author: glozow

The PR branch HEAD was 3e16a36959f70da59846621f099c1f1df4a210ed at the time of this review club meeting.

Notes

  • (Transaction) orphan resolution is a process that kicks off when we are missing UTXOs to validate an unconfirmed transaction.

  • There are a number of known orphan-handling problems which have existed for quite some time. While they represent inefficiencies when transactions are relayed individually, these problems are even more relevant in package relay, where the orphanage is part of a transaction’s “critical path.”

  • Currently, we only attempt to resolve orphans with the peer who provided them. If this doesn’t work out (e.g. they send a notfound or don’t respond), we do not try again. In fact, we can’t, because we’ve already forgotten who else could help.

  • The TxRequestTracker, introduced in PR #19988, is a data structure that tracks hashes of transactions to download and the candidate peers for requesting those transactions. It schedules requests to send, helps avoid duplicate in-flight transaction requests, and encodes prioritization of peers.

  • In transaction download, outbound peers are preferred over inbound peers, though the TxRequestTracker does not have internal knowledge of connection directions; the preference is expressed by adding a delay to the announcement entry. In orphan resolution, there are similar reasons for preferring outbound over inbound peers.

  • When a transaction is rejected from or accepted to mempool, we call ForgetTxHash on the TxRequestTracker, indicating that we have successfully downloaded the transaction. Downloading it from a different peer will not make the transaction valid, so we do not need to keep it around.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?

  2. Define orphan resolution, in your own words.

  3. Prior to this PR, what are the steps for orphan resolution, starting from when we notice that the transaction has missing inputs? Where in the code does this happen?

  4. What are the ways we may fail to resolve an orphan with the peer we request its parents from? What are some reasons things this may happen, honest or otherwise? (Hint: one honest example is that the peer disconnects before responding to our request, perhaps because they are going offline)

  5. Can you come up with an attack to prevent a node from downloading a 1p1c package by exploiting today’s orphan resolution behavior? Temporary and/or probabilistic censorship are ok.

  6. What is the PR’s solution to this problem?

  7. Can you think of any alternatives - does BIP 331 fix any of this?

  8. In this PR, which peers do we identify as potential candidates for orphan resolution, and why?

  9. What does the node do if a peer announces a transaction that is currently a to-be-resolved orphan?

  10. Prior to this PR, what are all the TxOrphanage methods that remove an orphan, and where in the code are they called? (Hint: you should end up with 5 links to code)

  11. Take a look at the logic for adding a transaction to the orphanage and m_orphan_resolution_tracker. Is it possible that there are candidate peers but AddTx is not called - how? Why is that important?

  12. This PR edits some of the erasures to potentially keep the transaction in the orphanage and only modify the announcers list instead. Which places are these, and why? (Hint: for example, EraseForPeer is called when a peer disconnects)

  13. Commit 7badc73 stores missing parent txids in the TxOrphanage entry. Another approach could be to re-calculate the missing parents prior to sending out requests; the missing parents may change over time as well. Which approach do you prefer, and why?

  14. Should it be possible for a transaction to be in m_orphanage but not m_orphan_resolution_tracker? Why or why not? How might you test this?

  15. Why might we prefer to resolve orphans with outbound peers over inbound peers?

  16. What are the disadvantages of resolving orphans by requesting parent txids? (Hint: see BIP 331). Does this PR change that?

  17. Commit 3e16a36 edits the 1p1c logic to no longer try packages of transactions from different peers. What was the original motivation for this section, and why is removing it ok?

Meeting Log

Meeting 1

  117:00 <glozow> #startmeeting
  217:00 <monlovesmango> hey
  317:00 <Guest68> hi
  417:00 <halloy6647> hello everyone
  517:00 <danielabrozzoni> hi
  617:00 <CosmikDebris> hello
  717:00 <glozow> Welcome to PR review club! Feel free to say hi to let us know you're here. Any first-timers?
  817:01 <glozow> We're looking at #31397 today, notes and questions in the usual place: https://bitcoincore.reviews/31397
  917:01 <marcofleon> woo!
 1017:01 <glozow> Did anybody get a chance to review the PR and/or look at the notes?
 1117:01 <glozow> marcofleon: welcome!
 1217:02 <chinggg> second-timer here. I created txorphan fuzz target 2 year ago mentored by marco
 1317:02 <marcofleon> other marco
 1417:02 <marcofleon> Yeah I reviewed the PR, read through the commits
 1517:02 <glozow> chinggg: welcome back :) I think I remember your early package fuzz target
 1617:02 <glozow> Fantastic
 1717:02 <chinggg> Yeah I also reviewed the code
 1817:02 <danielabrozzoni> I reviewed the PR, focusing on header files to understand the high-level structure, didn't have time for an in-depth review
 1917:03 <instagibbs> hi reviewed all but still chewing on the major commit
 2017:03 <instagibbs> lot going on there
 2117:03 <glozow> danielabrozzoni: Great approach! I also often start with headers and tests
 2217:03 <glozow> Let's jump in with the questions then. If anybody has their own questions, please feel free to ask at any time.
 2317:04 <glozow> Define orphan resolution, in your own words.
 2417:04 <premitive2> It's when a node receives a transaction that contains inputs referencing a transaction it doesn't have and then performs actions to find the parent
 2517:05 <monlovesmango> when a node tries to request the missing parent transaction for a child transaction it has received
 2617:05 <chinggg> when downloading tx that missing parents, we need to query the peer again to confirm?
 2717:05 <marcofleon> getting txs that are orphans to possibly get into the mempool by finding their ancestors. process starts when you get a tx that has missing inputs
 2817:06 <glozow> Great answers! Yes, we're trying to find the missing inputs of an unconfirmed transaction. It's called an orphan because it's missing at least one parent.
 2917:06 <glozow> Prior to this PR, what are the steps for orphan resolution, starting from when we notice that the transaction has missing inputs? Bonus points if you can get code links
 3017:07 <luisschwab> We just ask the peer that gave us the orphan transaction for the parent. If he doesn't have it, we can't resolve it.
 3117:07 <marcofleon> https://github.com/bitcoin/bitcoin/blob/ae69fc37e4fff237a119225061d68f69e6cd61d7/src/node/txdownloadman_impl.cpp#L315C5-L392C10
 3217:07 <glozow> luisschwab: great start. Is it necessarily the case that we won't be able to resolve the orphan if the first peer doesn't respond?
 3317:08 <glozow> marcofleon: bingo
 3417:08 <instagibbs> if another peer advertises the parent via INV, we can still ask them for it, or if we get another orphan with the same missing parent, we will try them after a timeout?
 3517:09 <glozow> instagibbs: yep, that's what I was thinking of
 3617:09 <glozow> Hope is not lost if the first one fails, but we don't actively retry to resolve this specific orphan.
 3717:09 <instagibbs> 👍
 3817:10 <luisschwab> got it
 3917:10 <marcofleon> is that the GetChildrenFromDifferentPeer?
 4017:10 <chinggg> In MempoolRejectedTx, we check if (state.GetResult() == TxValidationResult::TX_MISSING_INPUTS), where we notice that the transaction has missing inputs. The link is already in notes?
 4117:11 <marcofleon> instagibbs: that part you were talking about i mean
 4217:12 <glozow> marcofleon: yes kind of! If we receive the parent somehow and it's low feerate, we might pick this orphan back up in `Find1P1CPackage` that way
 4317:12 <instagibbs> that's specifically a 1P1C relay thing
 4417:12 <marcofleon> got it
 4517:12 <instagibbs> I was more thinking about when do we ask for parents which isnt 1p1c per se
 4617:12 <glozow> But not necessarily - instagibbs is also referring to the case where we just accept the parent normally, and then schedule this orphan for processing https://github.com/bitcoin/bitcoin/blob/39950e148d80eec7ef18ff2858453d34a86c15cb/src/node/txdownloadman_impl.cpp#L299
 4717:13 <marcofleon> nice, thanks for the link
 4817:13 <glozow> And when we just happened to download the child before the parent
 4917:13 <glozow> What are the ways we may fail to resolve an orphan with the peer we request its parents from? What are some reasons things this may happen, honest or otherwise?
 5017:14 <glozow> I'm thinking of 3 different ways, if that's a helpful hint
 5117:15 <marcofleon> peer just doesn't have it so sends a notfound
 5217:15 <CosmikDebris> the peer may disconnect, the peer may have evicted the parent
 5317:15 <glozow> marcofleon: yes!
 5417:15 <glozow> CosmikDebris: yep!
 5517:15 <glozow> and a 3rd one?
 5617:15 <chinggg> malicious peer?
 5717:15 <marcofleon> announces a fake parent...? something malicious yeah
 5817:16 <glozow> marcofleon: yes!
 5917:16 <glozow> actually, there is a 4th
 6017:16 <glozow> 👀
 6117:17 <luisschwab> maybe the request gets blocked by the network along the way?
 6217:17 <glozow> marcofleon: specifically by "fake parent" I'm thinking the tx with a malleated witness. Given that the txid can be the same, they have responded to our request, but since it's invalid it doesn't help us resolve the orphan.
 6317:17 <glozow> luisschwab: close enough! I was thinking "you just don't get a response from them" so the request times out.
 6417:18 <glozow> Can you come up with an attack to prevent a node from downloading a 1p1c package by exploiting today’s orphan resolution behavior?
 6517:18 <marcofleon> makes sense
 6617:19 <marcofleon> The peer could just withhold the parent on purpose right? Because it's the only one we try
 6717:19 <monlovesmango> malicious actor filling up orphanage?
 6817:19 <marcofleon> But I guess you were saying it's still possible to find the parent later on?
 6917:20 <premitive2> Is there a way to overrun the orphan store with fake child txs from different peers? I know there's a limit to the orphaned txs kept around, but I don't know the conditions in which new orphans can replace old ones
 7017:20 <glozow> premitive2: we evict randomly. so yes, if you just send lots of orphans, most likely you can get other orphans evicted
 7117:20 <glozow> marcofleon: yes, that only works if you are the one who sends the orphan. is that something you can try to guarantee?
 7217:22 <instagibbs> the attacker INV-ing the parent wouldn't stop you from eventually asking the honest relayer, right?
 7317:22 <marcofleon> you'd have to target the node somehow. but yeah I'm not actually sure how a peer would ensure that they were the first ones to send you the orphan
 7417:23 <glozow> instagibbs: mhm, they can't stop you from asking
 7517:23 <glozow> for 1p1c with BIP133 though, people wouldn't announce the parent
 7617:23 <glozow> because of fee filter
 7717:24 <glozow> marcofleon: the answer I'm looking for is - they send you the orphan unsolicited :)
 7817:24 <glozow> since you already have the tx in orphanage, you ignore everyone else who announces it
 7917:25 <marcofleon> Got it yeah. Just don't go through the normal process of announcing and waiting for the request
 8017:25 <marcofleon> and could be an advantage
 8117:28 <glozow> What is the PR’s solution to this problem?
 8217:28 <monlovesmango> request parent tx from every peer that send the orphan
 8317:29 <glozow> let's try to be more specific!
 8417:30 <chinggg> instead of just querying and waiting for the peer who provided the tx, refactor the code and introduce m_orphan_resolution_tracker to query from multiple peer candidates with some scheduling
 8517:31 <glozow> yes, we're getting warmer! when we find that a tx has missing inputs, instead of adding the parent to our tx request tracker, what do we do?
 8617:31 <premitive2> There's also reasonable expirations for orphans and delays https://github.com/bitcoin/bitcoin/pull/31397/files#diff-3fc44df0a49b8a2fa4cb515b41fae470b794c32c93e632edb18ae14e8fcb159dR259
 8717:33 <instagibbs> we add the orphan itself to a new orphan tx request tracker, so when the timer goes off for that, we "immediately" add the missing parents to the regular request tracker
 8817:34 <glozow> instagibbs: bingo
 8917:34 <glozow> Does BIP 331 fix any of this?
 9017:34 <dergoegge> I was wondering why we need another txrequest tracker instance for this? couldn’t we just keep track of announcers in the orphanage (same as in the PR) and then add requests for the parents to the existing tracker on future announcements?
 9117:35 <glozow> dergoegge: that's an idea. How would you schedule the requests?
 9217:35 <dergoegge> ideally at the same time
 9317:35 <glozow> Is there a cost to having another txrequest tracker? It's not that different from adding another std::map
 9417:35 <glozow> No, I mean, how would you load-balance between peers, bake in preferences, etc.
 9517:36 <dergoegge> isn't that what the existing tracker does?
 9617:36 <glozow> Oh, you mean adding a new type to the tracker? So we'd have txid type, wtxid type, and orphan?
 9717:37 <glozow> also note that the parent requests are added to the txrequest tracker
 9817:39 <dergoegge> I guess I'm wondering why we need the concept of tracking the resolution by orphan id, as opposed to just putting the requests for the parents in the existing tracker
 9917:39 <glozow> we do put the requests for the parents in the existing tracker
10017:39 <glozow> Maybe we are crossing wires?
10117:39 <instagibbs> mmmm he's asking why the add to new tracker, then "immediately" add to old one, vs just add to old one, I think
10217:40 <dergoegge> yea
10317:40 <instagibbs> add to old one with "proper delays"
10417:40 <instagibbs> I didn't get far enough in my review to figure this out either 😅
10517:40 <glozow> We might have multiple candidates for orphan resolution
10617:41 <glozow> Oh I see what you're saying
10717:42 <dergoegge> "multiple candidates" as in same parents different orphan?
10817:42 <glozow> Perhaps that could work, where you're treating it as if all of them just announced the missing parents? I don't know how you'd add `ancpkginfo` orphan resolution easily this way though.
10917:43 <glozow> different peers same orphan
11017:43 <instagibbs> You'd also have to somehow track that you're no longer asking for any parents of an orphan in order to EraseOrphanOfPeer?
11117:43 <marcofleon> yeah i was thinking it made sense with GetCandidatePeers. Having another tracker to separate out the orphan reso process
11217:46 <glozow> will think about this idea some more
11317:47 <dergoegge> me too 👍
11417:47 <glozow> I think it's possible it works? My main questions would be (1) what is the cost of having a second tracker? Because it's the perfect data structure for this. (2) how do we make it extensible to package relay still.
11517:48 <instagibbs> imo the cost is mostly an additional structure lying around that needs to stay in sync with other things
11617:48 <dergoegge> 1) if we don't need it then it's just extra complexity 2) fair!
11717:48 <marcofleon> The fact that there are candidates that be added or not added to that new tracker is why it made sense to me in the first place I guess is what i was saying
11817:48 <marcofleon> can be*
11917:49 <glozow> (1) shoving it into the existing tracker but needing to have extra logic could also be complexity!
12017:49 <dergoegge> well in my mind it wouldn't need extra logic but I might be wrong, need to think more
12117:49 <instagibbs> proof of code for this I think..
12217:49 <glozow> but yeah, I don't like that we need to ensure m_orphanage and m_orphan_resolution_tracker are in sync. that's super fair
12317:49 <dergoegge> frfr
12417:50 <glozow> yeah let's see what the code would look like
12517:50 <marcofleon> fr
12617:50 <glozow> fr r r
12717:50 <glozow> In this PR, which peers do we identify as potential candidates for orphan resolution, and why?
12817:51 <glozow> btw, we're not even halfway through the questions. Lmk if y'all would like another session tomorrow. We can make it happen if there's 3+ people interested.
12917:51 <marcofleon> I'm down
13017:51 <monlovesmango> I'd be interested
13117:51 <instagibbs> +1 if same time
13217:51 <dergoegge> +1
13317:51 <chinggg> sounds cool. the same time?
13417:51 <luisschwab> +1
13517:52 <premitive2> +1
13617:52 <marcofleon> peers that are potential candidates are ones that are in any state but COMPLETED?
13717:53 <glozow> marcofleon: yes, which means all of the peers who announced the transaction (that we still remember)
13817:53 <instagibbs> I think it's anyone who INV'd us something we have in orphanage, or directly handed us a tx which ends up being an orphan
13917:53 <premitive2> Those that have announced they have an orphan's ancestor?
14017:53 <monlovesmango> dumb question but what does INV mean?
14117:53 <instagibbs> sorry, sending INV aka inventory message
14217:53 <instagibbs> short message "hey I have this txid or wtxid"
14317:53 <monlovesmango> youre good, thank you!
14417:53 <glozow> sending an INV is sending an "inventory" message containing the hash of something you have (e.g. tx), synonymous with announcing it
14517:54 <monlovesmango> perfect that helps a lot thanks!
14617:54 <glozow> premitive2: nope. We just presume that if you told us about a tx, you must know about all of its ancestors.
14717:54 <instagibbs> remember we tack orphans by wtxid in this PR
14817:54 <glozow> What does the node do if a peer announces a transaction that is currently a to-be-resolved orphan?
14917:54 <instagibbs> and in amster i guess
15017:56 <marcofleon> It checks if that peer is an orphan resolution candidate and if it is it adds it as an announcer
15117:56 <marcofleon> and adds the inv to the orphan_resolution_tracker
15217:56 <chinggg> "instead of adding it to m_txrequest, remember this peer as a potential orphan resolution candidate"
15317:56 <glozow> marcofleon: chinggg: yes!
15417:57 <glozow> why is that important? hint: we talked about a potential attack earlier....
15517:57 <instagibbs> and its a candidate as long as the peer is still connected, and doesn't have too much in flight, pretty much?
15617:57 <glozow> instagibbs: yeah. or until we resolve the orphan
15717:58 <instagibbs> right i meant at "OrphanResolutionCandidate" time
15817:58 <glozow> oh yes, yes
15917:58 <glozow> in the future we can also add logic like "and they haven't already given us 20 orphans"
16017:59 <glozow> I think it's really important to consider new announcers as orphan reso candidates, for the case where somebody frontruns and sends you an orphan unsolicited.
16118:00 <instagibbs> if honest peer announces first, then attacker sends directly, we won't track as a candidate, right?
16218:00 <marcofleon> Having multiple peers just makes it more likely to resolve the orphan it seems to me. And yeah less likely that that attack happens
16318:00 <glozow> instagibbs: we will track the honest peer as a candidate. because it was an announcer of the orphan
16418:00 <instagibbs> I'll re-read :)
16518:00 <dergoegge> but it announced before we knew it is an orphan?
16618:00 <glozow> yes
16718:00 <instagibbs> actually i can write a test for this
16818:01 <glozow> there should be tests for this in p2p_orphan_handling.py
16918:01 <dergoegge> 🚀
17018:01 <instagibbs> noice
17118:01 <glozow> I write tests!
17218:01 <glozow> Oh, we are out of time.
17318:01 <glozow> Let's do the next half of the questions tomorrow
17418:02 <glozow> #endmeeting

Meeting 2

17517:00 <glozow> #startmeeting
17617:00 <glozow> welcome back!
17717:00 <glozow> this is part 2 of https://bitcoincore.reviews/31397
17817:01 <marcofleon> woo!
17917:01 <dergoegge> hi
18017:01 <glozow> did anybody review the PR any further between yesterday and now?
18117:01 <glozow> thanks for the reviews btw dergoegge and instagibbs
18217:02 <marcofleon> A little but not much unfortunately
18317:03 <glozow> We can pick up where we left off and ofc bring up any questions you have: Prior to this PR, what are all the `TxOrphanage` methods that remove an orphan, and where in the code are they called?
18417:03 <marcofleon> Took a look at dergoegge's alternate solution
18517:03 <glozow> I'll give the first of 5 links: we remove an orphan when it's accepted to mempool by calling `EraseTx`, here https://github.com/bitcoin/bitcoin/blob/dbc8ba12f3b3548dd6955293c5d650320ca39c5b/src/node/txdownloadman_impl.cpp#L301
18617:03 <instagibbs> LimitOrphans, for timeouts and random eviction
18717:04 <chinggg> EraseTx, EraseForPeer, EraseForBlock
18817:05 <marcofleon> called in BlockConnected, DisconnectedPeer, mempoolAcceptedTx, and MempoolRejectedtx
18917:05 <glozow> instagibbs: yep, https://github.com/bitcoin/bitcoin/blob/dbc8ba12f3b3548dd6955293c5d650320ca39c5b/src/node/txdownloadman_impl.cpp#L375
19017:05 <glozow> chinggg: yep! those are the functions, you just missed `LimitOrphans`
19117:05 <glozow> marcofleon: yep, got em
19217:05 <dergoegge> there is a 6th: `~TxOrphanage()` 😎
19317:05 <marcofleon> LimitOrphans is called in MempoolRejected too
19417:06 <glozow> dergoegge: hah
19517:06 <glozow> marcofleon: yes, what are the 2 conditions within `MempoolRejectedTx`?
19617:09 <glozow> ok so one of them is when the orphan is found to be invalid for a reason other than missing inputs https://github.com/bitcoin/bitcoin/blob/dbc8ba12f3b3548dd6955293c5d650320ca39c5b/src/node/txdownloadman_impl.cpp#L444
19717:09 <marcofleon> One is to limit DoS yeah? LimitOrphans
19817:09 <glozow> marcofleon: correct, we call `LimitOrphans` each time we add a transaction to orphanage
19917:09 <marcofleon> which is in the missing inputs and parents aren't rejected
20017:09 <glozow> yep
20117:11 <marcofleon> nice, thanks
20217:11 <glozow> taking a look at the logic for adding a tx to the orphanage within this PR: [logic](https://github.com/bitcoin-core-review-club/bitcoin/commit/c04d1a876cdbcd159c53dde55d57a55675c41f38#diff-3fc44df0a49b8a2fa4cb515b41fae470b794c32c93e632edb18ae14e8fcb159dR462-R481)
20317:11 <glozow> Is it possible that there are candidate peers but `AddTx` is not called - how? Why is that important?
20417:13 <marcofleon> I wasn't too sure how to interpret the question but I thought no it's not possible. Whenver there's a peer that passes OrphanResolutionCandidate the orphan tx gets added
20517:13 <instagibbs> If you get an INV after tx is orphanage already you add announcer via wtxid?
20617:13 <marcofleon> AddTx won't be called basically if there are no candidate peers
20717:14 <glozow> marcofleon: is there any case where `OrphanResolutionCandidate` returns `std::nullopt`?
20817:14 <marcofleon> wait yeah ofc
20917:15 <glozow> instagibbs: right. this logic only operates on the existing announcers though, it's when we just found out the tx is an orphan
21017:15 <glozow> marcofleon: and what's the case?
21117:15 <marcofleon> if a peer without relay permission has too many tracked annoucements
21217:15 <marcofleon> and then if the peer doesn't exist in m_peer_info
21317:16 <instagibbs> hmm? I think im missing the actual question
21417:16 <chinggg> `if (m_txrequest.Count(nodeid) + m_orphan_resolution_tracker.Count(nodeid) >= MAX_PEER_TX_ANNOUNCEMENTS) return std::nullopt;`
21517:17 <glozow> instagibbs: maybe the link doesn't render well. we're looking at the code block in `MempoolRejectedTx` where we're `add_orphan_reso_candidate` ing
21617:17 <glozow> chinggg: marcofleon: yep!
21717:17 <glozow> so if a peer is flooding us with invs, we'll drop them as an orphan reso candidate
21817:18 <dergoegge> glozow: yea the link doesn't jump to the highlighted section for some reason
21917:18 <glozow> thank you github
22017:18 <marcofleon> I guess I was interpreting candidate to mean a peer that passes.... but that's not really the defintion of candidate
22117:18 <glozow> importantly, if they are the only announcer, we won't call `AddTx` at all
22217:19 <dergoegge> that changes behavior from before right?
22317:19 <glozow> yeah
22417:19 <glozow> I don't like the fact that a new orphan always gets added
22517:20 <dergoegge> we won't resolve the parents anyway so that seems fine
22617:20 <glozow> In the future we should modify `OrphanResolutionCandidate` to also drop if e.g. this peer is already taking up half our orphanage space
22717:20 <glozow> or 100KvB, or 10 orphans, or something
22817:20 <instagibbs> more like OrphanResolutionNominee
22917:20 <chinggg> another step to prevent DoS on orphan
23017:21 <glozow> This PR edits some of the erasures to potentially keep the transaction in the orphanage and only modify the announcers list instead. Which places are these, and why?
23117:22 <marcofleon> eraseForPeer and the new function EraseOrphanOfPeer
23217:23 <marcofleon> We only want to erase the orphan for the peer that announced, not erase the orphan entirely as we did before
23317:23 <premitive2> I'm thinking so that another candidate can be called in GetRequestsToSend if another peer is disconnected or otherwise unresponsive
23417:23 <glozow> premitive2: yes exactly
23517:25 <glozow> Commit [7badc73](https://github.com/bitcoin-core-review-club/bitcoin/commit/7badc7e004e288aef645404b1b09699be8b87d1e) stores missing parent txids in the `TxOrphanage` entry. Another approach could be to re-calculate the missing parents prior to sending out requests; the missing parents may change over time as well. Which approach do you prefer, and why?
23617:26 <instagibbs> IIUC passing in the missing parents means you can calculate it once, then delegate handling duplications etc inside txrequest module
23717:27 <premitive2> I was a little confused about what you meant with 'the missing parent may change over time as well, although I like the approach and the fact that GetChildrenFromDifferentPeer was dropped
23817:27 <glozow> instagibbs: right, seemed better to just calculate it once
23917:27 <instagibbs> premitive2 parents get resolved over time, you may only have one left but list says N left
24017:28 <glozow> Yeah. the main recalculation point would be if somebody sends an inv for an orphan.
24117:28 <instagibbs> ah hm
24217:28 <premitive2> @instagibbs Ahhh, now I understand. Makes sense, thanks!
24317:30 <glozow> Ah, we should be filtering for `AlreadyHaveTx` again when we add the parents to txrequest 🤔
24417:30 <instagibbs> txrequests get filtered before going out the door at GetRequestsToSend, so a request may get queued and dropped?
24517:30 <glozow> instagibbs: ah yeah good point
24617:31 <glozow> ok, that's not a problem then
24717:31 <instagibbs> txrequest doing some heavy lifting, you might think temp that your honest peer has more inflights but I guess that's very short lived
24817:31 <glozow> right
24917:32 <glozow> I haven't thought of ways that dergoegge's approach doesn't work
25017:32 <glozow> so I guess the next question is kinda moot if we do that instead
25117:33 <glozow> Why might we prefer to resolve orphans with outbound peers over inbound peers?
25217:33 <dergoegge> Storing the parent txids does change the memory usage assumption a bit (eyeball 2x increase?)
25317:33 <glozow> dergoegge: yeah, it's extra for sure
25417:33 <dergoegge> perhaps worth adjusting the comment, but otherwise not concerning
25517:34 <glozow> I am wondering if we should accompany this with new memory-based orphan limits, based on the current maximum memory usage
25617:34 <instagibbs> outbounds are selected by the node more directly, means it's harder to get sybil'ed by those connections
25717:34 <glozow> right now the limit is really large and we probably never hit it
25817:34 <marcofleon> Are outbound peers less likely to be malicious?
25917:35 <marcofleon> instagibbs: +1 becuase we choose them yes
26017:35 <glozow> what instagibbs said. I'm not sure the best way to phrase this but my mental model is to just assume that all inbounds are trying to attack you
26117:36 <premitive2> Malicious nodes *looking* for targets to attack will always be inbound mhm
26217:36 <chinggg> make sense. I wonder where the code implement the logic of preferring outbound peers over inbound peers
26317:36 <instagibbs> inbounds are very often spy services :)
26417:36 <instagibbs> outbounds generally are not, or if they are, they're doing useful things like handing you blocks and whatnot
26517:36 <marcofleon> gotta be adversarial out there
26617:38 <glozow> dergoegge: I don't think it's a 2x increase, but it can be large. it's a function of the tx's size. oh wow i just realized we could just store a bit vector that corresponds to its inputs.
26717:38 <chinggg> oh there is a functional test `test_orphan_handling_prefer_outbound`
26817:38 <glozow> instead of a `std::vector<Txid>`
26917:38 <glozow> duh *facepalm*
27017:39 <monlovesmango> so with dergoegge's method, is there a way to track total orphan resolution's in flight by peer?
27117:39 <glozow> monlovesmango: very good question
27217:40 <glozow> we can implement this in `TxOrphanage` by storing a map of number of orphans provided by each peer
27317:40 <glozow> I have this implemented and planned it for later, but maybe we bring this into the current PR
27417:41 <glozow> What are the disadvantages of resolving orphans by requesting parent txids?
27517:42 <premitive2> I haven't read all of BIP331 but a fake child tx with lots of inputs might be bad
27617:43 <monlovesmango> is there concern with having witness be malleable? like there could be multiple parents with same txid but different wtxid?
27717:43 <marcofleon> The witness could be malleated
27817:44 <chinggg> sounds like requesting parent txid will cause more attack surface
27917:44 <glozow> yeah, txid-based relay sucks
28017:45 <glozow> If you have multiple missing parents as well, downloading them one by one seems really inefficient as well. You'd re-evaluate the orphan over and over again (always still getting missing inputs) until the last parent gets in
28117:46 <dergoegge> since we can't really cache invalidity by txid, can policy differences cause us to re-download parents many times?
28217:46 <glozow> dergoegge: yeah
28317:46 <dergoegge> with the PR we'd retry with all the announcers
28417:47 <glozow> ah, well, we'd `ForgetTxHash` after the parent fails
28517:47 <glozow> wait no, that would be by wtxid
28617:47 <glozow> hah, yeah, we'd just keep downloading it
28717:48 <dergoegge> should be wtxid, otherwise that'd be censorship
28817:48 <glozow> yeah. if the first peer gives us a bad one, we can't just give up - that's the whole point
28917:48 <glozow> sorry misspoke
29017:48 <glozow> however we wouldn't re-validate the same tx, since it would be in recent rejects
29117:50 <dergoegge> hm right so only some bandwidth waste
29217:50 <glozow> yes
29317:50 <glozow> this is just the nature of any txid-based relay
29417:51 <glozow> so that's what ancpkginfo is for - tell me the wtxids of the ancestors, and then I can figure out how to download them
29517:52 <glozow> when you retry with other peers, you also first ask them for the wtxids, and only download if you think there's a chance it'll be valid
29617:52 <glozow> Last question
29717:52 <glozow> Commit [3e16a36](https://github.com/bitcoin-core-review-club/bitcoin/commit/3e16a36959f70da59846621f099c1f1df4a210ed) edits the 1p1c logic to no longer try packages of transactions from different peers. What was the original motivation for this section, and why is removing it ok?
29817:53 <premitive2> Parents are kept within the orphanage now so it seems like the orphan matching is no longer needed
29917:54 <glozow> premitive2: no, we still do matching for 1p1c. the difference is we require the originating peer be the same
30017:54 <marcofleon> It made sense before becuase we weren't keeping track of all the peers that are announcing
30117:54 <monlovesmango> bc orphan tracker is systematically doing what this is attempting to opportunistically do
30217:54 <marcofleon> so we would wnat to try a different peer if we had a parent that was reconsiderable?
30317:55 <marcofleon> try looking for a child that is
30417:55 <glozow> marcofleon: yeah, really we *only* want to try 1p1cs from the same peer
30517:55 <glozow> but since we didn't remember all peers before, we had this logic just in case
30617:55 <premitive2> The parent was used to find other peers from the children to request, like @monlovesmango it now can do everything it did without that code
30717:56 <premitive2> to validate for a package, not req*
30817:56 <glozow> we don't really need the orphan tracker for this either, it's more than `TxOrphanage` remembers more announcers
30917:56 <glozow> more that*
31017:58 <marcofleon> wait why do we only want it from the same peer?
31117:58 <glozow> marcofleon: why would you want it from different peers?
31217:58 <marcofleon> hmmm
31317:59 <glozow> You could imagine that a peer can give you an infinite number of children for any given transaction. So let's not sign up to 1p1c all of them haha
31417:59 <marcofleon> if we have the child from someone else and they don't have the parent yet... but we have the parent from another peer we just won't form a 1p1c package?
31517:59 <glozow> but there should only be 1 honest package
31617:59 <marcofleon> right yes got it
31717:59 <glozow> if they have the child, they must have the parent
31817:59 <glozow> otherwise how did they validate it?
31918:00 <marcofleon> Yes ofc, I missed that
32018:00 <glozow> that's time!
32118:00 <instagibbs> good timing
32218:00 <glozow> indeed!
32318:00 <marcofleon> got it, thanks glozow!
32418:00 <marcofleon> what a time
32518:00 <chinggg> Thanks glozow and everyone!
32618:00 <premitive2> Thanks again for hosting @glozow!
32718:00 <dergoegge> thanks glozow!
32818:00 <glozow> Thanks for doing 2 whole review clubs everybody! I feel like we had some really great discussions here
32918:00 <monlovesmango> thanks glozow!
33018:00 <instagibbs> fwiw dergoegge's commit is a lot shorter 😅
33118:01 <glozow> haha I think there's a small bug but I do think I will end up taking it
33218:01 <instagibbs> didn't validate correctness or anything
33318:01 <dergoegge> (i'm only convinced it works because it passes the tests, there might be bugs)
33418:01 <glozow> We should maybe spend some time thinking about what other parts of the orphan handling buffs we want to include in this first chunk
33518:01 <glozow> #endmeeting