Only allow getdata of recently announced invs (
p2p) Jul 8, 2020
The PR branch HEAD was 2da7ee37b at the time of this review club meeting.
If a spy is able to identify which node initially broadcast a transaction,
there’s a high probability that that node is the source wallet for the
To avoid that privacy leak, we try to be intentional about how we relay and request transactions. We
don’t want to reveal the exact contents of our mempool or the precise timing
when we received a transaction.
PR 18861 improved
transaction-origin privacy. The idea is that if we haven’t yet announced a
transaction to a peer, we shouldn’t fulfill any
GETDATA requests for that
transaction from that peer.
The implementation for that PR checks the list of transactions we are about to
announce to the peer (
setInventoryTxToSend), and if it finds the transaction
that the peer has requested, then responds with a
NOTFOUND instead of with the transaction. While this helps in many cases,
it is an imperfect heuristic. Think about why that is (see question 3 below).
This week’s PR further reduces the possible attack surface. It introduces a
per-peer rolling bloom filter (
m_recently_announced_invs) to track which
transactions were recently announced to the peer. When the peer requests a
transaction, we check the filter before fulfilling the request and relaying the
Did you review the PR?
Concept ACK, approach ACK, tested ACK, or
You’re always encouraged to put your PR review on GitHub, even after it has
What are possible techniques for an adversary to probe a node for a
particular mempool transaction? What changes with this PR?
What were drawbacks of using
setInventoryTxToSend as a heuristic?
What is the
UNCONDITIONAL_RELAY_DELAY? How is this delay constant being
used when deciding whether or not to fulfill a request?
What are some relevant considerations when reviewing/implementing a bloom
filter? What choices does this patch make?
After these changes, can you think of other ways a spy could probe for a
1 17:00 <amiti> #startmeeting
8 17:00 <michaelfolkson> hi
10 17:00 <amiti> hi everyone! welcome to this weeks review club. today, we'll chat about transaction privacy on the network
13 17:01 <amiti> who has gotten a chance to look at the proposed changes?
16 17:02 <michaelfolkson> y
19 17:02 <willcl_ark> n (sorry!)
21 17:03 <amiti> cool! let's get started with the questions
22 17:03 <amiti> What are possible techniques for an adversary to probe a node for a particular mempool transaction? What changes with this PR?
23 17:04 <andrewtoth> is sending a GETDATA the only real way?
24 17:04 <michaelfolkson> So this is Gleb's comment. Attacker connects to us right after we put the transaction in MapRelay
25 17:05 <amiti> andrewtoth: the only real way to solicit a TX response?
26 17:05 <andrewtoth> to try and probe a node for a particular mempool transaction?
27 17:05 <felixweis_k> you could give it a transaction that spends a certain input
28 17:06 <andrewtoth> ahh felixweis_k interesting
29 17:06 <felixweis_k> if it requests the input tx, it didn't have it
30 17:06 <amiti> michaelfolkson: yes, could you break that down further? we could also go into this more in the next question
31 17:07 <amiti> felixweis_k: yeah! using children to observe request patterns can help a spy deduce tx presence
32 17:07 <michaelfolkson> So my (limited) understanding is that the previous PR fixed a lot of the premature announcement logic. And so now we are working out what still needs to be fixed
33 17:08 <michaelfolkson> #18861 (previous PR, now merged)
34 17:08 <amiti> alright, lets jump into the next question and discuss the previous PR (18861) and how this PR (19109) addresses a leak
35 17:08 <amiti> from the notes: PR 18861 improved transaction-origin privacy. The idea is that if we haven’t yet announced a transaction to a peer, we shouldn’t fulfill any GETDATA requests for that transaction from that peer. The implementation for that PR checks the list of transactions we are about to announce to the peer (setInventoryTxToSend), and if it
36 17:08 <amiti> finds the transaction that the peer has requested, then responds with a NOTFOUND instead of with the transaction. While this helps in many cases, it is an imperfect heuristic.
37 17:08 <amiti> What were drawbacks of using setInventoryTxToSend as a heuristic?
38 17:08 <michaelfolkson> Sorry I think I misunderstood the first question :)
39 17:09 <overall> michaelfolkson premature or non-privacy preserving announcement logic?
40 17:09 <amiti> michaelfolkson: nah, the two questions overlap.
41 17:10 <jnewbery> sipa's original PR description answers this question...
42 17:11 <michaelfolkson> overall: Premature is non-privacy preserving. I don't understand distinction...
43 17:12 <jnewbery> so setInventoryTxToSend is a set of transactions that we haven't yet announced to that peer
44 17:12 <troygiorshev> The issue in sipa's original PR description is a "false negative" sort of problem, right? i.e. we'll tell a node that we don't have a transaction even when they know that we actually do?
45 17:12 <amiti> troygiorshev: that's one of the issues, pointed out in the convo with luke-jr & sipa
47 17:12 <overall> michaelfolkson just checking if there was another metric, besides privacy, that wasn't needed to be managed
48 17:13 <amiti> but from my understanding, it seems pretty unlikely
49 17:13 <jnewbery> we'll only add txids to a peer's setInventoryTxToSend if we're connected to that peer when we relay the transaction
50 17:13 <sipa> overall: well, memory usage - we could also just keep a list of all things we've ever announced to a peer, and only allow them to query exactly that... but that would easily run into memory problems
51 17:14 <jnewbery> so using the old heuristic, if a node connects to us and immediately asks for a transaction, it won't be in the setInventoryTxToSend, and we'll happily send the transaction
52 17:14 <michaelfolkson> overall: Yeah I think it is just privacy. Apart from privacy, announcing prematurely doesn't matter. Unless you literally don't want to be announcing it at all (not a concern)
53 17:15 <amiti> jnewbery: exactly. does this make sense to people?
54 17:15 <andrewtoth> jnewbery: makes sense thanks!
55 17:15 <sipa> michaelfolkson: it's not announcing; it is allowing requests
56 17:15 <troygiorshev> jnewbery amiti: it does now :)
57 17:15 <sipa> michaelfolkson: this is about not permitting requests of things that weren't announced
58 17:16 <sipa> announcement itself has a bunch of privacy protections already, since a few years ago, and they're pretty good
59 17:16 <sipa> this is making sure that it can't be bypassed by tx requests
60 17:17 <andrewtoth> still a little unclear on the second point though. how does using setInventoryTxToSend prevent some locally resubmitted invs from being requested?
61 17:19 <amiti> andrewtoth: the idea is if we are using setInventoryTxToSend to prevent fulfilling a txn request, if we put a txn on there for a _second_ time, then it would be valid for the peer to request (since we prev announced), but we would deny them the txn
62 17:19 <michaelfolkson> sipa: Not permitting the providing of what they request right? They are free to request whatever they want. Just don't want to give it to them
63 17:19 <andrewtoth> amiti: thanks!
64 17:19 <sipa> michaelfolkson: exactly
66 17:20 <felixweis_k> whats a mental model to aproximate what consitutes an acceptable amount of per peer memory?
67 17:21 <luke-jr> amiti: it seems like it could result in efforts to relay a tx, making it harder to relay
68 17:21 <amiti> sipa: my understanding of your comment there was that you were identifying the precise steps / timing that would have to happen to not serve the txn, is that correct?
69 17:21 <amiti> (aka not as simple as resubmit via sendrawtransaction)
70 17:22 <sipa> amiti: yeah, because it also requires the old submission to have been expired from filterAlreadyKnown
71 17:22 <amiti> luke-jr: I don't follow
72 17:22 <andrewtoth> sipa: thanks. both these cases seem pretty unlikely, right? but probably better to not have any edge cases
73 17:22 <sipa> andrewtoth: probability is the wrong metric when thinking about attacks
74 17:23 <sipa> andrewtoth: the question is how costly it is for an attacker to make it happen - they won't let it to chance
75 17:23 <jnewbery> felixweiss_k: that's a really good question! I don't know if there are any hard numbers
76 17:23 <luke-jr> amiti: unless I'm confusing my own comment (it's been a while), it seems sendrawtransaction (ie, the user explicitly wants to relay it) could end with it not being relayed even if it would have otherwise been?
77 17:23 <andrewtoth> interesting
78 17:23 <amiti> sipa: 👍🏽, and you were calculating the most feasible / quick way to expire (using mempool requests)
80 17:24 <sipa> andrewtoth: an analogy i like: a civilian engineer designs building resistant to natural causes (earthquakes, ...), and needs probability. a military engineer designs structures that are designed to protect against attacks; they don't go compute the probability that a missile will randomly hit it; they care about how hard it is for an attack that make that happen
81 17:25 <amiti> ok, so these are some of the identified drawbacks of using `setInventoryTxToSend` as a heuristic. what does PR 19109 implement as an alternative solution?
82 17:25 <andrewtoth> sipa: :)
83 17:25 <michaelfolkson> But probability still informs what should be prioritized to fix!
84 17:25 <sipa> michaelfolkson: how so?
85 17:26 <sipa> we care about probabilities of things _randomly_ going wrong absent attack scenarios of course
86 17:26 <michaelfolkson> Well if there are two possible attacks and one attack is more likely than the other, fix that one first
87 17:26 <sipa> michaelfolkson: no, you fix the cheapest to perform first
88 17:26 <sipa> attacks are not random
89 17:27 <michaelfolkson> I guess I am seeing the cost of the attack as informing how likely that attack is to happen
90 17:27 <amiti> felixweis_k: good question about acceptable amount of memory / peer, something I also wonder
91 17:27 <luke-jr> andrewtoth: if an attack is possible, it increases the possibility
92 17:28 <sipa> at a very level view, you could see it that way
93 17:28 <sipa> but it's misleading imo
94 17:29 <andrewtoth> makes sense
95 17:29 <troygiorshev> sipa: we would want to prioritize on the "product" of the "ease" of the attack and the "damage" of the attack, no?
96 17:29 <jnewbery> felixweiss_k: without having specific numbers, I think a good rule is that keeping sets of items per-peer isn't scalable, but keeping a probabilistic filter is
97 17:30 <sipa> i think we'd like to have some sort of upper bound of memory usage per peer
98 17:30 <jnewbery> and in many cases, a probabilistic filter is good enough for the specific function
99 17:30 <sipa> or, alternatively, actual acconting of memory, and a way to deprioritize/disconnect peer if we're using to much memory on behalf of a peer, and it gets tight
100 17:31 <jnewbery> sipa: I think deprioritizing based on memory usage would be difficult. How do you shed the memory? Deprioriritizing based on CPU/bandwidth usage seems much easier
101 17:32 <sipa> jnewbery: agree
102 17:32 <amiti> lets jump into this question: What are some relevant considerations when reviewing/implementing a bloom filter? What choices does this patch make?
103 17:32 <amiti> since we are talking about one of the considerations- memory usage
104 17:32 <jnewbery> (not saying they're exclusive - just that the latter is much more straightforward)
105 17:34 <a__jonas> amiti: Generally things you'd want to consider are the are number of items to keep track of and the acceptable false-positive rate
106 17:34 <amiti> a__jonas: yes!
107 17:35 <amiti> did you catch what values were chosen in this PR ?
108 17:35 <jnewbery> how are those parameters chosen?
109 17:35 <a__jonas> (3500 and 0.000001)
110 17:35 <troygiorshev> amiti: False positive rate is 0.000001
111 17:36 <amiti> a__jonas, troygiorshev: yup!
112 17:36 <andrewtoth> so number of items increases memory, false positive rate increases CPU time?
113 17:36 <amiti> let's dig into jnewbery's question- how were these numbers chosen? so as reviewers, we can evaluate that it makes sense
114 17:37 <sipa> andrewtoth: better false positive rate also increases memory usage
115 17:37 <sipa> a rolling bloom filter has around -log2(fprate)*3/log(2) bits usage per tracked item
116 17:39 <jnewbery> 'rolling bloom filter' is perhaps a bit of a misleading name because you might expect that you can remove items
118 17:39 <jnewbery> in fact, it's implemented as 3 separate bloom filters, each with capacity 1/2 of the desired capacity
119 17:39 <andrewtoth> amiti: cool site!
120 17:40 <amiti> (props to aj for sharing with me)
121 17:40 <felixweis_k> thanks, amiti!
122 17:40 <sipa> jnewbery: in general you can't delete things from bloom filters
123 17:40 <sipa> there are generalizations that can, but they need more memory
124 17:40 <jnewbery> and we roll through the different filters (ie put items into the first filter until it's at capacity, then the second, then the third, then delete the first filter and start filling it again)
125 17:40 <felixweis_k> yeah rolling bloom filters are cool i've been thinking recently about something like that. didn't know we had them in core
126 17:40 <sipa> jnewbery: exactly
127 17:41 <jnewbery> so that's where the *3 in sipa's memory usage comes from
128 17:41 <jnewbery> because we actually have 3 bloom filters
129 17:41 <jnewbery> (I think)
131 17:41 <sipa> though it's not the entire picture
132 17:42 <amiti> jnewbery: oh cool, I was wondering how it worked but didn't get the chance to dig in yet. thanks!
133 17:42 <sipa> we also get a factor 2 gain from having 2 active generations (each of the 3 generations is only half the window size)
134 17:42 <felixweis_k> when you add, you add to the latest 2, when you check you check for OR in any of the 3? then a counter as to when advance the "ring buffer"?
135 17:42 <sipa> but also a factor 2 again because we need 2 bits per entry (which can store a number from 0 to 3... 0 for unused; 1-3 for each of the 3 generations)
136 17:42 <felixweis_k> sorry im just going to read the code
137 17:43 <jnewbery> felixweiss_k: you only add to the latest generation I believe. And yes, there's a counter to determine which generation we should be adding to
138 17:43 <michaelfolkson> What is AJ doing re running the patch to find premature requests?
139 17:44 <michaelfolkson> This is to explore when premature requests still happen when someone is running Core and is not malicious?
140 17:44 <jnewbery> felixweis_k: oops, I've been inserting and extra s in your name!
141 17:45 <sipa> michaelfolkson: indeed
142 17:45 <sipa> to see if our logic isn't too strict in what it permits
143 17:46 <amiti> 15 minutes left
144 17:46 <michaelfolkson> You want to allow a certain number of premature requests?! Don't get it
145 17:46 <sipa> michaelfolkson: no?
146 17:46 <michaelfolkson> Surely better if there are zero premature requests unless doing testing
147 17:46 <sipa> michaelfolkson: yes
148 17:47 <sipa> i'm confused by your conclusion :)
149 17:47 <felixweis_k> no worries, i've written new berry probably too in the past ;-)
150 17:47 <michaelfolkson> The "logic isn't too strict" comment. Surely we want extreme strictness in this case (unless doing testing)
151 17:47 <sipa> michaelfolkson: but it's an approximation
152 17:48 <troygiorshev> michaelfolkson: we're seeing false positives, right?
153 17:48 <sipa> we don't know exactly what all legitimate (and safe) behaviors are
154 17:48 <sipa> so we want to know if this policy we have implemented isn't going to accidentally affect legitimate use
155 17:49 <sipa> because we might have forgotten some protocol interaction where someone ends up legitimately requesting something that the filter approach disallows
156 17:49 <sipa> aj's testing is to figure this by running it in the wild, and seeing if there are legitimately-looking requests that get blocked by it
158 17:50 <michaelfolkson> Ok makes sense, thanks. So some premature requests are legitimate. e.g. the example AJ found.
159 17:50 <sipa> michaelfolkson: well, they're not premature - we're just accidentally marking them as such
160 17:50 <jnewbery> is this new data structure threadsafe? If so, what synchronization method is used to guard it? Do you think it's in the right place?
161 17:51 <a__jonas> to answer your question amiti, I think these numbers could be adjusted (as AJ maths and Sipa's response suggests)
162 17:52 <overall> jnewbery nice question :)
163 17:52 <a__jonas> It looks like caching was a consideration (as noted in the comment on the assert on line 134)
164 17:52 <andrewtoth> jnewbery: looks like it's protected by cs_main
165 17:52 <jnewbery> overall: credit goes to neha for asking in the previous PR :)
166 17:53 <jnewbery> andrewtoth: yes, can you explain why/
167 17:53 <overall> nehan +1
168 17:53 <sipa> a__jonas: i don't think i've commented at all on those numbers or how they could be changed
169 17:54 <andrewtoth> i need to look closer at the code here :/
170 17:55 <amiti> a__jonas: are you talking about the convo early on in the PR?
172 17:55 <nehan> jnewbery: +1. Also, I don't see how cs_vRecv is taken out properly in the previous PR, but maybe that could be answered more quickly here.
173 17:55 <amiti> ah, I think that convo lead to introducing `UNCONDITIONAL_RELAY_DELAY`
174 17:55 <sipa> or maybe i'm just wrong! and clang didn't detect it
175 17:56 <amiti> lets chat about it in the last few minutes here... How is this delay constant being used when deciding whether or not to fulfill a request?
176 17:56 <jnewbery> andrewtoth: the answer is that it m_recently_announced_invs is added to CNode, which is guarded by cs_main.
177 17:56 <sipa> nehan: i believe i'm wrong!
178 17:56 <nehan> sipa: where does net lock cs_vRecv?
179 17:56 <nehan> sipa: yeah ok good i'm not going crazy :)
180 17:56 <a__jonas> troygiorshev, yes - that and AJ's previous comment
181 17:56 <jnewbery> sipa: wrong about what?
182 17:56 <sipa> jnewbery: about claiming that net grabs cs_vRecv
183 17:57 <nehan> jnewbery: vRecvGetData isn't protected by cs_vRecv. But as you both pointed out it's only accessed in one thread. That seems worth documenting to me though.
184 17:57 <overall> 3 minutes remaining
185 17:58 <sipa> same with orphan_work_set
186 17:58 <jnewbery> nehan: definitely document it at the very least
189 17:58 <sipa> or fix it
190 17:59 <jnewbery> vRecvGetData and orphan_work_set are similar - they're both work queues for an individual peer. As such they're only pushed to/popped from in ProcessMessages
191 17:59 <nehan> sipa: ok!
192 17:59 <amiti> a__jonas, troygiorshev: that conversation calculated the number of transactions in 15 minutes, because after 15 minutes is the time set by mapRelay. after that review, sipa introduced `UNCONDITIONAL_RELAY_DELAY` as 2 mins. so now, if a peer requests a txn thats >2 minutes old, we will serve
193 17:59 <jnewbery> but agree it should be fixed!
194 17:59 <sipa> net_processing.cpp:1653:43: warning: reading variable 'vRecvGetData' requires holding mutex 'pfrom.cs_vRecv' [-Wthread-safety-analysis] std::deque<CInv>::iterator it = pfrom.vRecvGetData.begin();
195 18:00 <sipa> i recompiled and just missed the warning, silly me
196 18:00 <amiti> but if the txn is < 2 minutes old, it must be in the `m_recently_announced_invs` filter
197 18:00 <amiti> okay! it looks like thats time
198 18:00 <sipa> and the filter is sized such that it can store as many transactions as we can possibly announce in 2 minutes
199 18:01 <amiti> thanks for coming all :)
200 18:01 <troygiorshev> thanks amiti!
201 18:01 <felixweis_k> thanks all, and amiti for hosting! very informative as always.
203 18:01 <andrewtoth> thanks amiti! and everyone else for the discussion
204 18:01 <overall> thanks amiti!
205 18:01 <nehan> thanks amiti!
206 18:01 <jnewbery> thanks amiti. Great discussion!
207 18:02 <a__jonas> Thanks amiti
208 18:02 <amiti> #endmeeting
209 18:02 <michaelfolkson> Thanks amiti! New IRC nick?
210 18:03 <sipa> jnewbery: so, you're right - the memory usage of the rolling bloom filter (which i'll completely unambiguously abbreviate to RBF) is indeed exactly equal to having 3 completely separate bloom filters
211 18:03 <sipa> and even operationally it's very similar to that
212 18:03 <amiti> hah, my normal irc client is down (I use irccloud) , hopefully it will be back soon
213 18:03 <felixweis_k> using irccloud too usually. lol
214 18:04 <troygiorshev> sipa: unambiguously :D
215 18:04 * sipa suggests ssh + vps + screen + irssi :)
216 18:04 <felixweis_k> maybe someone should create a decentralized IRC client
218 18:04 <felixweis_k> (im kidding)
219 18:04 <sipa> jnewbery: the difference is that you'd need to access 3x as many memory locations
220 18:04 <overall> on a time chain
221 18:05 <jnewbery> sipa: so a 3x constant factor in checking for membership
222 18:05 <overall> sipa and connect irssi to freenet over tor? the sasl registration over clearnet bugs me :(
223 18:06 <sipa> overall: indeed
224 18:06 <a__jonas> thanks sipa
225 18:07 <overall> I understand that abuse / rate limiting demands that require sasl but still wish there was another way to handle it (surety bond)
226 18:07 <sipa> jnewbery: with a rolling cuckoo filter you only need 28ish bits per elememt (at 20 bits fprate), instead of 86ish
229 18:09 <troygiorshev> jnewbery: thx
230 18:09 <sipa> happy to do a review session about it, if/when it's done :)
231 18:09 <jnewbery> sipa: that'd be great!
232 18:09 <amiti> sipa: +1
233 18:15 <jnewbery> sipa: I haven't looked at the implementation recently, but do the 3 bloom filters use the same hash functions? If so, my comment ("3x constant factor in checking for membership") isn't correct because you'd only do the hashing once
234 18:15 <sipa> jnewbery: they dk
236 18:16 <jnewbery> ok, so checking 3x memory locations but that doesn't equate to 3x lookup times
237 18:16 <sipa> right; number of hashes would be identical across them
238 18:17 <sipa> in practice i expect the filter data to not be hot in the cache, so memory accesses dominate
239 18:17 <sipa> (which is annoyingly hard to measure in microbenchmarks)