Immediately disconnect on invalid net message checksum (
p2p) May 20, 2020
The PR branch HEAD was 1d9bc6a8c at the time of this review club meeting.
Note: an earlier version of this PR moved hash finalization from the Message
Handler thread to the Socket Handler thread. That actually happened in PR
16202, so you can ignore
discussion about moving the hash calculation. Notes
Peer-to-peer message headers include a
checksum for message integrity. If
any of the message gets corrupted, then the calculated checksum from the
payload won’t match the checksum in the header. See the developer
full details of the message header format.
The checksum is the first four bytes of the double-SHA256 of the message
SHA256() is a
In this hash function construction, the pre-image is split into blocks, which
are fed one-by-one into a compression function.
Because of the serial way that the hash is constructed, we can do some of the
hashing work in the Socket Handler thread as we receive the bytes of the
wire. See the
function, which calls into .
The final round of hashing is done in
which calls through to
GetMessageHash() was called by
ProcessMessages(), which is executed by
the Message Handler thread. After that PR, the hash finalization is done
in the Socket Hander thread, and the Message Handler thread just checks
hash data member.
If the calculated checksum does not match the checksum in the message header,
then the message is dropped.
This PR proposes removing any checksum logic from net_processing, and moving
it to net. It also proposes immediately disconnecting peers that send us
a message with a bad checksum.
Did you review the PR?
Concept ACK, approach ACK, tested ACK, or
(Don’t forget to put your PR review on GitHub.)
What are the benefits of handling bad checksums in the net layer instead of
in net processing?
Are there any other checks that should be moved from net processing into net?
What do you think about the behaviour change in this PR? Is it better to simply
drop the message or should we disconnect from the peer?
1 13:00 <jnewbery> #startmeeing
9 13:00 <jnewbery> greetings earthlings (and other universe dwellers)
12 13:00 <michaelfolkson> hi
19 13:01 <jnewbery> Normal reminders: all questions are welcome! If you're confused about something, other people probably are too, so ask! No need to ask to ask, just ask!
21 13:01 <jnewbery> who had a chance to review this week? (y/n)
35 13:02 <jnewbery> Great, anyone want to summarize the PR? Are you concept ACKs/NACKs?
37 13:03 <pinheadmz> it moves the p2p message checksum check up earlier in the message processing prrocess
38 13:04 <troygiorshev> Concept ACK. Two parts: 1) Move checksum check to net layer, 2) change behavior upon check failure to disconnecting the peer, from ignoring the message
39 13:04 <pinheadmz> and the reviewers are trying to decide if it should ban a peer that sends a malformed message
40 13:04 <jonasschnelli> Hi (only partially here)
41 13:04 <jnewbery> troygiorshev pinheadmz: exactly. This isn't a pure refactor, it does change behaviour
42 13:04 <raj_149> Move hashing op into network layer. Move it from message processing to socket Handler. .
43 13:04 <jonasschnelli> The current behavior of the PR is not a ban
44 13:05 <jnewbery> we'll go into that in the next questions
45 13:05 <jnewbery> hi jonasschnelli! Thanks for joining us
46 13:05 <nehan> also changes which thread does the Finalize, I think?
47 13:05 <sipa> only the finalization moves
48 13:06 <sipa> hashing for bulk of messages was already in net
49 13:06 <jonasschnelli> Only the finalize operation changes...
50 13:06 <jonasschnelli> Though we have a lot of small messages
51 13:06 <jnewbery> I don't think this PR actually moves which thread does the finalize
52 13:06 <pinheadmz> yeah looked to me like originally the checksum was checked and stored in a bool but not /acted/ upon until way later
53 13:06 <jonasschnelli> I think it does... but can’t say for sure. It been a while
54 13:07 <jnewbery> When the PR was opened it did, but since then, PR16202 was merged, which moves the finalize operation into the socket handler thread
56 13:07 <pinheadmz> oh interesting. yeah i had a pre-game quesiton about the timing of this one
57 13:07 <pinheadmz> has there been a recent flood of malformed messages or something?
58 13:07 <pinheadmz> what rises this PR from its year of sleep?
59 13:07 <jonasschnelli> Oh. Thanks for that info jnewbery
60 13:08 <michaelfolkson> I think some of the review comments touched upon this but what are the likely reasons for sending an incorrect checksum (other than maliciousness)?
61 13:08 <jonasschnelli> Some BIp324 discussion
62 13:08 <nehan> jnewbery: oh right, that's the first sentence of the review notes
63 13:08 <jnewbery> pinheadmz: I just saw it was rebased recently and started receiving some review attention and thought net/net_processing separation would be interesting to talk about at review club
64 13:09 <jnewbery> ok, so first question. What are the benefits of handling bad checksums in the net layer instead of in net processing?
65 13:09 <vasild> michaelfolkson: implementation bug, memory corruption, also memory corruption at the receiving end.
66 13:09 <sipa> just broken network infrastructure
67 13:10 <vasild> we can rely on TCP to not corrupt the data?
68 13:10 <nehan> jnewbery: get to it sooner, use fewer resources processing bad messages
69 13:10 <jnewbery> (I'll be a bit loose with language and use "net layer" == "socket handler thread" and "net processing layer" == "message handler thread")
70 13:10 <michaelfolkson> vasild: Which could all be honest failures right? We're disconnecting for our own benefit as a node rather than because we think the incorrect checksum was sent due to maliciousness
71 13:11 <emzy> For network problems there is l a TCP checksum. Ok could be manipulatet bei firewall stuff.
72 13:11 <jnewbery> vasild: great question. TCP should take care of errors for us. So how else could header checksum be wrong?
73 13:11 <theStack> jnewbery: i'd say it makes generally sense to detect failures at the lowest layer possible
74 13:11 <vasild> I mean we can rely that TCP will detect and re-transmit corrupted data. So there is no way to get corrupted data due to a malfuctioning router in between?
75 13:11 <vasild> michaelfolkson: yes
76 13:12 <ariard> vasild: TCP does integrity check ?
77 13:12 <emzy> vasild: yes, normaly.
78 13:12 <jonasschnelli> Would firewall randomly manipulate a single message? Or constant?
79 13:12 <sipa> jonasschnelli: intentionally it wouldn't change anything at all
80 13:12 <jonasschnelli> (Sry phone typing)
81 13:12 <lightlike> While reviewing I wondered if the changed disconnect behavior is a consequence of a change done mostly for architectural reasons (not so easy to imitate the old behavior after moving from net_processing to net) or something desirable in itself.
82 13:12 <vasild> ariard: yes
83 13:13 <jonasschnelli> Is the firewall a hypothetical issue or did someone report / monitor that?
84 13:13 <emzy> jonasschnelli: you never know. there is deep packet inspection.
85 13:13 <sipa> but routers can reassemble TCP packets
86 13:13 <jnewbery> nehan theStack: in the original version of this PR, I'd agree with you. I expect there was a performance benefit from moving the finalize operation to the socket handler thread, since the message handler thread is our bottleneck. Since 16202 was merged, there isn't a performance benefit ...
87 13:13 <felixweis> tcp has a 16 bit checksum
88 13:13 <sipa> so the TCP checksum is only an point-to-point checksum, not end-to-end
89 13:13 <jonasschnelli> Which is relatively weak
90 13:13 <jnewbery> ... but I think from a architecture/layering perspective, this makes sense
91 13:13 <MarcoFalke> lightlike: I posted a patch that keeps the check in net_processing, so no
92 13:13 <ariard> right but your NAT firewall may recompute the checksum after modification IIRC
93 13:13 <sipa> ariard: yes
95 13:14 <troygiorshev> jnewbery: agreement on the architecture/layering reason
96 13:14 <nehan> jnewbery: what is the architecture/layering reason?
97 13:15 <jnewbery> anyone want to answer nehan's question about layering?
98 13:15 <sipa> it's ugly that the checksum checking is in net_processing, because it's a network protocol level thing
99 13:15 * MarcoFalke move the checksum check to the wallet. *hides
100 13:16 <troygiorshev> nehan: The header wraps around some data. (Just like how, say, an ethernet header wraps around some data). Ideally you do the checksums, check the header, and then hand the data to the layer above you. Then that layer doesn't have to worry about anythign
101 13:16 <sipa> in particular when facing BIP324, where the checksum because dependent on the transport used... it's strange that the network processing layer would even know which transport is used
102 13:16 <nehan> maybe you could share a bit about how you see hte split between net/net_processing
103 13:16 <nehan> net is network protocol; net_processing is more application (node) semantics?
104 13:17 <felixweis> stream vs message handling?
105 13:17 <nehan> and you consider the "header" part of the network protocol? that's application-specific, right? it's not the packet header
106 13:17 <troygiorshev> nehan: yes exactly
107 13:17 <jnewbery> nehan: exactly
108 13:17 <sipa> net is interaction with socket layer, and P2P transport layer
109 13:17 <lightlike> the PR description gives another reason: It would be desirable to implement alternative transport layer protocols such as BIP155 in net, without having to change anything in net_processing
110 13:17 <sipa> the P2P transport layer transports messages, which are handled by net_processing
111 13:18 <jnewbery> The distinction on our software is not perfect. Take a look a CNode in net.h and CNodeState in net_processing.cpp. Ideally, CNode would just be about the connection to another node and CNodeState would just be about application-level stuff (ie inventory of transactions and blocks, address gossip, etc)
113 13:19 <sipa> nehan: i guess you'd have to see the bitcoin p2p protocol as consisting of two layers itself
114 13:19 <jnewbery> In a previous life I worked in telecoms, and we used to have lots of similar issues, where firewalls would mess around with IP addresses inside SIP and SDP messages and things would stop working.
115 13:19 <gzhao408> jnewbery When you say that "message handler thread is our bottleneck" it seems like we want to protect that thread from DoS/wasted resources. Is this a concern here/in general?
116 13:20 <jnewbery> gzhao408: that's a concern everywhere!
117 13:20 <sipa> it depends whether you're talking about honest overload or DoS attack
118 13:20 <gzhao408> jnewbery er sorry, I mean is it something we care /especially/ about e.g. because it's a common dos vector?
119 13:20 <jnewbery> but yes, here is a good example. In general, like nehan and theStack said earlier, it's best to deal with failures and bad messages as early and as low in the stack as possible.
120 13:21 <ariard> sipa: do we have a way to dissociate both right now ?
121 13:21 <lightlike> if these corruption issues can happen spuriously to otherwise good peers in some cases, I would prefer not to disconnect.
122 13:21 <ariard> like when downloading too much blocks become a DoS ?
123 13:21 <theStack> would a proper analogy for net vs. net_processing be be ethernet (layer 2) vs ip (layer 3)? the ethernet frame has a checksum, which is also checked on layer 2, and layer3 never gets to see it
124 13:21 <sipa> ariard: sure it can, but we have no protection against that
125 13:22 <jnewbery> gzhao408: probably not. This isn't a big dos concern. We always have to do the checksum calculations, so a node trying to waste our resources gains no advantage by sending a message with a bad checksum.
126 13:22 <ariard> I think there is another point by moving this from net_processing to net, it's happen in 2 different threads right now
127 13:22 <emzy> What about Erlay BIP 330? Will it also bettter to move the checksum out to net for that?
128 13:22 <felixweis> could ASMAP be used to limit upload rates per network?
129 13:22 <ariard> and lets say you have a single-threaded system, you may have some buffer growing quickly before switch
130 13:22 <troygiorshev> theStack: I think so. An interesting parellel is that just as the ethernet header contains a section saying "IP", the bitcoin header contains a sections specifying the command name
131 13:22 <jnewbery> theStack: I'm not sure that analogy is useful. You can point out some similarities, but I don't think it's particularly illuminating
132 13:22 <sipa> emzy: completely orthogonal; erlay is entirely net_processing
133 13:23 <ariard> felixweis: I would fear someone in your ASN wasting resources for you, like some kind of impersonation
134 13:24 <sipa> felixweis: the general solution to resource wasting attack is just keep track of how much resources every peer is using; if things get tight, slow down processing of the worst one
135 13:24 <sipa> felixweis: that's the first, and hard, step
136 13:24 <jnewbery> lightlike: To understand whether this is actually a concern, it'd be useful to look at real-world data.
137 13:24 <sipa> doing it per asmap seems like a nice optimization on top of that
138 13:24 <jnewbery> Has anyone grepped their debug log for "CHECKSUM ERROR"?
139 13:24 <sipa> felixweis: sorry, how many resources you're consuming on behalf of every peer
140 13:24 <sipa> jnewbery: yes
141 13:25 <MarcoFalke> net_processing is similar to the rpc server I guess. They both can send data structures to validation, and they both read raw bytes from a socket or so
142 13:25 <troygiorshev> sipa: What did you generally find?
143 13:25 <felixweis> ariad, sipa: interesting points!
144 13:25 <MarcoFalke> I found one error
145 13:25 <sipa> troygiorshev: only one completely broken peer, which sends bogus messages with checksum all 0
146 13:25 <MarcoFalke> 2020-03-03T23:14:54Z ProcessMessages(inv, 397 bytes): CHECKSUM ERROR peer=1849951
147 13:25 <troygiorshev> sipa: bogus message with SMBr as the type?
149 13:25 <MarcoFalke> 2020-03-03T22:49:52Z receive version message: /Satoshi:0.18.0/: version 70015, blocks=620054, us=126.96.36.199:8333, peer=1849951
150 13:25 <sipa> or 0-byte or -1 messages
151 13:26 <troygiorshev> Looks like that's spam from some group advertizing their coin
152 13:26 <jnewbery> I found none (although I only had a couple of weeks' worth of debug logs)
153 13:26 <MarcoFalke> So according to the user agent, it is a Bitcoin Core node
154 13:26 <jonasschnelli> MarcoFalke: what does that peer does during its session... can you grep?
155 13:26 <MarcoFalke> Normal relay, it was an inv message
156 13:26 <jnewbery> sipa: how long does your debug log go back?
157 13:27 <sipa> only 15 days
158 13:27 <sipa> i have an older one though
159 13:28 <sipa> i don't think this means much though - people on broken network infrastructure will see checksum errors and others won't
160 13:28 <jnewbery> So it seems that checksum errors are very rare.
162 13:29 <sipa> jnewbery: i don't think we can conclude that (i agree it's likely the case, but it's hard to know)
163 13:29 <emzy> can't find a "CHECKSUM ERROR" on like 5 nodes/
164 13:30 <jnewbery> sipa: right, there's at least one cow in scotland, and one side of it appears to be brown
166 13:30 <nehan> jnewbery: gmaxwell's response seemed reasonable
167 13:30 <sipa> jnewbery: my point is that you wouldn't see problems if you're on a non-broken network
169 13:30 <sipa> that doesn't mean it's not a problem for others
170 13:31 <sipa> it may be fairly common even, for users we care about, but we wouldn't know as long as we are using good networks
171 13:31 <sipa> (especially old home routers...)
172 13:31 <lightlike> emzy: all nodes with debug=net enabled?
173 13:31 <ariard> and there is no way to measure which nodes are on good-vs-bad network
174 13:31 <troygiorshev> should we try and ensure that bitcoin can be used over an unreliable transport layer?
175 13:32 <raj_149> I didn't find any in my node..
176 13:32 <emzy> lightlike: no.
177 13:32 <sipa> troygiorshev: that is a better question, i think
178 13:32 <vasild> If I am sitting on a broken network infrastructure, checksum errors will be common for me.
179 13:32 <sipa> to what extent can be guarantee reliable operation in such cases
180 13:32 <lightlike> emzy: then it wouldn't appear in the logs even if it happened
181 13:32 <MarcoFalke> Wouldn't a bad router cause disconnects anyway because the network magic is corrupted? I mean it is less likely because the magic is short compared to the message, but still
182 13:32 <sipa> BIP324 would break these nodes anyway
183 13:32 <troygiorshev> it certainly seems, uh, romantic at least...
185 13:33 <emzy> lightlike: I see. I will enable it on one of my nodes.
186 13:33 <jnewbery> If the network infrastructure is breaking the version message, then those nodes won't be able to connect to peers
187 13:33 <sipa> jnewbery: agree
188 13:33 <troygiorshev> sipa: good point bringing up BIP324. Maybe it overrides this discussion
189 13:33 <sipa> further, the protocol is stateful
190 13:34 <sipa> if something happens in the middle of say a tx response, the peer may keep waiting, and time out
191 13:34 <jnewbery> so there seems to be no downside to disconnecting peers that have a bad checksum in a version message (since the alternative is that they'll timeout their connection)
192 13:34 <troygiorshev> ariard: i love it :D
193 13:34 <sipa> jnewbery: why would they time out?
194 13:34 <jnewbery> And if the version message is the one most likely to be corrupted by the firewall (which seems likely since it contains IP addresses), then if it's uncorrupted, then there are unlikely to be problems with subsequent messages
195 13:35 <jonasschnelli> Good point
196 13:35 <jnewbery> Do we not drop a connection if we don't receive a version within a time limit?
197 13:35 <sipa> jnewbery: sure
198 13:35 <sipa> but there are cases where a checksum failure would not result in any problems at all
199 13:35 <jonasschnelli> MarcoFalke: is that checksum error you witnessed in a version message?
200 13:35 <ariard> jnewbery: can you still have a bad checksum for version message, i.e switch IP address but this IP address being valid
201 13:35 <ariard> for receiver
203 13:35 <sipa> the message would just be ignored and skipped, and that'd be it
204 13:35 <jonasschnelli> (Or did you say in a inv)
205 13:35 <MarcoFalke> It was in an inv message
206 13:36 <jnewbery> so the peer would send us a version with a bad checksum, we'd drop it, and then we'd eventually time out the connection because we never saw a version
207 13:36 <troygiorshev> sipa: (Other than a single bit flip in the checksum field I assume?)
208 13:36 <ariard> if it's a NATed address it's routable
209 13:36 <sipa> jnewbery: if it's in the version message, sure
210 13:36 <MarcoFalke> I was connected to the node for days
211 13:36 <troygiorshev> jonasschnelli: that's a good point
212 13:37 <MarcoFalke> No, only 15 minutes :sweat-smile:
213 13:37 <jnewbery> sipa: right that was my point. If the version message is corrupted, then there's no downside to disconnecting
214 13:37 <sipa> jnewbery: agree!
215 13:37 <jnewbery> right, next question. Are there any other checks that should be moved from net processing into net?
216 13:38 <jnewbery> (feel free to continue discussing/asking questions about disconnect. I just want to make sure we have a chance to discuss all the questions before the end of the hour)
217 13:38 <troygiorshev> jnewbery: MarcoFalke brought up a great point in the PR that the checksum should be treated the same as the header check and netmagic. Maybe all of them should be moved to net?
218 13:39 <jnewbery> troygiorshev: I agree!
219 13:40 <MarcoFalke> I think both should live in the same place (whether that is net_processing due to historic reasons or net because we decided that is the better place)
220 13:41 <MarcoFalke> But splitting them up felt a bit weird
221 13:41 <jnewbery> MarcoFalke: I don't think it's all-or-nothing. There are plenty of examples where things live in the wrong layer because they haven't moved yet
222 13:42 <ariard> its far better in net IMO, its really confusing while reviewing bip324 where these values are changed by deserializer and you have to go in net_processing to understand semantic
223 13:42 <jonasschnelli> jnewbery: we might want to disconnect nodes on invalid netmagic (v1 only).
224 13:42 <pinheadmz> jnewbery how about m_valid_netmagic ?
226 13:42 <jonasschnelli> (in the socket thread)
227 13:42 <pinheadmz> i was looking for something like message size etc
228 13:42 <troygiorshev> pinheadmz: look at m_valid_header too
229 13:42 <jonasschnelli> Isn't #15197 doing that? (netmagic)
231 13:43 <pinheadmz> troygiorshev ah good call for some reason i saw that and thought block header, but ofc that was wrong contet
232 13:44 <jnewbery> jonasschnelli: ah, I hadn't seen 15197. Thanks
233 13:45 <MarcoFalke> Is there software that uses the net magic to seek to the next message in the stream?
234 13:45 <jonasschnelli> MarcoFalke: that would be super weak
235 13:46 <MarcoFalke> (Asking because btcinformation claimed so)
236 13:46 <jonasschnelli> what if those 4 bytes match a hash of something sent over the wire?
237 13:46 <MarcoFalke> jonasschnelli: Agree
238 13:46 <jonasschnelli> we can't help someone doing that
239 13:47 <MarcoFalke> I am asking because the meeting notes linked to btcinformation, which suggest this is good practice
240 13:47 <MarcoFalke> Someone should probably edit that
241 13:47 <troygiorshev> Is it maybe historical?
242 13:48 <vasild> The decision whether to disconnect from a peer from whom we receive a bad checksum maybe should be based on how many other checksum errors am I seeing from other peers. For example if I get bad checksums from 1 peer but have plenty of other healthy peers from which I never get a bad checksum, then I would disconnect from the "bad" peer. But if I am getting checksum errors every few minutes
243 13:48 <vasild> from e.g 80% of my peers then maybe I wouldn't be so eager to disconnect (from everybody).
244 13:48 <jnewbery> MarcoFalke: you've missed out a part of the sentence "used to seek to next message _when stream state is unknown._"
245 13:49 <jnewbery> I'm still not sure whether that's true, but it makes a bit more sense
246 13:49 <jonasschnelli> vasild: that would be an option. But is it worth the effort?
247 13:49 <vasild> dunno :)
248 13:49 <MarcoFalke> But why would the stream state be unknown? (We disconnect when the header can't be deserialized)
249 13:50 <vasild> jonasschnelli: I think definitely out of the scope of that PR, maybe not worth the effort at all.
250 13:50 <michaelfolkson> Agree vasild. The danger here is that majority of your peers are sending you bad checksums and you rotate through iterations of new peers?
251 13:50 <felixweis> is it known how much CPU time is spent in SHA256 in a typical bitcoin node during normal operation (not IBD)?
252 13:51 <nehan> vasild: hmm this makes me wonder if this changes threatmodels at all. now an attacker could corrupt all incoming headers and force me to disconnect from all nodes
253 13:51 <troygiorshev> nehan: this is discussed in the PR
254 13:51 <vasild> michaelfolkson: yes, needlessly disconnecting from good peers due to bricked network router, but you also use the same router to connect to new peers...
255 13:51 <nehan> but could an attacker with network control essentially do that anyway?
256 13:51 <sipa> nehan: that kind of attacker can already just disconnect you
257 13:51 <troygiorshev> nehan: yup that's it :)
258 13:51 <nehan> troygiorshe: ah, must have missed it. thanks!
259 13:51 <MarcoFalke> felixweis: Depends whether you use hardware hashing or do it in software, I guess
260 13:51 <jnewbery> felixweis: I'm also curious about this. Does anyone know?
261 13:52 <troygiorshev> felixweis: Do you mean in this checksum or SHA256 in general?
262 13:52 <sipa> none of the changes in this PR change anything about attack models, as far as i can tell
263 13:52 <emzy> There is also an attack vector. If you can insert one TCP packet into the stram with wrong checksum. Then you can disconnect all peers. Seeing from a network level attacker.
264 13:52 <sipa> the only question is whether disconnecting honest but broken peers is preferable
265 13:52 <jonasschnelli> well... there is a CVE.
266 13:52 <felixweis> well all i know is there's a lot of sh256 hashing going, only 2 of them are used to verify the next block header.
267 13:53 <michaelfolkson> The scenario where honest peers are sending you bad checksums. A dishonest node identifies this, sends you good checksums and controls all your connections
268 13:53 <michaelfolkson> Unlikely but possible?
269 13:53 <jnewbery> is the concern about bad firewalls about my local firewall or the remote peer's firewall?
270 13:53 <ariard> emzy: network level attacker with such capabilities can just feed invalid blocks and get ban all your peers
271 13:54 <ariard> jnewbery: good question, I think both
272 13:54 <ariard> like can you know which one is faultive ?
273 13:54 <sipa> felixweis: txid and merkle root computation are probably the majority
274 13:54 <jnewbery> if it's about the remote's firewall, then I'd expect to see CHECKSUM ERROR messages in all node's debug logs
275 13:54 <sipa> and p2p checksums...
276 13:54 <pinheadmz> jonasschnelli CVE for what ?
277 13:55 <jnewbery> because we'd probably connect to at least one peer with a bad firewall
278 13:55 <sipa> jnewbery: it could be rare
279 13:55 <emzy> ariard: You only have to hit the right TCP SEQ. number. That's not sp hard anymore.
280 13:55 <jonasschnelli> There is a publicly revealed CVE that the PR fixes... but I don't think that CVE has reasonable weight
281 13:55 <jnewbery> 5 minutes left!
282 13:55 <MarcoFalke> jonasschnelli: Pretty sure anyone can come up with a DOS vector that is more severe and does not depend on the checksum disconnect logic
283 13:55 <jonasschnelli> The PR focuses on layering issues and it "also fixes that CVE"
284 13:55 <jnewbery> If you've been shy so far, now's your chance to ask your question
285 13:56 <ariard> emzy: what do you mean by sp hard ? IIRC there is ban on mutated consensus data, you don't need hashrate for this
286 13:57 <pinheadmz> jonasschnelli if possible id love to see the cve
287 13:57 <lightlike> emzy, ariard: maybe there are types of attackers who could manage to flip a bit every now and then, but cannot carefully replace whole msgs with constructed ones? Or does that make no sense?
288 13:57 <nehan> well, now i want to know about the CVE
289 13:57 <troygiorshev> +!
290 13:57 <troygiorshev> +1
291 13:57 <jonasschnelli> MarcoFalke: probably... I guess the difference on attacking the checksum is, that the attacker doesn't have to calculate a valid SHA256 hash where the victim needs to do for the validation
292 13:57 <ariard> jnewbery: it's a private node it may not have that much connections and not being seen that much
293 13:57 <emzy> ariard: sorry typo. It's not so hard to hit the right TCP sequence number.
294 13:57 <jonasschnelli> pinheadmz: the CVE has been publicly announced by the Bitcoin SV people...
296 13:57 <jonasschnelli> I'm only revealing it because its already public available on the web
297 13:58 <nehan> ah. it's a pretty obvious one
299 13:59 <jonasschnelli> It is public since a year and it seems that no-one could explot it
300 13:59 <sipa> heh, they can just send valid double spending transactions too
301 13:59 <nehan> jonasschnelli: why not?
302 13:59 <emzy> lightlike: yes, It would make an attack more easy, if you only have to get in the TCP stream and send someting random. Insted of getting the hash right.
303 14:00 <jnewbery> That's time. Thanks everyone! Special thanks to jonasschnelli for dropping in.
304 14:00 <jonasschnelli> nehan: don't know... because its probably an inefficient attack?
305 14:00 <ariard> lightlike: I can't think about real-world infrastructure with this kind of capabilites, see Erebus paper for discussion on infrastructure attacker model
306 14:00 <jnewbery> I have a call now so I have to run.
309 14:00 <troygiorshev> thanks jnewbery!
310 14:00 <jnewbery> #endmeeting
311 14:00 <sipa> jonasschnelli: there are dozens of ways to achieve the same outcome
312 14:00 <jonasschnelli> thanks, Thanks for calling me in jnewbery!
313 14:00 <pinheadmz> ty all!
314 14:00 <jonasschnelli> sipa: Yes. I think so.
315 14:00 <emzy> tnx everyone!
316 14:00 <lightlike> thanks!
317 14:00 <theStack> thanks!
318 14:00 <vasild> Thanks everyone!
319 14:00 <sipa> jonasschnelli: sending N bytes to cause the victim to waste N bytes of network + N bytes of hashing...
320 14:01 <felixweis> thanks everyone!
321 14:01 <pinheadmz> sipa jonasschnelli you mean just sending any type of invalid mesasge
322 14:01 <jonasschnelli> indeed... yeah
323 14:01 <thomasb06> thanks!
325 14:01 <pinheadmz> is as severe as bad checksum
326 14:01 <sipa> pinheadmz: right
328 14:01 <pinheadmz> i.e. a ddos where entire 4MB block is valid.... except the last tx
329 14:01 <nehan> pinheadz: that requires a bunch of pow
330 14:01 <pinheadmz> ah good point
331 14:01 <jonasschnelli> but for an invalid message (to pass the checksum test), the attacker needs a valid sha256 hash?
332 14:01 <pinheadmz> well then a TX max-size where the last sig is invalid
333 14:02 <MarcoFalke> jonasschnelli: Only needs to calculate once
334 14:02 <jonasschnelli> But I guess you can send invalid blocks to make the victim hash-check that
335 14:02 <nehan> don't nodes drop peers that send a lot of invalid messages/
336 14:02 <jonasschnelli> MarcoFalke: yeah. We don't ignore dups
337 14:02 <jonasschnelli> I guess that CVE is total bullshit
338 14:02 <MarcoFalke> right
339 14:02 <pinheadmz> even if its not severe, I do appreciate that altcoins look at the code from a different perspective and have an opportunity to contribute back to bitcoin
340 14:03 <jonasschnelli> However, the optimization in layering (checksum belong to message transport and not to message processing) is real
341 14:03 <jonasschnelli> I guess the SV people where happy they could create some CVE at all. :)
342 14:04 <raj_149> Curious on estimate of performance optimization by this layering.
343 14:05 <jonasschnelli> raj_149: near to zero probably
344 14:06 <emzy> Again. I think the checksum prevents that an attacker could inject just one packet into the TCP stream to ban a node. I'ts far to easy to spoof the IP address of a peer and flood with TCP packets (random SEQ. number) the target node.
345 14:07 <emzy> Would be better to just dissconnet the node not ban it.
346 14:08 <sipa> who says anything about banning?
347 14:08 <raj_149> Are we also banning the node along with disconnection?
349 14:08 <raj_149> Ya. Right.
350 14:08 <emzy> I think I got it wrong. It is only disconnect.
351 14:08 <emzy> Sorry, that's fine then.
352 14:10 <raj_149> Also is it possible to apply some kind of heuristics to assertain resource draining intention from a node? May be that way we can safeguard against disconecting honest nodes just because of bad transport?
353 14:11 <sipa> raj_149: that's hard, as it's probably a cat and mouse game
354 14:11 <sipa> there are certainly some behaviours that are easily detectable, but if someone really wants to exploit things, there are dozens of ways
355 14:12 <sipa> i believe the real solution is just keeping track of resources used on behalf of every peer, and slow down/disconnect the worst offenders if you run low
356 14:12 <desolate> raj_149 does the heuristic determine how much resources the attack is draining on the heuristic function? :P
357 14:12 <sipa> with perhaps exceptions like giving you a valid block
358 14:20 <desolate> I second sipa's KISS approach when receiving incompatible data: "turtle" strategy rather than attempting to build a highly optimized, hopefully omnipotent strategy