Add BIP324 encrypted p2p transport de-/serializer (only used in tests) (p2p)

Host: ariard  -  PR author: jonasschnelli

The PR branch HEAD was 7aa91cd1 at the time of this review club meeting.

Note: in this PR Review club we will review the first three commits only (Expose MAC length in chacha_poly_ahead_h, Add BIP324 short-IDs to protocol.h/.cpp and Add BIP324 v2 transport serializer and deserializer).



  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? (Don’t forget to put your PR review on GitHub.)

  2. What’s your process to review a PR implementing a new BIP? Did you read the BIP or the code first? How can you ensure that BIP is correct? How can you ensure that code implements the BIP correctly?

  3. BIP 324 introduces a new message structure, notably with short command ID. What do you think about those new short command ID?

  4. Why was the Chacha20/Poly1305 construction chosen? Have you read Bitcoin Core implementations of these primitives?

  5. Beyond code review, how can testing be improved? What failure cases should be tested?

  6. Could the serialization/deserialization code be better documented or simplified? Consider the choices of data structures and algorithms.

Meeting Log

  113:00 <jnewbery> #startmeeting
  213:00 <jnewbery> hi
  313:00 <SirRichard> hi
  413:00 <andrewtoth> hi
  513:00 <willcl_ark> hi
  613:01 <thomasb06> hi
  713:01 <lightlike> hi
  813:01 <ariard> hi
  913:01 <raj_149> hi
 1013:01 <emzy> hi
 1113:01 <jnewbery> today's notes and questions are here:
 1213:01 <troygiorshev> hi
 1313:01 <jnewbery> ariard is our host. Over to you, ariard!
 1413:01 <ariard> thanks jnewbery!
 1513:02 <ariard> so to get started, usual question who has reviewed the PR ?
 1613:02 <raj_149> y
 1713:02 <willcl_ark> y
 1813:02 <SirRichard> y
 1913:02 <ariard> that's great! have you left any comment on it ?
 2013:02 <raj_149> y
 2113:02 <willcl_ark> no :(
 2213:03 <emzy> n
 2313:03 <nehan> hi
 2413:03 <SirRichard> n
 2513:03 <nehan> y
 2613:03 <jnewbery> n
 2713:04 <ariard> okay what was your process to review a PR implementing a BIP ?
 2813:04 <ariard> did you read the BIP or code first ? or dig into context around the BIP, like reading old mail threads ?
 2913:05 <willcl_ark> The context to me seems pretty rational from the BIP
 3013:05 <raj_149> 1. read the bip. 2. read on missing gaps that i dont understand yet 3. read the code. 4. tally the code with the BIP.
 3113:05 <willcl_ark> (i read the BIP first)
 3213:05 <ariard> raj_149: quick reply, it's hard to be sure you have enough pointer without a second implementation not hurting wrinkles
 3313:06 <ariard> willcl_ark: what seems you rational here ? what this BIP is trying to achieve?
 3413:06 <emzy> I read the BIP first.
 3513:06 <ariard> raj_149: what were the missings gaps?
 3613:07 <willcl_ark> ariard: we want to prevent kinds of MITM attacks which are easy on unencrypted traffic.
 3713:07 <raj_149> ariard: depends on individual's previous knowledge. for me i didn't knew chacha20, poly1305 and AEAD details. so had to read them up first.
 3813:09 <raj_149> willcl_ark: it seems this doesn't necessarily solves all forms MITM, i guess the draft also mentions this somewhere. if i am not mistaken.
 3913:09 <ariard> willcl_ark: in Bitcoin context, why it makes sense to encrypt traffic ? Most of data is already public like blocks, headers
 4013:10 <ariard> raj_149: yes generally you have to, it's pretty lengthy reviewing this kind of PR when it involves cryptographic primitives
 4113:10 <ariard> have you been through the process to assert this pick up of cryptographic primitives compare to alternative ?
 4213:10 <jnewbery> (anyone can feel free to answer any of these questions)
 4313:11 <pinheadmz> encrypted traffic obscures the fact that you even have abitcoin node running
 4413:11 <raj_149> ariard: for start it stops traffic analysis and IP to publickey linkage. In general all sorts of privacy leaks that can occur by monitoring network traffic of a node.
 4513:11 <pinheadmz> and withou, people can snoop on the network connectivity graph, TX propogation and maybe resolve the origin of a ne TX
 4613:12 <ariard> pinheadmz: obscures to whom? not if you're running on port 8333
 4713:12 <pinheadmz> who says I am? :-)
 4813:12 <pinheadmz> I dunno, oppressive regimes
 4913:12 <willcl_ark> if you can tamper with messages, you can get legitimate (truthful) peers banned
 5013:12 <ariard> no I mean your ISP can still learn you're running bitcoin if you're using port 8333
 5113:13 <pinheadmz> ariard I understand but you dont have to use port 8333 and if my ISP sees a bunch of encrypted blobs coming into port 9000, that is at least plausible denyability
 5213:13 <jnewbery> and even if you're not using port 8333, bitcoin p2p traffic has an unmistakable pattern
 5313:13 <ariard> pinheadmz: with tx propogation for sure it obfuscates tx origin for on-path attacs, now spy peers can still observe origin by connecting to you
 5413:14 <pinheadmz> certainly for SPV or Neutrino, there is privacy leaked with plaintext mesages
 5513:14 <pinheadmz> but that privacy is leaked to the full node you connect to anyway
 5613:14 <ariard> raj_149: which publickey you're pointing to ? what kind of privacy leaks you can think of beyond tx origin ?
 5713:15 <pinheadmz> Im not sure if theres an attack vector around this, but you mightbe abe to tell where a node is in the sync process by reading their traffic
 5813:15 <ariard> jnewbery: that's a huge concern, even tx propagation may still leak being interactive and we don't do padding
 5913:16 <emzy> it's not only good to secure the p2p trafic from your ISP eyes, it's also good agains state level surveillance.
 6013:16 <pinheadmz> ariard do you know - i asked this on the BIP draft - why not use Noise Protocol for the key exchange the way Lightning does?
 6113:16 <ariard> pinheadmz: if you assume attacker knows about 1) block announcement 2) listen for transactions flooding 3) can map encrypted blob size to transactions received?
 6213:17 <raj_149> ariard: i am not exactly sure on the kind of possible leaks that can happen over clear text data. I read somewhere by monitoring network traffic and some kind of triangulation observer can link origin transactions with IP addresses, without connecting to me as a node. Would like to know about other possible leaks that i dont know about. clearly there should be some over encrypted data.
 6313:17 <jnewbery> pinheadmz: you surely can tell if a node is in IBD, even if you used a stream cipher to encrypt the messages
 6413:17 <populate> regarding the entropy of encrypted information with strong time entropy:
 6513:18 <ariard> pinheadmz: I'm not sure but I would say historical reasons, Noise Protocol wasn't fully spec out when BIP 151 was first submitted
 6613:18 <emzy> also in flight modification of the data is easy posible without MAC
 6713:19 <sipa> pinheadmz, ariard: there is some interaction with development of a private authentication protocol we've been working on
 6813:19 <sipa> noise's authentication relies on revealing identities, which is something we want to avoid
 6913:19 <willcl_ark> if you wanted to eclipse someone, tampering with their current peers' messages (and having them disconnected) seems like a good start
 7013:19 <willcl_ark> maybe this PR doesn't quite solve that though
 7113:19 <ariard> raj_149: see on in-protocol leaks
 7213:20 <ariard> emzy: but what about a MitM attacker intercepting communications from both side? there is no authentication with BIP324
 7313:21 <sipa> the full functionality obviously needs encryption + authentication
 7413:21 <willcl_ark> ONe thing I noticed it this BIP (and lightning) vs "standard" ChaCha20 Poly1305 is that we encrypt the packet length, why is this?
 7513:21 <sipa> but authentication is very easy to add in a modular way once we have encryption
 7613:21 <jnewbery> willcl_ark: if you can tamper with traffic you can get a peer disconnected, but not banned. The terminology is a bit confusing currently, but the disconnected peer is allowed to reconnect
 7713:21 <willcl_ark> jnewbery: ok, thanks for the clarification
 7813:22 <emzy> ariard: Right, but it's only posible at first connection attempt. A running session is save from MitM attack. So it is better then not.
 7913:22 <jnewbery> of course, if you're able to tamper with messages, you could just continue to do that to get them disconnected again, or just block all traffic
 8013:22 <willcl_ark> even a "ban" is always temporary (some length of time) also I think, right?
 8113:22 <raj_149> ariard: thanks.
 8213:22 <ariard> sipa: but we could have reuse a Noise pattern with ephemeral key pairs? It doesn't mandate how you exchange key pairs?
 8313:22 <jnewbery> willcl_ark: 24 hours by default
 8413:22 <willcl_ark> yes
 8513:22 <sipa> ariard: right, that could work, using noise purely with ephemeral keys
 8613:23 <ariard> willcl_ark: good question, to make dumb traffic analysis harder?, see real_and_random and my posts on the BIP
 8713:23 <pinheadmz> would it make sense to have static keys for "precious" nodes ?
 8813:23 <sipa> though i expect even in that case we'd want our own "vendored" version, with secp256k1 keys etc
 8913:23 <pinheadmz> for example, connecting my personal SPV to my remote Full Node
 9013:23 <sipa> pinheadmz: yes, that's called authentication!
 9113:23 <sipa> pinheadmz: and of course that's the eventual goal
 9213:23 <willcl_ark> ariard: Oh i didn't see the comments at the end of the BIP!
 9313:23 <pinheadmz> i thought you meant authentication like message integirty
 9413:24 <sipa> pinheadmz: ah, no
 9513:24 <sipa> pinheadmz: history is that we original had BIP150/BIP151 for auth and enc
 9613:24 <sipa> BIP151 was replaced with BIP324
 9713:24 <pinheadmz> eevntual goal, so nodes would have <pubkey>@IP:port kind of hostnmaes like lightning ?
 9813:24 <ariard> jnewbery: I was verifying just before is this really what we do in AcceptConnection
 9913:24 <ariard> pinheadmz: please not permanent pubkeys like in LN
10013:25 <sipa> pinheadmz: perhaps that's getting off-topic, but the idea is that there wouldn't be any *observable* identities (for obvious reasons)
10113:25 <sipa> but if you *know* the identity of a peer you're connecting to, you can private determine whether it's them
10213:25 <pinheadmz> excellent
10313:25 <sipa> (without them even knowing that authentication succeeds)
10413:25 <sipa> (a bit outdated, but the writeup is mostly right still)
10513:26 <ariard> emzy: right, but in practice if you assume ISP-capabilities like attackers they would be in place for intercepting at any moment of the session
10613:27 <willcl_ark> ariard: so am I right after reading the comments in thinking we only encrypt the packet length, but still don't MAC it (not re-read the code changes)?
10713:27 <sipa> willcl_ark: that's right
10813:27 <ariard> sipa: right does Noise make it mandatory to use Curve25519?
10913:27 <sipa> ariard: i honestly don't know
11013:27 <emzy> ariard: If you have a long running node, the time window is very narrow.
11113:28 <ariard> willcl_ark: yes we don't MAC it, BOLT8 does it, I spent few hours yesterday trying to find even theoritical attacks on modifying a length field MAC
11213:29 <sipa> ariard: secp256k1 isn't supported by the noise framework, but it'd be possible to of course use a noise protocol that we instantiate ourselves
11313:29 <willcl_ark> slightly off topic, but I appreciated the short command ID improvement here
11413:29 <sipa> though at that point i think the advantages are rather small
11513:29 <ariard> and found nothing, so likely too minor
11613:29 <sipa> at a high conceptual level, bip324 is effectively that
11713:29 <willcl_ark> ariard: sipa, thanks
11813:30 <raj_149> sipa: can you through some pointer on what do you mean by noise framework?
11913:30 <raj_149> *throw
12013:30 <sipa> raj_149:
12113:30 <raj_149> thanks..
12213:30 <ariard> used by whatsapp or lightning among others
12313:32 <raj_149> feels like kinda same thing BIP324 is doing?
12413:32 <ariard> sipa: yes I slightly agree, does anyone have done at least of properties comparison between a Noise pattern with ephemeral keys vs BIP324 ?
12513:32 <sipa> raj_149: bip324 is encryption only, no auth
12613:33 <sipa> (it's authenticated encryption, but that's detecting tampering, not identity of the peer)
12713:33 <ariard> emzy: I'm not sure the time window matters here, internet infrastructure is quite static
12813:33 <ariard> okay let's move to next question
12913:33 <raj_149> sipa: got it. ariard: no but seems interesting exercise.
13013:34 <ariard> BIP324 introduces a new message structure, notably with short command ID, what do you think about it ?
13113:34 <raj_149> loved it. :D
13213:34 <ariard> especially the new short command ID? is it worth the complexity vs bandwidth saving argued ?
13313:34 <ariard> raj_149: haha please explain
13413:34 <raj_149> thougb currently it seems hardcode. Not p2p agrrement. is that correct?
13513:35 <ariard> raj_149: yes no p2p agreement for now
13613:35 <ariard> which makes sense bip324 is already big enough, it can be done in future specifications
13713:35 <willcl_ark> I saw some discussion over Cap'n'proto (for the GUI), was there a concious choice not to also leverage that here?
13813:35 <troygiorshev> At this level it seems just fine to have the compact IDs
13913:35 <willcl_ark> (but very happy to see bandwidth reductions!)
14013:35 <raj_149> ariard: well if we only have a finite set of command lists, it makes sense to map them into numbers instead of transmiting them as chars.
14113:36 <sipa> willcl_ark: i don't see how those are related at all
14213:36 <sipa> the P2P protocol is not an RPC protocol (it's message based, not function call based), and capnproto doesn't do encryption
14313:36 <willcl_ark> well the short IDs are basically a protocol spec?
14413:36 <willcl_ark> hmmm ok, seemed like defining a protocol for messages to me
14513:37 <sipa> you could probably route capnproto over the encrypted stream bip324 defines
14613:37 <sipa> but they're fundamentally a very different layer
14713:37 <willcl_ark> as in, analagous to how Protobufs encode messages according to a .proto file
14813:37 <willcl_ark> sipa: OK
14913:37 <sipa> willcl_ark: it's like asking why not use HTTP instead of SSL
15013:38 <ariard> raj_149: can you envision what nodes would negotiate a different set of mappings?
15113:38 <ariard> *why
15213:39 <raj_149> ariard: honestly i couldn't come up with any reason. maybe for some they just need a subset of those so dont wanna assign all the numbers? not sure if its worth the effort though.
15313:39 <willcl_ark> sipa: it just seemed to me that e.g. Protobuf kind of does what that massive `if else` statment does, but with a well-defined proto file
15413:40 <willcl_ark> but will research it more :)
15513:40 <sipa> willcl_ark: bitcoin already has a p2p protocol, we're not proposing to replace it with something else entirely
15613:40 <ariard> raj_149: well we may have experimentation with new P2P messages, like LN custom messages, I can see people trying to communicate through P2P to their wallets
15713:40 <sipa> (or at least, no reason to do that simultaneously with introducing encryption)
15813:41 <ariard> especially if we add authentication
15913:41 <willcl_ark> sipa: yes I gues you can't just do it for a single part (command)
16013:41 <willcl_ark> anyway I'm very much in favour of some bandwidth reductions on this front!
16113:41 <ariard> next question: why ChaCha20/Poly1305 were chosen ? have you read implementation
16213:41 <lightlike> it could lead to confusion when there are several bips that introduce new p2p messages (that need new numbers). e.g. BIP 157 messages that are merged are not assigned a number yet.
16313:42 <sipa> lightlike: a new bip can introduce short numbers for newly introduced messages
16413:42 <raj_149> ariard: yes they can use negotiations to add extra *special* messages between them. But these messages still needs to be defined in the protocol right? So they can as well be assigned with short_id there?
16513:42 <sipa> so that there doesn't need to be coordination
16613:42 <troygiorshev> lightlike: agreed. Might not be too hard to say "just give this the next available number" though
16713:42 <willcl_ark> ariard: faster than AES on most CPUs
16813:44 <willcl_ark> I think it's better for chips without AES-NI, right?
16913:44 <ariard> raj_149: if they are negotiated and protocol is custom, your bandwidth savings priorities may not be equivalent to the default ones
17013:45 <ariard> willcl_ark: right for software implementations
17113:45 <raj_149> ariard: I am not seeing how bandwidth saving is different in a negotiated short_id than a hardcoded one.
17213:46 <raj_149> ariard: on poly1305, why are we using goto in the code? isn't there any better way to do it? for example the donna implementation doesn't seem use it.
17313:46 <luke-jr> hi
17413:48 <ariard> raj_149: let's say I have my own range of custom messages which make 80% of my traffic, there is no room left in the default short command ID table, by negotiating I can overrules with my own maps and thus save traffic?
17513:48 <willcl_ark> ariard: is there another reason that better (faster) performance on average across different CPUs?
17613:48 <willcl_ark> than*
17713:49 <sipa> raj_149: i assume it's based on the openssh code
17813:49 <sipa>
17913:49 <sipa> (i haven't checked)
18013:49 <raj_149> ariard: why there wont be any room left in default table if the protocol itself defines the short_ids while defining the command?
18113:49 <ariard> willcl_ark: yes it's design have been vetted as more conservative than AES and less prone to cryptographic breakage
18213:49 <ariard> *its
18313:50 <sipa> ariard: chacha more conservative than aes?
18413:50 <ariard> see cryptanalysis table at the end of Salsa20 paper, though I've not compare to ones on AES
18513:50 <sipa> i wouldn't say that
18613:50 <sipa> it's just faster and more modern, and well-studied
18713:50 <sipa> (aes being a block cipher has inherent complexity, which is unnecessary for stream encryption)
18813:51 <ariard> sipa: it's claimed by Bernstein in the ChaCha paper, "Salsa20/20 is amore conservative design than AES, and the community seems to have rapidlygained confidence in the security of the cipher."
18913:51 <ariard> which should be taken with carefulness
19013:51 <sipa> ariard: such claims by the author should be taken with a grain of salt, i think :)
19113:51 <sipa> but i guess i see what he means
19213:52 <ariard> I completly agree and why I said I've not read cryptanalysis AES to compare
19313:52 <sipa> but i wouldn't claim that chacha is more well-analyzed than AES
19413:52 <willcl_ark> perhaps, "good enough" (and faster)?
19513:53 <troygiorshev> modern and well-analyzed (if not-quite-as-well-analyzed) seems notable
19613:53 <willcl_ark> was the alternative AES + CBC? becuase it would seem prefereable over that
19713:53 <sipa> AES-GCM would be the typical alternative
19813:54 <willcl_ark> ah ok
19913:54 <sipa> or AES-CTR-GCM more precisely
20013:54 <sipa> AES-CTR is a stream cipher like ChaCha20; GCM is a MAC like Poly1305
20113:54 <ariard> raj_149: which protocol are you talking about ? If I experiment with my custom one, it won't be supported by the default implementation
20213:55 <willcl_ark> sipe: thanks. I have much cryptography to learn!
20313:55 <ariard> next question: how testing can be improved ? what failure cases should be tested?
20413:55 <jnewbery> 5 minutes left!
20513:55 <raj_149> ariard: ah i see that now.
20613:57 <troygiorshev> it's great that fuzzing is already in
20713:57 <troygiorshev> a parallel implementation in python could be nice
20813:57 <raj_149> ariard: we definitely need functional tests once its fully getched. So far it only implements en/decryption, which seems to be adequately tested in unit test.
20913:58 <raj_149> *fetched
21013:58 <ariard> troygiorshev: have you looked on fuzz scope ? what kind of cases could be missed by fuzzer?
21113:58 <ariard> \
21213:58 <willcl_ark> I couldn't get the fuzz tests to work on MacOS 10.15 :(
21313:59 <troygiorshev> ariard: not deeply enough, no. definitely worth doing!
21413:59 <willcl_ark> I couldn't see any tests for the re-keying process, but maybe missed them!
21513:59 <ariard> last question: could the serialization/deserialization code be better documented or simplified ? have you look on data struct and algorithm choice ?
21614:00 <raj_149> i have taken up an issue which will force me to look deeper into fuzzing. Planning to fuzz this once i get there.
21714:00 <ariard> willcl_ark: I think the re-keying is still WIP, or at least we need to think about some of its implications
21814:00 <willcl_ark> ariard: ah ok
21914:01 <jnewbery> that's time!
22014:01 <willcl_ark> thanks ariard very interesting discussion
22114:01 <jnewbery> Next week's meeting is on 18468, hosted by sipa. Notes will be posted soon:
22214:01 <raj_149> thanks ariard, nice review session.
22314:01 <thomasb0`> thanks ariard
22414:01 <troygiorshev> thanks ariard!
22514:01 <jnewbery> See you all there!
22614:01 <ariard> great, I invite anyone who hasn't review yet the PR to do it with questions in mind :)
22714:01 <emzy> thanks ariard!
22814:01 <ariard> yw
22914:01 <populate> thanks ariard
23014:02 <SirRichard> Thanks ariard great discussion
23114:02 <andrewtoth> thanks ariard! very interesting discussion
23214:02 <furunodo> thanks!
23314:03 <thomasb0`> sipa: what I was thinking about was to pick a random and compute either Euclide, theStack's pow, or the exponential ladder. What do you think?
23414:03 <sipa> ... why?
23514:03 <thomasb0`> to use several algorithms
23614:03 <sipa> why wouldn't you always use the fastest
23714:04 <thomasb0`> to track down bugs
23814:04 <sipa> the tests already do both
23914:04 <thomasb0`> then, no need for change
24014:07 <furunodo> what's a good place to start for getting into the python test code, beyond bitcoin/test/functional/ say if I wanted to start (or work on) a parallel test implemtation in python?
24114:08 <furunodo> (as suggested by troygiorshev)
24214:08 <thomasb0`> sipa: btw I was talking about python, only...
24314:10 <thomasb0`> furunodo: it looks like you need to be confident to make relevant modifications by yourself
24414:10 <sipa> furunodo: that would be the place to start
24514:11 <sipa> all the functional python tests are in test/functional (with lots of utilities in test/functional/test_framework)