Implement BIP 340-342 validation - Support for Schnorr Signatures and integration in SignatureCheckers (consensus, taproot)

https://github.com/bitcoin/bitcoin/pull/17977

Host: jnewbery  -  PR author: sipa

The PR branch HEAD was ff9b7e6 at the time of this review club meeting.

This is the fourth in a series of review club meetings on the (work in progress) implementation of BIP 340-342.

This week, we’ll look at another commit from PR 17977 - Support for Schnorr signatures and integration in SignatureCheckers.

Notes

  • This is the first commit from PR 17977 that uses the new functionality and interfaces in libsecp256k1. You may need to make clean && ./configure && make for the build to succeed.

  • Schnorr signatures are a digital signature scheme that can be constructed over any group where the discrete log problem is hard. It’s a different signature scheme from ECDSA, which is used in Bitcoin today, but there are certain similarities between the two schemes.

  • We won’t go into the construction of Schnorr signatures or the motivation for adding them to Bitcoin, but it’d be useful to have a high-level understanding of those things to motivate why we’re making these changes. BIP 340 explains both the motivation and the construction of these signatures.

  • We also won’t go into the implementation of signing and verifying the signatures themselves. That’s all hidden away inside the libsecp library. The first two commits from PR 17977 pull in the required changes to libsecp. If you want to dig deeper into the implementation, lipsecp PR 558 is where the new cryptographic code is implemented.

  • Instead, we’re going to look at the new interface provided by libsecp for verifying Schnorr signatures, and how it’s integrated into Bitcoin Core.

  • The first thing to note about the Schnorr signature interface is that pubkeys in BIP340 are 32 bytes. That’s unlike pubkeys in our ECDSA implementation, which are 33 or 65 bytes (depending on whether they’re compressed or uncompressed). BIP 340 public keys can be 32 bytes because the y-value or the point is implicit. See the Implicit Y coordinates and Public Key Generation sections of BIP 340 for full details.

  • The new XOnlyPubKey object is the lowest-level object we’ll use for signature validation. It’s constructed with a uint256, and then the function VerifySchnorr() is called with the 32 bytes hash of the message that was signed (this hash can sometimes just be called the ‘message’, or in the case of signing a Bitcoin transaction the ‘sighash’), and the 64 byte signature.

  • The XOnlyPubKey object is used within a hierarchy of signature checker classes, which are used by the higher-level code to check signatures:

    • At the base is BaseSignatureChecker, which provides three (virtual) public methods: CheckSig(), CheckLockTime() and CheckSequence(). This PR adds CheckSchnorrSig().

    • Above that is GenericTransactionSignatureChecker (which is actually a template that’s instantiated as TransactionSignatureChecker or MutableTransactionSignatureChecker depending on whether it’s being used with a CTransaction or CMutableTransaction). GenericTransactionSignatureChecker overrides the virtual CheckSig() and CheckSchnorrSig() methods with implementations of signature verification.

    • And above TransactionSignatureChecker is a CachingTransactionSignatureChecker, which uses the signature cache (calling CachingTransactionSignatureChecker::Verify____Signature() more than once on the same transaction won’t result in the same signatures being validated multiple times).

  • The main functional changes in this PR are in GenericTransactionSignatureChecker::CheckSigSchnorr() and the CSignatureHash constructor.

Questions

  1. Why does BIP 340 use 32 byte public keys? How do we get away with not including the y-coordinate in the public key?

  2. What other derived classes use the BaseSignatureChecker as their base class?

  3. What is this code doing? When do we expect a signature to be 64 bytes, and when is it 65 bytes?

  4. Why do we have a signature cache? How does it help performance?

  5. Why is this change necessary?

Meeting Log

  119:00 <jnewbery> #startmeeting
  219:00 <jnewbery> hi folks!
  319:00 <emzy> Hi
  419:00 <willcl_ark> hi
  519:00 <troygiorshev> hi
  619:00 <pinheadmz_> Hi! On mobile. This'll be weird
  719:00 <michaelfolkson> hi
  819:01 <fjahr> hi
  919:01 <jnewbery> welcome to Bitcoin Core PR Review Club
 1019:01 <jnewbery> This week we're talking about .....
 1119:01 <jnewbery> ... TAPROOT
 1219:01 <jnewbery> (again)
 1319:01 <fjahr> Taprrrrrrrrrroot
 1419:01 <troygiorshev> woot!
 1519:01 âš¡ graveyard party hat
 1619:01 <evanlinjin> Hello everyone!
 1719:01 <jnewbery> notes and questions are where you'd expect them: https://bitcoincore.reviews/17977-2
 1819:01 <evanlinjin> This is my first time :)
 1919:01 <graveyard> hi evanlinjin
 2019:01 <nehan> hi
 2119:01 <willcl_ark> hi evanlinjin
 2219:02 <jnewbery> hi evanlinjin. Welcome. We love new participants :)
 2319:02 <evanlinjin> Hello, thank you for the welcome!
 2419:02 <jnewbery> who had a chance to review this week's commit?
 2519:02 <jonatack> taproooooooo oooo ooooot (hi)
 2619:02 <jnewbery> y/n (no worries if you didn't have time)
 2719:02 <sipa> hi
 2819:02 <evanlinjin> Sorry, not me
 2919:02 <pinheadmz_> Y
 3019:02 <troygiorshev> n
 3119:02 <michaelfolkson> y
 3219:02 <sipa> i've looked at it once or twice
 3319:02 <fjahr> y
 3419:02 <emzy> n
 3519:02 <michaelfolkson> Have you reviewed this week's commit sipa?
 3619:02 <fjahr> sipa: probably when you wrote it :D
 3719:03 <jnewbery> sipa: that's good to know
 3819:03 <jonatack> i've looked at it less than sipa has
 3919:03 <jnewbery> ok, lots of notes this week. Not so many questions, but that means there's more time for _your_ questions
 4019:03 <jnewbery> q1: Why does BIP 340 use 32 byte public keys? How do we get away with not including the y-coordinate in the public key?
 4119:04 <michaelfolkson> BIP 340 laid out three options and plumped for the third
 4219:04 <pinheadmz_> Every x has 2 y. We just require the y to be odd (or square) or negative. Or whichever it ended up being. ;/)
 4319:04 <fjahr> The "Implicit Y coordinates" section in BIP340 explains it: Every X coordinate has 2 two Y coordinates. There are different options on how to determine which Y to use but the Y which is a quadratic residue was chosen.
 4419:04 <graveyard> since it's mirrored we can set the zone we want to look at and it's inferred
 4519:04 <jnewbery> michaelfolkson pinheadmz_ fjahr graveyard: exactly correct!
 4619:05 <pinheadmz_> And pubkey Y isn't the same as R value Y right?
 4719:05 <pinheadmz_> It's quad reissue for one and positive ness for the other ?
 4819:05 <nehan> n
 4919:05 <pinheadmz_> *quadratic residue
 5019:06 <jnewbery> a public key in elliptic curve cryptography is a point, but because the curve is mirrored in the x-axis, as long as we all agree a scheme to determine which of the 2 possible y values to choose, then we don't actually need to explicitly state which y value we're using
 5119:06 <fjahr> pinheadmz_: i think they both use quad residue but had different explainations why
 5219:07 <nehan> deterministic symmetry breaking to choose the y
 5319:07 <sipa> i guess this is as good a time to mention this as any: i believe our original reasoning for picking quadratic residue for R was flawed, and we should consider making both use even
 5419:07 <sipa> i'm about to send a mail to list about that, today
 5519:07 âš¡ michaelfolkson looking up the original reasoning
 5619:07 <pinheadmz_> sipa interesting. I thought there were performance reasons why R should be one or the other
 5719:08 <emzy> lol
 5819:08 <sipa> the reasoning was "it's faster for single verification" - turns out it isn't, and in fact it's probably slower in practice
 5919:08 <fjahr> I'm wrong. P uses even.
 6019:08 <jnewbery> sipa: so the final final final version of BIP 340 will have both using implicit even y?
 6119:08 <sipa> jnewbery: well, we could decide not to change things at this point anymore
 6219:08 <graveyard> should be an interesting read
 6319:08 <pinheadmz_> jnewbery: using the "F" word
 6419:08 <sipa> but i owe an explanation, since our justification was based on incorrect assumptions
 6519:09 <jnewbery> pinheadmz_: i didn't say "final final final final"
 6619:09 <pinheadmz_> Ha
 6719:09 <felixweis> now i learned what a quadratic residue is for no reason :(
 6819:09 <felixweis> jk
 6919:09 <michaelfolkson> How does one conclude it is faster and then realize it isn't? Yeah looking forward to the read too
 7019:09 <jnewbery> ok, next question: What other derived classes use the BaseSignatureChecker as their base class?
 7119:09 <nehan> michaelfolkson: +1
 7219:10 <jnewbery> (apart from GenericTransactionSignatureChecker)
 7319:10 <sipa> michaelfolkson: i'm happy to elaborate, but i think probably not during this meeting
 7419:10 <michaelfolkson> Yeah no problem, thanks
 7519:10 <fjahr> I found DummySignatureChecker and SignatureExtractorChecker
 7619:10 <willcl_ark> SignatureExtractorChecker
 7719:11 <jnewbery> sipa: thanks. Let's focus on Bitcoin Core's use of the signature verification in this meeting, and maybe circle back to the cryptography at the end if we have time
 7819:11 <sipa> jnewbery: sgtm
 7919:12 <jnewbery> fjahr willcl_ark: yep. Did you look into what those are doing?
 8019:12 <graveyard> evalscript?
 8119:13 <jnewbery> I also found one other class that inherits from BaseSignatureChecker
 8219:13 <fjahr> Dummy is a what is it's name says and just accept all signatures
 8319:13 <michaelfolkson> TransactionSignatureChecker, MutableTransacstionSignatureChecker, CacheTransactionSignatureChecker too.... were they derived from Base? (just checking)
 8419:14 <jnewbery> michaelfolkson: yes, see this week's notes from "The XOnlyPubKey object is used within a hierarchy of signature checker classes..."
 8519:15 <jnewbery> The other derived class I found is called FuzzedSignatureChecker
 8619:15 <jnewbery> (used in fuzz testing)
 8719:16 <thomasb06> what number is 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F ?
 8819:16 <pinheadmz_> thomasb06: is that the curve order?
 8919:16 <jnewbery> ok, we have this hierarchy of classes for signature checking. Let's look into how it's actually doing that verification.
 9019:17 <thomasb06> pinheadmz_: yes
 9119:17 <jnewbery> what does this code do: https://github.com/bitcoin-core-review-club/bitcoin/blob/125318b6/src/script/interpreter.cpp#L1558-L1564 ?
 9219:17 <pinheadmz_> Checks the sighash
 9319:17 <thomasb06> pinheadmz_: han... Thanks
 9419:17 <michaelfolkson> Checks if the signature is 64 bytes or not
 9519:18 <felixweis> if its 65 bytes but still SIGHASH?DEFAULT its illegal bc its implicit
 9619:18 <fjahr> 65 bytes includes a different sighash type (not SIGHASH_DEFAULT) in the last byte. It gets stripped and the 64 bytes are used to check sig. If its 64 bytes the default is used automatically. 65 bytes with SIGHASH_DEFAULT is not allowed to protect against malleability.
 9719:18 <gzhao408> u can't have a 65B sig if it's hashtype == SIGHASH_DEFAULT?
 9819:18 <michaelfolkson> And then if it isn't checks that it isn't SIGHASH_DEFAULT as that is only defined for Taproot
 9919:18 <felixweis> to mitigate malleability
10019:19 <pinheadmz_> gzhao408: you can
10119:19 <pinheadmz_> gzhao408: but if you leave off sighash byte it it implicitly sighash all
10219:19 <gzhao408> o
10319:19 <pinheadmz_> You can optionally have 0x01 as well for explicit ALL
10419:19 <pinheadmz_> I think
10519:20 <sipa> indeed
10619:20 <pinheadmz_> Or perhaps for backwards compatibility ? sipa?
10719:20 <pinheadmz_> Why allow both options
10819:20 <willcl_ark> we covered this in a previous meeting but I don't quite recall now
10919:20 <sipa> some things may rely on a predictable signature size, and account for 65 byte anyway
11019:21 <jnewbery> willcl_ark: indeed, we talked about this a couple of weeks ago
11119:21 <jnewbery> https://bitcoincore.reviews/17977#l-86
11219:22 <willcl_ark> <sipa> pinheadmz: you can do either; have a 64-byte one with implicit sighash 0, or explicitly make a 65-byte sig with hashtype 1; their semantics are the same... but the signature will still differ because the hash commits to the actual hashtype value
11319:22 <pinheadmz_> Lol is that from the last meeting?
11419:22 <jnewbery> if there's no sighash byte (and the sig is therefore 64 bytes), then the sighash is SIGHASH_DEFAULT, which hashes the same transaction data as SIGHASH_ALL
11519:22 <michaelfolkson> 2 weeks ago
11619:23 <pinheadmz_> But the question of design... sounds like for interop or backwards compat
11719:23 <jnewbery> so gzhao408 is right to say you can't have a 65B if it's hashtype == SIGHASH_DEFAULT
11819:24 <jnewbery> you can have SIGHASH_ALL, which is the same transaction data (but the hash also includes the sighash to avoid malleability between SIGHASH_DEFAULT and SIGHASH_ALL)
11919:24 <thomasb06> p = 2^256 - 2^32 - 2^9 - 2^8 - 2^7 - 2^6 - 2^4 - 1 = 115792089237316195423570985008687907853269984665640564039457584007908834671663
12019:24 <jnewbery> If any of this is confusing or unclear, then I recommend rereading the notes and logs from two weeks ago
12119:25 <jnewbery> Next question: Why do we have a signature cache? How does it help performance?
12219:26 <fjahr> tx sigs can be evaluated multiple times by a node for example at entry of mempool and when included in a block. with the cache it doesn't have to which si saving computation resources.
12319:26 <pinheadmz_> I had a few Qs a about this
12419:26 <gzhao408> bc signature verification takes SOOOO LONG and u usually have to do it at least twice, atmp and in block
12519:26 <michaelfolkson> Performance optimization and DoS protection
12619:26 <pinheadmz_> Ok yeah I thought mempool/Block. But also don't understand the purpose of the salt
12719:27 <gzhao408> le salt is DoS protection
12819:27 <sipa> more defense in depth
12919:28 <sipa> if there somehow is a collision in the cache's hashing, it would be on an isolated machine, rather than consistently on every node in the network
13019:29 <willcl_ark> that's pretty smart!
13119:29 <sipa> note that there is also a full validation cache in validation.cpp
13219:30 <sipa> since the introduction of that, the sigcache's role is primarily dos resistance
13319:31 <jnewbery> where 'full validation cache' means caching the result of validating a single script with specified flags
13419:31 <sipa> exactly
13519:31 <jnewbery> I wrote a bit about that a few years ago if you want to read a bit more: https://johnnewbery.com/post/whats-new-in-bitcoin-core-v0.15-pt5/
13619:31 <sipa> rather than individual signature validations
13719:32 <michaelfolkson> Apparently you had a discussion on this in 2012 with Jeff Garzik sipa ;) https://bitslog.com/2013/01/23/fixed-bitcoin-vulnerability-explanation-why-the-signature-cache-is-a-dos-protection/
13819:33 <jnewbery> the script cache is here: https://github.com/bitcoin/bitcoin/blob/bd00d3b1f2036893419d1e8c514a8af2c4e4b1fb/src/validation.cpp#L1472
13919:33 <jnewbery> pinheadmz_: did you have any other questions about signature cache?
14019:33 <michaelfolkson> That jnewbery blog post is very good, I remember that
14119:33 <pinheadmz__> Jsut about why salt and not key by the entire sig
14219:34 <sipa> pinheadmz__: it is
14319:34 <sipa> Entries are SHA256(nonce || signature hash || public key || signature)
14419:34 <pinheadmz__> And that won't be collision resistant without nonce?
14519:34 <sipa> sure it would be
14619:34 <sipa> there is no technical reason why the nonce is needed
14719:34 <sipa> but it also comes at 0 cost
14819:35 <pinheadmz__> Ah
14919:35 <sipa> (the midstate of "SHA256(nonce ||" is precomputed)
15019:35 <pinheadmz__> Ok that is a cool design feature
15119:36 <gzhao408> jnewbery: when you check against the script cache, do you go by exact verify flags or any superset is fine?
15219:36 <jnewbery> gotta love them midstate precomputations
15319:37 <jnewbery> gzhao408: I think it's by the exact verify flags, but I'm not 100% sure
15419:38 <jnewbery> yes, the exact flags: https://github.com/bitcoin/bitcoin/blob/13c4635a3ecfbc6759301fb3c94bd5293c49388c/src/validation.cpp#L1518-L1525
15519:38 <jnewbery> ok, final question. Why is the change here necessary: https://github.com/bitcoin-core-review-club/bitcoin/commit/125318b68a#diff-8a3cc5f1d2678a348e95e4884d1827f1R38-R45
15619:40 <fjahr> We want nonce of ecdsa and schnorr to be different, to prevent collisions as well I assume? This also has zero cost I guess...
15719:42 <sipa> exactly
15819:42 <jnewbery> fjahr: exactly, we don't want ecdsa and schnorr signatures in the cache to collide (slight correction: s/nonce/key)
15919:43 <sipa> there *should* be no need for that either, as the sighashes should never collide
16019:43 <jnewbery> ok, those were the only questions I had prepared for this. What did you all think of that commit? Pretty easy to follow? Any questions?
16119:44 <michaelfolkson> Sorry I don't quite get this last part. A 64 byte signature collision?
16219:45 <jnewbery> michaelfolkson: take a look at the signature cache. The structure is a cuckoocache, which has a key and a value
16319:46 <michaelfolkson> Ok
16419:46 <jnewbery> the key we use is a hash of various things, and we don't want those keys to collide
16519:46 <sipa> it's really a set
16619:47 <jnewbery> sipa: right, thanks
16719:47 <michaelfolkson> Any questions evanlinjin? You still there?
16819:47 <evanlinjin> I'm still here!
16919:48 <evanlinjin> No questions so far. Learning a lot
17019:48 <evanlinjin> I should probably look into the PR beforehand for next time though
17119:48 <michaelfolkson> Otherwise can we ask about sipa about the u turn?
17219:49 <gzhao408> i have question about script cache, not related to this though
17319:49 <jnewbery> sure, if sipa's around and wants to talk about quadratic residues
17419:49 <jnewbery> gzhao408: go ahead!
17519:50 <sipa> actually let me just share the mail i'm planning to send: https://0bin.net/paste/TYkaOevuGeiiMdhv#7qExtowsogSYGHGUOSFQ+gVA5Nl9Ss78lpNx-LQPyBc
17619:50 <sipa> if anyone feels like reading and pointing things out that aren't clear or so, go ahead
17719:51 <michaelfolkson> evanlinjin: Cool feel free to reach out or post on this channel during the week if you have general questions
17819:52 <evanlinjin> michaelfolkson: Thank you!
17919:53 <thomasb06> for real numbers, the equivalent of quadratic residue is "is the number positive, or negative". If the number is positive, it has a square root. If it's negative, it hasn't. In F_p, half numbers are square roots and half are not. The Legendre symbole says if it does or not.
18019:53 <felixweis> how does ed25519 get away with 32 bytes pubkeys?
18119:53 <gzhao408> jnewbery: well, the reason i asked if the script cache uses exact verify flags is according to https://github.com/bitcoin/bitcoin/blob/e16718a8b3db8bf9c9715f28f4dc6080bf609776/src/script/interpreter.h#L31 they are meant to be soft forks, i.e. if a script passes for a set of flags A, then it would definitely pass for any subset of A right? hopefully i don't have that backwards. and usually if we were to verify
18219:53 <gzhao408> script in atmp, we use more flags (e.g. standardness) than when we are validating a block. but what i mean to say is... i feel like it doesn't need to be the exact verify flags?
18319:53 <michaelfolkson> "The benchmark was repeatedly testing the same constant input, which apparently was around 2.5x faster than the average speed. It is a variable-time algorithm, so a good variation of inputs matters."
18419:54 <jnewbery> sipa: lots to digest there. My very rudimentary understanding was that to lift an x co-ordinate onto a point, the last step was squaring, so you'd always end with a quadratic residue. Is that relevant here?
18519:54 <sipa> jnewbery: that's correct, but not relevant i think
18619:54 <sipa> it matters for batch validation where the R.x coordinate is lifted explicitly to a point
18719:55 <sipa> but batch validation is not impacted by this (as it needs a sqrt anyway)
18819:55 <sipa> invididual validation does not need a sqrt, it recomputes R, and then verifies it against the signature
18919:55 <jnewbery> sipa: ok, I'll go away and read the post. Thanks for writing it up!
19019:55 <michaelfolkson> For what it is worth I don't think an improvement like this shouldn't be made because it is late in the day. It doesn't have knock on impacts right?
19119:56 <sipa> i need to run now, but i'll read messages later here if anyone has comments
19219:57 <jnewbery> gzhao408: the flags used for validating are hashed into the cache entry, so it'd be quite a big redesign to make the change you're suggesting
19319:58 <jnewbery> if you're interested in this subject, take a look at where we call CheckInputScripts() multiple times in ATMP to explicitly fill the cache
19419:58 <jnewbery> I also need to run now. Thanks everyone. Great discussion today!
19519:59 <thomasb06> thanks
19619:59 <fjahr> jnewbery: Thanks for hosting!
19719:59 <troygiorshev> yeah thanks jnewbery!
19819:59 <michaelfolkson> Thanks jnewbery
19920:00 <evanlinjin> Thank you jnewbery!