Bitcoin Core PR Review Club

Implement BIP 340-342 validation - Support for Schnorr Signatures and integration in SignatureCheckers (consensus, taproot)

Aug 12, 2020

https://github.com/bitcoin/bitcoin/pull/17977

Host: jnewbery - PR author: sipa

The PR branch HEAD was ff9b7e6 at the time of this review club meeting.

This is the fourth in a series of review club meetings on the (work in progress) implementation of BIP 340-342.

This week, we’ll look at another commit from PR 17977 - Support for Schnorr signatures and integration in SignatureCheckers.

Notes

This is the first commit from PR 17977 that uses the new functionality and interfaces in libsecp256k1. You may need to make clean && ./configure && make for the build to succeed.
Schnorr signatures are a digital signature scheme that can be constructed over any group where the discrete log problem is hard. It’s a different signature scheme from ECDSA, which is used in Bitcoin today, but there are certain similarities between the two schemes.
We won’t go into the construction of Schnorr signatures or the motivation for adding them to Bitcoin, but it’d be useful to have a high-level understanding of those things to motivate why we’re making these changes. BIP 340 explains both the motivation and the construction of these signatures.
We also won’t go into the implementation of signing and verifying the signatures themselves. That’s all hidden away inside the libsecp library. The first two commits from PR 17977 pull in the required changes to libsecp. If you want to dig deeper into the implementation, lipsecp PR 558 is where the new cryptographic code is implemented.
Instead, we’re going to look at the new interface provided by libsecp for verifying Schnorr signatures, and how it’s integrated into Bitcoin Core.
The first thing to note about the Schnorr signature interface is that pubkeys in BIP340 are 32 bytes. That’s unlike pubkeys in our ECDSA implementation, which are 33 or 65 bytes (depending on whether they’re compressed or uncompressed). BIP 340 public keys can be 32 bytes because the y-value or the point is implicit. See the Implicit Y coordinates and Public Key Generation sections of BIP 340 for full details.
The new XOnlyPubKey object is the lowest-level object we’ll use for signature validation. It’s constructed with a uint256, and then the function VerifySchnorr() is called with the 32 bytes hash of the message that was signed (this hash can sometimes just be called the ‘message’, or in the case of signing a Bitcoin transaction the ‘sighash’), and the 64 byte signature.
The XOnlyPubKey object is used within a hierarchy of signature checker classes, which are used by the higher-level code to check signatures:
- At the base is BaseSignatureChecker, which provides three (virtual) public methods: CheckSig(), CheckLockTime() and CheckSequence(). This PR adds CheckSchnorrSig().
- Above that is GenericTransactionSignatureChecker (which is actually a template that’s instantiated as TransactionSignatureChecker or MutableTransactionSignatureChecker depending on whether it’s being used with a CTransaction or CMutableTransaction). GenericTransactionSignatureChecker overrides the virtual CheckSig() and CheckSchnorrSig() methods with implementations of signature verification.
- And above TransactionSignatureChecker is a CachingTransactionSignatureChecker, which uses the signature cache (calling CachingTransactionSignatureChecker::Verify____Signature() more than once on the same transaction won’t result in the same signatures being validated multiple times).
The main functional changes in this PR are in GenericTransactionSignatureChecker::CheckSigSchnorr() and the CSignatureHash constructor.

Questions

Why does BIP 340 use 32 byte public keys? How do we get away with not including the y-coordinate in the public key?
What other derived classes use the BaseSignatureChecker as their base class?
What is this code doing? When do we expect a signature to be 64 bytes, and when is it 65 bytes?
Why do we have a signature cache? How does it help performance?
Why is this change necessary?

Meeting Log

19:00

<jnewbery> #startmeeting

19:00

<jnewbery> hi folks!

19:00

<emzy> Hi

19:00

<willcl_ark> hi

19:00

<troygiorshev> hi

19:00

<pinheadmz_> Hi! On mobile. This'll be weird

19:00

<michaelfolkson> hi

19:01

<fjahr> hi

19:01

<jnewbery> welcome to Bitcoin Core PR Review Club

19:01

<jnewbery> This week we're talking about .....

19:01

<jnewbery> ... TAPROOT

19:01

<jnewbery> (again)

19:01

<fjahr> Taprrrrrrrrrroot

19:01

<troygiorshev> woot!

19:01

⚡ graveyard party hat

19:01

<evanlinjin> Hello everyone!

19:01

<jnewbery> notes and questions are where you'd expect them: https://bitcoincore.reviews/17977-2

19:01

<evanlinjin> This is my first time :)

19:01

<graveyard> hi evanlinjin

19:01

<nehan> hi

19:01

<willcl_ark> hi evanlinjin

19:02

<jnewbery> hi evanlinjin. Welcome. We love new participants :)

19:02

<evanlinjin> Hello, thank you for the welcome!

19:02

<jnewbery> who had a chance to review this week's commit?

19:02

<jonatack> taproooooooo oooo ooooot (hi)

19:02

<jnewbery> y/n (no worries if you didn't have time)

19:02

<sipa> hi

19:02

<evanlinjin> Sorry, not me

19:02

<pinheadmz_> Y

19:02

<troygiorshev> n

19:02

<michaelfolkson> y

19:02

<sipa> i've looked at it once or twice

19:02

<fjahr> y

19:02

<emzy> n

19:02

<michaelfolkson> Have you reviewed this week's commit sipa?

19:02

<fjahr> sipa: probably when you wrote it :D

19:03

<jnewbery> sipa: that's good to know

19:03

<jonatack> i've looked at it less than sipa has

19:03

<jnewbery> ok, lots of notes this week. Not so many questions, but that means there's more time for _your_ questions

19:03

<jnewbery> q1: Why does BIP 340 use 32 byte public keys? How do we get away with not including the y-coordinate in the public key?

19:04

<michaelfolkson> BIP 340 laid out three options and plumped for the third

19:04

<pinheadmz_> Every x has 2 y. We just require the y to be odd (or square) or negative. Or whichever it ended up being. ;/)

19:04

<fjahr> The "Implicit Y coordinates" section in BIP340 explains it: Every X coordinate has 2 two Y coordinates. There are different options on how to determine which Y to use but the Y which is a quadratic residue was chosen.

19:04

<graveyard> since it's mirrored we can set the zone we want to look at and it's inferred

19:04

<jnewbery> michaelfolkson pinheadmz_ fjahr graveyard: exactly correct!

19:05

<pinheadmz_> And pubkey Y isn't the same as R value Y right?

19:05

<pinheadmz_> It's quad reissue for one and positive ness for the other ?

19:05

<nehan> n

19:05

<pinheadmz_> *quadratic residue

19:06

<jnewbery> a public key in elliptic curve cryptography is a point, but because the curve is mirrored in the x-axis, as long as we all agree a scheme to determine which of the 2 possible y values to choose, then we don't actually need to explicitly state which y value we're using

19:06

<fjahr> pinheadmz_: i think they both use quad residue but had different explainations why

19:07

<nehan> deterministic symmetry breaking to choose the y

19:07

<sipa> i guess this is as good a time to mention this as any: i believe our original reasoning for picking quadratic residue for R was flawed, and we should consider making both use even

19:07

<sipa> i'm about to send a mail to list about that, today

19:07

⚡ michaelfolkson looking up the original reasoning

19:07

<pinheadmz_> sipa interesting. I thought there were performance reasons why R should be one or the other

19:08

<emzy> lol

19:08

<sipa> the reasoning was "it's faster for single verification" - turns out it isn't, and in fact it's probably slower in practice

19:08

<fjahr> I'm wrong. P uses even.

19:08

<jnewbery> sipa: so the final final final version of BIP 340 will have both using implicit even y?

19:08

<sipa> jnewbery: well, we could decide not to change things at this point anymore

19:08

<graveyard> should be an interesting read

19:08

<pinheadmz_> jnewbery: using the "F" word

19:08

<sipa> but i owe an explanation, since our justification was based on incorrect assumptions

19:09

<jnewbery> pinheadmz_: i didn't say "final final final final"

19:09

<pinheadmz_> Ha

19:09

<felixweis> now i learned what a quadratic residue is for no reason :(

19:09

<felixweis> jk

19:09

<michaelfolkson> How does one conclude it is faster and then realize it isn't? Yeah looking forward to the read too

19:09

<jnewbery> ok, next question: What other derived classes use the BaseSignatureChecker as their base class?

19:09

<nehan> michaelfolkson: +1

19:10

<jnewbery> (apart from GenericTransactionSignatureChecker)

19:10

<sipa> michaelfolkson: i'm happy to elaborate, but i think probably not during this meeting

19:10

<michaelfolkson> Yeah no problem, thanks

19:10

<fjahr> I found DummySignatureChecker and SignatureExtractorChecker

19:10

<willcl_ark> SignatureExtractorChecker

19:11

<jnewbery> sipa: thanks. Let's focus on Bitcoin Core's use of the signature verification in this meeting, and maybe circle back to the cryptography at the end if we have time

19:11

<sipa> jnewbery: sgtm

19:12

<jnewbery> fjahr willcl_ark: yep. Did you look into what those are doing?

19:12

<graveyard> evalscript?

19:13

<jnewbery> I also found one other class that inherits from BaseSignatureChecker

19:13

<fjahr> Dummy is a what is it's name says and just accept all signatures

19:13

<michaelfolkson> TransactionSignatureChecker, MutableTransacstionSignatureChecker, CacheTransactionSignatureChecker too.... were they derived from Base? (just checking)

19:14

<jnewbery> michaelfolkson: yes, see this week's notes from "The XOnlyPubKey object is used within a hierarchy of signature checker classes..."

19:15

<jnewbery> The other derived class I found is called FuzzedSignatureChecker

19:15

<jnewbery> (used in fuzz testing)

19:16

<thomasb06> what number is 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F ?

19:16

<pinheadmz_> thomasb06: is that the curve order?

19:16

<jnewbery> ok, we have this hierarchy of classes for signature checking. Let's look into how it's actually doing that verification.

19:17

<thomasb06> pinheadmz_: yes

19:17

<jnewbery> what does this code do: https://github.com/bitcoin-core-review-club/bitcoin/blob/125318b6/src/script/interpreter.cpp#L1558-L1564 ?

19:17

<pinheadmz_> Checks the sighash

19:17

<thomasb06> pinheadmz_: han... Thanks

19:17

<michaelfolkson> Checks if the signature is 64 bytes or not

19:18

<felixweis> if its 65 bytes but still SIGHASH?DEFAULT its illegal bc its implicit

19:18

<fjahr> 65 bytes includes a different sighash type (not SIGHASH_DEFAULT) in the last byte. It gets stripped and the 64 bytes are used to check sig. If its 64 bytes the default is used automatically. 65 bytes with SIGHASH_DEFAULT is not allowed to protect against malleability.

19:18

<gzhao408> u can't have a 65B sig if it's hashtype == SIGHASH_DEFAULT?

19:18

<michaelfolkson> And then if it isn't checks that it isn't SIGHASH_DEFAULT as that is only defined for Taproot

19:18

<felixweis> to mitigate malleability

19:19

<pinheadmz_> gzhao408: you can

19:19

<pinheadmz_> gzhao408: but if you leave off sighash byte it it implicitly sighash all

19:19

<gzhao408> o

19:19

<pinheadmz_> You can optionally have 0x01 as well for explicit ALL

19:19

<pinheadmz_> I think

19:20

<sipa> indeed

19:20

<pinheadmz_> Or perhaps for backwards compatibility ? sipa?

19:20

<pinheadmz_> Why allow both options

19:20

<willcl_ark> we covered this in a previous meeting but I don't quite recall now

19:20

<sipa> some things may rely on a predictable signature size, and account for 65 byte anyway

19:21

<jnewbery> willcl_ark: indeed, we talked about this a couple of weeks ago

19:21

<jnewbery> https://bitcoincore.reviews/17977#l-86

19:22

<willcl_ark> <sipa> pinheadmz: you can do either; have a 64-byte one with implicit sighash 0, or explicitly make a 65-byte sig with hashtype 1; their semantics are the same... but the signature will still differ because the hash commits to the actual hashtype value

19:22

<pinheadmz_> Lol is that from the last meeting?

19:22

<jnewbery> if there's no sighash byte (and the sig is therefore 64 bytes), then the sighash is SIGHASH_DEFAULT, which hashes the same transaction data as SIGHASH_ALL

19:22

<michaelfolkson> 2 weeks ago

19:23

<pinheadmz_> But the question of design... sounds like for interop or backwards compat

19:23

<jnewbery> so gzhao408 is right to say you can't have a 65B if it's hashtype == SIGHASH_DEFAULT

19:24

<jnewbery> you can have SIGHASH_ALL, which is the same transaction data (but the hash also includes the sighash to avoid malleability between SIGHASH_DEFAULT and SIGHASH_ALL)

19:24

<thomasb06> p = 2^256 - 2^32 - 2^9 - 2^8 - 2^7 - 2^6 - 2^4 - 1 = 115792089237316195423570985008687907853269984665640564039457584007908834671663

19:24

<jnewbery> If any of this is confusing or unclear, then I recommend rereading the notes and logs from two weeks ago

19:25

<jnewbery> Next question: Why do we have a signature cache? How does it help performance?

19:26

<fjahr> tx sigs can be evaluated multiple times by a node for example at entry of mempool and when included in a block. with the cache it doesn't have to which si saving computation resources.

19:26

<pinheadmz_> I had a few Qs a about this

19:26

<gzhao408> bc signature verification takes SOOOO LONG and u usually have to do it at least twice, atmp and in block

19:26

<michaelfolkson> Performance optimization and DoS protection

19:26

<pinheadmz_> Ok yeah I thought mempool/Block. But also don't understand the purpose of the salt

19:27

<gzhao408> le salt is DoS protection

19:27

<sipa> more defense in depth

19:28

<sipa> if there somehow is a collision in the cache's hashing, it would be on an isolated machine, rather than consistently on every node in the network

19:29

<willcl_ark> that's pretty smart!

19:29

<sipa> note that there is also a full validation cache in validation.cpp

19:30

<sipa> since the introduction of that, the sigcache's role is primarily dos resistance

19:31

<jnewbery> where 'full validation cache' means caching the result of validating a single script with specified flags

19:31

<sipa> exactly

19:31

<jnewbery> I wrote a bit about that a few years ago if you want to read a bit more: https://johnnewbery.com/post/whats-new-in-bitcoin-core-v0.15-pt5/

19:31

<sipa> rather than individual signature validations

19:32

<michaelfolkson> Apparently you had a discussion on this in 2012 with Jeff Garzik sipa ;) https://bitslog.com/2013/01/23/fixed-bitcoin-vulnerability-explanation-why-the-signature-cache-is-a-dos-protection/

19:33

<jnewbery> the script cache is here: https://github.com/bitcoin/bitcoin/blob/bd00d3b1f2036893419d1e8c514a8af2c4e4b1fb/src/validation.cpp#L1472

19:33

<jnewbery> pinheadmz_: did you have any other questions about signature cache?

19:33

<michaelfolkson> That jnewbery blog post is very good, I remember that

19:33

<pinheadmz__> Jsut about why salt and not key by the entire sig

19:34

<sipa> pinheadmz__: it is

19:34

<sipa> Entries are SHA256(nonce || signature hash || public key || signature)

19:34

<pinheadmz__> And that won't be collision resistant without nonce?

19:34

<sipa> sure it would be

19:34

<sipa> there is no technical reason why the nonce is needed

19:34

<sipa> but it also comes at 0 cost

19:35

<pinheadmz__> Ah

19:35

<sipa> (the midstate of "SHA256(nonce ||" is precomputed)

19:35

<pinheadmz__> Ok that is a cool design feature

19:36

<gzhao408> jnewbery: when you check against the script cache, do you go by exact verify flags or any superset is fine?

19:36

<jnewbery> gotta love them midstate precomputations

19:37

<jnewbery> gzhao408: I think it's by the exact verify flags, but I'm not 100% sure

19:38

<jnewbery> yes, the exact flags: https://github.com/bitcoin/bitcoin/blob/13c4635a3ecfbc6759301fb3c94bd5293c49388c/src/validation.cpp#L1518-L1525

19:38

<jnewbery> ok, final question. Why is the change here necessary: https://github.com/bitcoin-core-review-club/bitcoin/commit/125318b68a#diff-8a3cc5f1d2678a348e95e4884d1827f1R38-R45

19:40

<fjahr> We want nonce of ecdsa and schnorr to be different, to prevent collisions as well I assume? This also has zero cost I guess...

19:42

<sipa> exactly

19:42

<jnewbery> fjahr: exactly, we don't want ecdsa and schnorr signatures in the cache to collide (slight correction: s/nonce/key)

19:43

<sipa> there *should* be no need for that either, as the sighashes should never collide

19:43

<jnewbery> ok, those were the only questions I had prepared for this. What did you all think of that commit? Pretty easy to follow? Any questions?

19:44

<michaelfolkson> Sorry I don't quite get this last part. A 64 byte signature collision?

19:45

<jnewbery> michaelfolkson: take a look at the signature cache. The structure is a cuckoocache, which has a key and a value

19:46

<michaelfolkson> Ok

19:46

<jnewbery> the key we use is a hash of various things, and we don't want those keys to collide

19:46

<sipa> it's really a set

19:47

<jnewbery> sipa: right, thanks

19:47

<michaelfolkson> Any questions evanlinjin? You still there?

19:47

<evanlinjin> I'm still here!

19:48

<evanlinjin> No questions so far. Learning a lot

19:48

<evanlinjin> I should probably look into the PR beforehand for next time though

19:48

<michaelfolkson> Otherwise can we ask about sipa about the u turn?

19:49

<gzhao408> i have question about script cache, not related to this though

19:49

<jnewbery> sure, if sipa's around and wants to talk about quadratic residues

19:49

<jnewbery> gzhao408: go ahead!

19:50

<sipa> actually let me just share the mail i'm planning to send: https://0bin.net/paste/TYkaOevuGeiiMdhv#7qExtowsogSYGHGUOSFQ+gVA5Nl9Ss78lpNx-LQPyBc

19:50

<sipa> if anyone feels like reading and pointing things out that aren't clear or so, go ahead

19:51

<michaelfolkson> evanlinjin: Cool feel free to reach out or post on this channel during the week if you have general questions

19:52

<evanlinjin> michaelfolkson: Thank you!

19:53

<thomasb06> for real numbers, the equivalent of quadratic residue is "is the number positive, or negative". If the number is positive, it has a square root. If it's negative, it hasn't. In F_p, half numbers are square roots and half are not. The Legendre symbole says if it does or not.

19:53

<felixweis> how does ed25519 get away with 32 bytes pubkeys?

19:53

<gzhao408> jnewbery: well, the reason i asked if the script cache uses exact verify flags is according to https://github.com/bitcoin/bitcoin/blob/e16718a8b3db8bf9c9715f28f4dc6080bf609776/src/script/interpreter.h#L31 they are meant to be soft forks, i.e. if a script passes for a set of flags A, then it would definitely pass for any subset of A right? hopefully i don't have that backwards. and usually if we were to verify

19:53

<gzhao408> script in atmp, we use more flags (e.g. standardness) than when we are validating a block. but what i mean to say is... i feel like it doesn't need to be the exact verify flags?

19:53

<michaelfolkson> "The benchmark was repeatedly testing the same constant input, which apparently was around 2.5x faster than the average speed. It is a variable-time algorithm, so a good variation of inputs matters."

19:54

<jnewbery> sipa: lots to digest there. My very rudimentary understanding was that to lift an x co-ordinate onto a point, the last step was squaring, so you'd always end with a quadratic residue. Is that relevant here?

19:54

<sipa> jnewbery: that's correct, but not relevant i think

19:54

<sipa> it matters for batch validation where the R.x coordinate is lifted explicitly to a point

19:55

<sipa> but batch validation is not impacted by this (as it needs a sqrt anyway)

19:55

<sipa> invididual validation does not need a sqrt, it recomputes R, and then verifies it against the signature

19:55

<jnewbery> sipa: ok, I'll go away and read the post. Thanks for writing it up!

19:55

<michaelfolkson> For what it is worth I don't think an improvement like this shouldn't be made because it is late in the day. It doesn't have knock on impacts right?

19:56

<sipa> i need to run now, but i'll read messages later here if anyone has comments

19:57

<jnewbery> gzhao408: the flags used for validating are hashed into the cache entry, so it'd be quite a big redesign to make the change you're suggesting

19:58

<jnewbery> if you're interested in this subject, take a look at where we call CheckInputScripts() multiple times in ATMP to explicitly fill the cache

19:58

<jnewbery> I also need to run now. Thanks everyone. Great discussion today!

19:59

<thomasb06> thanks

19:59

<fjahr> jnewbery: Thanks for hosting!

19:59

<troygiorshev> yeah thanks jnewbery!

19:59

<michaelfolkson> Thanks jnewbery

20:00

<evanlinjin> Thank you jnewbery!