Implement BIP 340-342 validation - Implement Taproot signature hashing (
consensus) Jul 29, 2020
The PR branch HEAD was ad51604 at the time of this review club meeting.
This is the third Review Club meeting on the (work in progress) implementation
of BIP 340-342. We’ve previously looked at:
Those preliminary changes have been merged into the master branch in separate PRs.
This week, we’ll look at one commit from PR 17977 -
. Implement Taproot
Participants are encouraged to review the notes from the session on
PR18401 in particular, since we’ll be covering a lot of the same
topics this week. Notes
transaction digest that is used in signature creation and verification
is calculated from different parts of the transaction. There are extensive
notes covering this topic from our session on
BIP 341 contains
the specification of the new signature validation
Taproot BIP Review series
covered this topic in week
from that review session has more details.
This week, we’ll review the implementation of the Taproot transaction hashing
algorithm. We won’t look at full signature validation. It may be useful to
check out just up to the commit
Implement Taproot signature hashing (BIP 341).
Specifically, we’ll be looking at the
SignatureHashSchnorr() function in
It’s probably most instructive to compare the new
function to the
SignatureHash() function immediately below, which
calculates the signature hash for segwit v0 and legacy transactions.
SignatureHashSchnorr() is a templated function. Why? What types
the template function instantiated with?
Hint: look at how the existing
SignatureHash() function is called.
SignatureHashSchnorr() is passed a
What data is stored in this structure? Why?
SignatureHashSchnorr() is passed a
hash_type argument. How many valid
hash types are there? The
hash_type parameter is split into
// Hash type
const uint8_t output_type = ...
const uint8_t input_type = ...
How many valid values are there for
SignatureHash(), the new function creates a local
CHashWriter is a stream-like object. Other objects can be
serialized into it (using the
<< operator), and at the end,
CHashWriter::GetHash() is called to retrieve the digest.
One difference from
SignatureHash() is that the
HasherTapSighash. How is
that object constructed, and
what’s the difference from a regular
hash_type does not have the
ANYONECANPAY flag, certain parts of
the transaction are added to the
CHashWriter. What are those elements, and
how is that different from
spend_type byte is added the
CHashWriter. What are the component parts
spend_type, and what do they indicate?
Hint: refer back to BIP 341 and
SIGHASH_SINGLE, there’s a check here:
if (in_pos >= tx_to.vout.size()) return false;
What is that testing, and why?
1 19:00 <jnewbery> #startmeeting
2 19:00 <jnewbery> hi folks!
7 19:00 <jnewbery> welcome to bitcoin core PR review club :)
9 19:00 <michaelfolkson> hi
11 19:01 <jnewbery> today we're going to be looking at a commit from the taproot implementation PR
13 19:01 <troygiorshev> hi
15 19:01 <jnewbery> normal reminder: there are no stupid questions. We're all here to learn, so please speak up if something isn't clear to you. You'll be helping others too!
19 19:02 <jonatack> hi (afk -> sleep, will read tomorrow am)
20 19:02 <jnewbery> other reminder: you can ask questions at any time. You don't have to ask to ask. I'll guide the discussion with some prepared questions, but feel free to jump in at any point with your question
21 19:02 <jnewbery> ok, let's get started
22 19:02 <jnewbery> who had a chance to review the commit?
24 19:03 <jnewbery> (not necessarily the full PR, just the changes to signature hashing)
27 19:03 <michaelfolkson> y
29 19:03 <troygiorshev> y/n
32 19:03 <jnewbery> great!
35 19:03 <jnewbery> First question: SignatureHashSchnorr() is a templated function. Why? What types T is the template function instantiated with? Hint: look at how the existing SignatureHash() function is called.
36 19:04 <troygiorshev> CTransaction and CMutableTransaction, I think that's it
37 19:04 <pinheadmz> is it because the function can accept both mutabelTX and TX ?
38 19:04 <willcl_ark> so you don't have to copy MutableTransaction into a Transaction each time?
39 19:04 <fjahr> Different transaction types for the spending transaction. Could be CTransaction or CMutableTransaction afaict
40 19:04 <jnewbery> troygiorshev pinheadmz willcl_ark fjahr: correct! We can call it with either a CTransaction or CMutableTransaction, just like SignatureHash()
41 19:05 <jnewbery> any questions about that? Is everyone happy with templates?
42 19:06 <jnewbery> I'll move on, but feel free to come back and ask about a previous question if it doesn't make sense to you
44 19:06 <gzhao408> in what situations are we working with a CMutableTransaction? our own txns?
45 19:06 <jnewbery> SignatureHashSchnorr() is passed a PrecomputedTransactionData* argument. What data is stored in this structure? Why?
46 19:06 <pinheadmz> gzhao408 yeah, when the wallet (for ex) is constructing a tx
47 19:06 <jnewbery> great question gzhao408! Anyone know?
48 19:06 <willcl_ark> Depends on which BIP143 (Segwit v0) or BIP341 (taproot) features the tx uses.
49 19:06 <pinheadmz> you can add inputs and outputs and signatures to an MTX
50 19:06 <pinheadmz> but a TX is what comes out of a block for verification
51 19:07 <jnewbery> pinheadmz: how about transactions that aren't in a block?
52 19:07 <pinheadmz> mempool tx are also not mutable
53 19:07 <sipa> (for some historic context: the distinction between tx and mtx arose from wanting to cache the txid and later wtxid inside a tx)
54 19:07 <jnewbery> pinheadmz: yep
55 19:08 <jnewbery> thanks sipa!
56 19:08 <sipa> as a tx is immutable, it can be cached without ever needing to worry about inconsistency or locking or whatever
57 19:08 <sipa> it may also enable having different container structures for tx that are more efficient (e.g. no separate allocation for all the inputs, outputs, scripts, ...)
58 19:08 <michaelfolkson> The state switches from mutable to immutable on announcement? Should check code...
59 19:09 <sipa> michaelfolkson: there is no state; it's an entirely different class
60 19:09 <pinheadmz> michaelfolkson i thought the walletconverts to tx before broadcast
61 19:09 <jnewbery> willcl_ark: yes, what we cache depends on the transaction type. Can you expand?
62 19:09 <sipa> and yes, it's converted from mtx to tx after signing
63 19:09 <willcl_ark> BIP143 tx: hashPrevouts, hashSequence, hashOutputs
64 19:10 <willcl_ark> BIP341 tx: as BIP143 + spentAmounts + spentScripts
66 19:10 <gzhao408> thanks this is really helpful :)
67 19:11 <pinheadmz> willcl_ark and some are double-sha256 and some are single-sha256!
68 19:11 <gzhao408> why do we double-sha256 things?
69 19:11 <sipa> gzhao408: because that's what BIP143 prescribes :)
70 19:12 <jnewbery> willcl_ark: I think those are actually all cached for all witness transactions. It's just that m_spent_amounts_hash and m_spent_scripts_hash are only used if the tx is taproot
71 19:12 <sipa> if your question is why did BIP143's authors chose that option: it was aiming to change as little as possible compared to pre-segwit sighashing, iirc
72 19:12 <jnewbery> gzhao408: it's what satoshi would have wanted
73 19:12 <willcl_ark> jnewbery: ah ok :) that makes sense tbh
74 19:12 <sipa> jnewbery: the PR was just updated, it now computes only the necessaey things
75 19:13 <jnewbery> sipa: ah! I haven't looked at this week's changes
76 19:13 <troygiorshev> gzhao408: apparently maybe protection against length extention attacks (but don't take my word on that)
77 19:13 <willcl_ark> Ah i checked out the PR from bitcoin/bitcoin
78 19:13 <sipa> troygiorshev: length extension attacks are not relevant in this context
79 19:14 <troygiorshev> sipa: ok thx
80 19:14 <sipa> (they apply when you're using a hash as a MAC, which implies there is secret data)
81 19:14 <jnewbery> does anyone have questions about PrecomputedTransactionData or should we move on?
82 19:14 <sipa> however, that may be the (ill advised) reason why satoshi chose for double hashing
83 19:15 <jnewbery> oh wow. PrecomputedTransactionData::Init() has changed quite a lot
84 19:16 <jnewbery> this is what it looked like previously: ttps://github.com/bitcoin-core-review-club/bitcoin/commit/41d08f5d77f52bec0e31bb081d85fff2d67d0467#diff-be2905e2f5218ecdbe4e55637dac75f3R1312-R1331
86 19:16 <jnewbery> Next question. SignatureHashSchnorr() is passed a hash_type argument. How many valid hash types are there?
87 19:17 <michaelfolkson> 4
88 19:17 <jnewbery> any advances on 4?
90 19:17 <willcl_ark> and two masks
91 19:17 <pinheadmz> 7 total i think?
92 19:17 <willcl_ark> or is it 6 and one mask
93 19:17 <pinheadmz> well 6 + default 0x00
94 19:17 <willcl_ark> hmmm
95 19:17 <sipa> how many total combinations are there?
96 19:17 <willcl_ark> final answer: 7
98 19:17 <michaelfolkson> Are we including default and masks in that?
99 19:18 <jnewbery> that's right!
100 19:18 <jnewbery> How many valid values are there for output_type and input_type?
102 19:18 <willcl_ark> 2 output and 1 input?
103 19:19 <michaelfolkson> I don't know what the masks are doing. Can someone explain? :)
105 19:19 <willcl_ark> they're for a bitwise AND
106 19:19 <gzhao408> it's leftmost bit and 2 rightmost bits
107 19:19 <sipa> they extract information
108 19:19 <jnewbery> michaelfolkson: all will become clear shortly
109 19:19 <michaelfolkson> Ok I'll be patient
110 19:19 <jnewbery> willcl_ark: it can't be 1 for input, otherwise we wouldn't need a variable to store it :)
111 19:20 <jnewbery> any advances on 2 and 1?
112 19:20 <willcl_ark> jnewbery: oh, I see what I've done
113 19:21 <theStack> next try: 3 values for output, 2 values for input
114 19:21 <jnewbery> ok, the sighash type dictates what parts of the transaction we hash to go into the signature
115 19:21 <jnewbery> theStack: yes!
116 19:21 <willcl_ark> so input has 8? and output 7?
117 19:21 <willcl_ark> oh, no ok
118 19:22 <jnewbery> part of the sighash type indicates which parts of the transaction's outputs go into the signature hash
119 19:22 <jnewbery> that can take the values SIGHASH_ALL, SIGHASH_SINGLE and SIGHASH_NONE (3 different options)
120 19:22 <gzhao408> i read it as output = SIGHASH_ALL, SIGHHASH_SINGLE, or SIGHASH_NONE and input = SIGHASH_ANYONECANPAY or 0?
121 19:23 <jnewbery> the other part of the sighash indicates which parts of the transaction's inputs go into the signature hash
122 19:23 <sipa> gzhao408: correct
123 19:23 <jnewbery> that can take the values SIGHASH_ANYONECANPAY or not (2 different options)
124 19:24 <troygiorshev> isn't the check on line 1501 unneccesary?
125 19:24 <jnewbery> those can be combined in any way so we get 3*2 = 6 different possibilities
126 19:24 <jnewbery> so where does the 7th come from?
127 19:24 <pinheadmz> huh this is a much simper way to think about sighashing
128 19:24 <pinheadmz> i been trying to kinda memorize each type individually
129 19:25 <pinheadmz> jnewbery that 7th is the default - sighash all
130 19:25 <pinheadmz> if there is no sighash byte at the end of the sig
131 19:25 <willcl_ark> ok I see now
132 19:25 <sipa> pinheadmz: but 1 is sighash_all already
133 19:25 <sipa> how does 0 differ?
134 19:25 <pinheadmz> you can not actually add 0x00 literally, it woudl be invalid
136 19:25 <pinheadmz> its the default value assigned internaly if not sighash byte is present
137 19:26 <pinheadmz> and in fact 0x00 is invalid if it is present
138 19:26 <sipa> 0 is the implicit sighash type when the byte is missing
139 19:26 <sipa> so why is it an implicit 0 and not an implicit 1?
140 19:26 <pinheadmz> "because thats what is says in bip341" :-)
141 19:26 <willcl_ark> heh
142 19:26 <fjahr> jnewbery: you mean have_annex?
143 19:27 <sipa> pinheadmz: technically correct is the only kind of correct
145 19:27 <troygiorshev> sipa: because the mask is on the last two bits?
146 19:27 <jnewbery> fjahr: I don't understand what you mean
147 19:27 <troygiorshev> (no that's not it)
148 19:27 <pinheadmz> sipa i actually don tknow - thought sighash all was only, no byte added -> implict 0x00
149 19:27 <sipa> troygiorshev: both 0 and 1 result in the same value after masking (ALL for outputs, 0 for inputs)
150 19:28 <sipa> fjahr: that's something unrelated
151 19:28 <sipa> it's hashed as well, but elsewhere
152 19:28 <jnewbery> fjahr: oh, you mean the annex is the 7th option? No, that's unrelated here. We'll get to it in a bit
154 19:29 <sipa> so 0 is different from 1, because we don't want people to be able to malleate a 64-byte sig into a 65-byte one
155 19:29 <sipa> if it was just an implicit 1, you'd be able to add or remove it without affecting the validity of the signatire
156 19:30 <theStack> ah, that makes sense
157 19:30 <jnewbery> great point sipa. Thanks!
158 19:30 <jnewbery> ok, next question
159 19:30 <jnewbery> Just like SignatureHash(), the new function creates a local CHashWriter object. CHashWriter is a stream-like object. Other objects can be serialized into it (using the << operator), and at the end, CHashWriter::GetHash() is called to retrieve the digest.
160 19:30 <jnewbery> One difference from SignatureHash() is that the CHashWriter is copy-constructed from HasherTapSighash. How is that object constructed, and what’s the difference from a regular CHashWriter object?
161 19:30 <pinheadmz> but adding 0x01 at the end of the sig is not valid right? if you want sighash all you just leave it as a 64 byte sig with no explciit type?
162 19:31 <jnewbery> Oh, I see that it's now called HASHER_TAPSIGHASH in the latest push
163 19:32 <willcl_ark> sipa: Can't you malleate it in the opposite direction then? from 65 bytes to 64 bytes
164 19:32 <sipa> pinheadmz: you can do either; have a 64-byte one with implicit sighash 0, or explicitly make a 65-byte sig with hashtype 1; their semantics are the same... but the signature will still differ because the hash commits to the actual hashtype value
165 19:32 <sipa> willcl_ark: no
166 19:32 <troygiorshev> jnewbery: it starts with the double tag
167 19:33 <jnewbery> troygiorshev: exactly
168 19:33 ℹ me is now known as Guest86102
169 19:33 <jnewbery> ok, follow-up question: what are tagged hashes and why do we use them in taproot
170 19:34 <pinheadmz> jnewbery i think its to prevent hash collisions with hashing the same data for other purposes
171 19:34 <troygiorshev> which, if I'm reading 340 right, helps prevent a collision across different uses of the same hash function
172 19:34 <jnewbery> pinheadmz troygiorshev: precisely
174 19:35 <troygiorshev> gzhao408: a new meaning of "double SHA256" :D
175 19:35 <sipa> troygiorshev: so why HASHER_TAPSIGHASH and not just start by writing the double tag into the stream?
176 19:35 <pinheadmz> sipa to cache the prefix ?
177 19:36 <fjahr> It's 64 bytes which is the size of a block in sha256
178 19:36 <felixweis> midstate caching
179 19:36 <sipa> yup, exactly
180 19:36 <sipa> it's copying the midstate after hashing the tag, so it avoids redoing that for every sighash
181 19:37 <troygiorshev> and it's static const so we get some sort of speedup copy constructing it?
182 19:37 <jnewbery> pinheadmz fjahr felixweiss: great stuff. If anyone is confused by the words 'midstate' or 'block' in this context, look up the SHA256 or merkle-damgard hash construction wikipedia pages
183 19:38 <jnewbery> troygiorshev: I'm not sure if the static or const make a difference there. Can you explain?
184 19:39 <jnewbery> next question: If the hash_type does not have the ANYONECANPAY flag, certain parts of the transaction are added to the CHashWriter. What are those elements, and how is that different from SignatureHash()?
185 19:40 <willcl_ark> It adds prevouts_hash, spend_amounts_hash, spent_scripts_hash and sequences_hash to the cache if its not ANYONECANPAY
186 19:40 <pinheadmz> a hash of all the inputs' prevouts
187 19:41 <willcl_ark> but SignatureHash does `SHA256Uint256(GetPrevoutHash(txTo)`
189 19:43 <pinheadmz> so weitness v0 sighashing only commits to the value of the input being signed?
190 19:43 <troygiorshev> jnewbery: hmm I may be wrong, need to explore it a bit further
191 19:43 <pinheadmz> but v1 sighash we commit to value of ALL inputs? (unless ANYONECANPAY)
192 19:43 <pinheadmz> this is related to the recent hardware wallet fee-attack that was published i think
193 19:44 <felixweis> thats what cache->m_spent_amounts_hash is about
194 19:44 <jnewbery> pinheadmz: exactly. Can you explain a bit more about the fee-attack?
195 19:45 <pinheadmz> ok sure, so currently you can trick a hardware wallet into signing two separate transactions each with two inputs (say a big value and a small value input) then, create a new TX with just the two high value inputs, making the input to the tx unexpectedly large. the output value doesnt change, so the wallet loses a big fee
196 19:46 <sipa> troygiorshev: static just makes the variable inaccessible outside the compilation unit, and const prevents code from accidentally modifying the cached value; neither changes performance, they just prevent things we don't want to happen
198 19:46 <pinheadmz> although now that im trying to explain, it seems like sighash single must be used for the attack so the bip341 updated couldnt prevent that ?
199 19:46 <sipa> pinheadmz: it doesn't need sighash_single
200 19:47 <pinheadmz> ok (grokking)
201 19:47 <sipa> and yes, signing all inputs prevents it (in sighash_all)
202 19:47 <sipa> *signing all input amounts
204 19:48 <pinheadmz> oh the trick is lying about the value of an input being spent
205 19:48 <pinheadmz> which would normally make a tx invalid on the network
206 19:48 <jnewbery> it's been known about for some time, and adding a new segwit version allows us to resolve that class of issues by introducing a new sighash algorithm
207 19:48 <pinheadmz> but the attecker throws that input away for the final tx
208 19:48 <jnewbery> Next question: A spend_type byte is added the CHashWriter. What are the component parts of spend_type, and what do they indicate? Hint: refer back to BIP 341 and BIP 342.
209 19:49 <jnewbery> fjahr: you might be able to answer this one :)
210 19:49 <willcl_ark> Whether it's taproot or tapscript and with annex or not
211 19:49 <michaelfolkson> ext_flag and annex_present
212 19:49 <pinheadmz> script path vs key path
213 19:49 <pinheadmz> oh and a flag for has_annex
214 19:49 <jnewbery> willcl_ark michaelfolkson pinheadmz: exactly correct
216 19:50 <jnewbery> "spend_type (1): equal to (ext_flag * 2) + annex_present, where annex_present is 0 if no annex is present, or 1 otherwise (the original witness stack has two or more witness elements, and the first byte of the last element is 0x50)"
217 19:50 <jnewbery> Last question:
218 19:50 <jnewbery> If the hash_type indicates SIGHASH_SINGLE, there’s a check here:
219 19:50 <jnewbery> if (in_pos >= tx_to.vout.size()) return false;
220 19:51 <jnewbery> What is that testing, and why?
221 19:51 <pinheadmz> heh, the sighash single bug :-)
222 19:51 <jnewbery> pinheadmz: go on...
223 19:51 <felixweis> SIGHASH_SINGLE is a mode where for every input there is 1 output
224 19:51 <pinheadmz> if there is an input signed with sighash single but no corresponding output...
225 19:52 <pinheadmz> satoshis code returns an error code of 1
226 19:52 <pinheadmz> which gets interpreted as 32 byte hash value!
227 19:52 <pinheadmz> which means you can compute a signature and store in the output of the tx being spent
228 19:53 <pinheadmz> and this happened on mainnet and ive heard it refered to as "the coolese transaction ever"
229 19:53 <pinheadmz> and AFAIK, the only legit use for OP_CODESEPARATOR ?
230 19:53 <willcl_ark> good knowledge !
232 19:54 <jnewbery> hmmm I'm not sure about the part about OP_CODESEPARATOR. I know that people have tried to use that in protocols like tumblebit to allow different signing paths
233 19:54 <pinheadmz> ah ok, i was hoping to get correted on that
234 19:54 <sipa> pinheadmz: i vaguely recall that, but i don't remember the details
235 19:55 <jnewbery> There are some mailing list posts by roconnor about OP_CODESEPARATOR in the last few months if you're interested
236 19:55 <jeremyrubin> wait the tx's signature can be comitted to inside the output being spent?
237 19:55 <pinheadmz> sipa was it not possile to fix sighashsingle in witnessv0?
238 19:56 <pinheadmz> jeremyrubin yes in this case bc of sighash single bug. bc the data signed is 0x0000.....01
240 19:56 <pinheadmz> (not an actual digest of the spending tx)
241 19:56 <sipa> pinheadmz: i think we just chose not to
242 19:56 <sipa> (i don't think it's really a bug a such, just a major footgun)
243 19:56 <pinheadmz> jnewbery is excellent at finding things quickly! i knew it was a peter todd post but couldnt find it
244 19:57 <sipa> pinheadmz: if you run into it, you're already doing something wrong
245 19:57 <jnewbery> the idea behind SIGHASH_SINGLE is that only the output with the corresponding index to the input being spent is included in the hash. Obviously for that to work, there needs to be an output with the same index as the input being signed
246 19:57 <jnewbery> that's what the check is for
247 19:58 <jnewbery> ok, 3 minutes left. Any final questions?
249 19:58 <jnewbery> s/3/2/
250 19:59 <michaelfolkson> Wen moar Taproot
251 19:59 <willcl_ark> instagibbs: and a good writeup of the bug too
252 19:59 <willcl_ark> that's cleared it right up
253 20:00 <jnewbery> michaelfolkson: we'll do more taproot soon. Next week is multiprocess though, hosted by ryanofsky. You won't want to miss it!
254 20:00 <jnewbery> ok, times up. Thanks all. Great session
255 20:00 <jnewbery> #endmeeting
256 20:00 <troygiorshev> thanks jnewbery!
257 20:00 <willcl_ark> thanks jnewbery
258 20:00 <pinheadmz> good jam today! thanks everyone for being so smart and so polite! 🚀
259 20:00 <fjahr> thanks jnewbery
260 20:00 <Platapus> That was a good meeting
262 20:00 <luke-jr> (doh, off by an hour)
263 20:00 <willcl_ark> that was a fun one
264 20:00 <Platapus> Thank you
265 20:00 <jnewbery> off by one errors are the best errors
266 20:00 <felixweis> thanks everyone! jnebery for hosting
268 20:01 <michaelfolkson> Thanks jnewbery