Initialize PrecomputedTransactionData in CheckInputScripts and Move single-sig checking EvalScript code to EvalChecksig (
consensus) Apr 1, 2020
The PR branch HEAD was b409b611 at the time of this review club meeting.
This week, we’ll review two (small) PRs:
A few weeks ago,
we looked at PRs 16902 and
18002, which had both been pulled
out of PR 17977, the WIP schnorr/taproot
implementation. This week,
we’ll look at two more small refactor PRs that have also been pulled out of
The same general considerations apply to these PRs: ordinarily, these
refactor changes to consensus code wouldn’t meet the bar for review cost and
risk -vs- benefit. However, splitting them off from the main schnorr/taproot
PR makes reviewing the logical changes to consensus in the main PR easier and
PR 18401: Initialize PrecomputedTransactionData in CheckInputScripts
When a signature for a transaction input is created or verified, different
parts of the transaction are hashed into a
transaction digest. This is the
e value in
e = Hash(m) that is used in the
The transaction digest algorithm for pre-segwit transaction inputs is
documented here (in
particular, look at the
verifyThisStr string in step 9 of the
which is the data that is hashed).
For pre-segwit transaction inputs, the transaction digest has to be
recalculated from scratch for each signature. This means that in the worst
case the amount of data that needs to be hashed is quadratic in the size of
the transaction. See
Sergio Demian Lerner’s bitcoin talk
post for more details of this
quadratic hashing issue. The Bitcoin Core
and Rusty Russell’s blog also have good
To resolve the quadratic hashing problem, the way that the transaction digest
is calculated was changed for segwit inputs. See
143 for the
motivation and specification.
The segwit transaction digest is designed so that parts of the digest are
shared between all the signatures in a transaction. This means that the total
amount of data to be hashed for the transaction is linear in the size of the
This shared data between all signatures of a transaction is stored in the
PrecomputedTransactionData object, which was added in
Segwit v1 (taproot) makes some very minor changes to the transaction
digest algorithm. Specifically, the transaction digest commits to the
amounts of all inputs (instead of just the amount of the input being signed)
and to the scriptPubKey of the output being spent. See
for the full description of the transaction digest.
Because of these changes, more data needs to be stored in the
PrecomputedTransactionData object. This PR changes the way that the object
is constructed and initialized so that a
future commit in PR
can add that data to the object. PR 18422: Move single-sig checking EvalScript code to EvalChecksig
is the scripting language used for scripts in a taproot tree (taproot has
versioning for scripts, so a future soft fork could introduce a different
scripting version or language).
Tapscript has almost the same semantics as Bitcoin Script. One notable
difference is that
OP_CHECKSIGVERIFY both verify schnorr
signatures instead of ECDSA signatures.
This PR extracts most of the logic for
OP_CHECKSIGVERIFY from the very large
future commit in PR
modifies this function to switch on which flags have been passed into the
script interpreter in order to determine whether to do schnorr or ECDSA
signature verification. Questions
Did you review the PRs?
Concept ACK, approach ACK, tested ACK, or
(Don’t forget to put your PR review on GitHub.)
PrecomputedTransactionData default constructor is added which
doesn’t initialize the data members. Where do they get initialized instead?
What new data members are added to
PrecomputedTransationData in the
subsequent commit in the taproot PR 17977?
Is the old
PrecomputedTransactionData constructor still used anywhere?
Why is the transaction digest algorithm changed for taproot? What problems
does the new algorithm solve?
What is the difference between
false and setting
the return parameter
false? When would those different
failure modes be used?
Why isn’t the code for
extracted out in PR 18401?
1 13:00 <jnewbery> #startmeeting
2 13:00 <jnewbery> Hi folks! Welcome to Bitcoin Core Review club. Feel free to say hi to let everyone know you're here.
5 13:00 <josh-bushwick> hi
6 13:00 <michaelfolkson> Bitcoin Corrrr Review Club
7 13:00 <michaelfolkson> hi
11 13:00 <jnewbery> Special welcome to everyone who's at their first review club meeting.
15 13:01 <AlistairMann> hi
16 13:01 <jnewbery> Feel free to jump in at any time to ask questions. There are no stupid questions here. We're all here to learn.
19 13:02 <jnewbery> Who had a chance to review the PRs this week?
24 13:02 <michaelfolkson> y
30 13:03 <jnewbery> fantastic! The code changes are quite small and mechanical, but there's a _lot_ of context, so I thought it'd be fun to dig into the surrounding concepts a bit.
31 13:03 <jnewbery> I'll ask questions, but like I said earlier, feel free to jump in at any time with your own questions/comments
32 13:03 <jnewbery> first question: A new PrecomputedTransactionData default constructor is added which doesn’t initialize the data members. Where do they get initialized instead?
33 13:04 <jnewbery> (this is in PR 18401: Initialize PrecomputedTransactionData in CheckInputScripts)
34 13:04 <jkczyz> In it's Init method
35 13:04 <jnewbery> jkczyz: right, and where is that called?
36 13:04 <pinheadmz> in the script interpreter ?
37 13:04 <fjahr> As the PR title says, in CheckInputScripts :)
38 13:04 <theStack> it's called in the function CheckInputScripts()
39 13:04 <jnewbery> fjahr: exactly!
40 13:04 <jkczyz> CheckInputScripts
41 13:05 <michaelfolkson> That is it's Init method?
42 13:05 <jnewbery> why are we calling Init there instead of the ctor?
43 13:06 <michaelfolkson> Because the initialization needs to happen later...
44 13:06 <instagibbs> we need valid (uninitialized) caches in place for pointer validity?
45 13:06 <nehan_> to set things up for a later change which requires knowing about spent outputs
46 13:06 <jnewbery> nehan_: yes, that's right
48 13:07 <nehan_> how do we know it's ok to wait until then? Functions like PolicyScriptChecks() use the uninitialized txdata
50 13:08 <instagibbs> I answered the wrong question, but yeah
51 13:08 <jnewbery> That's unchanged by this PR (although I've expanded the comment a bit). Can you explain what's going on there?
52 13:08 <instagibbs> you asked why uninitialized then :)
53 13:09 <jnewbery> (or anyone else)
54 13:09 <pinheadmz> is it because some of the data isnt needed if its not taproot?
55 13:09 <nehan_> control might run later and relies on some of the data in txdata.
56 13:09 <instagibbs> the multithreaded part(control) needs access to the state
58 13:10 <nehan_> jnewbery: yeah your comment is clearer
59 13:10 <jnewbery> nehan_ instagibbs: exactly correct
60 13:11 <instagibbs> used to be that when control took over script validation jobs, you'd std::swap the pointers in
63 13:11 <nehan_> (this was in the original PR)
64 13:12 <jnewbery> nehan_: in the new code, they're constructed in place when the vector is created
65 13:12 <jnewbery> but the data inside them is initialized inside CheckInputScripts
66 13:12 <theStack> nit-question: i guess the definition moved up just for code readability reasons, to be closer to `control`, as the comment refers to both `control` and `txsdata`?
67 13:13 <nehan_> ok. previously the data inside them was initialized here, which was what confused me.
68 13:13 <jnewbery> theStack: yeah, it's a big red flag to me that control relies on txsdata but there's nothing in the code that enforces that
69 13:14 <jnewbery> someone might come along later and see that txsdata is only used inside the for loop and think "we don't need a vector. We can just create a new txdata in each loop iteration"
70 13:14 <jnewbery> so I moved it next to control and expanded the comment
71 13:14 <nehan_> I went down a rabbit hole trying to confirm that the fields inside the items were initialized somewhere.
72 13:15 <instagibbs> to be fair you'll notice it quite fast I think :)
73 13:15 <theStack> jnewbery: thanks for clarifying, that makes sense
74 13:15 <jnewbery> instagibbs: I guess lots of tests would start failing!
75 13:16 <jnewbery> ok, does anyone want to ask any more questions about control or CCheckQueueControl? What's happening is quite interesting
76 13:16 <jnewbery> (in terms of when the script validation is running and on which threads)
77 13:17 <jnewbery> ok, moving on (but feel free to continue asking questions on anything). What new data members are added to PrecomputedTransationData in the subsequent commit in the taproot PR 17977?
78 13:18 <fjahr> the spent outputs of the tx: m_spent_outputs
79 13:18 <theStack> there's a new member m_spent_outputs introduced
81 13:18 <jnewbery> fjahr theStack: yes. We cache the spent outputs
83 13:18 <jnewbery> hi sipa!
84 13:19 <pinheadmz> a vector of all tx outputs
85 13:19 <fjahr> because we need them for taproot sig message
86 13:19 <pinheadmz> er spent outputs
87 13:20 <fjahr> at least the amounts
88 13:20 <jnewbery> right. signatures in taproot commit to a couple of extra pieces of data. What are they?
89 13:21 <jkczyz> amount of all inputs and scriptPubKey of output being spent
90 13:21 <pinheadmz> the annex :-)
91 13:22 <jnewbery> jkczyz: right
92 13:22 <pinheadmz> and the scriptpubkey
93 13:22 <jnewbery> pinheadmz: ah yes, also the annex. Have a bonus point
94 13:22 <pinheadmz> ty, i could use a few
95 13:22 <michaelfolkson> Why do we need the single hashes in addition to the double hashes?
96 13:23 <sipa> the real question is: what things does the sighash commit to that is *not* part of the spending transaction
97 13:23 <pinheadmz> michaelfolkson: from bip341: "There is no expected security improvement by doubling SHA256 because this only protects against length-extension attacks against SHA256 which are not a concern for signature messages because there is no secret data. "
98 13:23 <jnewbery> (difficult question) what potential problems does this solve? Why are we committing to the scriptPubKey and all ammounts?
99 13:23 <pinheadmz> i think the extra commitments help offline signers
100 13:23 <michaelfolkson> Thanks pinheadmz
101 13:24 <jnewbery> pinheadmz: yes! How?
102 13:24 <fjahr> also nSequences but I guess that's less common: The signature message commits to all input nSequence if SIGHASH_NONE or SIGHASH_SINGLE are set (unless SIGHASH_ANYONECANPAY is set as well)
103 13:24 <nehan_> amounts are to eventually support a cold wallet. I still don't think it needs to commit to scriptPubKey but as sipa pointed out on stackoverflow it's to maintain legacy behavior
104 13:24 <nehan_> er, instead of maintain legacy behavior, maybe I should have said reduce changes
106 13:25 <pinheadmz> jnewbery: i thought it was posisble beofre this change to trick a hardware wallt into giving away a huge fee
107 13:25 <jonatack_> to prevent lying to offline signing devices about the output being spent?
108 13:25 <nehan_> jnewbery: yes
109 13:25 <sipa> nehan_: the current sighashes do not commit to the sPK being spent, actually
110 13:25 <sipa> scriptCode is not the same as scriptPubKey
111 13:25 <instagibbs> jonatack_, yes
112 13:25 <nehan_> sipa: true! I meant scriptCode
113 13:26 <jnewbery> nehan_: that's keeping the behaviour between pre-segwit and segwit v0 similar. In taproot, we also commit to the scriptPubKey, which is new
114 13:26 <nehan_> jnewbery: ah, thanks
115 13:26 <jnewbery> and yes, all of this is to prevent being able to lie to an offline signer about fees
116 13:27 <instagibbs> f.e. two inputs, attacker lies about value of one, gets one valid sig, attacker asks for sigs again, lying about the other input's value
118 13:27 <instagibbs> those two sigs combined is a valid tx with an unknown fee
120 13:28 <jnewbery> instagibbs: that's right
121 13:29 <sipa> nehan_: imagine you have a script that checks two signatures with the same key
122 13:29 <sipa> nehan_: could you satsify it with 2 identical signatures?
123 13:30 <sipa> so <key> OP_CHeCKSIGVERIFY <key> OP_CHECKSIG, say
124 13:30 <jnewbery> There's a (very miner) similar attack where an online host could lie to an offline signer about whether an output being spent is P2WPKH or P2WSH-P2PKH. The signer doesn't know the transaction output type and therefore the size, and a signature it provides would be valid for both.
125 13:31 <jnewbery> committing to the scriptPubKey prevents all attacks in this class
126 13:32 <sipa> nehan_: or maybe a better example: IF <key> CHECKSIG ELSE <key> CHECKSIG ENDIF
127 13:32 <r251d> In #18422 EvalChecksig communicates success by return value and by fSuccess bool ref. Are those success values only inconsistent when verifying scripts with null signatures?
128 13:33 <jnewbery> r251d: we'll get onto 18422 in just a moment
129 13:33 <nehan_> sipa: that last one requires one signature? and doesn't use CODESEPARATOR as indicated is necessary in that comment?
130 13:34 <sipa> nehan_: indeed
131 13:34 <sipa> nehan_: but imagine you're given a satisfying witness 1 <sig>
132 13:34 <sipa> can you change the 1 into a 0?
133 13:34 <theStack> so the reason that scriptPubKey of the output being spent was not included in the signature was just that those attacks were not taken into consideration? or is there also a drawback in doing so?
134 13:35 <sipa> theStack: i believe it was an oversight in P2SH that was maintained in P2WSH
135 13:36 <jnewbery> sipa: I think your point is that with OP_CODESEPARATOR, the signer can provide a signature that is only valid for a certain execution branch. Is that right?
136 13:37 <sipa> jnewbery: indeed
137 13:37 <sipa> as i think nehan_ believed the CODESEP was unnecessary for that purpose
138 13:37 <nehan_> sipa: i'm confused. to make sure we're on the same page: I think that committing to (txid, input index) implicitly commits to a scriptCode for someone who has that data (so, not an airgapped wallet, of course). therefore commiting to the actual scriptCode in the signature seems unnecessary.
139 13:38 <sipa> nehan_: ah!
140 13:38 <sipa> you are right, but we're talking about a number of different things
141 13:38 <nehan_> I think this is true EVEN IF you are using CODESEPARATOR (Russell, and you with your edit on that stackoverflow comment, imply otherwise)
142 13:38 <jnewbery> nehan_: if the sighash only committed to (txid, index), then both of those signatures in sipa's example would be signing the same digest, so the signatures would be the same
143 13:39 <sipa> first of all, committing isn't enough; we need to commit in a way that is cheap to prove to an offline signing device
144 13:39 <sipa> giving the entire previous transaction is not cheap
146 13:39 <nehan_> but outside of that
147 13:39 <jnewbery> if the signer only wanted to sign the IF branch, someone could take that signature and place it in the ELSE branch and it'd still be valid
148 13:39 <sipa> secondly, the "code path" referred to in the stack exchange answer is about code without the scriptCode
149 13:39 <sipa> say, what branch of an IF a particular checksig is executed in
150 13:40 <sipa> s/without/inside/
151 13:41 <jnewbery> I'm going to ask the next question, which is about 18422, but don't feel like you need to stop discussing CODESEPARATOR :)
152 13:41 <jnewbery> What is the difference between EvalCheckSig() returning false and setting the return parameter fSuccess to false? When would those different failure modes be used?
154 13:41 <pinheadmz> return false means the script threw an error.
155 13:42 <r251d> jnewbery: Thanks for asking that. That's the essence of my question above.
156 13:42 <pinheadmz> the fsuccess indicates the signature evalutaion
157 13:42 <theStack> i'd roughly say: if EvalCheckSig() returns false, then the input script is invalid (e.g. invalid signature format) in some way, where as fSuccess represents the result of the evaluated script
158 13:42 <jnewbery> pinheadmz: correct
159 13:42 <sipa> theStack: s/script/signature/ maybe
160 13:42 <nehan_> sipa: i think my misunderstanding was the following: i did not think that the scriptSig specified which code path in a scriptPubKey to take. I thought this was determined by execution.
161 13:43 <jnewbery> theStack: yes, except I think you meant to write signature instead of script at the end
162 13:43 <theStack> sipa: yes, thanks
163 13:44 <sipa> nehan_: you're right - it does not: signatures so far have never directly or indirectly committed to what execution path is taken in the script
164 13:44 <jnewbery> ok, so why would evaluating a signature sometimes cause a script to fail instantly, and sometimes not?
165 13:44 <jnewbery> in the case where it's not a valid signature
166 13:44 <sipa> nehan_: russell points out that OP_CODESEPARATOR can optionally be used for that purpose, because it modifies the scriptCode
167 13:44 <sipa> and the scriptCode is committed to
168 13:45 <michaelfolkson> In a multisig case providing an invalid signature shouldn't cause the script to fail?
169 13:45 <pinheadmz> jnewbery: if the sig or pubkey break rules set in bip340
170 13:45 <pinheadmz> size of pubkey and secp field size limits etc
171 13:45 <sipa> pinheadmz: not really
172 13:46 <theStack> jnewbery: with "fail instantly" you mean that false is returned, i.e. before CheckSig() is called?
173 13:46 <sipa> BIP340 treats the signature and public key as byte arrays
174 13:46 <jnewbery> theStack: I mean we fail the script instantly and the transaction is invalid
175 13:46 <sipa> so secp field limits are not really relevant; a signature is valid or invalid
176 13:46 <sipa> and there is no longer a distinction between invalid encoding for a signature vs signature just being wrong
177 13:47 <pinheadmz> did i get filed size mixed up with curve order?
180 13:47 <sipa> neither of those matter
181 13:47 <jnewbery> anyone want to guess what's going on here?
182 13:47 <sipa> a signature is valid or invalid according to BIP340
183 13:48 <jnewbery> sipa: we're still discussing pre-taproot sig validation
186 13:48 <sipa> ignore me
188 13:48 <sipa> pinheadmz: i confused you
189 13:48 <jnewbery> which in the taproot PR eventually will be called EvalChecksigPreTapscript()
190 13:48 <theStack> so pinheadmz was right? i think the names "CheckSignatureEncoding" and "CheckPubKeyEncoding" speak for themselves :)
191 13:49 <sipa> theStack: indeed
192 13:49 <sipa> he did mention bip340 though ;)
193 13:49 <pinheadmz> sipa do you mean that in taproot bc theres soft forkable pubkey versions that CheckPubKeyEncoding wont happen?
194 13:49 <sipa> pinheadmz: correct
195 13:49 <pinheadmz> yeah no i wasnt really confused i was getting ahead of the pr
196 13:50 <pinheadmz> ah ok
197 13:50 <sipa> nor is there a concept of an "incorrectly encoded signature"
198 13:50 <pinheadmz> right for the wrong reason perhaps 🏆
200 13:50 <sipa> a signature is a byte array, and it's valid or invalid
201 13:50 <sipa> this was a huge source of problems with ECDSA
202 13:50 <sipa> that ECDSA signatures are abstract objects, with some complex encoding, ...
203 13:51 <theStack> to answer the question (in pre-taproot), an invalid encoded signature or pubkey encoding could be reasons for an instant fail of the script, leading to an invalid transaction
204 13:51 <pinheadmz> so in the future EvalChecksigPreTapscript() will check pubkey and sig encoding, but EvalChecksigTapscript() doesnt
205 13:52 <sipa> pinheadmz: indeed
206 13:52 <pinheadmz> cheers
207 13:52 <sipa> nehan_: to continue..
208 13:52 <nehan_> sipa: I don't see how OP_CODESEPARATOR could be used for that purpose -- the entire scriptPubKey is committed to in the txid. to clarify: we are talking about pre-segwit Bitcoin, and not considering airgrapped wallets and access to data is not a concern.
209 13:52 <jnewbery> From a high level, I think the point is that we want a way to be able to fail a signature check, but continue execution, while avoiding malleability.
210 13:52 <sipa> nehan_: in pre-taproot, the scriptCode is committed to directly as well
211 13:53 <sipa> nehan_: and the scriptCode is modified by OP_CODESEP operators
212 13:53 <nehan_> sipa: yes. That seems redudant to me :)
213 13:53 <jnewbery> in segwit v0, we do that by allowing a zero-length signature to be an invalid sig, but not cause an instant failure
214 13:53 <jnewbery> that's what the SCRIPT_VERIFY_NULLFAIL flag is all about
215 13:53 <sipa> nehan_: russell's answer shows why it is not entirely redundant: it allows committing to the execution path
216 13:54 <jnewbery> Final question from me. Why isn’t the code for OP_CHECKMULTISIG and OP_CHECKMULTISIGVERIFY also extracted out in PR 18401?
217 13:54 <sipa> the sPK will be identical in all execution paths, but the scriptCode won't be
218 13:55 <pinheadmz> jnewbery: those are removed from tapscript
219 13:55 <sipa> (scriptCode is the script being executed, from the last executed OO_CODESEP until the end of the script; and the entire executed script if no CODESEP was ever executed)
220 13:55 <jnewbery> 5 minute warning. If you have a question but have been keeping quiet, now's the time to speak up
221 13:55 <amiti> jnewbery: I don't quite follow. whats an example of why we'd want to be able to continue execution after failing a signature check?
222 13:56 <michaelfolkson> jnewbery: So they are deprecated in BIP Taproot. The aim is to do these PRs piecemeal
223 13:56 <fjahr> they are disabled in taproot
224 13:56 <nehan_> sipa: I might need to work out an example, because I don't see how that adds a degree of flexibility. relevant OP_CODESEPARATORs are in the scripPubKey, and the scriptCode that results is deterministic from that. Certainly the signer could provide different scriptSigs that could cause different paths to execute, but that does not change the scriptPubKey and hence the scriptCode.
225 13:56 <sipa> amiti: say a 1-of-2 multisig
227 13:56 <jnewbery> amiti: good question! Any suggestions?
228 13:56 <sipa> nehan_: go back to my 0 <sig> vs 1 <sig> exame
229 13:57 <sipa> nehan_: it's a toy example, but it's a form of malleability
230 13:57 <jnewbery> sipa: in that case only one signature (plus dummy empty stack item) is provided
231 13:57 <instagibbs> amiti, if you want to do some big threshhold using script
232 13:57 <pinheadmz> amiti: jnewbery seems silly, but you could have a script be valid if the sig is false
233 13:58 <sipa> nehan_: i think i need to write this out in more detail
234 13:58 <amiti> cool. thanks :)
235 13:58 <pinheadmz> maybe especially in a checksigadd scneario ?
236 13:58 <instagibbs> in tapscript that'd be checksigadd construction, it's ok to have some failures, provided the failures don't actually make you spend the sig validation time
237 13:58 <jnewbery> pinheadmz: not silly at all. I think optimized lightning uses a similar construction.
238 13:58 <sipa> nehan_: you're right that in simple scenarios it is likely not useful
239 13:59 <michaelfolkson> Lightning uses a false sig construction jnewbery?!
240 13:59 <sipa> nehan_: but when reasoning through malleability in miniscript we had to introduce a rule "the same public key cannot occur more than once in a script", because otherwise it's near impossible to reason about - using CODESEP that could be made permitted
241 13:59 <jnewbery> a basic lightning script might be OP_IF <hashlock branch with checksig> OP_ELSE <timelock branch with checksig> OP_ENDIF
242 14:00 <sipa> there is no failing checksig there afaik
243 14:00 <sipa> just a branch that executes only one of the checksigs?
244 14:00 <jnewbery> it's maybe a byte or two more efficient if you do the checksig outside the if/else and then choose the branch based on whether the checksig is successful
246 14:00 <jnewbery> ok, that's time!
247 14:01 <jnewbery> Thanks everyone
248 14:01 <theStack> thanks, was interesting and instructive
249 14:01 <pinheadmz> thanks jnewbery and sipa !
250 14:01 <jonatack_> thanks jnewbery, sipa, and everyone!
252 14:01 <jkczyz> thanks!
254 14:01 <AlistairMann> Awesome - learned so much from just reading and researching as you went. Thanks!
256 14:01 <jnewbery> thanks for joining us, sipa!
257 14:01 <emzy> Thanks! And stay save.
259 14:01 <nehan_> thanks! sorry for derailing!
260 14:02 <michaelfolkson> Good one, thanks. Nice discussion neha and sipa
261 14:02 <nehan_> sipa: ok i see that someone could get me to sign a 0 <sig> and then replace it with a 1 <sig>. bad! but I don't see how committing to scriptCode changes that.
262 14:02 <michaelfolkson> nehan_
263 14:02 <jnewbery> nehan_: not derailing at all. OP_CODESEPARATOR is gnarly
264 14:02 <nehan_> jnewbery: but we haven't even talked about an example with that yet :/
265 14:02 <sipa> nehan_: if one of the branches has a OP_CODESEP it is no longer possible to do that mauling
266 14:03 <jnewbery> I know Nicolas Dorier was using it in his tumblebit implementation a couple of years ago
267 14:03 <sipa> because the two sigs will be verified against a different sighash