The same general considerations apply to these PRs: ordinarily, these
refactor changes to consensus code wouldn’t meet the bar for review cost and
risk -vs- benefit. However, splitting them off from the main schnorr/taproot
PR makes reviewing the logical changes to consensus in the main PR easier and
safer.
PR 18401: Initialize PrecomputedTransactionData in CheckInputScripts
When a signature for a transaction input is created or verified, different
parts of the transaction are hashed into a transaction digest. This is the
e value in e = Hash(m) that is used in the signing/verification
algorithm.
The transaction digest algorithm for pre-segwit transaction inputs is
documented here (in
particular, look at the verifyThisStr string in step 9 of the
diagram,
which is the data that is hashed).
For pre-segwit transaction inputs, the transaction digest has to be
recalculated from scratch for each signature. This means that in the worst
case the amount of data that needs to be hashed is quadratic in the size of
the transaction. See Sergio Demian Lerner’s bitcoin talk
post for more details of this
quadratic hashing issue. The Bitcoin Core
blog
and Rusty Russell’s blog also have good
explanations.
To resolve the quadratic hashing problem, the way that the transaction digest
is calculated was changed for segwit inputs. See BIP
143 for the
motivation and specification.
The segwit transaction digest is designed so that parts of the digest are
shared between all the signatures in a transaction. This means that the total
amount of data to be hashed for the transaction is linear in the size of the
transaction.
This shared data between all signatures of a transaction is stored in the
PrecomputedTransactionData object, which was added in PR
8524.
Segwit v1 (taproot) makes some very minor changes to the transaction
digest algorithm. Specifically, the transaction digest commits to the
amounts of all inputs (instead of just the amount of the input being signed)
and to the scriptPubKey of the output being spent. See BIP
341
for the full description of the transaction digest.
Because of these changes, more data needs to be stored in the
PrecomputedTransactionData object. This PR changes the way that the object
is constructed and initialized so that a future commit in PR
17977
can add that data to the object.
PR 18422: Move single-sig checking EvalScript code to EvalChecksig
Tapscript
is the scripting language used for scripts in a taproot tree (taproot has
versioning for scripts, so a future soft fork could introduce a different
scripting version or language).
Tapscript has almost the same semantics as Bitcoin Script. One notable
difference is that OP_CHECKSIG and OP_CHECKSIGVERIFY both verify schnorr
signatures instead of ECDSA signatures.
This PR extracts most of the logic for OP_CHECKSIG and
OP_CHECKSIGVERIFY from the very large EvalScript() function.
A future commit in PR
17977
modifies this function to switch on which flags have been passed into the
script interpreter in order to determine whether to do schnorr or ECDSA
signature verification.
A new PrecomputedTransactionData default constructor is added which
doesn’t initialize the data members. Where do they get initialized instead?
What new data members are added to PrecomputedTransationData in the
subsequent commit in the taproot PR 17977?
Is the old PrecomputedTransactionData constructor still used anywhere?
Why is the transaction digest algorithm changed for taproot? What problems
does the new algorithm solve?
What is the difference between EvalCheckSig() returning false and setting
the return parameter fSuccess to false? When would those different
failure modes be used?
Why isn’t the code for OP_CHECKMULTISIG and OP_CHECKMULTISIGVERIFY also
extracted out in PR 18401?
<jnewbery> fantastic! The code changes are quite small and mechanical, but there's a _lot_ of context, so I thought it'd be fun to dig into the surrounding concepts a bit.
<jnewbery> first question: A new PrecomputedTransactionData default constructor is added which doesn’t initialize the data members. Where do they get initialized instead?
<theStack> nit-question: i guess the definition moved up just for code readability reasons, to be closer to `control`, as the comment refers to both `control` and `txsdata`?
<jnewbery> someone might come along later and see that txsdata is only used inside the for loop and think "we don't need a vector. We can just create a new txdata in each loop iteration"
<jnewbery> ok, moving on (but feel free to continue asking questions on anything). What new data members are added to PrecomputedTransationData in the subsequent commit in the taproot PR 17977?
<pinheadmz> michaelfolkson: from bip341: "There is no expected security improvement by doubling SHA256 because this only protects against length-extension attacks against SHA256 which are not a concern for signature messages because there is no secret data. "
<fjahr> also nSequences but I guess that's less common: The signature message commits to all input nSequence if SIGHASH_NONE or SIGHASH_SINGLE are set (unless SIGHASH_ANYONECANPAY is set as well)
<nehan_> amounts are to eventually support a cold wallet. I still don't think it needs to commit to scriptPubKey but as sipa pointed out on stackoverflow it's to maintain legacy behavior
<jnewbery> There's a (very miner) similar attack where an online host could lie to an offline signer about whether an output being spent is P2WPKH or P2WSH-P2PKH. The signer doesn't know the transaction output type and therefore the size, and a signature it provides would be valid for both.
<r251d> In #18422 EvalChecksig communicates success by return value and by fSuccess bool ref. Are those success values only inconsistent when verifying scripts with null signatures?
<theStack> so the reason that scriptPubKey of the output being spent was not included in the signature was just that those attacks were not taken into consideration? or is there also a drawback in doing so?
<jnewbery> sipa: I think your point is that with OP_CODESEPARATOR, the signer can provide a signature that is only valid for a certain execution branch. Is that right?
<nehan_> sipa: i'm confused. to make sure we're on the same page: I think that committing to (txid, input index) implicitly commits to a scriptCode for someone who has that data (so, not an airgapped wallet, of course). therefore commiting to the actual scriptCode in the signature seems unnecessary.
<jnewbery> nehan_: if the sighash only committed to (txid, index), then both of those signatures in sipa's example would be signing the same digest, so the signatures would be the same
<jnewbery> What is the difference between EvalCheckSig() returning false and setting the return parameter fSuccess to false? When would those different failure modes be used?
<theStack> i'd roughly say: if EvalCheckSig() returns false, then the input script is invalid (e.g. invalid signature format) in some way, where as fSuccess represents the result of the evaluated script
<nehan_> sipa: i think my misunderstanding was the following: i did not think that the scriptSig specified which code path in a scriptPubKey to take. I thought this was determined by execution.
<theStack> to answer the question (in pre-taproot), an invalid encoded signature or pubkey encoding could be reasons for an instant fail of the script, leading to an invalid transaction
<nehan_> sipa: I don't see how OP_CODESEPARATOR could be used for that purpose -- the entire scriptPubKey is committed to in the txid. to clarify: we are talking about pre-segwit Bitcoin, and not considering airgrapped wallets and access to data is not a concern.
<jnewbery> From a high level, I think the point is that we want a way to be able to fail a signature check, but continue execution, while avoiding malleability.
<sipa> (scriptCode is the script being executed, from the last executed OO_CODESEP until the end of the script; and the entire executed script if no CODESEP was ever executed)
<nehan_> sipa: I might need to work out an example, because I don't see how that adds a degree of flexibility. relevant OP_CODESEPARATORs are in the scripPubKey, and the scriptCode that results is deterministic from that. Certainly the signer could provide different scriptSigs that could cause different paths to execute, but that does not change the scriptPubKey and hence the scriptCode.
<instagibbs> in tapscript that'd be checksigadd construction, it's ok to have some failures, provided the failures don't actually make you spend the sig validation time
<sipa> nehan_: but when reasoning through malleability in miniscript we had to introduce a rule "the same public key cannot occur more than once in a script", because otherwise it's near impossible to reason about - using CODESEP that could be made permitted
<jnewbery> it's maybe a byte or two more efficient if you do the checksig outside the if/else and then choose the branch based on whether the checksig is successful
<nehan_> sipa: ok i see that someone could get me to sign a 0 <sig> and then replace it with a 1 <sig>. bad! but I don't see how committing to scriptCode changes that.