Use `script_util` helpers for creating P2{PKH,SH,WPKH,WSH} scripts (tests, refactoring)

https://github.com/bitcoin/bitcoin/pull/22363

Host: glozow  -  PR author: theStack

The PR branch HEAD was 905d672b74 at the time of this review club meeting.

Notes

  • Bitcoin transactions encode spending conditions through a scriptPubKey in outputs and a witness and scriptSig in the inputs. You can read more about Bitcoin Script here. In the functional test framework, scripts are represented using the CScript class, and can be initialized using an array of opcodes and byte-encoded data.

  • PR #22363 replaces many manually-constructed default scripts in the functional tests with helper functions provided in script_util.py. It also corrects an error in the helper function, get_multisig, in which the P2SH-wrapped P2WSH script hadn’t hashed the witness script before putting it into the scriptSig. We’ll use this opportunity to review script and output types.

  • To test your understanding of scripts and output types, you can try to fill out this table (Hint: a few cells have been pre-filled, and some cells should remain blank):

Solutions

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?

  2. What do key_to_p2pkh_script, script_to_p2sh_script, key_to_p2wpkh_script and script_to_p2wsh_script in script_util.py do? In what cases would we want to use or not use them?

  3. Review of Terminology: Let’s define script code, witness script, redeem script, scriptPubKey, scriptSig, witness, and witness program (some of these terms are synonymous).

  4. What does the operation OP_HASH160 do? (Hint: what does the script interpreter do when it sees this opcode? What are the differences between the hashers?)

  5. Review of P2PKH: to send coins to someone by public key hash (pre-segwit), what is included in the scriptPubKey of the output? What needs to be provided in the input when the coin is spent?

  6. Review of P2SH: to send coins to someone with spending conditions encoded in a script, what is included in the scriptPubKey of the output? What needs to be provided in the input when the coin is spent? Why do we use Pay-To-Script-Hash instead of Pay-To-Script?

  7. Review of P2SH-P2WSH: What is the purpose of “P2SH wrapped segwit” outputs? When a non-segwit node validates a P2SH-P2WSH input, what does it do?

  8. Review of P2SH-P2WSH: When a node with segwit enabled validates a P2SH-P2WSH input, what does it do in addition to the procedure performed by a non-segwit node?

  9. What is wrong with the P2SH-P2WSH script here? (Hint: which variable holds the 2-of-3 multisig script itself? Which variable holds the scriptSig which will be included in the input?)

  10. How would you verify the correctness of helper functions like get_multisig()? Can we add tests for them?

  11. Can you find any other places in functional tests that could use the script_util.py helper functions instead of manually creating scripts?

Meeting Log

  117:00 <glozow> #startmeeting
  217:00 <jnewbery> hi!
  317:00 <glozow> Welcome to PR Review Club everyone!!! Feel free to say hi :)
  417:00 <b10c> hi
  517:00 <sipa> hi
  617:00 <glozow> lurkers welcome too
  717:00 <sipa> hi, lurking
  817:01 <stickies-v> hi
  917:01 <glozow> we're looking at #22363 today: Use `script_util` helpers for creating P2{PKH,SH,WPKH,WSH} scripts
 1017:01 <glozow> Notes are here: https://bitcoincore.reviews/22363
 1117:01 <glozow> Did anyone get a chance to review the PR? y/n
 1217:02 <glozow> And did anyone get a chance to look at the notes (or fill out the table)?
 1317:02 <b10c> y
 1417:02 <jnewbery> concept y
 1517:03 <remember> hi
 1617:03 <stickies-v> concept y
 1717:03 <glozow> First question: What do `key_to_p2pkh_script`, `script_to_p2sh_script`, `key_to_p2wpkh_script` and `script_to_p2wsh_script` in wallet\_util.py do?
 1817:04 <Azorcode> Hi everyone
 1917:04 <svav> Hi
 2017:05 <b10c> specifically in wallet_util.py?
 2117:05 <stickies-v> I think you mean script_util.py, right?
 2217:05 <glozow> oh wait sorry, they are in script_util.py
 2317:05 <glozow> yes
 2417:06 <glozow> was going off of an old version of my notes
 2517:06 <jnewbery> They're helper functions that take a key/script and return a CScript object
 2617:06 <b10c> they are used to create scripts for different types of script templates
 2717:06 <glozow> jnewbery: b10c: yep!
 2817:06 <stickies-v> I think they provide convenience wrappers around the CScript constructor, with default opcodes etc
 2917:06 <jnewbery> where the CScript object is a P2PKH/P2SH/etc
 3017:07 <glozow> stickies-v: yes!
 3117:08 <glozow> let's define some terminology with the word "script" in them: script code, witness script, redeem script, scriptPubKey,
 3217:08 <glozow> scriptSig
 3317:08 <glozow> what do these mean?
 3417:08 <LarryRuane> script code = sequence of operations (some of which may be just pushing values on the stack)
 3517:09 <jnewbery> scriptPubKey - this is in the TxOut and encumbers the output with spending conditions
 3617:09 <b10c> scriptSig = script part in the transaction input
 3717:10 <remember> scriptSig predicate that satisfies the scriptPubKey
 3817:10 <remember> a predicate *
 3917:10 <glozow> jnewbery: b10c: remember: yep! those are our names for the scripts in inputs and outputs, which we can see in the code here: https://github.com/bitcoin/bitcoin/blob/master/src/primitives/transaction.h
 4017:10 <LarryRuane> scriptSig = script code that is placed before the scripPubKey when evaluating if an input unlocks the output
 4117:11 <sipa> not "placed before" since somewhere in 2010; it's evaluated first, and the resulting stack is fed as initial state for the scriptPubKey is evaluated
 4217:11 <jnewbery> LarryRuane: scriptCode actually has a specific meaning
 4317:11 <glozow> LarryRuane: I suppose we could use "script code" colloquially to mean the "code" evaluated in scripts, but scriptCode also has a meaning defined in BIP143
 4417:12 <sipa> it has a meaning since long before segwit
 4517:12 <glozow> oop 🤭
 4617:13 <jnewbery> here's that change in 2010: https://github.com/bitcoin/bitcoin/commit/6ff5f718b6a67797b2b3bab8905d607ad216ee21#diff-27496895958ca30c47bbb873299a2ad7a7ea1003a9faa96b317250e3b7aa1fefR1114-R1124
 4717:14 <glozow> okie so we still need definitions for witness script and redeem script, any takers?
 4817:15 <stickies-v> redeem script I would think is the full script that satisfies a p2sh?
 4917:16 <b10c> it's not the full scriptPubkey
 5017:16 <b10c> only the last data push
 5117:17 <glozow> stickies-v: ya i agree with that answer
 5217:17 <glozow> and witness script?
 5317:18 <b10c> a witness script belongs to an input spending a SegWit output
 5417:18 <glozow> (does anyone want to answer: what's a witness?)
 5517:18 <LarryRuane> Witness script is part of the tx input (but not included in the txid hash), and it's placed into the execution to-do list (probably using the wrong terms here) after the special segwit pattern is seen, 0,32-byte-hash
 5617:19 <remember> whatabout "A witness script is to segwit txns as scriptSig is to non-segwit txn" ?
 5717:19 <remember> accurate?
 5817:20 <LarryRuane> remember: I think that's pretty close to my understanding
 5917:20 <stickies-v> I think it's the segwit equivalent of a redeem script?
 6017:21 <b10c> remeber: agree for native SegWit, when nesting the script in a P2SH construction you still have data in the scriptSig
 6117:21 <stickies-v> so in other words, witness script is P2WSH and redeem script is P2SH?
 6217:22 <remember> "segwit txn" is probably not specific enough :]
 6317:22 <remember> in my analogy
 6417:23 <sipa> taproot doesn't have a "witness script", so the term is kind of specific to P2WSH
 6517:23 <sipa> (and P2SH-P2WSH)
 6617:23 <b10c> oh witness script != witness
 6717:24 <sipa> it is the script being actually executed in P2WSH
 6817:24 <sipa> like the redeemscript is the actually executed script in P2SH
 6917:24 <glozow> remember: I'd say witness: segwit txn as scriptSig: non-segwit txn
 7017:24 <sipa> yeah ^
 7117:24 <glozow> and witness script : segwit txn as redeemScript: non-segwit txn
 7217:25 <sipa> s/segwit txn/P2WSH input/
 7317:25 <glozow> in a P2WSH, witness = a stack of input data + witness script
 7417:25 <remember> +1
 7517:25 <sipa> and s/non-segwit txn/P2SH input/
 7617:25 <jnewbery> or maybe "witness script is to P2WSH output as redeem script is to P2SH output"
 7717:25 <jnewbery> There's a good summary here: https://bitcoin.stackexchange.com/a/95236/26940
 7817:26 <glozow> oooh wonderful, thanks past sipa for providing the answer to Question 2!
 7917:26 <glozow> and jnewbery for sharing the link :D
 8017:27 <LarryRuane> I'm confused about this part of p2wpkh: once the stack has the special pattern 20-byte-hash,0, then magically the command set (to-do list) becomes the standard p2pkh sequence, sig, pubkey, OP_DUP, OP_HASH160, 20-byte-hash, OP_EQUALVERIFY, OP_CHECKSIG .... my question is, is THAT the witness? Or is this sequence "manufactured" on the fly, and the
 8117:27 <LarryRuane> witness has only the signature and pubkey?
 8217:28 <sipa> the witness is what is encoded in the input
 8317:28 <sipa> so the pubkey and signature
 8417:28 <LarryRuane> got it, thanks
 8517:28 <sipa> and it's not the stack that has a special pattern; it is the scriptPubKey or redeemScript that has to be in the form "OP_0 <20 byte push>"
 8617:29 <sipa> for P2WPKH validation rules to trigger
 8717:29 <LarryRuane> i see, that's very helpful thanks
 8817:30 <glozow> and the interpreter sees that pattern and knows to use the script code OP_DUP OP_HASH160 20Bhash OP_EQUALVERIFY OP_CHECKSIG with the witness yeah?
 8917:30 <sipa> righty
 9017:30 <sipa> -y
 9117:30 <glozow> woot! next question is a light one: What does the opcode OP_HASH160 do?
 9217:31 <stickies-v> it first hashes with SHA-256 and then RIPEMD-160
 9317:31 <glozow> stickies-v: correct!
 9417:32 <glozow> ok now let's start going over the script output types table
 9517:32 <glozow> Review of P2PKH: to send coins to someone by public key hash (pre-segwit), what is included in the scriptPubKey of the output? What is included in the scriptSig?
 9617:33 <jnewbery> LarryRuane: here's the P2WPKH execution constructing that sequence, which later gets fed into EvalScript: https://github.com/bitcoin/bitcoin/blob/4129134e844f78a89f8515cf30dad4b6074703c7/src/script/interpreter.cpp#L1906-L1911
 9717:34 <LarryRuane> and OP_SHA256 and OP_RIPEMD160 are also opcodes, so (IIUC) OP_HASH160 is just a convenient shortcut
 9817:34 <b10c> scriptPubKey: OP_DUP OP_HASH160 OP_PUSHBYTES_20 20-byte-hash OP_EQUALVERIFY OP_CHECKSIG
 9917:34 <b10c> scriptSig: signature and pubkey
10017:34 <glozow> LarryRuane: righto
10117:35 <glozow> b10c: winner!
10217:35 <glozow> Same question for P2SH: to send coins to someone with spending conditions encoded in a script, what is included in the scriptPubKey of the output? What needs to be provided in the scriptSig when the coin is spent?
10317:36 <b10c> scriptPubKey: OP_HASH160 OP_PUSHBYTES_20 20-byte-hash OP_EQUAL
10417:37 <LarryRuane> scriptPubKey: hash160, hash, EQUAL ....... scriptSig: pubkey, sig, redeemscript
10517:37 <LarryRuane> the redeem script itself is: pubkey, OP_CHECKSIG
10617:38 <b10c> scriptSig: <stuff needed for redeemscript> redeemscript
10717:38 <glozow> LarryRuane: ah i suppose that's a specific script
10817:38 <glozow> I like b10c's answer, which is for a generic redeemScript
10917:39 <b10c> I'm not sure on my terminology though :D
11017:40 <LarryRuane> yes, guess I was only giving the simplest version (single-sig), but it's much more general, as b10c said
11117:40 <glozow> both good answers :)
11217:40 <glozow> And Why do we use Pay-To-Script-Hash instead of Pay-To-Script?
11317:41 <b10c> privacy
11417:41 <LarryRuane> I think the TXO is smaller (and when it's still a UTXO, that's very helpful for resource use), and also it's more secure (in some future where ECDSA is broken)
11517:42 <stickies-v> it pushes the cost burden of having complex scripts to the receiver, who designed the script in the first place
11617:42 <remember> some reduction to chain bloat, some privacy, some marginal QC benefits
11717:42 <b10c> privacy (until we spend it)*
11817:42 <remember> stickies-v good point about block-space cost alignment
11917:42 <glozow> b10c: LarryRuane: stickies-v: remember: great answers!
12017:42 <jnewbery> b10c: ha! was about to say "until it gets spent". I don't think privacy is the reason here
12117:42 <LarryRuane> Oh, and especially it's good with multisig, because the address that you have to give to the payer is much smaller (right?)
12217:43 <b10c> spender pays for it's own large script, not the one who pays him
12317:43 <glozow> I hadn't thought about the small scriptPubKey part before
12417:43 <LarryRuane> (i mean, smaller than multisig without P2SH)
12517:43 <glozow> I imagine P2SH predates ultra prune but idk
12617:43 <jnewbery> The "motivation" section for BIP16 is very short, but it contains the key point: "The purpose of pay-to-script-hash is to move the responsibility for supplying the conditions to redeem a transaction from the sender of the funds to the redeemer."
12717:43 <LarryRuane> probably but the UTXO set has to be maintained by all full nodes
12817:44 <remember> I think we would design P2SH differently today given what we know
12917:44 <b10c> jnewbery: yeah, agree after thinking about it :)
13017:44 <jnewbery> And the second point: "The benefit is allowing a sender to fund any arbitrary transaction, no matter how complicated, using a fixed-length 20-byte hash that is short enough to scan from a QR code or easily copied and pasted."
13117:44 <jnewbery> https://github.com/bitcoin/bips/blob/master/bip-0016.mediawiki#motivation
13217:44 <stickies-v> jnewbery: arguably it's still good for privacy though? e.g. not exposing that you have timelocks in your script until after the outputs are spent is a privacy benefit, no?
13317:45 <stickies-v> although maybe that's more security than privacy
13417:45 <remember> stickies-v I'd say it's both (though the privacy benefit expires at spending)
13517:46 <stickies-v> agreed!
13617:47 <glozow> Okie dokie let's continue with the questions.
13717:47 <glozow> Review of P2SH-P2WSH: What is the purpose of “P2SH wrapped segwit” outputs? When a non-segwit node validates a P2SH-P2WSH input, what does it do?
13817:48 <glozow> And the other part of the question is: When a node with segwit enabled validates a P2SH-P2WSH input, what does it do in addition to the procedure performed by a non-segwit node?
13917:48 <LarryRuane> purpose is, in case you are asking for a payment from someone with an old wallet, so the segwit address you'd like to give the person won't work ... so you can give the payer what looks exactly like a P2SH address
14017:49 <b10c> P2WH wrapped segwit in general: the sender doesn't need to add segwit-sending support on his side if the recipient wants to use segwit
14117:49 <LarryRuane> so the TXO is *not* segwit, but the corresponding (later) input *is*
14217:49 <jnewbery> stickies-v: I'm not sure that's how we usually think about privacy. If it needs to be revealed in future, then you could argue that it's not really private.
14317:50 <stickies-v> thanks for clearing that up, jnewbery, makes sense!
14417:51 <glozow> LarryRuane: b10c: right!
14517:52 <glozow> So we have a scriptPubKey that looks like a P2SH, and both a scriptSig and a witness. What does a nonsegwit node do to validate it? And what does a segwit node do?
14617:54 <b10c> a non-segwit node just hashes the 22 bytes and compares them (OP_EQUAL) to the hash in the scriptPubKey
14717:55 <glozow> b10c: yep! they don't know how to deal with the witness stuff, but they'll verify the hash matches
14817:55 <LarryRuane> the nonsegwit node verifies that the redeem script hash is correct, then runs the redeem script, however, it's just OP_0 and a 20-byte-hash, so push those on the stack, and since top element is nonzero, done, success
14917:55 <glozow> LarryRuane: *chefs kiss 😗 👌
15017:56 <b10c> a segwit node verifies the signature+pubkey (for Nested P2WPKH) or the witness script (for Nested P2WSH)
15117:56 <glozow> b10c: yep!
15217:56 <LarryRuane> of course the segwit node then goes on to notice this special pattern, and then it does the usual segwit verification
15317:56 <glozow> Ok so #22363 fixes a bug in here: https://github.com/bitcoin/bitcoin/blob/091d35c70e88a89959cb2872a81dfad23126eec4/test/functional/test_framework/wallet_util.py#L109
15417:56 <LarryRuane> mind-bending but brilliant
15517:56 <glozow> what's the bug? :)
15617:58 <LarryRuane> forgot to hash the witness_script, so the OP_EQUAL will never return 1 (true)
15717:58 <b10c> should be `hash160(witness_script)` and not `witness_script`
15817:58 <glozow> LarryRuane: b10c: bingo!
15917:58 <jnewbery> who would do something like that?!
16017:59 <glozow> gotta pull out the `git blame`
16117:59 <LarryRuane> JOHHHHHHNNNN! but i guess this bug wasn't operative, because this part of the test wasn't used (?)
16217:59 <jnewbery> 😳
16318:00 <glozow> yep! we've run out of time for the last 2 questions, but they'd be good to include in your review (hopefully everyone will be posting a review after this!)
16418:00 <jnewbery> peep peep peeeeeep. That's full time. Let's not go to penalties.
16518:01 <glozow> yep! we've run out of time for the last 2 questions, but they'd be good to include in your review (hopefully everyone will be posting a review after this!)
16618:01 <glozow> #10: Can you think of test vectors for `get_multisig`?
16718:01 <glozow> #11: Can you find any other places in functional tests that could use the script_util.py helper functions instead of manually creating scripts?
16818:01 <glozow> #endmeeting