bitcoin-inquisition #39: Re enable OP_CAT (consensus)

https://github.com/bitcoin-inquisition/bitcoin/pull/39

Host: EthanHeilman  -  PR authors: 0xBEEFCAF3 , EthanHeilman

Notes

The following pull request (PR) reinstates the usage of OP_CAT in accordance with the specifications outlined in the draft BIP, which can be found here. Reviewers are encouraged to familiarize themselves with the BIP before examining the code.

  • The primary objective of this PR is to re-enable OP_CAT utilizing OP_SUCCESS semantics, replacing OP_SUCCESS126.
    • OP_CAT is designed to concatenate two elements on the stack.
    • If the resulting element is smaller than the MAX_SCRIPT_ELEMENT_SIZE, it is placed on the stack.
  • OP_CAT should fail if there are less than two values on the stack or if a concatenated value would have a combined size greater than MAX_SCRIPT_ELEMENT_SIZE.
  • The PR also introduces activation parameters for the Bitcoin Signet network, aligning with the BIN-2024-0002 standard.
  • Additionally, this PR introduces new semantics for testing taproot-related script tests.

Motivation for OP_CAT

OP_CAT aims to expand the toolbox of the tapscript developer with a simple, modular, and useful opcode in the spirit of Unix. To demonstrate the usefulness of OP_CAT below we provide a non-exhaustive list of some usecases that OP_CAT would enable:

  • Bitstream, a protocol for the atomic swap (fair exchange) of bitcoins for decryption keys, that enables decentralized file hosting systems paid in Bitcoin. While such swaps are currently possible on Bitcoin without OP_CAT they require the use of complex and computationally expensive Verifiable Computation cryptographic techniques. OP_CAT would remove this requirement on Verifiable Computation, making such protocols far more practical to build in Bitcoin.
  • Vaults which are a specialized covenant that allows a user to block a malicious party who has compromised the user’s secret key from stealing the funds in that output. The first CAT vault has been developed by Rijndael. Find more details on CAT vaults in practice here
  • Replicating CheckSigFromStack which would allow the creation of simple covenants and other advanced contracts without having to presign spending transactions, possibly reducing complexity and the amount of data that needs to be stored. Originally shown to work with Schnorr signatures, this result has been extended to ECDSA signatures. For more usecases please checkout the BIP

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?

  2. What are the various conditions under which the execution of OP_CAT may result in failure?

  3. OP_CAT is defined as 0x7e. Even after replacing an OP_SUCCESS opcode, libraries can continue to use 0x7e to represent concatenation operations. Why is this the case?

  4. In deploymentinfo.cpp, there are both an OP_CAT flag and a DISCOURAGE_OP_CAT flag. What is the rationale behind having both of these?

  5. When does consensus consider OP_SUCCESS126 replaced by OP_CAT?

  6. What is the expected behavior when neither flag is set?

  7. Why is it important to verify if OP_CAT is being executed in a non-segwitv0 or base-script context at L474:interpreter.cpp rather than inside the opcode definition?

  8. This PR introduces new semantics for taproot-related script tests in script_tests.json. For example, this test. What issues or inefficiencies existed with the previous testing strategy?

  9. Are there any additional test cases you would like to see implemented that are not covered by the functional tests or the script tests in script_tests.json?

Meeting Log

  117:01 <EthanHeilman> #startmeeting
  217:01 <EthanHeilman> Hi
  317:01 <stickies-v> hi
  417:01 <arminsdev> Hi
  517:01 <maxedw> hi
  617:01 <reardencode> hi!
  717:01 <kouloumos> hi
  817:01 <Guest21> hello
  917:01 <kevkevin> hi
 1017:01 <Al79> hI
 1117:01 <dergoegge> hi
 1217:01 <monlovesmango> hello
 1317:01 <effexzi> Hi every1
 1417:01 <cbergqvist> hi
 1517:02 <EthanHeilman> Feel free to post any questions you might have about the PR https://github.com/bitcoin-inquisition/bitcoin/pull/39 to reenable OP_CAT.
 1617:03 <EthanHeilman> Did everyone get a chance to read the PR?
 1717:03 <LarryRuane> hi
 1817:03 <glozow> Spent some time, but I'm new to script interpreter code
 1917:03 <stickies-v> same here!
 2017:04 <instagibbs> ignored the activation code, mostly, but read the script interpreter stuff which im more familiar with
 2117:04 <reardencode> My one question so far is why in the test_framework/script.py the separate block for marking CAT not-success vs. changing the 2nd-to-last character in line 938 to `d`?
 2217:04 <kevkevin> have not but I will be mostly lurking today
 2317:04 <hernanmarino> Hi !
 2417:04 <alfonsoromanz> hi
 2517:04 <maxedw> skimmed through PR and BIP
 2617:05 <reardencode> BTW, not specific to the PR, but to the BIP and PR club notes, but CAT cannot emulate CSFS - that was a common misconception based on many of us too quickly reading Poelstra's old blog post.
 2717:05 <kouloumos> a quick look at the PR, read the BIP and mostly context-gathering around OP_CAT
 2817:05 <EthanHeilman> If you reviewed the PR Concept ACK, approach ACK, tested ACK, or NACK?
 2917:06 <arminsdev> @reardencode  Thanks for pointing that out! I need to revert that change. In general the changes in this PR should not affect OP_SUCCESS checks
 3017:06 <hernanmarino> approach ACK here. I'll test it later , i really like this :)
 3117:06 <instagibbs> approach ACK, would have to dive more into the activation logic to think clearly about how OP_SUCCESSX stuff is handled
 3217:08 <EthanHeilman> Let's get it started, it sounds like arminsdev addressed reardencode's question
 3317:08 <LarryRuane> i have a very basic question, what's the difference between the various OP_NOPs and OP_SUCCESS?
 3417:08 <LarryRuane> do they both just do nothing (successfully)?
 3517:09 <instagibbs> LarryRuane all NOPs are same, all OP_SUCCESSX are same, but the two classes are different
 3617:09 <reardencode> presence of SUCCESS skips script execution entirely and makes the tx success.
 3717:09 <reardencode> er the _input_ success, not the tx
 3817:10 <LarryRuane> i see thanks
 3917:10 <instagibbs> interpreter steps over all opcodes, if it sees an op_successx, it immediately succeeds
 4017:10 <instagibbs> before running the actual script
 4117:10 <glozow> do repurposed nops have restrictions on what they can do to the stack?
 4217:10 <EthanHeilman> I'm going to go down the list of questions, but throw any additional questions in
 4317:11 <instagibbs> glozow and how does that effect re-enabling of something like OP_CAT 🧐
 4417:11 <LarryRuane> glozow: i would think they must leave the stack unchanged (like a NOP does)
 4517:11 <glozow> it doesn't, i'm just trying to answer LarryRuane's question
 4617:12 <EthanHeilman> 2. What are the various conditions under which the execution of OP_CAT may result in failure?
 4717:12 <reardencode> it means we can only reenable OP_CAT as a SUCCESS replacement in tapscript, but not in legacy/witnessv0 script.
 4817:12 <glozow> fewer than 2 items, resulting size too big, or usage not allowed?
 4917:12 <LarryRuane> I could only see 2, not enough items on the stack, or the resulting element too large
 5017:12 <instagibbs> reardencode without some ... interesting engineering :)
 5117:13 <arminsdev> So far so good! Fewer than 2 items on the stack, resulting item is too large and script verify flags
 5217:13 <arminsdev> There is one more
 5317:14 <Ayelen> OP_CAT fails if there are less than two values on the stack or if the size of the result is more than size of 520 bytes.
 5417:14 <Guest21> is MAX_SCRIPT_ELEMENT_SIZE = 520 a consensus rule or policy rule ?
 5517:15 <arminsdev> Hint that we re-enable CAT by replacing OP_SUCCESS126
 5617:15 <reardencode> Guest21: consensus (but prior to CAT we didn't have an example of it needing to be enforced on an element pushed back to the stack IIUC)
 5717:16 <reardencode> arminsdev: when it's executed in witnessv0/legacy script?
 5817:16 <arminsdev> reardencode correct!
 5917:17 <Guest21> so its not consensus rule for tapscript?
 6017:17 <rot13maxi> I saw that there is a new `SCRIPT_VERIFY_DISCOURAGE_OP_CAT` flag being added. Why is there also `SCRIPT_VERIFY_OP_CAT`? Is that to have something to toggle on when it gets activated
 6117:17 <EthanHeilman> OP_FALSE IF OP_SUCCESS126 END_IF OP_FALSE OP_VERIFY --> success
 6217:17 <EthanHeilman> vs
 6317:17 <EthanHeilman> OP_FALSE IF OP_CAT END_IF OP_FALSE OP_VERIFY --> failure
 6417:17 <arminsdev> rot13maxi thats one of the questions we're going to cover
 6517:17 <EthanHeilman> rot13maxi thats question 4, lets jump to that: In deploymentinfo.cpp, there are both an OP_CAT flag and a DISCOURAGE_OP_CAT flag. What is the rationale behind having both of these?
 6617:17 <reardencode> Guest21: it is a consensus rule for all incoming stack elements in witnessv0 and tapscript, and with the addition of CAT also for elements pushed back to the stack by script.
 6717:18 <rot13maxi> arminsdev cool
 6817:19 <arminsdev> I suppose we can jump into that question now. Any ideas?
 6917:20 <EthanHeilman> reardencode "with the addition of CAT also for elements pushed back to the stack by script" how does OP_CAT add this rule? Is there something different about the OP_PUSHDATA logic?
 7017:20 <LarryRuane> rot13maxi: I believe the discourage flags are in case later we want to deactivate the softfork (I notice that the early SFs like P2SH don't have discourage flags because we know they'll never be undone)
 7117:21 <reardencode> EthanHeilman: because CAT directly manipulates the stack it has to separately implement the length restriction, vs. PUSHDATA already has the restriction is what I meant
 7217:21 <EthanHeilman> reardencode Oh, I get what you are saying
 7317:22 <instagibbs> OP_CAT didn't have the restriction, hence part of th reason it was disabled
 7417:22 <hernanmarino> +1 to LarryRuane but i thinks it is only related to bitcoin inquisition because forks are activated differently, and can be desactivated
 7517:22 <EthanHeilman> IIRC OP_CAT did have the restriction
 7617:23 <instagibbs> only for like... a day, one sec
 7717:23 <rot13maxi> there is also a policy limit on taproot witness stack item size of 80 bytes (https://github.com/bitcoin/bitcoin/blob/ab5dfdbec1143f673f4d83acd4e335bb2c51034e/src/policy/policy.h#L45) which I ran into when playing with CAT on regtest. Doesn't directly affect this PR, but is another limit that comes into play for using cat in practice
 7817:24 <EthanHeilman> OP_CAT was restricted to 5000 bytes before it was disabled, but interestingly in the commit that disabled it, Satoshi also changed OP_CAT to restrict it to 520 Bytes
 7917:24 <EthanHeilman> https://github.com/bitcoin/bitcoin/commit/4bd188c4383d6e614e18f79dc337fbabe8464c82#diff-27496895958ca30c47bbb873299a2ad7a7ea1003a9faa96b317250e3b7aa1fefL390
 8017:24 <instagibbs> 757f0769d8360ea043f469f3a35f6ec204740446 satoshi adds a result restiction
 8117:25 <kouloumos> Regarding the MAX_SCRIPT_ELEMENT_SIZE, Steven Roose [propose yesterday](https://github.com/bitcoin/bips/pull/1525#issuecomment-1979297869) a total stack size limit instead of per-item size limit. Any thoughts on that? Does it allow for any other use-cases?
 8217:25 <instagibbs> 4bd188c4383d6e614e18f79dc337fbabe8464c82 satoshi changes it to 520, but also disabled
 8317:26 <instagibbs> (end digression!)
 8417:28 <EthanHeilman> What happens in soft fork when half of the network has adapted it but half the network has not and someone publishes a transaction on the p2p network that uses the softforked behavior? (related to the question)
 8517:28 <arminsdev> SCRIPT_VERIFY_OP_CAT essentially decouples OP_SUCCESS verify flags from OP_CAT usage. Take a look at how this is being used here: https://github.com/0xBEEFCAF3/bitcoin/blob/armin/re-enable-op-cat/src/script/interpreter.cpp#L1985
 8617:28 <arminsdev> By introducing the discouragement flag, mempool policy can inform developers to not use CAT during a activation period.
 8717:28 <arminsdev> Take a look https://github.com/0xBEEFCAF3/bitcoin/blob/armin/re-enable-op-cat/src/validation.cpp#L1021
 8817:29 <glozow> so DISCOURAGE_OP_CAT is for granularity vs DISCOURAGE_OP_SUCCESS?
 8917:29 <rot13maxi> I get the rational of having a flag to specifically target OP_CAT for "allow" or "disallow". I'm curious about the rationale for having both
 9017:30 <Guest42> hi
 9117:32 <EthanHeilman> It is like a two phase commit, OP_CAT = true,  DISCOURAGE_OP_CAT = true. Until the network has fully moved to OP_CAT = true. Then you can set DISCOURAGE_OP_CAT =false. I believe this approach was taken with segwit as well right, right?
 9217:33 <EthanHeilman> 6. What is the expected behavior when neither flag is set?
 9317:35 <glozow> It's just an op success?
 9417:37 <arminsdev> glozow correct!
 9517:37 <EthanHeilman> and op success is discouraged
 9617:37 <glozow> so, rejected in policy and accepted in consensus
 9717:38 <EthanHeilman> 7. Why is it important to verify if OP_CAT is being executed in a non-segwitv0 or base-script context at L474:interpreter.cpp rather than inside the opcode definition
 9817:38 <EthanHeilman> https://github.com/0xBEEFCAF3/bitcoin/blob/armin/re-enable-op-cat/src/script/interpreter.cpp#L475
 9917:38 <EthanHeilman> this question is a fun one!
10017:38 <hernanmarino> glozow: thanks for clarifying that, i was wondering about the meaning of discouraged in this context
10117:39 <instagibbs> "pretty please don't use this, this an upgrade hook"
10217:40 <arminsdev> For those that want to dig a bit deeper into the different flags and their responsibilities take a look at this gh thread
10317:40 <arminsdev> https://github.com/bitcoin-inquisition/bitcoin/pull/39#discussion_r1480760257
10417:41 <reardencode> EthanHeilman: otherwise it would fail before getting to the opcode execution even for Tapscript?
10517:41 <Ayelen> 7- to avoid exponential memory usage?. Tapscript enforces a maximum stack element size of 520 bytes
10617:42 <kouloumos> I can't find when/if the discourage flag was used with previous softforks, links appreciated
10717:42 <EthanHeilman> almost! What would be the behavior for non-tapscript transactions if we enabled OP_CAT and put the check IS OPCAT ENABLED in the script definiition
10817:43 <glozow> by opcode definition, do you mean L543?
10917:44 <EthanHeilman> yes
11017:45 <hernanmarino> i have a question related to the discourage flag... this is only related to how bitcoin inquisition works right ? Activation in the real Bitcoin mainnet will not work this way, will it ?
11117:45 <EthanHeilman> If implemented it like:
11217:45 <EthanHeilman>  L543: case OP_CAT: {
11317:45 <EthanHeilman>  L544:     if not tapscript ... return set_error(serror, SCRIPT_ERR_DISABLED_OPCODE); // Disabled opcodes (CVE-2010-5137).
11417:49 <reardencode> not seeing a difference :-\
11517:49 <Mccalabrese> all nontapscript transactions would fail?
11617:49 <arminsdev> hernanmarino thats a good question. It certainly could operate the same way in mainnet. It depends on what the bitcoin core policy is for consensus changes
11717:49 <EthanHeilman> I took me a half a day to figure this out.
11817:49 <EthanHeilman> Consider the following non-tapscript script
11917:49 <EthanHeilman> OP_FALSE IF OP_CAT END_IF OP_TRUE
12017:49 <EthanHeilman> Currently this script will fail because the check for if OP_CAT is a disabled op code happens before we reach the logic that checks conditionals
12117:50 <reardencode> oh, but the if check there hiding on the line right above the switch(opcode). would be a consensus change for legacy scripts.
12217:50 <reardencode> nice one.
12317:51 <EthanHeilman> Whereas if we move that check into the op code definition (L543) then
12417:51 <EthanHeilman> OP_FALSE IF OP_CAT END_IF OP_TRUE
12517:51 <EthanHeilman> will succeed for non-tapscript transactions. Such a change will result in a hardfork.
12617:51 <EthanHeilman> reardencode exactly!
12717:51 <Ayelen> EthanHeilman: thanks
12817:51 <instagibbs> feel like we should have a term for things that fail if interpreter hits it unconditionally
12917:52 <dergoegge> EthanHeilman: would any of our tests catch this?
13017:52 <dergoegge> (i don't think so)
13117:53 <EthanHeilman> Yes, in fact I first noticed this because an existing script test caught it. For more details see:
13217:53 <EthanHeilman> https://github.com/bitcoin-inquisition/bitcoin/pull/39#discussion_r1465110469
13317:54 <EthanHeilman> I believe the script test that caught it was this one:
13417:54 <EthanHeilman> ["'a' 'b' 0", "IF CAT ELSE 1 ENDIF", "P2SH,STRICTENC", "DISABLED_OPCODE", "CAT disabled"],
13517:55 <EthanHeilman> so good work to whoever wrote that
13617:55 <dergoegge> that's great!
13717:55 <EthanHeilman> 8. This PR introduces new semantics for taproot-related script tests in script_tests.json. For example, this test. What issues or inefficiencies existed with the previous testing strategy?
13817:56 <glozow> very cool, totally doesn't make this code look scary at all
13917:56 <glozow> (the interpreter code in general i mean)
14017:56 <arminsdev> test example = https://github.com/0xBEEFCAF3/bitcoin/blob/armin/re-enable-op-cat/src/test/data/script_tests.json#L2530
14117:58 <BlueMatt[m]> do we have fuzz tests which can test an old script interpreter and a new one and check that its not a hard fork?
14217:58 <instagibbs> differential script fuzzing sounds like a dergoegge question
14317:58 <dergoegge> not that i'm aware but i've been meaning to work on it
14417:58 <monlovesmango> needing to check/validate multiple previous items on the stack?
14517:59 <EthanHeilman> I would love to see that. I did a bit of fuzzing on OP_CAT and I want to do more.
14618:00 <hernanmarino> dergoegge: that would something nice to have, I'm not an expert in fuzz testing but I'm willing to help if you or someone takes the lead
14718:00 <EthanHeilman> For reference, this is that a tapscript script test looked like before we added these new test semantics
14818:00 <EthanHeilman> [
14918:00 <EthanHeilman>     [
15018:00 <EthanHeilman>         "c24f2c1e363e09a5dd56f0",
15118:00 <EthanHeilman>         "89a0385490a11b6dc6740f3513",
15218:00 <EthanHeilman>         "7e4c18c24f2c1e363e09a5dd56f089a0385490a11b6dc6740f351387",
15318:00 <EthanHeilman>         "c0d6889cb081036e0faefa3a35157ad71086b123b2b144b649798b494c300a961d",
15418:00 <EthanHeilman>         0.00000001
15518:00 <EthanHeilman>     ],
15618:00 <EthanHeilman>     "",
15718:00 <EthanHeilman>     "0x51 0x20 0x25b1769ec1939759dd5a97f5f02186e986280ae2bd0588ad13f28c8ce5044fa6",
15818:00 <EthanHeilman>     "P2SH,WITNESS,TAPROOT",
15918:00 <EthanHeilman>     "OK",
16018:00 <EthanHeilman>     "TAPSCRIPT (OP_CAT) tests CAT on different sized random stack elements. Script is CAT PUSHDATA1 0x18 c24f2c1e363e09a5dd56f089a0385490a11b6dc6740f3513 EQUAL"
16118:00 <EthanHeilman> ],
16218:02 <EthanHeilman> This is what that a very similar tapscript script test looks like now
16318:02 <EthanHeilman> [
16418:02 <EthanHeilman>     [
16518:02 <EthanHeilman>         "c24f2c1e363e09a5dd56f0",
16618:02 <EthanHeilman>         "89a0385490a11b6dc6740f3513",
16718:02 <EthanHeilman>         "CAT 0x4c 0x18 0xc24f2c1e363e09a5dd56f089a0385490a11b6dc6740f3513 EQUAL",
16818:02 <EthanHeilman>         "<AUTOGEN:CONTROLBLOCK>",
16918:02 <EthanHeilman>         0.00000001
17018:02 <EthanHeilman>     ],
17118:02 <EthanHeilman>     "",
17218:02 <EthanHeilman>     "0x51 0x20 <AUTOGEN:TAPROOTOUTPUT>",
17318:02 <EthanHeilman>     "P2SH,WITNESS,TAPROOT,OP_CAT",
17418:02 <EthanHeilman>     "OK",
17518:02 <EthanHeilman>     "TAPSCRIPT Tests CAT on different sized random stack elements and compares the result."
17618:02 <EthanHeilman> ],
17718:03 <hernanmarino> new semantic is more human readable and less error prone
17818:04 <EthanHeilman> Yes! And you can edit it by hand without writing code to generate a valid control block so it makes it easy to add new tapscript tests to script_tests.json
17918:05 <EthanHeilman> We are five minutes past the hour, I'm happy to lurk and discuss this more, I'm going to bring the meeting to a end
18018:05 <EthanHeilman> Please add any comments or questions to the PR
18118:06 <rot13maxi> EthanHeilman was there other feedback you were hoping to get but didn't?
18218:06 <Guest42> Is there another meeting tomorrow?
18318:08 <EthanHeilman> #endmeeting