Apply strict verification flags for transaction tests and assert backwards compatibility (tests, consensus)

Dec 2, 2020

https://github.com/bitcoin/bitcoin/pull/19698

Host: glozow - PR author: glozow

The PR branch HEAD was 110239f at the time of this review club meeting.

Notes

Bitcoin script allows users to attach spending conditions to transaction outputs. A “locking” script in a transaction output specifies some spending condition, such as providing a public key that hashes to a specified digest and a valid signature for that public key. An “unlocking” script in a transaction input supplies the operations and/or data to satisfy that condition. Transaction validation includes interpreting these scripts and running their specified operations such as verifying signatures, hashing data, checking timestamps, etc.
The Bitcoin script interpreter includes code that understands all types of scripts and accepts script verification flags to adjust which rules to enforce. Determining which flags to apply is very important.
- Example 1: When a node validates a block, it determines what the consensus rules are (see GetBlockScriptFlags, which also serves as a little tour of Bitcoin consensus history) before validation. During Initial sync, the consensus rules to apply will change as the block chain advances.
- Example 2: Bitcoin Core nodes also enforce policy (or “standardness”) rules in addition to consensus rules when accepting transactions into their mempool by applying additional script verification flags.
- The SCRIPT_VERIFY_DISCOURAGE_UPGRADEABLE_NOPS policy flag prevents a node from accepting a transaction into its mempool if that transaction uses an upgradeable OP_NOP. During activation for CHECKLOCKTIMEVERIFY, for example, an in-mempool transaction using NOP2 could have been invalidated between blocks. “Discouraging” transactions in this way is preferable to re-validating all transactions in the mempool after every block.
- PR #10699 describes a procedure for adding new consensus rules via upgradable NOPs or witness versions which incorporates this idea.
PR #10699 established the precedent that all script verification flags must be soft forks, i.e. more script verification flags = more strict spending rules.
PR #19698 extends the current transaction unit tests by asserting that each test applies flags precisely by asserting an opposite result when flags are removed or added. The goal is to ensure that valid transaction tests don’t pass due to a lack of flags and invalid transaction tests don’t fail due to an excess of flags.

Questions

What is the difference between PolicyScriptChecks and ConsensusScriptChecks?
Since the code for taproot is already in interpreter.cpp, why aren’t taproot rules being enforced yet (what condition in the code would it exit on)? How would we invoke VerifyWitnessProgram to make it apply taproot spending rules?
This PR edits some of the CheckLockTimeVerify tests by removing the 1 (push a 1 to the stack) at the end of the scripts. However, it doesn’t do this for all of them (example). Why? (Hint: try removing it and running the transaction tests, src/test/test_bitcoin --log_level=all --run_test=transaction_tests)
What does TrimFlags do?
What does it mean for script verify flags to be “minimal” and “maximal?”
How do the tests check that valid transaction tests have maximal/minimal flags?
What does it mean for script verify flags to be “soft forks?” How do the tests check this?

Meeting Log

17:57

<glozow> ____ ____ ____ ____ _____

17:57

<glozow> / __"| u U /"___| U | _"\ u ___ U| _"\ u |_ " _| 🏴‍☠️

17:57

<glozow> <\___ \/ \| | \| |_) |/ |_"_| \| |_) |/ | |__/

17:58

<glozow> u___) | | |___ | _ < | | | __/ /| |

17:58

<glozow> |____/>> \____| |_| \_\ U/| |\U |_| u |_|

17:58

<glozow> )) (__) // \\ // \\_ .-,_|___|_,-. ||>>_ _// \\_

17:58

<glozow> (__) (__)(__) (__) (__) \_)-' '-(_/ (__)__) (__) (__)

17:58

<glozow> (in 2 minutes)

17:58

<jnewbery> \o/

17:59

<MarcoFalke> I think I am using the wrong IRC client to see ascii art

17:59

<willcl_ark> very nifty

18:00

<jnewbery> #startmeeting

18:00

<sequel> fixed width font helps

18:00

<MarcoFalke> copy-paste in vim helped :)

18:00

<glozow> hi everyone!

18:00

<willcl_ark> hi

18:00

<sequel> hi

18:00

<robot-dreams> hi

18:00

<MarcoFalke> hi

18:00

<elle> hi!

18:00

<mango> hi

18:00

<thomasb06> hi

18:00

<dhruvm> hi

18:00

<michaelfolkson> hi

18:00

<schulzemic> hi

18:00

<emzy> hi

18:00

<jnewbery> hi all. Welcome to review club! Feel free to say hi to let everyone know you're here

18:00

<cangr> hi

18:00

<jnewbery> anyone here for the first time?

18:01

<schulzemic> yes

18:01

<glozow> Welcome schulzemic! ☜(ﾟヮﾟ☜)

18:01

<emzy> MarcoFalke: hehe, that was what I also did.

18:01

<sequel> welcome schulzemic

18:01

<murch> hi

18:01

<cangr> yes

18:01

<jnewbery> welcome schulzemic :)

18:01

<schulzemic> Thanks!

18:01

<jnewbery> glozow is hosting today, but I just wanted to mention something first

18:01

<jnewbery> I'm currently looking for hosts for future review club meetings.

18:02

<jnewbery> I've got a lot of commitments right now, so anyone volunteering to host would be really helping me!

18:02

<jnewbery> I can help you prepare/review notes and questions, and help you get ready for the meeting.

18:02

<jnewbery> If you've been coming to the meetings for a few weeks/months, then you're ready to host! Take a look at all the people who've hosted before you here: https://bitcoincore.reviews/meetings-hosts/.

18:02

<jnewbery> I find that teaching something is one of the best ways of learning. Hosting a review club will force you to understand the PR that week. So if you're serious about improving your understanding of Bitcoin Core, then I'd highly recommend hosting a review club.

18:02

<jnewbery> Feel free to message me after the meeting in this channel or privately.

18:03

<jnewbery> ok, over to glozow for the fun part..

18:03

<glozow> woo thanks jnewbery! Today we’re talking about PR #19698 https://github.com/bitcoin/bitcoin/pull/19698

18:03

<robot-dreams> Welcome cangr and thanks jnewbery!

18:03

<glozow> Did anyone get the chance to review the PR? (y/n)

18:03

<anir> y

18:03

<jnewbery> y

18:03

<willcl_ark> y

18:03

<elle> y

18:03

<thomasb06> y

18:03

<glozow> (☞ﾟヮﾟ)☞ cangr welcome!

18:03

<robot-dreams> y-ish but want to take a closer look after the meeting

18:04

<murch> y

18:04

<cangr> thanks

18:04

<emzy> y (reading it and running make check)

18:04

<glozow> I just hope that everyone learns something, so feel free to ask any questions :) a lot of the questions today are also more conceptual than implementation

18:04

<glozow> Let’s get this party started! First warm-up question: What is the difference between `PolicyScriptChecks` and `ConsensusScriptChecks`?

18:05

<willcl_ark> As I understand it, policies are soft local rules regarding what our node will create and accept. Consensus rules by contrast are hard limits which define what we accept as valid transactions and blocks on the network.

18:05

<glozow> link to code btw: https://github.com/bitcoin/bitcoin/blob/a35b948836db20fab9b48d3b77cf9f23ffee109a/src/validation.cpp#L924

18:05

<pinheadmz> policy is the ruleset that validates entry into the mempool, consensus is the ruleset that validates entry into blocks -- confusing but mempool rules are always *more* restrictive!

18:05

<thomasb06> PolicyScriptChecks is executed first, and ConsensusScriptChecks only checks if the script doesn't pass because of flags

18:06

<dhruvm> Policy is the prerogative of the node, Consensus is what the network expects and considers honest.

18:06

<willcl_ark> and we use policy checks as a "cheap" DOS prevention before running expensive? consensus checks

18:06

<glozow> willcl_ark pinheadmz dhruvm correct about policy vs consensus! can anyone tell me what the differences are in the code?

18:06

<sequel> policy screens on the node level, consensus on the network level (should never make it into the chain)

18:07

<michaelfolkson> If a transaction doesn't follow consensus rules then it would fork the chain if included a block

18:07

<glozow> willcl_ark not really, they are both script checks and aren't that different in terms of computational complexity

18:07

<sipa> sdaftuar just wrote a great answer on this question: https://bitcoin.stackexchange.com/questions/100317/what-is-the-difference-between-policy-and-consensus-when-it-comes-to-a-bitcoin-c

18:08

<glozow> sipa: that's an awesome post

18:08

<michaelfolkson> That was amazing. Thanks for that sdaftuar

18:08

<pinheadmz> in the code, there are bitfields like this one https://github.com/bitcoin/bitcoin/blob/a35b948836db20fab9b48d3b77cf9f23ffee109a/src/policy/policy.h#L60 that define a set of rules to check

18:08

<sipa> glozow: it depends, i think

18:08

<willcl_ark> sipa: oh wow that is a nice answer

18:09

<sipa> some policy checks (specifically resource limit related ones, size of scripts, ...) let us avoid doing signature checks before it's needed

18:09

<sipa> policy is really a whole bunch of unrelated things

18:09

<glozow> ah that's true

18:09

<glozow> most of those are in PreChecks though, I believe

18:10

<murch> Policy checks only apply to unconfirmed transactinos, though, right?

18:10

<MarcoFalke> (mempool) policy even includes non-script reject reasons like tx dependencies

18:10

<michaelfolkson> And so we try to do checks from cheapest to most expensive in that order?

18:10

<pinheadmz> and IIRC sipa, for soft forks the rules apply to the mempool (policy) before the actual soft fork activation. Does that hold true for taproot? For example with the next release of bitcoind, taproot rules will actually apply in the mempool even though they are not enforced in blocks yet?

18:10

<sipa> glozow: oh, this is just about the script validation flags for policy? in that case you're right

18:10

<glozow> what I wanted to get to was `PolicyScriptChecks` and `ConsensusScriptChecks` both call `CheckInputScripts`, but with different script verification flags :)

18:10

<MarcoFalke> murch: yes

18:10

<robot-dreams> One question about the the code difference, why does Consensus use CheckInputsFromMempoolAndCache?

18:11

<pinheadmz> actually answer my own question: https://github.com/bitcoin/bitcoin/blob/a35b948836db20fab9b48d3b77cf9f23ffee109a/src/policy/policy.h#L76

18:11

<sipa> pinheadmz: actually, no: https://github.com/bitcoin/bitcoin/pull/20165

18:13

<murch> Hehe, is everyone busy reading Suhas's post or why did we stop? :D

18:13

<glozow> robot-dreams: I think it's to use cached coins or signature verification result, I'd have to check more closely

18:13

<glozow> ok yes, next question: Since the code for taproot is already in interpreter.cpp, why aren’t taproot rules being enforced yet (what condition in the code would it exit on)? How would we invoke `VerifyWitnessProgram` to make it apply taproot spending rules?

18:13

<glozow> Code we’re looking at: https://github.com/bitcoin/bitcoin/blob/2ee954daaee5758d04935492139808cb6fd95233/src/script/interpreter.cpp#L1885-L1926

18:14

<thomasb06> (ko)

18:15

<pinheadmz> well this is interesting, because if you look at SCRIPT_VERIFY_TAPROOT -- it is included in the standard verify flags in the code i posted, but then those txs are actually not relayed because of the code sipa posted -- right?

18:15

<willcl_ark> Looks like a Taproot tx would currently fail at src/validation.cpp#MemPoolAccept::PreChecks#L696

18:15

<jnewbery> robot-dreams: that's an excellent question. There's a comment here: https://github.com/bitcoin/bitcoin/blob/a35b948836db20fab9b48d3b77cf9f23ffee109a/src/validation.cpp#L959-L975 that explains why that function is being called

18:16

<willcl_ark> We should call VerifyWitnessProgram with `witversion=1` to make it apply taproot rules

100

18:17

<robot-dreams> I'm guessing a taproot transaction would not make it into the mempool due to the policy flag `SCRIPT_VERIFY_DISCOURAGE_UPGRADABLE_TAPROOT_VERSION`

101

18:18

<sipa> no, that's for unknown leaf versions

102

18:18

<sipa> the only defined leaf version is 0xc0 for tapscript currently

103

18:19

<glozow> is there a condition within that control block that would exit early if we... say... didn't have a script verification flag set? :P

104

18:19

<elle> im assuming that the SCRIPT_VERIFY_TAPROOT bit is not yet added to the flags variable. Meaning that this line is false: if (VersionBitsState(pindex->pprev, consensusparams, Consensus::DEPLOYMENT_TAPROOT, versionbitscache) == ThresholdState::ACTIVE) {

105

18:20

<elle> (https://github.com/bitcoin/bitcoin/blob/master/src/validation.cpp#L1861)

106

18:20

<glozow> elle: ding ding ding!!! winner winner

107

18:20

<glozow> today is all about script verification flags, in case y'all haven't noticed :)

108

18:20

<pinheadmz> well but isnt it included in standard flags? https://github.com/bitcoin/bitcoin/blob/a35b948836db20fab9b48d3b77cf9f23ffee109a/src/policy/policy.h#L76

109

18:21

<glozow> not consensus

110

18:21

<pinheadmz> aha, missed the context :-P

111

18:22

<glozow> and as elle pointed out, whenever we validate blocks, we get our consensus rules in the form of flags in `GetBlockScriptFlags`: https://github.com/bitcoin/bitcoin/blob/04670ef81ea2300fcba4e1a492c4c6b0e0752848/src/validation.cpp#L1822

112

18:22

<glozow> okie dokie

113

18:22

<sipa> in the mempool, taproot script validation currently happens in 0.21... except that as long as it isn't active on the network, a pre-script policy check will reject it before the script interpreter even runs

114

18:22

<glozow> Let’s dive into the test JSON files :D This PR edits some of the CTV and CSV tests by removing the `1` (push a 1 to the stack) at the end of the scripts. However, it doesn’t do this for all of them. Why?

115

18:22

<sipa> CLTV?

116

18:22

<glozow> CLTV OOPS

117

18:22

<glozow> no CTV yet

118

18:22

<michaelfolkson> I missed the CTV soft fork happened ;)

119

18:23

<jnewbery> glozow: is it something to do with the V part of CLTV?

120

18:24

<robot-dreams> Wild guess, is it because you now want maximal flags and are enforcing `CLEANSTACK`?

121

18:24

<murch> robot-dreams: That would explain why they are removed, but not why some were left in

122

18:25

<glozow> btw here's an example of a test that _isn't_ getting its 1 removed: https://github.com/bitcoin/bitcoin/pull/19698/files#diff-7e4229911841f1d419c71a0d0df95feb07b77f90c0ff39f09182eb8ca50779b9L199

123

18:25

<murch> Or maybe it does

124

18:26

<glozow> what does the script "0 CHECKLOCKTIMEVERIFY 1" do?

125

18:26

<glozow> vs a script "499999999 CHECKLOCKTIMEVERIFY 1" (which we're removing the 1 from)?

126

18:27

<sipa> the first leaves "0 1" on the stack if it succeeds; the second leaves "499999999 1" on the stack

127

18:27

<nehan> why are the policy script checks defined in STANDARD_SCRIPT_VERIFY_FLAGS in policy.h but the consensus script checks are enumerated separately in the function GetBlockScriptFlags?

128

18:27

<pinheadmz> sipa CLTV doesnt pop a number off the stack when it compares to the tx locktime ?

129

18:27

<willcl_ark> Well we need the script to end non-zero, which the first won't do without the final `1`

130

18:28

<sipa> pinheadmz: it cannot! it's a redefined NOP

131

18:28

<glozow> pinheadmz: nope! why :P

132

18:28

<glozow> AW MAN SIPA U GAVE IT AWAY

133

18:28

<jnewbery> if an opcode has verify in the name, then it leaves the stack unchanged

134

18:28

<sipa> sowwwy

135

18:28

<glozow> NOPs can't touch the stack!

136

18:28

<sipa> jnewbery: wrong

137

18:28

<jnewbery> oh really?

138

18:28

<sipa> CHECKSIGVERIFY definitely modifies the stack

139

18:28

<jnewbery> yes, totally wrong

140

18:28

<jnewbery> ignore me

141

18:28

<sipa> /ignore jnewbery

142

18:28

<michaelfolkson> That's generally right though

143

18:28

<michaelfolkson> Just some counterexamples

144

18:29

<glozow> ok we can still use our brains a little bit... WHY can't upgradeable NOPs touch the stack?

145

18:29

<pinheadmz> if the CLTV failed though, these examples would fail technically with n empty stack because the 1 at the end would never get pushed to the stack

146

18:29

<murch> Because that would be a hardfork

147

18:29

<willcl_ark> so that old clients don't fail the scripts

148

18:29

<pinheadmz> glozow bc they were soft forked in! i shouldve known

149

18:29

<robot-dreams> nehan: I think it's because consensus checks depends on which block you're on (so you can't just set hardcoded flags)

150

18:29

<glozow> yaaaaaaas

151

18:30

<pinheadmz> oops i mean they would fail without the 1 but still the first number bc of the NOPness

152

18:30

<glozow> and yes, sorry nehan, i forgot to get back to you on your question - block script flags change from block to block, so we re-calculate them every time

153

18:30

<sipa> robot-dreams: yeah, policy just always enables everything we know about; consensus flags depend on context

154

18:30

<glozow> thanks robot-dreams :)

155

18:30

<nehan> robot-dreams: thanks!

156

18:30

<glozow> alrighty, so why don't we take off the 1 at the end of "0 CHECKLOCKTIMEVERIFY 1" ?

157

18:31

<glozow> (this is a valid tx btw)

158

18:31

<willcl_ark> we need non-zero stack at the end

159

18:31

<glozow> willcl_ark: exactly!

160

18:31

<MarcoFalke> some consensus flags can be and have been backdated to the genesis block, though

161

18:32

<willcl_ark> but does `499999999 CHECKLOCKTIMEVERIFY` not leave us with nothing too?

162

18:32

<glozow> well, what does 499999999 evaluate to?

163

18:32

<sipa> pinheadmz: if the CLTV fails the script immediately aborts entirely with failure; no code after it is executed anymore, but that's not really relevant if we've already aborted

164

18:32

<willcl_ark> ah it just gives 499999999 I see now

165

18:32

<willcl_ark> was looking at the wrong opcode

166

18:32

<glozow> 👍

167

18:32

<glozow> NEXT! What does `TrimFlags` do? (Extra credit question: why does SCRIPT_VERIFY_CLEANSTACK need to be used with SCRIPT_VERIFY_P2SH and SCRIPT_VERIFY_WITNESS?)

168

18:33

<pinheadmz> MarcoFalke example ?

169

18:33

<MarcoFalke> pinheadmz: the witness v0 script flags

170

18:33

<glozow> link to the TrimFlags code: https://github.com/bitcoin/bitcoin/blob/110239f2ff673eaea8f59c650792f3641855263d/src/test/transaction_tests.cpp#L138-L148

171

18:33

<robot-dreams> I'm a bit behind... I understand why we cannot take off the 1 at the end of "0 CLTV 1", but just to confirm, why DO we take off the 1 at the end of "499999999 CLTV 1"?

172

18:34

<MarcoFalke> pinheadmz: ok wait, they aren't enabled for the p2sh exception, so still depend on context

173

18:34

<sipa> robot-dreams: what would the script do if you left the "1" off?

174

18:34

<glozow> robot-dreams: we'd like a clean stack, and in that case we'd have 4999999 and 1 at the end

175

18:34

<sipa> robot-dreams: oh, i misread

176

18:34

<glozow> would anyone who's been quiet today like to tell us what CLEANSTACK means? :)

177

18:35

<murch> robot-dreams: I am also still wondering that.

178

18:35

<sipa> glozow: you could turn it into "0 CLTV DROP 1" if you wanted cleanstack for the first :)

179

18:35

<felixweis> it means that after all opcodes have executed on the stack, the stack must be empty or 1 (i forgot)

180

18:35

<willcl_ark> Single stack element remaining after execution

181

18:35

<robot-dreams> sipa: Yes, that was my confusion. I didn't understand why `0 1` would be considered a claen stack

182

18:36

<sipa> robot-dreams: it's not

183

18:36

<murch> Ah, thanks.

184

18:36

<pinheadmz> is the cleanstack rule just a malleability fix?

185

18:37

<willcl_ark> So am I right in reading CLEANSTACk is not consensus critical before taproot, but is for taproot?

186

18:37

<glozow> felixweis: yup! we don't want any extra items on the stack. I think 0 vs 1 depends on context too, sipa explains it here https://github.com/bitcoin/bitcoin/pull/20006#issuecomment-698487304

187

18:37

<sipa> willcl_ark: CLEANSTACK really means two different things

188

18:37

<felixweis> its required since segwit and p2sh

189

18:37

<murch> So previously the test was merely considering the tx valid because we didn't enforce the cleanstack rule, and now that we're defaulting to all flags, it would fail on the cleanstack rule

190

18:38

<willcl_ark> aha

191

18:38

<sipa> the script flag CLEANSTACK is only for top-level script evaluation

192

18:38

<willcl_ark> also: "Note: CLEANSTACK should never be used without P2SH or WITNESS."

193

18:38

<sipa> witv0/taproot have an implicit cleanstack-like behavior as part of their consensus rules

194

18:39

<willcl_ark> thanks sipa.

195

18:39

<sipa> but that isn't controlled by the CLEANSTACK flag

196

18:39

<glozow> right

197

18:39

<willcl_ark> Just to keep devs on their toes :P

198

18:39

<felixweis> so if the taproot stack after execution is empty and cleanstack is enabled the input still fails?

199

18:40

<glozow> ok everybody, I don't want to lose people so let's just go back to the first question :) what does TrimFlags do?

200

18:40

<robot-dreams> `TrimFlags` enforces constraints between flags (e.g. if CLEANSTACK is on, WITNESS must be on), and does so by turning off flags.

201

18:40

<glozow> robot-dreams: correct!

202

18:41

<glozow> what does `~` do in this context?

203

18:41

<michaelfolkson> bitwise NOT

204

18:41

<glozow> michaelfolkson: yup!

205

18:41

<robot-dreams> Do we expect many more such constraints in the future?

206

18:41

<murch> felixweis: Empty stack is also a problem, I think it should end on a single element.

207

18:41

<glozow> (not a flag destructor, in some countries that is illegal)

208

18:42

<felixweis> single element != 0

209

18:42

<glozow> Arriving at our main course today: What does it mean for script verify flags to be “minimal” and “maximal?”

210

18:42

<pinheadmz> ha

211

18:42

<glozow> robot-dreams: not sure. depends on sipa i guess

212

18:43

<sipa> help

213

18:43

<sipa> i sincerely hope not

214

18:43

<murch> glozow: Valid txes should pass with all flags that are not explicitly excluded in this specific transactino, and invalid txes should fail on exactly the specific flag that's tested

215

18:44

<glozow> murch: yes! we don't want transaction tests to be passing/failing because we omitted/added extra flags

216

18:44

<willcl_ark> yes, maximal should mean that adding any extra flag will make the transaction fail

217

18:44

<murch> felixweis: I'm not sure what you mean. Do you mean to express that "0" is empty stack?

218

18:45

<glozow> willcl_ark: yes! thanks for elaborating more concretely

219

18:45

<glozow> so how do we implement this? what does the PR do to test for maximal/minimal flags?

220

18:45

<murch> And minimal means that no flags can be removed without making the tx pass as valid

221

18:45

<glozow> Code at https://github.com/bitcoin/bitcoin/blob/110239f2ff673eaea8f59c650792f3641855263d/src/test/transaction_tests.cpp#L241-L246

222

18:45

<felixweis> murch: i thought it has to be a single element value. does the value have to be boolan true?

223

18:46

<sipa> felixweis: [] is an empty stack, [""] is not

224

18:46

<sipa> OP_0 pushes "" onto the stack

225

18:46

<robot-dreams> For a valid transaction, we try every unset flag and make sure adding it invalidates the transaction.

226

18:46

<felixweis> thanks

227

18:47

<glozow> robot-dream: correct 👌

228

18:47

<murch> sipa: But does a stack have to be empty or end on a positive single element in the end. I thought it was the latter for valid txes.

229

18:47

<robot-dreams> I had a question about that, would it be helpful to try every valid *combination* of unset flags, or would that blow up the test time way too much?

230

18:47

<glozow> robot-dreams: that's the next question :P

231

18:47

<sipa> murch: you have to have a stack with a single element in it, which has to be nonzero

232

18:47

<willcl_ark> stack has to terminate with any element not `0`

233

18:48

<murch> Thx

234

18:48

<sipa> murch: (for CLEANSTACK)

235

18:48

<felixweis> what about NaN? /s

236

18:48

<murch> Right

237

18:48

<glozow> ok and what about testing that invalid transaction tests are minimal? how's that implemented?

238

18:48

<sipa> without CLEANSTACK, a non-empty stack whose top element is nonzero is enough

239

18:48

<glozow> have minimal flags* i mean

240

18:48

<murch> glozow: Presumably by trying to remove every set flag in turn and checking that this makes the tx vaild

241

18:48

<glozow> murch: correct

242

18:49

<glozow> but hey, is it possible that we're removing a flag and getting an invalid combination? e.g. CLEANSTACK without P2SH?

243

18:49

<felixweis> isnt that what trimflags is for?

244

18:49

<murch> No, because if CLEANSTACK is set P2SH is turned back on by trimFlags()

245

18:49

<robot-dreams> Yeah, I think glozow addresses this with FillFlags / TrimFlags calls

246

18:50

<glozow> felixweis murch robot-dreams yah it's almost like last question feeds into this question

247

18:50

<michaelfolkson> Just to confirm sipa. For CLEANSTACK you have to have a stack with a single element in it which has to be nonzero. Without CLEANSTACK a non-empty stack with multiple elements is ok as long as the top element is nonzero?

248

18:50

<glozow> WOOT

249

18:50

<murch> glozow: You've set us up

250

18:50

<glozow> last question! What does it mean for script verify flags to be “soft forks?” How do the tests check this?

251

18:50

<sipa> michaelfolkson: correct

252

18:51

<glozow> hint: https://github.com/bitcoin/bitcoin/pull/10699

253

18:51

<emzy> michaelfolkson: thx for the sum up.

254

18:52

<glozow> bigger hint: https://github.com/bitcoin/bitcoin/blob/3caee16946575e71e90ead9ac531f5a3a1259307/src/script/interpreter.h#L36-L40

255

18:52

<murch> glozow: I think it means that no flags should increase the set of acceptable txes when they're turned on

256

18:52

<felixweis> these flags only apply to mempool policy. so older nodes will not accept and forward new transaction features.

257

18:52

<robot-dreams> michaelfolkson: Thanks! Just to confirm, in this PR would it be necessary to change `0 CLTV 1` into `0 CLTV DROP 1` as sipa mentioned above?

258

18:53

<michaelfolkson> sipa: And so without CLEANSTACK you could have a horrible non-empty stack with TRUE FALSE 3 and it would pass. I don't think I understand this...

259

18:53

<felixweis> because we can't verify the new rules, people might be ddosing us with invalid transactions under new (yet to be definded) soft fork rules

260

18:53

<murch> glozow: TBH, I'm only getting that from interpreting your commit message, though (^_^)/

261

18:53

<glozow> robot-dreams: yes, this PR enforces that the tests do CLEANSTACK

262

18:53

<glozow> (when they can)

263

18:53

<sipa> michaelfolkson: yes; what's hard to understand about that?

264

18:54

<michaelfolkson> Why should it pass? It is a mess haha

265

18:54

<sipa> why wouldn't it?

266

18:54

<dhruvm> glozow: script verify flags are soft forks because each new flag can only be more restrictive. The tests check that here: https://github.com/bitcoin/bitcoin/pull/19698/files#diff-cc0c6a9039a1c9fe38b8a21fe28391fffbac9b8531dfda0f658919a9f74b46baR229 by making sure that the flags are maximal after `TrimFlags`?

267

18:54

<sipa> the computation succeeded

268

18:54

<felixweis> the final check just looks at the top element

269

18:54

<glozow> murch dhruvm: yup, adding a flag can never increase the space of valid scripts

270

18:55

<sipa> michaelfolkson: say the script is a boring "<pubkey> OP_CHECKSIG"; normally you'd pass it a [<sig>] as input, and it'd leave [1] on the stack

271

18:55

<sipa> if you pass it [<garbage1>,<garbage2>,<sig>], it'll leave [<garbage1>,<garbage2>,1] on the stack instead

272

18:55

<glozow> so we add/subtract individual flags to check this, but what about combinations of flags like robot-dreams mentioned earlier?

273

18:55

<felixweis> michaelfolkson: maybe some fancy script designers wanted to do multiple conditional execution paths and the shortest definition to do both didn't leave the stack clean

274

18:55

<sipa> whatever the script was enforcing worked just fine, but there is garbage left

275

18:56

<glozow> hint: https://github.com/bitcoin/bitcoin/pull/19698/files#diff-cc0c6a9039a1c9fe38b8a21fe28391fffbac9b8531dfda0f658919a9f74b46baR328

276

18:56

<sipa> michaelfolkson: i think you have to see bitcoin script as a programming language, and that language used to be defined without cleanstack... so you write scripts that do what you want given those semantics

277

18:57

<sipa> now the language does have cleanstack, and in theory that means you need slightly different scripts to do what you want

278

18:57

<sipa> in practice... not really

279

18:57

<dhruvm> glozow: since the flags have dependencies, you probably can't do all combinations?

280

18:57

<michaelfolkson> Ok I guess I'm surprised because I always thought a dirty stack (in the context of CLEANSTACK) would always fail. I misunderstood the other ways a script could still pass

281

18:57

<glozow> dhruvm: yes! so what do we do instead?

282

18:58

<dhruvm> glozow: TrimFlags?

283

18:58

<glozow> er i mean, we don't do all of them because that'd be 2^n

284

18:58

<michaelfolkson> But thanks sipa felixweis

285

18:58

<felixweis> was there ever a transaction using "p2sh" before p2sh activated? in other words would p2sh soft fork from genesis fork the chain?

286

18:58

<glozow> that's a lotta testing

287

18:58

<robot-dreams> glozow pulls out the RNG

288

18:58

<glozow> yeah! we just do random combinations :P

289

18:58

<sdaftuar> felixweis: one i think

290

18:58

<glozow> sorry, rushed the last question a little

291

18:58

<sipa> sdaftuar: oh, really?

292

18:59

<glozow> We've reached the end of our program. Thanks everyone! And now, a word from our sponsor, jnewbery

293

18:59

<jnewbery> it's the reason we can't apply P2SH/segwit back to genesis, right?

294

18:59

<dhruvm> glozow: Does that makes the test failures hard to reproduce?

295

18:59

<felixweis> sdaftuar: thanks, was it someone just experimenting with all the opcodes?

296

18:59

<MarcoFalke> (slightly off-topic) There is also a fuzz test that does random combinations: https://github.com/bitcoin/bitcoin/blob/3caee16946575e71e90ead9ac531f5a3a1259307/src/test/fuzz/script_flags.cpp#L54

297

18:59

<sdaftuar> jnewbery: yes, trying to find my notes on it

298

18:59

<jnewbery> thanks glozow. That was great. Really good dive into some pretty complex parts of the code.

299

18:59

<jnewbery> Hope everyone enjoyed!

300

18:59

<glozow> dhruvm: yeah i guess, but these are mostly sanity checks for making sure the test you wrote was correct. meta-testing 🤷

301

18:59

<willcl_ark> thanks glozow!

302

18:59

<jnewbery> I just wanted to repeat my request for hosts from the start. I have a slot open next week for anyone who's keen.

303

19:00

<felixweis> thanks glozow!!

304

19:00

<sipa> dhruvm: i expect that in practice this is not a problem; if there were an actual softforkability bug somewhere that these tests catch, you'll suddenly have tons of combinations fail

305

19:00

<cangr> thanks a lot

306

19:00

<robot-dreams> sdaftaur: Is this related to the comment at the start of `GetBlockScriptFlags` that says "one historical block violated the P2SH rules"?

307

19:00

<thomasb06> thanks glozow

308

19:00

<jnewbery> (and every week after that)

309

19:00

<elle> Thanks glozow! that was awesome. really enjoy learning about Script

310

19:00

<sipa> glozow: thanks!