Add getdescriptorinfo functional test (tests)

Jul 3, 2019

https://github.com/bitcoin/bitcoin/pull/15443

Host: jnewbery - PR author: promag

Notes

Output script descriptor are a language for describing individual scriptPubKeys or HD chains of scriptPubKeys. See the Bitcoin Core documentation.
PR 15368 added checksums to descriptors. Since descriptors contain private or public keys where a transcription error can lead to loss of funds, having to checksum to ensure correct transcription is a sensible precaution.
Those checksums are based on a similar BCH encoding scheme as the bech32 address format. There are very detailed comments in the implementation explaining how the checksum is calculated.
The PR also added a new RPC method getdescriptorinfo that can be called with a descriptor to analyse the descriptor and add the checksum. That RPC method:
- is a pure function. It has no side effects.
- doesn’t need access to any blockchain or mempool data
- doesn’t have access to the wallet.
The PR also added unit tests and updated several functional tests to use the checksums. A python implementation of the checksum was also added.
PR #15443 adds functional tests specifically for the new getdescriptorinfo RPC method.

Questions

What type of testing is used in Bitcoin Core?
What are the uses for unit tests and functional tests?
What is meant by solvable when talking about a descriptor?
Why are address and raw type descriptors always unsolvable?

Meeting Log

13:00

<jnewbery> hi!

13:00

<lightlike> hi!

13:00

<ariard> hi!

13:00

<jonatack> hi!

13:00

<ccdle12> hi!

13:01

<jnewbery> whilst we give everyone a couple of minutes to get here, I suggest we all review the notes and questions at https://bitcoin-core-review-club.github.io/15443.html

13:01

<amiti> hi!

13:02

<jnewbery> This week's PR is adding new tests. There are no changes to the C++ code.

13:02

<jnewbery> I don't think we've covered a test-only PR yet, which is a bit of an oversight, since they're a good place to start contributing

13:03

<jnewbery> I thought this one was interesting because it gives us a chance to look at output script descriptors, which are going to be important in the Bitcoin Core wallet

13:04

<jnewbery> Before we start with the *Questions* section of https://bitcoin-core-review-club.github.io/15443.html, did everything make sense in the *Notes*? Did anyone have any questions about those points?

13:05

<jnewbery> ok, great. digi_james: I know you had some questions about solvability and spendability this week. Can you state what your question was and what you learned?

13:05

<sosthene> I wonder when the descriptors were added? I don't remember seeing it before upgrading to 0.18, but maybe I wasn't paying attention

13:06

<jnewbery> sosthene: look at the docs here: https://github.com/bitcoin/bitcoin/blob/master/doc/descriptors.md . The first RPC to add support for descriptors was scantxoutset in 0.17

13:06

<digi_james> jnewbery: a descriptor is solvable if the output script and sigscript (minus signatures) can be derived from the output descriptor.

13:07

<digi_james> jnewbery: a descriptor is spendable if the signature can also be derived if the private key is also known.

13:08

<jnewbery> digi_james: that's right. So if the descriptor allows us to derive the scriptPubKey but not the scriptSig to spend it, then it's not solvable

13:08

<digi_james> So for example, a addr(P2SH_addr) is not solvable because the P2SH preimage is not known.

13:08

<jnewbery> exactly

13:09

<jnewbery> these definitions of solvable / spendable were originally used in the wallet

13:09

<jnewbery> ok, next question: What type of testing is used in Bitcoin Core?

13:10

<sosthene> unit and functional

13:11

<jnewbery> sosthene: yeah, that's mostly it. I believe there's also some fuzz testing and property-based testing (https://github.com/bitcoin/bitcoin/pull/12775)

13:11

<lightlike> I've also seen code for fuzzing. Is this used continually?

13:12

<jonatack> (plus a bit of linting in the CI and an optional serving of fuzzing that can use more data and examples)

13:12

<jnewbery> lightlike: I've never used fuzz testing myself. I'm pretty sure it's not used in the travis CI.

13:13

<drbrule> Is the testing described in the /doc folder? Wasn't sure if it's documented or just self-documented.

13:13

<sosthene> sorry can we explain what's fuzz testing?

13:13

<jnewbery> ok, so this PR adds functional tests. What are the advantages/disadvantages of that for this testing?

13:13

<jnewbery> ariard: mind taking the question about fuzz testing?

13:14

<fjahr> functional test are slow since they test the full stack

13:14

<ccdle12> jnewbury: functional tests simulate behaviour of an end user making rpc calls

13:14

<jonatack> Fuzzing: see doc/fuzzing.md

13:15

<jnewbery> jonatack: thanks

13:15

<ariard> sosthene: basically you throw millons of random-generated inputs into full-node components to discover crashs or bugs

13:15

<jnewbery> other test documentation:

13:15

<jnewbery> - unit - https://github.com/bitcoin/bitcoin/blob/master/src/test/README.md

13:15

<jnewbery> - functional/integration - https://github.com/bitcoin/bitcoin/blob/master/test/README.md

13:16

<ariard> it may be somehow semi-structurated to incrase bugs discovering success

13:16

<lightlike> also https://en.wikipedia.org/wiki/Fuzzing for a general overview

13:16

<drbrule> jnewbury: Thank you.

13:16

<ariard> s/structurated/structured

13:16

<jonatack> also: test/functional/README.md

13:17

<sosthene> ariard: got it, thanks

13:17

<jnewbery> fjahr: ccdle12: yeah, that's right. Functional tests spin up one or more full running instances of the bitcoind node, so they test the full functional behaviour, but are much slower to execute

13:18

<jnewbery> any thoughts about whether this functionality should be tested with unit or functional tests?

13:19

<fjahr> the rpc should be tested but especially the different success paths should be unit tests

13:19

<fjahr> as ccdle12 noted funcitonal is about the interaction of the user, so it should test for responding with the correct error messages etc.

13:20

<ccdle12> maybe unit for the core logic? the descriptors seem to be almost "standalone" meaning the rpc calls don't touch any other part of the codebase/affect other parts of the running daemon

13:20

<jnewbery> fjahr: that was my initial reaction too. Why is getdescriptorinfo a good candidate for unit tests?

13:21

<jonatack> it has few dependencies and can be tested in isolation

13:21

<hugohn> because it doesn't require external dependencies

13:21

<jnewbery> ccdle12: right. Because it's a pure utility, it doesn't need any setup with the rest of the node.

13:23

<jnewbery> I'm definitely not a concept NACK (some tests are better than no tests), but I think it's something to be aware of

13:23

<jnewbery> my feeling is that we rely on functional tests a bit too much in Bitcoin Core where sometimes unit tests might be more appropriate

13:23

<lightlike> jnewbery: but then we probably wouldn't test getdescriptorinfo directly, just stumbled today over a comment (https://github.com/bitcoin/bitcoin/blob/1381ddbcfcb6429b1327fd3db91ef97d8603aef9/src/test/setup_common.cpp#L74) stating that RPC tests should be moved to functional.

13:24

<jnewbery> oh, interesting

13:25

<hugohn> RE: disadvantages of functional/integration testing. If I might add, besides the fact that it's slow, IMO it might not be possible to have full 100% test coverage with functional/integration testing (unlike unit testing), because your functional tests usually test things in very specific orders and not all the possible ways the components can react. So it's helpful but, shouldn't be relied on exclusively.

13:25

<hugohn> *can interact

13:26

<jnewbery> lightlike: there's definitely room for both. I disagree that all RPC testing should be done using the functional test suite

13:26

<jnewbery> especially for pure utility functions like `getdescriptorinfo`

13:26

<jnewbery> but opinions vary!

13:27

<jnewbery> Any thoughts about Sjors comment here: https://github.com/bitcoin/bitcoin/pull/15443#pullrequestreview-205311643 ?

13:29

<jnewbery> ok, any other questions about the code itself?

13:29

<digi_james> I was confused about sjors comments.

13:30

<digi_james> descsum_create is part of the test framework, no?

13:30

<sosthene> do we have to use functional testing to test the output of a rpc call or can it be done with a unittest?

13:30

<lightlike> I haven't understood why does a (second) python implementation for checksum has to exist (and then needs to be tested).

13:30

<digi_james> I suppose we could pass the checksum as part of the input rather than generate it via a test function.

13:31

<jnewbery> digi_james: yes, descsum_create is part of the functional test framework

13:32

<jnewbery> These tests call `getdescriptorinfo` to get the checksum from the bitcoind node, and also call descsum_create() to get the checksum from the python implementation

13:32

<jnewbery> checking that the two implementations of the checksum match is a good way to test

13:33

<jnewbery> so that's an argument for doing this as a functional test: it forces us to use implementations in two different languages

13:33

<digi_james> Why arent the checksums hardcoded in the test?

13:33

<digi_james> I see...

13:33

<jnewbery> digi_james: they could be, but having a python implementation that can be re-used makes it easier to add more tests in future

13:34

<jnewbery> but also, where would you get the hard-coded values from if you didn't implement the checksum in the test code?

13:34

<fjahr> So there are tests for descsum_create() as well?

13:34

<digi_james> jnewbery: yup that makes sense to me now ...

13:35

<jnewbery> fjahr: I don't think so. We generally don't have tests for the test code.

13:35

<jnewbery> The tests test the code and the code tests the test

13:35

<fjahr> so if we introduce logic errors in both implementations they would not be noticed

13:35

<hugohn> that's def an interesting approach. introduce new logic in the test code itself...

13:36

<hugohn> yeah... you're testing for consistency not necessarily accurary

13:36

<jnewbery> one way to test the test code is to somehow break the node implementation and verify that the test starts failing

13:37

<jonatack> fjahr: with test code I think we depend a lot on review

13:37

<jonatack> (Sjors comment seems good to me... I have a few general questions regarding writing/reviewing tests, when it's time)

13:37

<digi_james> I suppose Sjors is suggesting to test a test :)

13:37

<lightlike> jnewbery: but doesn't this reasoning apply to basically everything (e.g. all kind of cryptographic functions)? I'm not sure why it is worth reprogramming function in python just for testing purposes and then keep it in the codebase.

13:38

<lightlike> (as an initial proof of concept, of course)

13:38

<fjahr> I think having some hardcoded values to test against would be good to make regressions explicit

13:38

<jnewbery> digi_james: one way to test the test code would be to use python's unittest module

13:39

<jnewbery> but in most cases, I don't think that's worthwhile

13:39

<digi_james> :)

13:40

<jnewbery> lightlike: yes, I think it's useful to have python implementations of cryptographic functions in the test framework. See https://github.com/bitcoin/bitcoin/blob/master/test/functional/test_framework/key.py for example

13:42

<fjahr> I guess another way to deal with it is to never change the python and the c++ implementation in the same PR

13:42

<fjahr> but not sure how that could be enforced

13:42

<jnewbery> any other questions about the code in https://github.com/bitcoin/bitcoin/pull/15443/files

13:42

<jonatack> Could use some logging :p

100

13:43

<jnewbery> yeah, logging would be good!

101

13:44

<jonatack> perhaps separate into subtests too

102

13:44

<jnewbery> fjahr: I wonder if https://github.com/bitcoin/bitcoin/blob/master/doc/descriptors.md#examples should be updated to include the checksums so they can be used as test vectors for other implementations

103

13:45

<digi_james> I wonder why the test vectors did not include unsolvable and private keys = true

104

13:46

<digi_james> Or the vectors in descriptor.md should be updated with such, as fjahr suggest

105

13:46

<fjahr> yeah, that would make sense to me

106

13:46

<jnewbery> digi_james: I think because that's mainly a Bitcoin Core wallet concept

107

13:47

<jonatack> A general question, if ok, about test assertion argument order:

108

13:47

<jonatack> In some test frameworks there is a convention that can be useful for test error reports to make better sense.

109

13:47

<jonatack> In Bitcoin Core it looks like there is an implicit convention of assert_bla(actual, expected)...?

110

13:47

<jonatack> I didn't see it stipulated in the testing docs... perhaps I overlooked it.

111

13:47

<jonatack> Is there a strict (or loose) convention?

112

13:48

<jnewbery> jonatack: the convention is to use the assert_bla(actual, expected) rather than assert actual == expected

113

13:48

<jnewbery> because assert_bla(a,e) will print out a and e if they don't match

114

13:49

<jnewbery> whereas assert a == e will just fail with no helpful logging

115

13:49

<jonatack> agreed... my question pertains to the arg order

116

13:50

<jnewbery> oh right. I don't think there's a convention

117

13:50

<jnewbery> I think we have assert_greater_than() but not assert_lesser_than(), so the order depends on which you want to be greater for that one :)

118

13:50

<jonatack> it looks like actual, expected in practice

119

13:51

<jonatack> but unsure how hard or soft that is

120

13:51

<jonatack> for assert_equal in any case :)

121

13:51

<jnewbery> yeah, I think I've mostly seen actual, expected

122

13:51

<jonatack> Another question about running linters before pushing test commits

123

13:52

<jonatack> I could bit by not doing that this morning

124

13:52

<jonatack> got bit by not running linters before pushing

125

13:52

<jonatack> which do you run...

126

13:53

<jonatack> I automated lint-python.sh and lint-python-deadcode.sh

127

13:54

<jnewbery> I generally don't run them locally

128

13:54

<jnewbery> travis tells you pretty quickly when they fail

129

13:54

<lightlike> also a general question: does anyone know if there is a debug.log generated for unit tests (where the LogPrintf() go to)?

130

13:55

<jnewbery> lighlike: I don't know about debug.log for unit tests, but you can run with --log_level=all to get more verbose output: https://github.com/bitcoin/bitcoin/blob/master/src/test/README.md#running-individual-tests

131

13:56

<jonatack> jnewbery: thanks

132

13:56

<jnewbery> ok, 5 mkinutes until the hour. Any final questions before we wrap up?

133

13:56

<digi_james> jnewbery: Is there an formal/informal roadmap for descriptors? Whist there are very neat, I wonder if they could exist in PBSTs? I suppose they will play a larger role in wallets managing taproot paths?

134

13:56

<jnewbery> Also, any requests for PRs to cover next week and beyond?

135

13:57

<digi_james> Internal node/wallet interfaces?

136

13:57

<jnewbery> digi_james: we want the wallet to move entirely to descriptors. There's a project tracking that here: https://github.com/bitcoin/bitcoin/projects/12

137

13:57

<jnewbery> as for the specification of the language itself, yes, it would need to be extended to support taproot

138

13:58

<jnewbery> I'd also recommend watching the talk on miniscript here: https://www.youtube.com/watch?v=XM1lzN4Zfks&feature=youtu.be which is related

139

13:58

<digi_james> jnewberry: Ah thank you, I will have a look at the project, awesome!

140

13:59

<jnewbery> alright, let's wrap it up there. Thanks everyone!

141

13:59

<ariard> I need to think more about it but I think a lightning-prefix could be useful too

142

13:59

<hugohn> thanks jnewbery

143

13:59

<lightlike> thanks!

144

14:00

<digi_james> Thanks jnewbery, and everybody else!

145

14:00

<jonatack> Thanks jnewbery and everyone!

146

21:03

<wumpus> jnewbery: I think the reasoning is that RPC is a separate 'subsystem' from whatever is being tested (it needs to be specifically initialized/torn down), so anything testing RPC *and something else* is already a functional/integration test

147

21:05

<wumpus> the internal functions used by RPC methods can of course, independently, be tested in unit tests, especially if they're pure utility functions

148

21:05

<wumpus> but really calling RPC methods from unit tests is somewhat frowned upon, unless it's unit testing of the RPC system