Add getdescriptorinfo functional test (
tests) Jul 3, 2019
Output script descriptor are a language for describing individual
scriptPubKeys or HD chains of scriptPubKeys. See the Bitcoin Core
PR 15368 added checksums
to descriptors. Since descriptors contain private or public keys where a
transcription error can lead to loss of funds, having to checksum to ensure
correct transcription is a sensible precaution.
Those checksums are based on a similar BCH encoding scheme as the bech32
address format. There are
in the implementation explaining how the checksum is calculated. The PR also added a new RPC method
getdescriptorinfo that can be called
with a descriptor to analyse the descriptor and add the checksum. That RPC
pure function. It has no side effects. doesn’t need access to any blockchain or mempool data
doesn’t have access to the wallet.
The PR also added
and updated several functional tests to use the checksums. A python
of the checksum was also added. PR #15443 adds functional tests specifically for the new
getdescriptorinfo RPC method.
What type of testing is used in Bitcoin Core?
What are the uses for unit tests and functional tests?
What is meant by
solvable when talking about a descriptor?
Why are address and raw type descriptors always unsolvable?
8 13:02 <jnewbery> This week's PR is adding new tests. There are no changes to the C++ code.
9 13:02 <jnewbery> I don't think we've covered a test-only PR yet, which is a bit of an oversight, since they're a good place to start contributing
10 13:03 <jnewbery> I thought this one was interesting because it gives us a chance to look at output script descriptors, which are going to be important in the Bitcoin Core wallet
12 13:05 <jnewbery> ok, great. digi_james: I know you had some questions about solvability and spendability this week. Can you state what your question was and what you learned?
13 13:05 <sosthene> I wonder when the descriptors were added? I don't remember seeing it before upgrading to 0.18, but maybe I wasn't paying attention
15 13:06 <digi_james> jnewbery: a descriptor is solvable if the output script and sigscript (minus signatures) can be derived from the output descriptor.
16 13:07 <digi_james> jnewbery: a descriptor is spendable if the signature can also be derived if the private key is also known.
17 13:08 <jnewbery> digi_james: that's right. So if the descriptor allows us to derive the scriptPubKey but not the scriptSig to spend it, then it's not solvable
18 13:08 <digi_james> So for example, a addr(P2SH_addr) is not solvable because the P2SH preimage is not known.
19 13:08 <jnewbery> exactly
20 13:09 <jnewbery> these definitions of solvable / spendable were originally used in the wallet
21 13:09 <jnewbery> ok, next question: What type of testing is used in Bitcoin Core?
22 13:10 <sosthene> unit and functional
24 13:11 <lightlike> I've also seen code for fuzzing. Is this used continually?
25 13:12 <jonatack> (plus a bit of linting in the CI and an optional serving of fuzzing that can use more data and examples)
26 13:12 <jnewbery> lightlike: I've never used fuzz testing myself. I'm pretty sure it's not used in the travis CI.
27 13:13 <drbrule> Is the testing described in the /doc folder? Wasn't sure if it's documented or just self-documented.
28 13:13 <sosthene> sorry can we explain what's fuzz testing?
29 13:13 <jnewbery> ok, so this PR adds functional tests. What are the advantages/disadvantages of that for this testing?
30 13:13 <jnewbery> ariard: mind taking the question about fuzz testing?
31 13:14 <fjahr> functional test are slow since they test the full stack
32 13:14 <ccdle12> jnewbury: functional tests simulate behaviour of an end user making rpc calls
33 13:14 <jonatack> Fuzzing: see doc/fuzzing.md
34 13:15 <jnewbery> jonatack: thanks
35 13:15 <ariard> sosthene: basically you throw millons of random-generated inputs into full-node components to discover crashs or bugs
36 13:15 <jnewbery> other test documentation:
39 13:16 <ariard> it may be somehow semi-structurated to incrase bugs discovering success
41 13:16 <drbrule> jnewbury: Thank you.
42 13:16 <ariard> s/structurated/structured
43 13:16 <jonatack> also: test/functional/README.md
44 13:17 <sosthene> ariard: got it, thanks
45 13:17 <jnewbery> fjahr: ccdle12: yeah, that's right. Functional tests spin up one or more full running instances of the bitcoind node, so they test the full functional behaviour, but are much slower to execute
46 13:18 <jnewbery> any thoughts about whether this functionality should be tested with unit or functional tests?
47 13:19 <fjahr> the rpc should be tested but especially the different success paths should be unit tests
48 13:19 <fjahr> as ccdle12 noted funcitonal is about the interaction of the user, so it should test for responding with the correct error messages etc.
49 13:20 <ccdle12> maybe unit for the core logic? the descriptors seem to be almost "standalone" meaning the rpc calls don't touch any other part of the codebase/affect other parts of the running daemon
50 13:20 <jnewbery> fjahr: that was my initial reaction too. Why is getdescriptorinfo a good candidate for unit tests?
51 13:21 <jonatack> it has few dependencies and can be tested in isolation
52 13:21 <hugohn> because it doesn't require external dependencies
53 13:21 <jnewbery> ccdle12: right. Because it's a pure utility, it doesn't need any setup with the rest of the node.
54 13:23 <jnewbery> I'm definitely not a concept NACK (some tests are better than no tests), but I think it's something to be aware of
55 13:23 <jnewbery> my feeling is that we rely on functional tests a bit too much in Bitcoin Core where sometimes unit tests might be more appropriate
57 13:24 <jnewbery> oh, interesting
58 13:25 <hugohn> RE: disadvantages of functional/integration testing. If I might add, besides the fact that it's slow, IMO it might not be possible to have full 100% test coverage with functional/integration testing (unlike unit testing), because your functional tests usually test things in very specific orders and not all the possible ways the components can react. So it's helpful but, shouldn't be relied on exclusively.
59 13:25 <hugohn> *can interact
60 13:26 <jnewbery> lightlike: there's definitely room for both. I disagree that all RPC testing should be done using the functional test suite
61 13:26 <jnewbery> especially for pure utility functions like `getdescriptorinfo`
62 13:26 <jnewbery> but opinions vary!
64 13:29 <jnewbery> ok, any other questions about the code itself?
65 13:29 <digi_james> I was confused about sjors comments.
66 13:30 <digi_james> descsum_create is part of the test framework, no?
67 13:30 <sosthene> do we have to use functional testing to test the output of a rpc call or can it be done with a unittest?
68 13:30 <lightlike> I haven't understood why does a (second) python implementation for checksum has to exist (and then needs to be tested).
69 13:30 <digi_james> I suppose we could pass the checksum as part of the input rather than generate it via a test function.
70 13:31 <jnewbery> digi_james: yes, descsum_create is part of the functional test framework
71 13:32 <jnewbery> These tests call `getdescriptorinfo` to get the checksum from the bitcoind node, and also call descsum_create() to get the checksum from the python implementation
72 13:32 <jnewbery> checking that the two implementations of the checksum match is a good way to test
73 13:33 <jnewbery> so that's an argument for doing this as a functional test: it forces us to use implementations in two different languages
74 13:33 <digi_james> Why arent the checksums hardcoded in the test?
75 13:33 <digi_james> I see...
76 13:33 <jnewbery> digi_james: they could be, but having a python implementation that can be re-used makes it easier to add more tests in future
77 13:34 <jnewbery> but also, where would you get the hard-coded values from if you didn't implement the checksum in the test code?
78 13:34 <fjahr> So there are tests for descsum_create() as well?
79 13:34 <digi_james> jnewbery: yup that makes sense to me now ...
80 13:35 <jnewbery> fjahr: I don't think so. We generally don't have tests for the test code.
81 13:35 <jnewbery> The tests test the code and the code tests the test
82 13:35 <fjahr> so if we introduce logic errors in both implementations they would not be noticed
83 13:35 <hugohn> that's def an interesting approach. introduce new logic in the test code itself...
84 13:36 <hugohn> yeah... you're testing for consistency not necessarily accurary
85 13:36 <jnewbery> one way to test the test code is to somehow break the node implementation and verify that the test starts failing
86 13:37 <jonatack> fjahr: with test code I think we depend a lot on review
87 13:37 <jonatack> (Sjors comment seems good to me... I have a few general questions regarding writing/reviewing tests, when it's time)
88 13:37 <digi_james> I suppose Sjors is suggesting to test a test :)
89 13:37 <lightlike> jnewbery: but doesn't this reasoning apply to basically everything (e.g. all kind of cryptographic functions)? I'm not sure why it is worth reprogramming function in python just for testing purposes and then keep it in the codebase.
90 13:38 <lightlike> (as an initial proof of concept, of course)
91 13:38 <fjahr> I think having some hardcoded values to test against would be good to make regressions explicit
92 13:38 <jnewbery> digi_james: one way to test the test code would be to use python's unittest module
93 13:39 <jnewbery> but in most cases, I don't think that's worthwhile
96 13:42 <fjahr> I guess another way to deal with it is to never change the python and the c++ implementation in the same PR
97 13:42 <fjahr> but not sure how that could be enforced
99 13:42 <jonatack> Could use some logging :p
100 13:43 <jnewbery> yeah, logging would be good!
101 13:44 <jonatack> perhaps separate into subtests too
103 13:45 <digi_james> I wonder why the test vectors did not include unsolvable and private keys = true
104 13:46 <digi_james> Or the vectors in descriptor.md should be updated with such, as fjahr suggest
105 13:46 <fjahr> yeah, that would make sense to me
106 13:46 <jnewbery> digi_james: I think because that's mainly a Bitcoin Core wallet concept
107 13:47 <jonatack> A general question, if ok, about test assertion argument order:
108 13:47 <jonatack> In some test frameworks there is a convention that can be useful for test error reports to make better sense.
109 13:47 <jonatack> In Bitcoin Core it looks like there is an implicit convention of assert_bla(actual, expected)...?
110 13:47 <jonatack> I didn't see it stipulated in the testing docs... perhaps I overlooked it.
111 13:47 <jonatack> Is there a strict (or loose) convention?
112 13:48 <jnewbery> jonatack: the convention is to use the assert_bla(actual, expected) rather than assert actual == expected
113 13:48 <jnewbery> because assert_bla(a,e) will print out a and e if they don't match
114 13:49 <jnewbery> whereas assert a == e will just fail with no helpful logging
115 13:49 <jonatack> agreed... my question pertains to the arg order
116 13:50 <jnewbery> oh right. I don't think there's a convention
117 13:50 <jnewbery> I think we have assert_greater_than() but not assert_lesser_than(), so the order depends on which you want to be greater for that one :)
118 13:50 <jonatack> it looks like actual, expected in practice
119 13:51 <jonatack> but unsure how hard or soft that is
120 13:51 <jonatack> for assert_equal in any case :)
121 13:51 <jnewbery> yeah, I think I've mostly seen actual, expected
122 13:51 <jonatack> Another question about running linters before pushing test commits
123 13:52 <jonatack> I could bit by not doing that this morning
124 13:52 <jonatack> got bit by not running linters before pushing
125 13:52 <jonatack> which do you run...
126 13:53 <jonatack> I automated lint-python.sh and lint-python-deadcode.sh
127 13:54 <jnewbery> I generally don't run them locally
128 13:54 <jnewbery> travis tells you pretty quickly when they fail
129 13:54 <lightlike> also a general question: does anyone know if there is a debug.log generated for unit tests (where the LogPrintf() go to)?
131 13:56 <jonatack> jnewbery: thanks
132 13:56 <jnewbery> ok, 5 mkinutes until the hour. Any final questions before we wrap up?
133 13:56 <digi_james> jnewbery: Is there an formal/informal roadmap for descriptors? Whist there are very neat, I wonder if they could exist in PBSTs? I suppose they will play a larger role in wallets managing taproot paths?
134 13:56 <jnewbery> Also, any requests for PRs to cover next week and beyond?
135 13:57 <digi_james> Internal node/wallet interfaces?
137 13:57 <jnewbery> as for the specification of the language itself, yes, it would need to be extended to support taproot
139 13:58 <digi_james> jnewberry: Ah thank you, I will have a look at the project, awesome!
140 13:59 <jnewbery> alright, let's wrap it up there. Thanks everyone!
141 13:59 <ariard> I need to think more about it but I think a lightning-prefix could be useful too
142 13:59 <hugohn> thanks jnewbery
143 13:59 <lightlike> thanks!
144 14:00 <digi_james> Thanks jnewbery, and everybody else!
145 14:00 <jonatack> Thanks jnewbery and everyone!
146 21:03 <wumpus> jnewbery: I think the reasoning is that RPC is a separate 'subsystem' from whatever is being tested (it needs to be specifically initialized/torn down), so anything testing RPC *and something else* is already a functional/integration test
147 21:05 <wumpus> the internal functions used by RPC methods can of course, independently, be tested in unit tests, especially if they're pure utility functions
148 21:05 <wumpus> but really calling RPC methods from unit tests is somewhat frowned upon, unless it's unit testing of the RPC system