#15443 Add getdescriptorinfo functional test (tests)
Output script descriptor are a language for describing individual scriptPubKeys or HD chains of scriptPubKeys. See the Bitcoin Core documentation.
PR 15368 added checksums to descriptors. Since descriptors contain private or public keys where a transcription error can lead to loss of funds, having to checksum to ensure correct transcription is a sensible precaution.
Those checksums are based on a similar BCH encoding scheme as the bech32 address format. There are very detailed comments in the implementation explaining how the checksum is calculated.
- The PR also added a new RPC method
getdescriptorinfothat can be called with a descriptor to analyse the descriptor and add the checksum. That RPC method:
- is a pure function. It has no side effects.
- doesn’t need access to any blockchain or mempool data
- doesn’t have access to the wallet.
- PR #15443 adds functional tests specifically for the new
What type of testing is used in Bitcoin Core?
What are the uses for unit tests and functional tests?
What is meant by solvable when talking about a descriptor?
Why are address and raw type descriptors always unsolvable?
13:00 < jnewbery> hi! 13:00 < lightlike> hi! 13:00 < ariard> hi! 13:00 < jonatack> hi! 13:00 < ccdle12> hi! 13:01 < jnewbery> whilst we give everyone a couple of minutes to get here, I suggest we all review the notes and questions at https://bitcoin-core-review-club.github.io/15443.html 13:01 < amiti> hi! 13:02 < jnewbery> This week's PR is adding new tests. There are no changes to the C++ code. 13:02 < jnewbery> I don't think we've covered a test-only PR yet, which is a bit of an oversight, since they're a good place to start contributing 13:03 < jnewbery> I thought this one was interesting because it gives us a chance to look at output script descriptors, which are going to be important in the Bitcoin Core wallet 13:04 < jnewbery> Before we start with the *Questions* section of https://bitcoin-core-review-club.github.io/15443.html, did everything make sense in the *Notes*? Did anyone have any questions about those points? 13:05 < jnewbery> ok, great. digi_james: I know you had some questions about solvability and spendability this week. Can you state what your question was and what you learned? 13:05 < sosthene> I wonder when the descriptors were added? I don't remember seeing it before upgrading to 0.18, but maybe I wasn't paying attention 13:06 < jnewbery> sosthene: look at the docs here: https://github.com/bitcoin/bitcoin/blob/master/doc/descriptors.md . The first RPC to add support for descriptors was scantxoutset in 0.17 13:06 < digi_james> jnewbery: a descriptor is solvable if the output script and sigscript (minus signatures) can be derived from the output descriptor. 13:07 < digi_james> jnewbery: a descriptor is spendable if the signature can also be derived if the private key is also known. 13:08 < jnewbery> digi_james: that's right. So if the descriptor allows us to derive the scriptPubKey but not the scriptSig to spend it, then it's not solvable 13:08 < digi_james> So for example, a addr(P2SH_addr) is not solvable because the P2SH preimage is not known. 13:08 < jnewbery> exactly 13:09 < jnewbery> these definitions of solvable / spendable were originally used in the wallet 13:09 < jnewbery> ok, next question: What type of testing is used in Bitcoin Core? 13:10 < sosthene> unit and functional 13:11 < jnewbery> sosthene: yeah, that's mostly it. I believe there's also some fuzz testing and property-based testing (https://github.com/bitcoin/bitcoin/pull/12775) 13:11 < lightlike> I've also seen code for fuzzing. Is this used continually? 13:12 < jonatack> (plus a bit of linting in the CI and an optional serving of fuzzing that can use more data and examples) 13:12 < jnewbery> lightlike: I've never used fuzz testing myself. I'm pretty sure it's not used in the travis CI. 13:13 < drbrule> Is the testing described in the /doc folder? Wasn't sure if it's documented or just self-documented. 13:13 < sosthene> sorry can we explain what's fuzz testing? 13:13 < jnewbery> ok, so this PR adds functional tests. What are the advantages/disadvantages of that for this testing? 13:13 < jnewbery> ariard: mind taking the question about fuzz testing? 13:14 < fjahr> functional test are slow since they test the full stack 13:14 < ccdle12> jnewbury: functional tests simulate behaviour of an end user making rpc calls 13:14 < jonatack> Fuzzing: see doc/fuzzing.md 13:15 < jnewbery> jonatack: thanks 13:15 < ariard> sosthene: basically you throw millons of random-generated inputs into full-node components to discover crashs or bugs 13:15 < jnewbery> other test documentation: 13:15 < jnewbery> - unit - https://github.com/bitcoin/bitcoin/blob/master/src/test/README.md 13:15 < jnewbery> - functional/integration - https://github.com/bitcoin/bitcoin/blob/master/test/README.md 13:16 < ariard> it may be somehow semi-structurated to incrase bugs discovering success 13:16 < lightlike> also https://en.wikipedia.org/wiki/Fuzzing for a general overview 13:16 < drbrule> jnewbury: Thank you. 13:16 < ariard> s/structurated/structured 13:16 < jonatack> also: test/functional/README.md 13:17 < sosthene> ariard: got it, thanks 13:17 < jnewbery> fjahr: ccdle12: yeah, that's right. Functional tests spin up one or more full running instances of the bitcoind node, so they test the full functional behaviour, but are much slower to execute 13:18 < jnewbery> any thoughts about whether this functionality should be tested with unit or functional tests? 13:19 < fjahr> the rpc should be tested but especially the different success paths should be unit tests 13:19 < fjahr> as ccdle12 noted funcitonal is about the interaction of the user, so it should test for responding with the correct error messages etc. 13:20 < ccdle12> maybe unit for the core logic? the descriptors seem to be almost "standalone" meaning the rpc calls don't touch any other part of the codebase/affect other parts of the running daemon 13:20 < jnewbery> fjahr: that was my initial reaction too. Why is getdescriptorinfo a good candidate for unit tests? 13:21 < jonatack> it has few dependencies and can be tested in isolation 13:21 < hugohn> because it doesn't require external dependencies 13:21 < jnewbery> ccdle12: right. Because it's a pure utility, it doesn't need any setup with the rest of the node. 13:23 < jnewbery> I'm definitely not a concept NACK (some tests are better than no tests), but I think it's something to be aware of 13:23 < jnewbery> my feeling is that we rely on functional tests a bit too much in Bitcoin Core where sometimes unit tests might be more appropriate 13:23 < lightlike> jnewbery: but then we probably wouldn't test getdescriptorinfo directly, just stumbled today over a comment (https://github.com/bitcoin/bitcoin/blob/1381ddbcfcb6429b1327fd3db91ef97d8603aef9/src/test/setup_common.cpp#L74) stating that RPC tests should be moved to functional. 13:24 < jnewbery> oh, interesting 13:25 < hugohn> RE: disadvantages of functional/integration testing. If I might add, besides the fact that it's slow, IMO it might not be possible to have full 100% test coverage with functional/integration testing (unlike unit testing), because your functional tests usually test things in very specific orders and not all the possible ways the components can react. So it's helpful but, shouldn't be relied on exclusively. 13:25 < hugohn> *can interact 13:26 < jnewbery> lightlike: there's definitely room for both. I disagree that all RPC testing should be done using the functional test suite 13:26 < jnewbery> especially for pure utility functions like `getdescriptorinfo` 13:26 < jnewbery> but opinions vary! 13:27 < jnewbery> Any thoughts about Sjors comment here: https://github.com/bitcoin/bitcoin/pull/15443#pullrequestreview-205311643 ? 13:29 < jnewbery> ok, any other questions about the code itself? 13:29 < digi_james> I was confused about sjors comments. 13:30 < digi_james> descsum_create is part of the test framework, no? 13:30 < sosthene> do we have to use functional testing to test the output of a rpc call or can it be done with a unittest? 13:30 < lightlike> I haven't understood why does a (second) python implementation for checksum has to exist (and then needs to be tested). 13:30 < digi_james> I suppose we could pass the checksum as part of the input rather than generate it via a test function. 13:31 < jnewbery> digi_james: yes, descsum_create is part of the functional test framework 13:32 < jnewbery> These tests call `getdescriptorinfo` to get the checksum from the bitcoind node, and also call descsum_create() to get the checksum from the python implementation 13:32 < jnewbery> checking that the two implementations of the checksum match is a good way to test 13:33 < jnewbery> so that's an argument for doing this as a functional test: it forces us to use implementations in two different languages 13:33 < digi_james> Why arent the checksums hardcoded in the test? 13:33 < digi_james> I see... 13:33 < jnewbery> digi_james: they could be, but having a python implementation that can be re-used makes it easier to add more tests in future 13:34 < jnewbery> but also, where would you get the hard-coded values from if you didn't implement the checksum in the test code? 13:34 < fjahr> So there are tests for descsum_create() as well? 13:34 < digi_james> jnewbery: yup that makes sense to me now ... 13:35 < jnewbery> fjahr: I don't think so. We generally don't have tests for the test code. 13:35 < jnewbery> The tests test the code and the code tests the test 13:35 < fjahr> so if we introduce logic errors in both implementations they would not be noticed 13:35 < hugohn> that's def an interesting approach. introduce new logic in the test code itself... 13:36 < hugohn> yeah... you're testing for consistency not necessarily accurary 13:36 < jnewbery> one way to test the test code is to somehow break the node implementation and verify that the test starts failing 13:37 < jonatack> fjahr: with test code I think we depend a lot on review 13:37 < jonatack> (Sjors comment seems good to me... I have a few general questions regarding writing/reviewing tests, when it's time) 13:37 < digi_james> I suppose Sjors is suggesting to test a test :) 13:37 < lightlike> jnewbery: but doesn't this reasoning apply to basically everything (e.g. all kind of cryptographic functions)? I'm not sure why it is worth reprogramming function in python just for testing purposes and then keep it in the codebase. 13:38 < lightlike> (as an initial proof of concept, of course) 13:38 < fjahr> I think having some hardcoded values to test against would be good to make regressions explicit 13:38 < jnewbery> digi_james: one way to test the test code would be to use python's unittest module 13:39 < jnewbery> but in most cases, I don't think that's worthwhile 13:39 < digi_james> :) 13:40 < jnewbery> lightlike: yes, I think it's useful to have python implementations of cryptographic functions in the test framework. See https://github.com/bitcoin/bitcoin/blob/master/test/functional/test_framework/key.py for example 13:42 < fjahr> I guess another way to deal with it is to never change the python and the c++ implementation in the same PR 13:42 < fjahr> but not sure how that could be enforced 13:42 < jnewbery> any other questions about the code in https://github.com/bitcoin/bitcoin/pull/15443/files 13:42 < jonatack> Could use some logging :p 13:43 < jnewbery> yeah, logging would be good! 13:44 < jonatack> perhaps separate into subtests too 13:44 < jnewbery> fjahr: I wonder if https://github.com/bitcoin/bitcoin/blob/master/doc/descriptors.md#examples should be updated to include the checksums so they can be used as test vectors for other implementations 13:45 < digi_james> I wonder why the test vectors did not include unsolvable and private keys = true 13:46 < digi_james> Or the vectors in descriptor.md should be updated with such, as fjahr suggest 13:46 < fjahr> yeah, that would make sense to me 13:46 < jnewbery> digi_james: I think because that's mainly a Bitcoin Core wallet concept 13:47 < jonatack> A general question, if ok, about test assertion argument order: 13:47 < jonatack> In some test frameworks there is a convention that can be useful for test error reports to make better sense. 13:47 < jonatack> In Bitcoin Core it looks like there is an implicit convention of assert_bla(actual, expected)...? 13:47 < jonatack> I didn't see it stipulated in the testing docs... perhaps I overlooked it. 13:47 < jonatack> Is there a strict (or loose) convention? 13:48 < jnewbery> jonatack: the convention is to use the assert_bla(actual, expected) rather than assert actual == expected 13:48 < jnewbery> because assert_bla(a,e) will print out a and e if they don't match 13:49 < jnewbery> whereas assert a == e will just fail with no helpful logging 13:49 < jonatack> agreed... my question pertains to the arg order 13:50 < jnewbery> oh right. I don't think there's a convention 13:50 < jnewbery> I think we have assert_greater_than() but not assert_lesser_than(), so the order depends on which you want to be greater for that one :) 13:50 < jonatack> it looks like actual, expected in practice 13:51 < jonatack> but unsure how hard or soft that is 13:51 < jonatack> for assert_equal in any case :) 13:51 < jnewbery> yeah, I think I've mostly seen actual, expected 13:51 < jonatack> Another question about running linters before pushing test commits 13:52 < jonatack> I could bit by not doing that this morning 13:52 < jonatack> got bit by not running linters before pushing 13:52 < jonatack> which do you run... 13:53 < jonatack> I automated lint-python.sh and lint-python-deadcode.sh 13:54 < jnewbery> I generally don't run them locally 13:54 < jnewbery> travis tells you pretty quickly when they fail 13:54 < lightlike> also a general question: does anyone know if there is a debug.log generated for unit tests (where the LogPrintf() go to)? 13:55 < jnewbery> lighlike: I don't know about debug.log for unit tests, but you can run with --log_level=all to get more verbose output: https://github.com/bitcoin/bitcoin/blob/master/src/test/README.md#running-individual-tests 13:56 < jonatack> jnewbery: thanks 13:56 < jnewbery> ok, 5 mkinutes until the hour. Any final questions before we wrap up? 13:56 < digi_james> jnewbery: Is there an formal/informal roadmap for descriptors? Whist there are very neat, I wonder if they could exist in PBSTs? I suppose they will play a larger role in wallets managing taproot paths? 13:56 < jnewbery> Also, any requests for PRs to cover next week and beyond? 13:57 < digi_james> Internal node/wallet interfaces? 13:57 < jnewbery> digi_james: we want the wallet to move entirely to descriptors. There's a project tracking that here: https://github.com/bitcoin/bitcoin/projects/12 13:57 < jnewbery> as for the specification of the language itself, yes, it would need to be extended to support taproot 13:58 < jnewbery> I'd also recommend watching the talk on miniscript here: https://www.youtube.com/watch?v=XM1lzN4Zfks&feature=youtu.be which is related 13:58 < digi_james> jnewberry: Ah thank you, I will have a look at the project, awesome! 13:59 < jnewbery> alright, let's wrap it up there. Thanks everyone! 13:59 < ariard> I need to think more about it but I think a lightning-prefix could be useful too 13:59 < hugohn> thanks jnewbery 13:59 < lightlike> thanks! 14:00 < digi_james> Thanks jnewbery, and everybody else! 14:00 < jonatack> Thanks jnewbery and everyone! 21:03 < wumpus> jnewbery: I think the reasoning is that RPC is a separate 'subsystem' from whatever is being tested (it needs to be specifically initialized/torn down), so anything testing RPC *and something else* is already a functional/integration test 21:05 < wumpus> the internal functions used by RPC methods can of course, independently, be tested in unit tests, especially if they're pure utility functions 21:05 < wumpus> but really calling RPC methods from unit tests is somewhat frowned upon, unless it's unit testing of the RPC system