Add getdescriptorinfo functional test (tests)

https://github.com/bitcoin/bitcoin/pull/15443

Host: jnewbery  -  PR author: promag

Notes

  • Output script descriptor are a language for describing individual scriptPubKeys or HD chains of scriptPubKeys. See the Bitcoin Core documentation.

  • PR 15368 added checksums to descriptors. Since descriptors contain private or public keys where a transcription error can lead to loss of funds, having to checksum to ensure correct transcription is a sensible precaution.

  • Those checksums are based on a similar BCH encoding scheme as the bech32 address format. There are very detailed comments in the implementation explaining how the checksum is calculated.

  • The PR also added a new RPC method getdescriptorinfo that can be called with a descriptor to analyse the descriptor and add the checksum. That RPC method:
    • is a pure function. It has no side effects.
    • doesn’t need access to any blockchain or mempool data
    • doesn’t have access to the wallet.
  • The PR also added unit tests and updated several functional tests to use the checksums. A python implementation of the checksum was also added.

  • PR #15443 adds functional tests specifically for the new getdescriptorinfo RPC method.

Questions

  • What type of testing is used in Bitcoin Core?

  • What are the uses for unit tests and functional tests?

  • What is meant by solvable when talking about a descriptor?

  • Why are address and raw type descriptors always unsolvable?

Meeting Log

  113:00 <jnewbery> hi!
  213:00 <lightlike> hi!
  313:00 <ariard> hi!
  413:00 <jonatack> hi!
  513:00 <ccdle12> hi!
  613:01 <jnewbery> whilst we give everyone a couple of minutes to get here, I suggest we all review the notes and questions at https://bitcoin-core-review-club.github.io/15443.html
  713:01 <amiti> hi!
  813:02 <jnewbery> This week's PR is adding new tests. There are no changes to the C++ code.
  913:02 <jnewbery> I don't think we've covered a test-only PR yet, which is a bit of an oversight, since they're a good place to start contributing
 1013:03 <jnewbery> I thought this one was interesting because it gives us a chance to look at output script descriptors, which are going to be important in the Bitcoin Core wallet
 1113:04 <jnewbery> Before we start with the *Questions* section of https://bitcoin-core-review-club.github.io/15443.html, did everything make sense in the *Notes*? Did anyone have any questions about those points?
 1213:05 <jnewbery> ok, great. digi_james: I know you had some questions about solvability and spendability this week. Can you state what your question was and what you learned?
 1313:05 <sosthene> I wonder when the descriptors were added? I don't remember seeing it before upgrading to 0.18, but maybe I wasn't paying attention
 1413:06 <jnewbery> sosthene: look at the docs here: https://github.com/bitcoin/bitcoin/blob/master/doc/descriptors.md . The first RPC to add support for descriptors was scantxoutset in 0.17
 1513:06 <digi_james> jnewbery: a descriptor is solvable if the output script and sigscript (minus signatures) can be derived from the output descriptor.
 1613:07 <digi_james> jnewbery: a descriptor is spendable if the signature can also be derived if the private key is also known.
 1713:08 <jnewbery> digi_james: that's right. So if the descriptor allows us to derive the scriptPubKey but not the scriptSig to spend it, then it's not solvable
 1813:08 <digi_james> So for example, a addr(P2SH_addr) is not solvable because the P2SH preimage is not known.
 1913:08 <jnewbery> exactly
 2013:09 <jnewbery> these definitions of solvable / spendable were originally used in the wallet
 2113:09 <jnewbery> ok, next question: What type of testing is used in Bitcoin Core?
 2213:10 <sosthene> unit and functional
 2313:11 <jnewbery> sosthene: yeah, that's mostly it. I believe there's also some fuzz testing and property-based testing (https://github.com/bitcoin/bitcoin/pull/12775)
 2413:11 <lightlike> I've also seen code for fuzzing. Is this used continually?
 2513:12 <jonatack> (plus a bit of linting in the CI and an optional serving of fuzzing that can use more data and examples)
 2613:12 <jnewbery> lightlike: I've never used fuzz testing myself. I'm pretty sure it's not used in the travis CI.
 2713:13 <drbrule> Is the testing described in the /doc folder? Wasn't sure if it's documented or just self-documented.
 2813:13 <sosthene> sorry can we explain what's fuzz testing?
 2913:13 <jnewbery> ok, so this PR adds functional tests. What are the advantages/disadvantages of that for this testing?
 3013:13 <jnewbery> ariard: mind taking the question about fuzz testing?
 3113:14 <fjahr> functional test are slow since they test the full stack
 3213:14 <ccdle12> jnewbury: functional tests simulate behaviour of an end user making rpc calls
 3313:14 <jonatack> Fuzzing: see doc/fuzzing.md
 3413:15 <jnewbery> jonatack: thanks
 3513:15 <ariard> sosthene: basically you throw millons of random-generated inputs into full-node components to discover crashs or bugs
 3613:15 <jnewbery> other test documentation:
 3713:15 <jnewbery> - unit - https://github.com/bitcoin/bitcoin/blob/master/src/test/README.md
 3813:15 <jnewbery> - functional/integration - https://github.com/bitcoin/bitcoin/blob/master/test/README.md
 3913:16 <ariard> it may be somehow semi-structurated to incrase bugs discovering success
 4013:16 <lightlike> also https://en.wikipedia.org/wiki/Fuzzing for a general overview
 4113:16 <drbrule> jnewbury: Thank you.
 4213:16 <ariard> s/structurated/structured
 4313:16 <jonatack> also: test/functional/README.md
 4413:17 <sosthene> ariard: got it, thanks
 4513:17 <jnewbery> fjahr: ccdle12: yeah, that's right. Functional tests spin up one or more full running instances of the bitcoind node, so they test the full functional behaviour, but are much slower to execute
 4613:18 <jnewbery> any thoughts about whether this functionality should be tested with unit or functional tests?
 4713:19 <fjahr> the rpc should be tested but especially the different success paths should be unit tests
 4813:19 <fjahr> as ccdle12 noted funcitonal is about the interaction of the user, so it should test for responding with the correct error messages etc.
 4913:20 <ccdle12> maybe unit for the core logic? the descriptors seem to be almost "standalone" meaning the rpc calls don't touch any other part of the codebase/affect other parts of the running daemon
 5013:20 <jnewbery> fjahr: that was my initial reaction too. Why is getdescriptorinfo a good candidate for unit tests?
 5113:21 <jonatack> it has few dependencies and can be tested in isolation
 5213:21 <hugohn> because it doesn't require external dependencies
 5313:21 <jnewbery> ccdle12: right. Because it's a pure utility, it doesn't need any setup with the rest of the node.
 5413:23 <jnewbery> I'm definitely not a concept NACK (some tests are better than no tests), but I think it's something to be aware of
 5513:23 <jnewbery> my feeling is that we rely on functional tests a bit too much in Bitcoin Core where sometimes unit tests might be more appropriate
 5613:23 <lightlike> jnewbery: but then we probably wouldn't test getdescriptorinfo directly, just stumbled today over a comment (https://github.com/bitcoin/bitcoin/blob/1381ddbcfcb6429b1327fd3db91ef97d8603aef9/src/test/setup_common.cpp#L74) stating that RPC tests should be moved to functional.
 5713:24 <jnewbery> oh, interesting
 5813:25 <hugohn> RE: disadvantages of functional/integration testing. If I might add, besides the fact that it's slow, IMO it might not be possible to have full 100% test coverage with functional/integration testing (unlike unit testing), because your functional tests usually test things in very specific orders and not all the possible ways the components can react. So it's helpful but, shouldn't be relied on exclusively.
 5913:25 <hugohn> *can interact
 6013:26 <jnewbery> lightlike: there's definitely room for both. I disagree that all RPC testing should be done using the functional test suite
 6113:26 <jnewbery> especially for pure utility functions like `getdescriptorinfo`
 6213:26 <jnewbery> but opinions vary!
 6313:27 <jnewbery> Any thoughts about Sjors comment here: https://github.com/bitcoin/bitcoin/pull/15443#pullrequestreview-205311643 ?
 6413:29 <jnewbery> ok, any other questions about the code itself?
 6513:29 <digi_james> I was confused about sjors comments.
 6613:30 <digi_james> descsum_create is part of the test framework, no?
 6713:30 <sosthene> do we have to use functional testing to test the output of a rpc call or can it be done with a unittest?
 6813:30 <lightlike> I haven't understood why does a (second) python implementation for checksum has to exist (and then needs to be tested).
 6913:30 <digi_james> I suppose we could pass the checksum as part of the input rather than generate it via a test function.
 7013:31 <jnewbery> digi_james: yes, descsum_create is part of the functional test framework
 7113:32 <jnewbery> These tests call `getdescriptorinfo` to get the checksum from the bitcoind node, and also call descsum_create() to get the checksum from the python implementation
 7213:32 <jnewbery> checking that the two implementations of the checksum match is a good way to test
 7313:33 <jnewbery> so that's an argument for doing this as a functional test: it forces us to use implementations in two different languages
 7413:33 <digi_james> Why arent the checksums hardcoded in the test?
 7513:33 <digi_james> I see...
 7613:33 <jnewbery> digi_james: they could be, but having a python implementation that can be re-used makes it easier to add more tests in future
 7713:34 <jnewbery> but also, where would you get the hard-coded values from if you didn't implement the checksum in the test code?
 7813:34 <fjahr> So there are tests for descsum_create() as well?
 7913:34 <digi_james> jnewbery: yup that makes sense to me now ...
 8013:35 <jnewbery> fjahr: I don't think so. We generally don't have tests for the test code.
 8113:35 <jnewbery> The tests test the code and the code tests the test
 8213:35 <fjahr> so if we introduce logic errors in both implementations they would not be noticed
 8313:35 <hugohn> that's def an interesting approach. introduce new logic in the test code itself...
 8413:36 <hugohn> yeah... you're testing for consistency not necessarily accurary
 8513:36 <jnewbery> one way to test the test code is to somehow break the node implementation and verify that the test starts failing
 8613:37 <jonatack> fjahr: with test code I think we depend a lot on review
 8713:37 <jonatack> (Sjors comment seems good to me... I have a few general questions regarding writing/reviewing tests, when it's time)
 8813:37 <digi_james> I suppose Sjors is suggesting to test a test :)
 8913:37 <lightlike> jnewbery: but doesn't this reasoning apply to basically everything (e.g. all kind of cryptographic functions)? I'm not sure why it is worth reprogramming function in python just for testing purposes and then keep it in the codebase.
 9013:38 <lightlike> (as an initial proof of concept, of course)
 9113:38 <fjahr> I think having some hardcoded values to test against would be good to make regressions explicit
 9213:38 <jnewbery> digi_james: one way to test the test code would be to use python's unittest module
 9313:39 <jnewbery> but in most cases, I don't think that's worthwhile
 9413:39 <digi_james> :)
 9513:40 <jnewbery> lightlike: yes, I think it's useful to have python implementations of cryptographic functions in the test framework. See https://github.com/bitcoin/bitcoin/blob/master/test/functional/test_framework/key.py for example
 9613:42 <fjahr> I guess another way to deal with it is to never change the python and the c++ implementation in the same PR
 9713:42 <fjahr> but not sure how that could be enforced
 9813:42 <jnewbery> any other questions about the code in https://github.com/bitcoin/bitcoin/pull/15443/files
 9913:42 <jonatack> Could use some logging :p
10013:43 <jnewbery> yeah, logging would be good!
10113:44 <jonatack> perhaps separate into subtests too
10213:44 <jnewbery> fjahr: I wonder if https://github.com/bitcoin/bitcoin/blob/master/doc/descriptors.md#examples should be updated to include the checksums so they can be used as test vectors for other implementations
10313:45 <digi_james> I wonder why the test vectors did not include unsolvable and private keys = true
10413:46 <digi_james> Or the vectors in descriptor.md should be updated with such, as fjahr suggest
10513:46 <fjahr> yeah, that would make sense to me
10613:46 <jnewbery> digi_james: I think because that's mainly a Bitcoin Core wallet concept
10713:47 <jonatack> A general question, if ok, about test assertion argument order:
10813:47 <jonatack> In some test frameworks there is a convention that can be useful for test error reports to make better sense.
10913:47 <jonatack> In Bitcoin Core it looks like there is an implicit convention of assert_bla(actual, expected)...?
11013:47 <jonatack> I didn't see it stipulated in the testing docs... perhaps I overlooked it.
11113:47 <jonatack> Is there a strict (or loose) convention?
11213:48 <jnewbery> jonatack: the convention is to use the assert_bla(actual, expected) rather than assert actual == expected
11313:48 <jnewbery> because assert_bla(a,e) will print out a and e if they don't match
11413:49 <jnewbery> whereas assert a == e will just fail with no helpful logging
11513:49 <jonatack> agreed... my question pertains to the arg order
11613:50 <jnewbery> oh right. I don't think there's a convention
11713:50 <jnewbery> I think we have assert_greater_than() but not assert_lesser_than(), so the order depends on which you want to be greater for that one :)
11813:50 <jonatack> it looks like actual, expected in practice
11913:51 <jonatack> but unsure how hard or soft that is
12013:51 <jonatack> for assert_equal in any case :)
12113:51 <jnewbery> yeah, I think I've mostly seen actual, expected
12213:51 <jonatack> Another question about running linters before pushing test commits
12313:52 <jonatack> I could bit by not doing that this morning
12413:52 <jonatack> got bit by not running linters before pushing
12513:52 <jonatack> which do you run...
12613:53 <jonatack> I automated lint-python.sh and lint-python-deadcode.sh
12713:54 <jnewbery> I generally don't run them locally
12813:54 <jnewbery> travis tells you pretty quickly when they fail
12913:54 <lightlike> also a general question: does anyone know if there is a debug.log generated for unit tests (where the LogPrintf() go to)?
13013:55 <jnewbery> lighlike: I don't know about debug.log for unit tests, but you can run with --log_level=all to get more verbose output: https://github.com/bitcoin/bitcoin/blob/master/src/test/README.md#running-individual-tests
13113:56 <jonatack> jnewbery: thanks
13213:56 <jnewbery> ok, 5 mkinutes until the hour. Any final questions before we wrap up?
13313:56 <digi_james> jnewbery: Is there an formal/informal roadmap for descriptors? Whist there are very neat, I wonder if they could exist in PBSTs? I suppose they will play a larger role in wallets managing taproot paths?
13413:56 <jnewbery> Also, any requests for PRs to cover next week and beyond?
13513:57 <digi_james> Internal node/wallet interfaces?
13613:57 <jnewbery> digi_james: we want the wallet to move entirely to descriptors. There's a project tracking that here: https://github.com/bitcoin/bitcoin/projects/12
13713:57 <jnewbery> as for the specification of the language itself, yes, it would need to be extended to support taproot
13813:58 <jnewbery> I'd also recommend watching the talk on miniscript here: https://www.youtube.com/watch?v=XM1lzN4Zfks&feature=youtu.be which is related
13913:58 <digi_james> jnewberry: Ah thank you, I will have a look at the project, awesome!
14013:59 <jnewbery> alright, let's wrap it up there. Thanks everyone!
14113:59 <ariard> I need to think more about it but I think a lightning-prefix could be useful too
14213:59 <hugohn> thanks jnewbery
14313:59 <lightlike> thanks!
14414:00 <digi_james> Thanks jnewbery, and everybody else!
14514:00 <jonatack> Thanks jnewbery and everyone!
14621:03 <wumpus> jnewbery: I think the reasoning is that RPC is a separate 'subsystem' from whatever is being tested (it needs to be specifically initialized/torn down), so anything testing RPC *and something else* is already a functional/integration test
14721:05 <wumpus> the internal functions used by RPC methods can of course, independently, be tested in unit tests, especially if they're pure utility functions
14821:05 <wumpus> but really calling RPC methods from unit tests is somewhat frowned upon, unless it's unit testing of the RPC system