Fuzz: BIP 42, BIP 30, CVE-2018-17144 (tests)

https://github.com/bitcoin/bitcoin/pull/17860

Host: MarcoFalke  -  PR author: MarcoFalke

The PR branch HEAD was 8b868f17 at the time of this review club meeting.

Notes

Questions

  1. Did you review the PR? Concept ACK, approach ACK, ACK <commit>, or NACK?  Don’t forget to put your PR review on GitHub or ask questions.

  2. What is the difference between black-box and white-box fuzzing? Which technique does Bitcoin Core use? Refer to “The Art, Science, and Engineering of Fuzzing: A Survey” in the notes.

  3. Did you compile the new fuzz target and run the fuzzer?

  4. How can CVE-2018-17144 be exploited? Can you explain conceptually how to create an example block that exploits the bug?

  5. What does the new fuzz test do on a high level? What “Actions” can it run?

  6. On a low level, the fuzz test reads bytes from a given seed via the FuzzedDataProvider. Integral values are read from individual bytes starting at the end of the seed. Take a look at the example seed (zzzii, without a new line) in the qa-assets repo. How many blocks are mined? Either explain by looking at the seed and code or obtain the result by modifying and running the fuzz test on the seed.

  7. How can the fuzz target be tested for accuracy? Hint: How would you modify the Bitcoin Core source code to trigger the CVE? See also this pull request: consensus: Explain why fCheckDuplicateInputs can not be skipped and remove it

  8. Could you find a seed that triggers the CVE? You can either create it manually, if you found an answer to question 4, or you can let the fuzzer run until it hits an assertion. If you let the fuzzer run: Was it successful? If not, what is the bottleneck of the fuzz test and could it be improved?

Meeting Log

  111:31 <instagibbs> I can't actually get the fuzztester to "Do anything". The readme focuses on setting up but doesn't actually give an example with expected output.
  211:31 <instagibbs> ./test/fuzz/test_runner.py ${DIR_FUZZ_IN}
  311:32 <instagibbs> results in it just printing out a bunch of names in arrays "Fuzz targets found" and "Fuzz targets selected" then exiting
  411:33 <instagibbs> I'll gladly improve the readme, if I just knew what the heck is supposed to be happening :)
  513:38 <jonatack> instagibbs: I see output like https://github.com/bitcoin/bitcoin/pull/17229#pullrequestreview-333863703
  613:39 <jonatack> or https://github.com/bitcoin/bitcoin/pull/17777#pullrequestreview-336198693
  713:39 <jonatack> and it runs indefinitely until halted with C-c or C-d
  813:40 <jonatack> I'm not sure what a fuzz failure looks like yet, it just runs and runs :p
  913:41 <instagibbs> hmmmmm
 1013:41 <instagibbs> mine definitely just stops
 1113:41 <instagibbs> with nothing useful
 1213:46 <instagibbs> jonatack, it's possible the readme is super poorly written and made me think it would run something, when all its doing is telling me what tests are available and exiting
 1313:48 <jonatack> yup definitely, improving fuzz.md has been on my todo list since last spring
 1413:49 <jonatack> just stopping seems wrong
 1513:50 <instagibbs> ok, ive upgraded to a subprocess command error message :)
 1613:50 <jonatack> i asked practicalswift when reviewing one of his fuzz prs and he confirmed that it just runs ad infinitum until halted
 1713:50 <instagibbs> somehow
 1813:50 <instagibbs> if i run specific ones like you did in that PR it works fwiw
 1913:50 <instagibbs> it's the python script the readme has that isn't working for me
 2013:52 <jonatack> hm, i recall the readme not working for me either, only following per-pr practicalswift instructions did
 2114:00 <instagibbs> actually scratch that, i actually broke it more, back to "nothing" for the `./test/fuzz/test_runner.py` incantation. If I specify a particular target, it says so and exits even faster
 2214:00 <instagibbs> :thinking:
 2314:01 <instagibbs> anyways I can at least run them one at a time thanks to your link :)
 2415:37 <jonatack> instagibbs: hmmm i retried that test runner script and am seeing the same as what you report
 2515:37 <jonatack> list of targets found and selected with: ./test/fuzz/test_runner.py ../qa-assets/fuzz_seed_corpus hex
 2615:38 <jonatack> or when no target supplied, then: subprocess.CalledProcessError: Command '['/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/bech32', '-runs=1', '-detect_leaks=0', '../qa-assets/fuzz_seed_corpus/bech32']' returned non-zero exit status 1.
 2715:50 <instagibbs> make sure the path is right, i was getting no file found error, once i fixed the path it did nothing again

— Day changed Fri Jan 10 2020

 2817:12 <fjahr> instagibbs: are you on macos or at least using libFuzzer? I have the same problem you are describing. jonatack: how about you? AFL or libFuzzer?

— Day changed Sat Jan 11 2020

 2905:02 <jonatack> fjahr: i've been using libFuzzer so far and it works fine on individual fuzz tests so i'm probably using the python runner improperly

— Day changed Tue Jan 14 2020

 3023:02 <fanquake> If anyone would like to test/verify the changes in #17916, I've written some notes up here: https://github.com/bitcoin/bitcoin/pull/17916#issuecomment-574485179

— Day changed Wed Jan 15 2020

 3111:00 <pinheadmz> having a bit of trouble fuzzing on macos. i followed the guide in bitcoin/fuzzing and the comment about macos specifically.
 3211:00 <pinheadmz> getting "subprocess timed out: Currently only libFuzzer is supported"
 3311:00 <pinheadmz> the comment says "Take care of having an LLVM/Clang environment that contains fuzzing libraries. The default one from Apple is not enough, so that you will have to install it with brew, if not already done."
 3411:00 <pinheadmz> so, hm: brew install clang ?
 3511:01 <MarcoFalk_> The test_runner.py only works with libFuzzer
 3611:01 <MarcoFalk_> The test_runner.py is not a requirement for any of the questions in today's review club
 3711:01 <pinheadmz> I did clone and make AFL -- thats something different ?
 3811:02 <MarcoFalk_> It is mostly used by travis to read all seeds, not (yet) for generating seeds
 3911:02 <MarcoFalk_> Yes, afl is a bit different from libFuzzer
 4011:03 <pinheadmz> so the python isnt necessary - the fuzz tests will run with configure --enable-fuzz; make ?
 4111:04 <MarcoFalk_> no, you still need to run the target ./src/test/fuzz/utxo_total_supply
 4211:04 <MarcoFalk_> (with appropriate options)
 4311:04 <MarcoFalk_> afl needs some more memory, e.g. -m200
 4411:08 <pinheadmz> ok I see, libFuzzer for the test_runner, but individual tests can be executed with AFL, and theres a specific command for that in bitcoin/fuzzing.md
 4511:09 <MarcoFalk_> libFuzzer instrumented binaries can also be run individually
 4611:13 <jonatack> Been having fun with some of the resources here: https://google.github.io/oss-fuzz/reference/useful-links/
 4711:36 <pinheadmz> this helped me too: https://github.com/bitcoin/bitcoin/issues/17914 not quite there yet, but moving forward....
 4811:48 <pinheadmz> theres a Makefile option `LDFLAGS = -L/usr/local/lib/darwin/` but that directory doesnt exist on my computer, could it be referring to /usr/local/lib/ ? that directory has all my .dylib 's
 4911:52 <MarcoFalk_> Hmm, disappointing that mac is making it so hard to run fuzzers
 5011:53 <MarcoFalk_> We'll get started in about one hour (18 UTC)
 5111:53 <pinheadmz> Yeah :-/ i really wanna try and watch this run. Configure says "linker did not accept requested flags, you are missing required libraries" - is there a way to see whcih libs exactly are missing?
 5211:54 <MarcoFalk_> Did you switch to libFuzzer?
 5311:55 <pinheadmz> no still AFL
 5411:55 <raj_> pinheadmz: what exactly is the issue?
 5511:55 <MarcoFalk_> hmm. Not sure if I can help with afl or mac. Maybe pastebin the config log for others to take a look?
 5611:56 <pinheadmz> raj_: trying to follow docs/fuzzing.md cloned and make'd AFL, trying to run configure with CC=${AFLPATH}/afl-clang CXX=${AFLPATH}/afl-clang++
 5711:56 <pinheadmz> and trying different LDFLAGS since /usr/local/lib/darwin/ is nonexistnet
 5811:56 <pinheadmz> getting errors about missing libraries
 5911:57 <pinheadmz> ignoring the configure errors, making anyway and running afl-fuzz, getting "No instrumentation detected"
 6011:59 <raj_> have you tried `make distclean` and starting over?
 6111:59 <pinheadmz> i make clean each time ;-) is that sufficient?
 6212:00 <raj_> not very sure. When in doubt i just distclean. :p
 6312:00 <raj_> usually works.
 6412:01 <raj_> also check if you have afl-clang in path.
 6512:04 <pinheadmz> don't the CXX= flags handle that?
 6612:05 <raj_> it does provided you have setup $ALFPATH correctly. So thats what i meant to check.
 6712:06 <pinheadmz> yeah, like `ls ${AFLPATH}/afl-clang` returns the afl-clang executable
 6812:07 <pinheadmz> this missing libraries error is the thing i think is the culprit
 6912:08 <raj_> seems like be some dependencies are missing?
 7012:14 <raj_> have you tried libfuzzer? that worked with lesser issues for me.
 7112:21 <pinheadmz> yeah going down that rabbit hole next :-)
 7212:48 <fjahr> pinheadmz: I am on mac and having issues as well. I read somewhere that AFL is not supported on mac, so I have been trying libfuzzer but I am stuck with fuzz binaries getting stuck without output (see my posts a few days ago here in the channel and in #bitcoin-core-dev).
 7312:49 <MarcoFalk_> Is it not possible to install a vanilla clang on MacOS?
 7412:50 <fjahr> I am using a non-systems clang installed via brew, if that's what you mean
 7512:51 <MarcoFalk_> Ah, so compilation worked, but the output is empty?
 7612:52 <fjahr> yeah, the fuzz bin starts but just doesn't do anything, no output and it doesn't shut down either
 7712:52 <fjahr> even when running with `-help=1`
 7812:56 <pinheadmz> fjahr: shoot me a link to install libfuzzer on osx ?
 7912:57 <MarcoFalk_> It might be bundeled with the clang+llvm version shipped in brew
 8012:57 <fjahr> it's included in llvm/clang if you install via brew
 8112:57 <fjahr> just not in the systems one
 8213:00 <MarcoFalk_> hi everyone. Let's get started. It sounds like some people had technical problems getting the fuzzer to run
 8313:00 <jnewbery> #startmeeting
 8413:00 <pinheadmz> hi
 8513:00 <jnewbery> hi
 8613:00 <ajonas> hi
 8713:00 <andrewtoth_> hi
 8813:00 <jonatack> hi
 8913:00 <emzy> Hi
 9013:01 <kanzure> hi
 9113:01 <felixweis> hi
 9213:01 <TheBigCohooNah> lol, hi, going to listen in and not understand anything =)
 9313:01 <fjahr> hi
 9413:01 <MarcoFalk_> Has everyone had a chance to look at the pull request?
 9513:02 <jonatack> y
 9613:02 <raj_> hi.
 9713:02 <andrewtoth_> y
 9813:02 <jnewbery> y
 9913:02 <raj_> y
10013:02 <emzy> y
10113:02 <MarcoFalk_> Nice
10213:03 <MarcoFalk_> Ok, so let's start with some general questions. Anyone want to explain the difference between a black-box and a white-box fuzzer?
10313:03 <jonatack> Black-box fuzzing, aka i/o-driven or data-driven testing, includes most
10413:03 <jonatack> Btraditional fuzzers and is blind to the internals of the PUT (program under test),
10513:03 <jonatack> observing only the i/o behavior and treating it as a black-box.
10613:03 <emzy> white-box is with source code
10713:04 <jonatack> White-box fuzzing generates test cases by analysing the internals of the PUT and the information gathered when executing the PUT.
10813:04 <raj_> black-box: doesnt look into internals. white-box: looks at internals and takes test decisions
10913:04 <MarcoFalk_> So what is coverage guided fuzzing and how does it fit in to the black/white categories?
11013:05 <andrewtoth_> gray-box?
11113:05 <jonatack> Grey to black
11213:05 <MarcoFalk_> libFuzzer and AFL are both coverage guided and that is what Bitcoin Core uses, so we will focus on these
11313:05 <MarcoFalk_> How does the coverage guiding work?
11413:06 <andrewtoth_> a function is exposed in the harness to allow fuzz data to get passed into it
11513:06 <jonatack> Guided by API coverage?
11613:07 <jonatack> API meaning certain functions put under test.
11713:07 <MarcoFalk_> Yes, some part of the interface is exposed in a fuzz target
11813:07 <jonatack> idea is FDD turst APIs into fuzz targets
11913:07 <raj_> anything to do with code path?
12013:07 <jonatack> turns
12113:07 <MarcoFalk_> But how does the coverage guiding help the fuzzer to fuzz that function?
12213:07 <jonatack> FDD (fuzz-driven development :D)
12313:08 <MarcoFalk_> raj_: Yes, something with code path :)
12413:08 <raj_> i am struggling with the concept of coverage. What does it means exactly? Is it like a set of all possible execution paths?
12513:09 <MarcoFalk_> coverage can mean many things. Common ones are "function coverage" "line coverage" or "branch coverage"
12613:09 <jonatack> see https://marcofalke.github.io/btc_cov/test_bitcoin.coverage/index.html
12713:10 <jonatack> and https://marcofalke.github.io/btc_cov/total.coverage/index.html
12813:10 <MarcoFalk_> Let's not go into the details of what kind of coverage the fuzzers use, but discuss the general idea
12913:10 <andrewtoth_> ok, but then what is coverage "guiding"? Is that just exposing functions to the fuzzer and mutating the fuzz data to be a proper input?
13013:10 <raj_> jonatack: really cool. thanks..
13113:12 <MarcoFalk_> Ok, let's take a step back. What is a "seed"?
13213:12 <raj_> initial set of test cases.
13313:12 <MarcoFalk_> raj_: right
13413:12 <MarcoFalk_> What happens with a "seed"?
13513:13 <andrewtoth_> it's given to the fuzzer
13613:13 <fjahr> i guess it is saved and different inputs are generated from it so that it can be used to replay the results if an error is detected
13713:14 <MarcoFalk_> andrewtoth_: Right
13813:14 <MarcoFalk_> fjahr: Right
13913:15 <MarcoFalk_> Ok, and when compiling with a fuzzer, the binary is instrumented with the fuzz engine and some annotations to collect coverage data
14013:15 <raj_> as per the pdf https://arxiv.org/pdf/1812.00140.pdf the fuzzer also construct some testing configuration from given seed. That makes the process more efficient.
14113:16 <MarcoFalk_> For each input (or seed) coverage data is collected when run
14213:16 <MarcoFalk_> How does the coverage data help the fuzzer?
14313:16 <felixweis> how did you generate the fuzz_seed_corpus?
14413:17 <raj_> gives the fuzzer clue on how to increase coverage in next iteration?
14513:17 <fjahr> it shows if new branches were discovered with the given seed or not
14613:17 <MarcoFalk_> fjahr: Correct
14713:17 <MarcoFalk_> Does it influence the future decisions of the fuzzer? What happens with a seed that did or did not increase coverage?
14813:18 <andrewtoth_> that would depend on the fuzzer no?
14913:18 <felixweis> also the filenames are random or simply hashes of the file contents
15013:18 <MarcoFalk_> andrewtoth_: Yes. (Assuming AFL and libFuzzer for now, i.e. coverage-guided fuzzers)
15113:19 <MarcoFalk_> felixweis: The name of the seeds or inputs depends on the fuzzer
15213:20 <jonatack> In this case, yes, seed trimming takes place to use the coverage to progressively remove a portion of the seed as long as the coverage remains constant
15313:21 <pinheadmz> Is the data sent to bitcoind totally random or can you give it structure ie. transaction format with some limits on field size and value etc
15413:21 <felixweis> if an input increases coverage it is saved to the directory of the other inputs. otherwise it is discarded
15513:21 <MarcoFalk_> jonatack: Correct. Coverage is useful to minimize a seed corpus or even a single seed
15613:23 <MarcoFalk_> pinheadmz: The data or inputs are not sent to bitcoind they are sent to the fuzz target (which is a separate binary, but linked with all the bitcoind libraries)
15713:23 <MarcoFalk_> felixweis: correct
15813:24 <MarcoFalk_> And to answer felix's question how the initial set of inputs is created: AFL needs an initial set, which is recommended to be generated by hand. libFuzzer does not, it can start with an empty string as the first seed
15913:25 <MarcoFalk_> Any other questions?
16013:25 <felixweis> but that search space is just too enormous
16113:25 <MarcoFalk_> felixweis: Yes, the search space is generally never exhausted
16213:25 <raj_> for this the seed is "zzzii"
16313:25 <andrewtoth_> so how does this PR guide the fuzzer exactly?
16413:25 <raj_> does it mean anything, or just random?
16513:26 <MarcoFalk_> felixweis: However, with coverage data, at least it can span the relevant parts relatively quickly
16613:26 <raj_> for `utxo_total_supply` target i mean.
16713:26 <andrewtoth_> does that happen automatically in utxo_total_supply? it knows which branches have been taken?
16813:28 <MarcoFalk_> andrewtoth_: Yes, the fuzz engine is hidden in AFL or libFuzzer and with the coverage instrumentation it can figure out by itself what branches have been taken
16913:28 <MarcoFalk_> raj_: We will get into that seed later
17013:28 <raj_> okay
17113:29 <MarcoFalk_> Ok, if there are no further questions about the general idea of coverage guided fuzzing, we will jump into the idea of the CVE.
17213:29 <MarcoFalk_> How can CVE-2018-17144 be exploited?
17313:29 <pinheadmz> a single tx spending from the smae input twice
17413:29 <MarcoFalk_> Can you explain conceptually how to create an example block that exploits the bug?
17513:29 <MarcoFalk_> pinheadmz: Correct
17613:29 <pinheadmz> it happened on testnet!
17713:29 <pinheadmz> and there are still testnet nods stuck on that fork from like 18 months ago
17813:30 <andrewtoth_> but only when included in a block, will be rejected from mempool
17913:30 <MarcoFalk_> andrewtoth_: Correct
18013:30 <pinheadmz> luke-jr mentioned ot me that the bug is so bad, the bad block can't even be re-org'ed out
18113:31 <MarcoFalk_> ok, looking into the code of the newly added fuzz target.
18213:31 <MarcoFalk_> What does the new fuzz test do on a high level? What “Actions” can it run?
18313:31 <pinheadmz> so for this bug specifically, what are the benefits of a fuzz test instead of jsut an explicit test in the python suite?
18413:31 <MarcoFalk_> pinheadmz: Good q. We do have a test in the python test suite
18513:32 <MarcoFalk_> However, a fuzz test is more flexible in the sense that it can also actively search for similar CVEs
18613:33 <jonatack> Fuzzing might find cases the test-writer did not think of.
18713:33 <MarcoFalk_> The CVE showed different behavior when the duplicate inputs were created in the same block, vs created in an earlier block
18813:33 <MarcoFalk_> jonatack: correct
18913:33 <raj_> 3 actions. Crate input, Create tx and Create Block
19013:33 <MarcoFalk_> So when the python test only checks for the case where the inputs were confirmed, the fuzz test might find the case where the inputs were created in the same block
19113:34 <MarcoFalk_> Or even find a case that was not imagined by anyone
19213:34 <MarcoFalk_> raj_: Correct
19313:34 <pinheadmz> and maybe im underestimating the size of the fuzz test, bt with random data, what is the probability of getting anything useful like that? especially with such a dense protocol like bitcoin
19413:34 <raj_> jus to clarify if i got it correctly. The bug is one utxo being spent by 2 different inputs in a same tx?
19513:35 <MarcoFalk_> raj_: Yes
19613:35 <andrewtoth_> pinheadmz i believe the goal is to have these tests run continuously with lots of cpus, to find a very large space of inputs
19713:35 <jnewbery> pinheadmz: I think that goes back to your previous question about whether the fuzzed data is totally random or whether it has some structure. Perhaps we could talk about that?
19813:35 <raj_> +1
19913:35 <MarcoFalk_> jnewbery: Yes, thx
20013:36 <pinheadmz> yeah thanks
20113:36 <andrewtoth_> raj_ the bug is one tx spending the same utxo twice
20213:36 <MarcoFalk_> So is the fuzzer sending raw blocks?
20313:36 <jnewbery> no!
20413:37 <MarcoFalk_> As mentioned by raj_ the fuzzer has 3 actions. Crate input, Create tx and Create Block
20513:37 <MarcoFalk_> Why would it not be feasible to send a raw block?
20613:38 <jnewbery> It looks to me like the fuzzer is providing entropy which is being used by utxo_total_supply.cpp to generate inputs
20713:38 <MarcoFalk_> jnewbery: Correct
20813:38 <jnewbery> anything that calls fuzzed_data_provider.ConsumeIntegralInRange is taking that entropy and generating stuff with it
20913:38 <MarcoFalk_> But why is it not possible to just treat the input seed as a raw block?
21013:39 <jnewbery> one reason is that generating blocks is difficult because they need proof-of-work!
21113:39 <jnewbery> apart from all the structure that blocks have
21213:40 <MarcoFalk_> jnewbery: Correct. Also, blocks are highly structured. E.g. the transactions need to deserialize correctly and match the merkle root
21313:40 <raj_> `const auto action = static_cast<Action>(fuzzed_data_provider.ConsumeIntegralInRange(0, 2));` How is an integer being casted into the enums? is it like 0 for Create Input, 1 for Create tx?
21413:40 <MarcoFalk_> raj_: I think in c++ you can cast integers to non-scoped enums
21513:40 <pinheadmz> Structuring the entropy with minimal formatting makes sense
21613:41 <pinheadmz> so the input scripts are randomly generated though? and we just hope to brute force enough of them that something interesting happens?
21713:41 <MarcoFalk_> So looking at my example seed zzzii, what does it do?
21813:41 <MarcoFalk_> How many blocks are mined? Either explain by looking at the seed and code or obtain the result by modifying and running the fuzz test on the seed.
21913:43 <pinheadmz> 2000 blocks
22013:43 <pinheadmz> 20 * coinbase maturity (which is 100)
22113:43 <jnewbery> is "zzzii" ascii? 0x7a7a7a6969?
22213:44 <raj_> would really appreciate a walk through of this.
22313:44 <MarcoFalk_> jnewbery: Yes, this happens to be ascii. In general seeds are not ascii, though
22413:45 <MarcoFalk_> Hint: The fuzzer reads this byte by byte from the right
22513:45 <MarcoFalk_> The first read is here
22613:45 <MarcoFalk_> int64_t duplicate_coinbase_height = fuzzed_data_provider.ConsumeIntegralInRange(0, 20 * COINBASE_MATURITY);
22713:46 <pinheadmz> while (fuzzed_data_provider.remaining_bytes())
22813:46 <pinheadmz> trying to see where the data provider gets initialized
22913:47 <andrewtoth_> 105?
23013:47 <MarcoFalk_> pinheadmz: It is initialized here: https://github.com/bitcoin/bitcoin/pull/17860/files#diff-7a6cf1c54083f72e3110c6a049e26842R29
23113:47 <MarcoFalk_> andrewtoth_: why?
23213:47 <pinheadmz> and the buffer data is that 5-char string ?
23313:47 <MarcoFalk_> pinheadmz: Yes, in this case the seed is only 5-chars
23413:48 <MarcoFalk_> or let's say 5 bytes
23513:48 <andrewtoth_> ConsumeIntegralInRange just gets the first byte, checking if it's in the range?
23613:48 <andrewtoth_> so first byte is 105?
23713:48 <dude> Converts the byte to range
23813:48 <MarcoFalk_> andrewtoth_: It will read as much bytes as it needs to reach the upper range
23913:49 <pinheadmz> ah, then the int in range(0, 2) becomes a command: input / tx / block
24013:49 <pinheadmz> so its like a little opcode script !
24113:49 <MarcoFalk_> So how many bytes does it read for duplicate_coinbase_height?
24213:49 <pinheadmz> just one ?
24313:50 <MarcoFalk_> https://github.com/bitcoin/bitcoin/pull/17860/files#diff-7a6cf1c54083f72e3110c6a049e26842R88
24413:50 <MarcoFalk_> pinheadmz: What is the upper bound?
24513:50 <dude> size / 8
24613:50 <dude> I think
24713:50 <MarcoFalk_> hint: COINBASE_MATURITY is 100
24813:50 <pinheadmz> oh i see it consumes as many bytes as it needs to find something in the range (0, 2000
24913:50 <andrewtoth_> if upper bound is 20000, then upper bound is 20000?
25013:50 <MarcoFalk_> pinheadmz: Yes!
25113:50 <dude> Wait are you sure?
25213:51 <dude> I thought it modded the input to fit into the range
25313:51 <pinheadmz> so thats what 2 bytes? max value is 65535
25413:51 <MarcoFalk_> How many bytes are needed to represent a random number up to 2000?
25513:51 <MarcoFalk_> pinheadmz: Yes
25613:51 <MarcoFalk_> Ok, so the "ii" in the seed gets deserialized as the duplicate_coinbase_height
25713:52 <MarcoFalk_> What is the next read of the fuzzer?
25813:52 <andrewtoth_> ii is 6969, so that's 26985, greater than 2000 though...
25913:52 <pinheadmz> while there's bytes remaining, it grabs one at a time and runs the corresponding Action::...
26013:52 <MarcoFalk_> andrewtoth_: Yes, it is wrapped by the fuzzed data provider
26113:52 <dude> It's because it's modded https://github.com/bitcoin/bitcoin/blob/556820ee576d02528de8cc5998579b044b3666c9/src/test/fuzz/FuzzedDataProvider.h#L105
26213:52 <MarcoFalk_> dude: Correct
26313:53 <MarcoFalk_> thx for the link
26413:53 <MarcoFalk_> So what is the next place where bytes are consumed by the fuzzer?
26513:53 <raj_> so duplicate_coinbase_height is always 0-2000 right?
26613:53 <MarcoFalk_> yup, in this fuzz target
26713:54 <jnewbery> MarcoFalke: so the coinbase height is 985?
26813:54 <pinheadmz> MarcoFalk_: L112?
26913:54 <MarcoFalk_> pinheadmz: Correct
27013:54 <pinheadmz> the last 3 bytes are interpreted as actions?
27113:54 <MarcoFalk_> Hint: https://github.com/bitcoin/bitcoin/pull/17860/files#diff-7a6cf1c54083f72e3110c6a049e26842R113
27213:55 <MarcoFalk_> Jup
27313:55 <jnewbery> Jup to height or Jup to actions?
27413:55 <jonatack> hehe
27513:56 <dude> What do you think about this comment: https://github.com/bitcoin/bitcoin/pull/17860#issuecomment-571846309
27613:56 <MarcoFalk_> jnewbery: Seems plausible. Haven't checked, but you can with std::cout << duplicate_coinbase_height << std::endl; and running this seed
27713:56 <dude> Can answer later if it's a tangent
27813:56 <MarcoFalk_> So how many blocks are mined?
27913:56 <MarcoFalk_> by the seed "zzzii"?
28013:57 <pinheadmz> 985? 0x6969 % 2000
28113:57 <MarcoFalk_> Oh, that is just the duplicate coinbase height. Not used to determine the number of blocks mined.
28213:58 <MarcoFalk_> Blocks are only mined when the action is CREATE_BLOCK
28313:58 <raj_> 'z's are ascii 122. so what action they translates into??
28413:58 <raj_> thats outside (0,2)
28513:58 <pinheadmz> 122 % 2 = 0
28613:59 <pinheadmz> which is create_input
28713:59 <raj_> oh..
28813:59 <raj_> so just create input three times?
28913:59 <MarcoFalk_> 0x7a % 3 = 2
29013:59 <andrewtoth_> 0 blocks is the answer? hmm...
29113:59 <MarcoFalk_> ;)
29213:59 <pinheadmz> MarcoFalk_: since you know the seeds will be modded like this anyway, why not just use actual bytes 0x00 0x01 0x02 ?
29313:59 <jnewbery> it's got to be modded by (range + 1) because the index starts at 0
29414:00 <andrewtoth_> ahh it's modded by 3
29514:00 <pinheadmz> jnewbery: ah ty!
29614:00 <andrewtoth_> so 3 blocks are mined
29714:00 <MarcoFalk_> pinheadmz: It was created by libFuzzer, which does not know that it is being modded anyway
29814:00 <MarcoFalk_> andrewtoth_: Correct!
29914:00 <andrewtoth_> this is a confusing PR...
30014:00 <pinheadmz> its great haha
30114:00 <dude> If you just have switch statements for 0x00, 0x01, 0x02, the fuzzer will generate bytes outside of the range and not get modded
30214:00 <dude> So it's more efficient that way
30314:00 <MarcoFalk_> last q: Did anyone run the fuzzer?
30414:00 <dude> Yes
30514:00 <MarcoFalk_> Did it find the CVE?
30614:00 <raj_> yes.
30714:00 <pinheadmz> did anyone git it to run on OSX by any chance?
30814:00 <dude> Eh, not sure, was there an assert I had to disable?
30914:01 <dude> Yeah I got it running on OSX
31014:01 <andrewtoth_> i couldn't get AFL to work, similar to raj_'s comment it kept crashing
31114:01 <dude> Did you turn off AFL_NO_FORKSRV?
31214:01 <dude> or I mean "on"
31314:01 <MarcoFalk_> raj_: Nice
31414:01 <pinheadmz> dude: yah, did you use AFL or libFuzzer ?
31514:01 <dude> AFL
31614:01 <andrewtoth_> didn't know you could build and run and then just `src/test/fuzz/utxo_total_supply`, could have some docs for that
31714:01 <emzy> Yes
31814:01 <raj_> I tried with libfuzzer. Ran 5 workers for 24 hours. didint find any crash.
31914:01 <andrewtoth_> haven't tried AFL_NO_FORKSRV, but i'm running linux
32014:01 <dude> https://github.com/bitcoin/bitcoin/pull/17860#issuecomment-573726974
32114:02 <dude> Oh I see
32214:02 <raj_> 4 of them stoped at times out.
32314:02 <MarcoFalk_> Note that the CVE was fixed, so it has to be reintroduced first ;)
32414:02 <emzy> I was runnig it on linux. And I found after 2h nothing.
32514:02 <dude> Timeout can usually just be erroneous, you would need to double-check
32614:02 <MarcoFalk_> How would you reintroduce the CVE?
32714:02 <andrewtoth_> i tried running afl with -m200 but same error for me
32814:02 <raj_> got few slow-unit test artifacts. not sure what they are..
32914:02 <raj_> but no cve..
33014:03 <andrewtoth_> revert the PR that fixed it, as well I think there was a PR that removed the bool flag entirely
33114:03 <jnewbery> revert 14247
33214:03 <emzy> andrewtoth_: -m200 woked for me
33314:03 <MarcoFalk_> Yes
33414:04 <MarcoFalk_> Ok, for reference, this is the seed I found: https://paste.ubuntu.com/p/KHNBFgpxK3/
33514:04 <MarcoFalk_> Let's wrap up
33614:04 <MarcoFalk_> We can chat about technical issues on MacOS later
33714:04 <MarcoFalk_> Any last questions?
33814:04 <pinheadmz> lol at "deadly signal"
33914:04 <pinheadmz> This was really cool thanks MarcoFalk_
34014:04 <jnewbery> Thanks MarcoFalk_! Before everyone goes, what did you all think? It sounds like there were a few issues getting fuzzing set up. Would you all like to do another session on fuzzing?
34114:05 <dude> Yes! but need better docs
34214:05 <MarcoFalk_> dude: Agree
34314:05 <emzy> Yes
34414:05 <MarcoFalk_> Unfortunately I don't have MacOs, so I can't provide documenation
34514:05 <andrewtoth_> yes, more fuzzing docs please
34614:05 <dude> I have MacOS
34714:05 <andrewtoth_> i'm running on linux and couldn't get afl working
34814:05 <jonatack> Yes, I've had rewriting fuzzing.md on my list since last Spring
34914:05 <jnewbery> I think improving the docs (including for MacOS) would be an excellent contribution from any one of you
35014:05 <andrewtoth_> MarcoFalk_ is the seed 7P9p/sc7BQHvJto7OzsyZZKSkpKSklBQUFBQUFBQUFBQUFCSkpKSkpI+Pj4+Pj4+PpJQUFBQUFBQ
35114:05 <andrewtoth_> UFBQUFBQkpKSkpKSPj4+Pj4+Pj7smGH7xdR67Ho7j4M7L2VwO/tidNQvNfX19fXsmGH7xdR67Ho7
35214:05 <andrewtoth_> j4M7L2VwO/tidNQvNfX19fU=?
35314:06 <raj_> jnewbery: yes would love to. It seems like an important concept. would love to learn more. Thanks for the PR. it was a great one, have learnt a lot in last week.
35414:06 <jonatack> andrewtoth_: same, i use libFuzzer
35514:06 <jnewbery> anyone who's set up fuzzing for the first time this week is best placed to improve the docs since you've probably recently run into all the problems and gotchas
35614:06 <andrewtoth_> I'll have to get fuzzing working well first before I can document how to do it ;)
35714:06 <MarcoFalk_> jnewbery: Great suggestion
35814:07 <jnewbery> #endmeeting
35914:07 <andrewtoth_> thanks MarcoFalk_ and everyone!
36014:07 <emzy> thanks MarcoFalk_ and everyone!
36114:07 <dude> :D
36214:07 <MarcoFalk_> If there are any issues with fuzzing, please file a new issue or leave a comment in https://github.com/bitcoin/bitcoin/issues/17914
36314:08 <jonatack> thanks MarcoFalk_, excellent meeting and study resources. Learning a lot.
36414:08 <dude> MarcoFalk_, do you run afl-tmin, afl-cmin on your corpus?
36514:08 <MarcoFalk_> If you find the docs hard, please also file an issue or improve them :)