The PR branch HEAD was fa6a0084 at the time of this review club meeting.
Notes
A few weeks ago, we looked at how fuzzing can find consensus bugs, such as money printing (brrrĚ…). This
week we will use fuzzing to find a remote crasher.
BIP 37 describes a way to filter transaction relay
by bloom filters. For the purposes of this pull request we will only need to know the different message types it
introduces and how the Bitcoin Core node processes them.
CVE-2013-5700 was a vulnerability
that could crash a node by merely sending two P2P messages.
CNode is the data structure to represent a connection in Bitcoin
Core.
CConnman is the connection manager in Bitcoin Core. A thread
to handle messages from connections is created at startup. This is often referred to as the “main” thread in Bitcoin
Core, taking care of message processing and validation. See
ThreadMessageHandler.
The pull request we are looking at this week is extending CConnman for testing purposes with several features:
Adding and removing nodes directly (without having to create a socket)
Push a serialized message onto a nodes’ receive buffer
The fuzz test itself does mainly two things for each fuzz input or fuzz seed:
Add a random amount of test nodes (the number of nodes is read from the fuzz input)
Pick a random peer and send some random bytes as a message from this peer, i.e. put the bytes into the receive
buffer of this peer
How would a remote node crash a Bitcoin Core node that is vulnerable to the attack? What steps would it need to take
and what messages would it send?
How was the vulnerability fixed?
Where in CNode are bloom filters for transaction relay filtering stored?
Where are bloom filter messages handled by Bitcoin Core?
Based on the previous questions, how would a fuzz test for Bitcoin Core exploit this vulnerability? What are the
steps needed on a high level to achieve this setup?
What patch needs to be applied to current Bitcoin Core to reintroduce the vulnerability?
Did you get fuzzing to run on your machine? If not, what issues did you run into?
<MarcoFalke> ok, so to get started with the net_processing vulnerability in the bloom filter that the fuzz tests is trying to find. What are the messages a remote node needs to send?
<robot-visions> Send a `filterload` where the bloom filter's underlying array is empty, then send a `filteradd` with arbitrary data; see [#18515](https://github.com/bitcoin/bitcoin/pull/18515)
<robot-visions> Vulnerability was fixed by introducing `isEmpty` and `isFull` flags to return early if the Bloom filter's underlying array is empty; see [#2914](https://github.com/bitcoin/bitcoin/pull/2914)
<lightlike> It was fixed covertly, by adding a UpdateEmptyFull() function that is processed on filterload (net_processing) an sets isFull if the underlaying array is empty. Filteradd will not insert anything if isFull is set
<lightlike> I understand why it was fixed this way back then, but wouldn't it make sense to do something more obvious today (like also check for zero in Hash()?)
<theStack> i was quite fascinated by this covert fix. i can imagine is it really hard to fix something but needing to pretend doing something different
<MarcoFalke> I think theStack has a good point. If the fix for this trivial exploit is a one-line patch, it might make it easy for bad players to exploit it
<jnewbery> lightlike chanho: I agree that after some time has passed (perhaps a couple of years) these covert fixes should be cleared up and made public. Leaving them in goes against the goal of having the code as clear and straightforward as possible.
<jnewbery> to be fair, I don't think they're left in this way on purpose. I expect the person who fixes it probably just forgets and it's not their priority after two years
<MarcoFalke> Pretty much what it does is, (1) create a few CNode (with the bloomfilters initialized), then (2) pretend they are sending random messages by passing them into the ProcessMessage function
<sipa> jnewbery: an alternative is (in general) not removing covert fixes (because touching the code again introduces risks of its own), but still documenting them in the code
<andrewtoth> What is the purpose of having a random number of nodes created? Is it because it affects what they do with the messages if they have different number of peers?
<theStack> If a constant number of nodes were be sufficient, could that CNode creation code move into initialize()? Or is there any other reason it has to be created again and again for each input?
<MarcoFalke> andrewtoth: Good question. The purpose of the fuzz test was to be more than just a test to find the bloomfilter bug. It should also test other parts of the code like transaction relay, which has a lot of state
<MarcoFalke> Unfortunately, the current fuzzer can not test transaction relay because messages are constructed from something that looks like a random stream
<andrewtoth> MarcoFalke: so to test transaction relay, a separate harness that uses the fuzz inputs to generate different relay messages needs to be built?
<theStack> MarcoFalke: would you count both libfuzzer and AFL to the category of "modern fuzz engines"? or are there better ones (commercial or anything)?
<MarcoFalke> sipa: Yes. I don't know how fuzz engines work, but I could also imagine that it might dump the binary and see if there are any strings in it.
<MarcoFalke> If you inspect the seeds manually with cat, please pass in `cat --show-nonprinting`, otherwise it might execute arbitary code in your terminal
<lightlike> I noticed if I stop the fuzzer (Ctrl+C) and start it again with the same seed dir, the coverage will be significantly lower than before, as if it didn't reload all the work done before. Does anyone know why this is the case? (I thought the seeds would be updated after each try)
<MarcoFalke> I think the fastest I got was 600k, but with 9 workers in parallel, so I am not sure if that counts because you can't compare it against a single worker. (The libfuzzer workers work together on the same seed corpus)
<MarcoFalke> lightlike: Good question. I believe it is because of "adjacant" coverage. Bitcoin core runs a lot of threads (like the stale tip check in the scheduler), and those are not included in the coverage if you start afresh
<MarcoFalke> I think my greatest fear would be to find a crasher and then try to reproduce it with the same seed and it wouldn't crash because of non-determinism
<lightlike> I didn't find the bug after 10M steps and stoppped, but found it after 2.2 million steps with a version in which I replaced the MsgType Fuzzing with an array of all known Messages, for which then only the index is fuzzed.
<robot-visions> MarkoFalke: I was able to get things work by following the "macOS hints for libFuzzer", but I had to remove the `--disable-asm` flag when running `./configure`
<theStack> MarcoFalke: i had a linking issue with the http_request fuzz test, which used internal libevent functions (evhttp_parse_firstline_) that were named slightly different on my libevent -- could fix it locally though, will open an issue later
<robot-visions> MarcoFalke: I think the docs are reasonable right now. It says (1) "you may need to run `--disable-asm` to avoid errors, and (2) "here's what worked on our setup (which included `--disable-asm`)". It doesn't say you *must* use that flag.
<MarcoFalke> nehan_: Good point. The warning is caused because a fuzz test was modified. To not invalidate existing seeds, the fuzz harness still needs to consume the same bytes it did previously, but now they are unused. I think this can be fixed by prefixing the line with (void)
<theStack> to shortly bring up the subject of remaining covert fixes again: it can be confusing to code readers if it remains. E.g. it led to the issue #16886 and its fix PR #16922 (the latter one by me), both made in the wrong assumption that the point of the empty/full flags were optimization, as we didn't know about the covert fix
<robot-visions> On a similar note to what theStack mentioned, would it make sense to now consider a peer "misbehaving" if they send an empty bloom filter?
<theStack> robot-visions: personally i think that would be fine, don't know though how strict BIP37 is interpreted -- it afair only defines an upper filter size limit, but not a lower limit
<lightlike> theStack: I found it interesting that I didn't find any detailed description on how to crash the node using the CVE apart from the one in your PR. Did you find any, or did you just figure it out yourself?
<theStack> lightlike: i was also wondering the same. the only information i got was that it is triggered by a division by zero. then i inspected the CBloomFilter class for divisions, and since the class is not that large i could figure it out quite fast
<MarcoFalke> Obviously you can't make them public on the first day they are discovered, and if the time has passed to make them public, they might be forgotten about
<jonatack> About the "DE:" keys that prefix new messages in the fuzzer output, https://llvm.org/docs/LibFuzzer.html#output doesn't mention them, my guess is they might mean Dictionary Entry?
<lightlike> Internally to the fuzzer, there is also "PersAutoDict" that you see in the output: "Persistent dictionary modified by the fuzzer, consists entries that led to successfull discoveries in the past mutations."