Compact block harness (tests)
https://github.com/bitcoin/bitcoin/pull/33300
Host: marcofleon -
Notes
-
This PR adds a fuzz harness for testing compact block relay, along with a few test infrastructure changes to improve speed and stability. Earlier review clubs on fuzzing include: #17860, #18521, #21142, #30605.
-
The harness simulates P2P network behavior by having three peers send various messages (CMPCTBLOCK, BLOCKTXN, HEADERS, SENDCMPCT, TX) to the test node, triggering its compact block logic.
-
The goal is to find bugs and ensure assertions hold. See #33296 for an example of a bug found by this very fuzz test. Nice!
Compact Blocks Protocol Overview
-
BIP 152 introduced the compact block relay protocol, which reduces bandwidth and latency when propagating blocks by sending an encoded version of a block under the assumption that most of the transactions are already in our peer’s mempool.
- Compact blocks consist of:
- A header
- Short transaction identifiers (6 bytes instead of 32 bytes)
- Prefilled transactions
- A vector where each entry is a differentially-encoded
uint16_t
index and the complete serialized transaction. - The current code only prefills the coinbase. However, the protocol allows for the inclusion of other transactions that we might expect our peer is missing. See this
TODO
.
- A vector where each entry is a differentially-encoded
-
The receiving node reconstructs the block using transactions from its mempool. If any transactions are missing, it requests them via a
GETBLOCKTXN
message. - There are two modes:
- low-bandwidth: block sent only after peer requests it via
INV
/GETDATA
- high-bandwidth: block sent to peers without request as soon as it is received, even before full validation is completed
- low-bandwidth: block sent only after peer requests it via
Fuzz Testing
-
Fuzzing is a testing technique that provides quasi-random inputs to code to discover bugs and unexpected behavior. Fuzzers use feedback like code coverage as a guide for mutation of those inputs, allowing them to explore deeper paths within the target code.
-
See our fuzzing docs to start fuzzing with libFuzzer and AFL++.
-
Check out this PR’s test coverage (
blockencodings.cpp
andnet_processing.cpp
are probably most useful to look at). Walking through a coverage report to see which branches are hit or not hit is a good way to start evaluating a fuzz harness. -
An effective fuzz test should be deterministic, to make bugs reproducible. If you’re able to get the AFL++ fuzzer running, you will see a stability metric. The goal is to get to 100%, perfect determinism! After gathering some inputs, you can also run the fuzz determinism script over the corpus to check for potential sources of instability.
-
The faster (more iterations per second) a fuzz harness is, the more quickly it will discover new code paths and potential bugs. You’ll see this metric as exec/s when fuzzing. Any operation that significantly slows down an iteration (like reading from and writing to disk) should be optimized or avoided if possible.
Fun Exercise
-
If you get the
cmpctblock
harness running, revert PR #33296 and see if you can reproduce the crash on thisAssume
statement. -
Yeah I know, fuzzing is cool.
Questions
-
Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach? Were you able to get the fuzz test running?
-
Where in the codebase are the main compact block helpers and processing logic? Name some of the relevant classes and functions. (Hint: nothing like a little search for
NetMsgType::CMPCTBLOCK
to get you started) -
The fuzz test sends
SENDCMPCT
messages withhigh_bandwidth
randomly set. How many high bandwidth peers are allowed and does the fuzz harness test this limit? More generally, why would a peer choose to be high or low bandwidth? -
Compare
-testdatadir
and the new-fuzzcopydatadir
. Why is the latter useful for performance, and why isn’t a freshTestingSetup
that mines blocks on each iteration acceptable here? -
Look at
create_block
in the harness. How many transactions do the generated blocks contain, and where do they come from? What compact block scenarios might be missed with only a few transactions in a block? -
In commit 254e13c
FinalizeHeader
is moved toutil.h
so it can be used by any fuzz test. What is the purpose of this function and why doesn’t it slow down the fuzzer? -
Commit ed813c4 sorts
m_dirty_blockindex
by block hash instead of pointer address. What non-determinism does this fix? The author notes this slows production code for no production benefit. Why can’tEnableFuzzDeterminism()
be used here? How do you think this non-determinism should be best handled (if not the way the PR currently does)? -
The harness adds prefilled transactions to a compact block beyond the coinbase. Walk through the index calculations for prefilling transactions at indices 1 and 3 in a 4-transaction block. Do you think this code could be simplified or cleaned up, and if so, what do you suggest?