Compact block harness (tests)

https://github.com/bitcoin/bitcoin/pull/33300

Host: marcofleon  -  PR author: Crypt-iQ

Notes

  • This PR adds a fuzz harness for testing compact block relay, along with a few test infrastructure changes to improve speed and stability. Earlier review clubs on fuzzing include: #17860, #18521, #21142, #30605.

  • The harness simulates P2P network behavior by having three peers send various messages (CMPCTBLOCK, BLOCKTXN, HEADERS, SENDCMPCT, TX) to the test node, triggering its compact block logic.

  • The goal is to find bugs and ensure assertions hold. See #33296 for an example of a bug found by this very fuzz test. Nice!

Compact Blocks Protocol Overview

  • BIP 152 introduced the compact block relay protocol, which reduces bandwidth and latency when propagating blocks by sending an encoded version of a block under the assumption that most of the transactions are already in our peer’s mempool.

  • Compact blocks consist of:
    • A header
    • Short transaction identifiers (6 bytes instead of 32 bytes)
    • Prefilled transactions
      • A vector where each entry is a differentially-encoded uint16_t index and the complete serialized transaction.
      • The current code only prefills the coinbase. However, the protocol allows for the inclusion of other transactions that we might expect our peer is missing. See this TODO.
  • The receiving node reconstructs the block using transactions from its mempool. If any transactions are missing, it requests them via a GETBLOCKTXN message.

  • There are two modes:
    • low-bandwidth: block sent only after peer requests it via INV/GETDATA
    • high-bandwidth: block sent to peers without request as soon as it is received, even before full validation is completed

Fuzz Testing

  • Fuzzing is a testing technique that provides quasi-random inputs to code to discover bugs and unexpected behavior. Fuzzers use feedback like code coverage as a guide for mutation of those inputs, allowing them to explore deeper paths within the target code.

  • See our fuzzing docs to start fuzzing with libFuzzer and AFL++.

  • Check out this PR’s test coverage (blockencodings.cpp and net_processing.cpp are probably most useful to look at). Walking through a coverage report to see which branches are hit or not hit is a good way to start evaluating a fuzz harness.

  • An effective fuzz test should be deterministic, to make bugs reproducible. If you’re able to get the AFL++ fuzzer running, you will see a stability metric. The goal is to get to 100%, perfect determinism! After gathering some inputs, you can also run the fuzz determinism script over the corpus to check for potential sources of instability.

  • The faster (more iterations per second) a fuzz harness is, the more quickly it will discover new code paths and potential bugs. You’ll see this metric as exec/s when fuzzing. Any operation that significantly slows down an iteration (like reading from and writing to disk) should be optimized or avoided if possible.

Fun Exercise

  • If you get the cmpctblock harness running, revert PR #33296 and see if you can reproduce the crash on this Assume statement.

  • Yeah I know, fuzzing is cool.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach? Were you able to get the fuzz test running?

  2. Where in the codebase are the main compact block helpers and processing logic? Name some of the relevant classes and functions. (Hint: nothing like a little search for NetMsgType::CMPCTBLOCK to get you started)

  3. The fuzz test sends SENDCMPCT messages with high_bandwidth randomly set. How many high bandwidth peers are allowed and does the fuzz harness test this limit? More generally, why would a peer choose to be high or low bandwidth?

  4. Compare -testdatadir and the new -fuzzcopydatadir. Why is the latter useful for performance, and why isn’t a fresh TestingSetup that mines blocks on each iteration acceptable here?

  5. Look at create_block in the harness. How many transactions do the generated blocks contain, and where do they come from? What compact block scenarios might be missed with only a few transactions in a block?

  6. In commit 254e13c FinalizeHeader is moved to util.h so it can be used by any fuzz test. What is the purpose of this function and why doesn’t it slow down the fuzzer?

  7. Commit ed813c4 sorts m_dirty_blockindex by block hash instead of pointer address. What non-determinism does this fix? The author notes this slows production code for no production benefit. Why can’t EnableFuzzDeterminism() be used here? How do you think this non-determinism should be best handled (if not the way the PR currently does)?

  8. The harness adds prefilled transactions to a compact block beyond the coinbase. Walk through the index calculations for prefilling transactions at indices 1 and 3 in a 4-transaction block. Do you think this code could be simplified or cleaned up, and if so, what do you suggest?