Add make check-valgrind to run the unit tests under Valgrind (tests)

https://github.com/bitcoin/bitcoin/pull/17639

Host: jonatack  -  PR author: practicalswift

Notes

Definitions

  • Undefined Behavior (UB): there are no restrictions on the behavior of the program if certain rules of the language are violated. Examples of undefined behavior are uninitialized scalars, memory access outside of array bounds, signed integer overflow, null pointer dereference, access to an object through a pointer of a different type, etc. Compilers are not required to handle undefined behavior (although many simple situations are diagnosed), and the compiled program is not required to do anything meaningful.

  • Uninitialized variables are Undefined Behavior, a typical outcome of which is reading stale stack or register values. In the best case, it generates a pseudorandom result which in edge cases might be the expected result or fixed by the compiler. In the worst case, an exploit may be created that leaks secrets by allowing use of an attacker-controlled value.

  • MemorySanitizer (MSan) (wiki) is a tool that detects uninitialized reads. It consists of a compiler instrumentation module and a run-time library. Typical slowdown is 3x and memory usage is 2x. MSan is supported on Linux/NetBSD/FreeBSD and requires compiling and linking all the code in your program, including libraries it uses and the C++ standard library, with -fsanitize=memory -fPIE -pie.

  • Valgrind is a debugging and profiling tool suite to make programs faster and more correct. Its most popular tool, Memcheck, can detect memory-related errors common in C and C++ programs that can lead to crashes and unpredictable behavior. Here is a tutorial. Memcheck is not perfect: typical slowdown is 3-10x, memory usage is 2x, it can produce false positives (there are mechanisms for suppressing these; see the valgrind.supp file in Bitcoin Core), and it doesn’t currently detect out-of-range reads or writes to arrays allocated statically or on the stack.

Events

  • On 10 November 2019, bitcoinVBR opened PR 17433 “fix nMinerConfirmationWindow not initialized”. The PR was misunderstood as a style change and closed.

  • On 12 November, bitcoinVBR opened a similar change in PR 17448 “remove unused variable- consensus.nMinerConfirmationWindow”. Despite a bad merge by the author while making the PR, the issue had became clear.

  • On 14 November, a clean third version was merged, PR 17449 “fix uninitialized variable nMinerConfirmationWindow”. It was for this 2-line fix that the Bitcoin Core 0.19.0.1 patch release was made – an unusual event.

  • What was the issue? An uninitialized read in PR 16713, merged 27 September 2019, that got past review by more than a half dozen reviewers (and also your host today).

  • How did this flaw escape detection?
  • On 30 October, PR 15921 – a review club PR with quite a few ACKs – was merged which contained another uninitialized read. Bitcoin Core contributor practicalswift found the issue by compiling Bitcoin with -fsanitize=memory and with valgrind, which both caught the issue. It was fixed in PR 17624.

  • On 29 November, practicalswift opened PR 17633 to enable running the functional tests with valgrind memcheck, which was recently merged. Note that the valgrind --exit-on-first-error=yes option is available since version 3.14 on Linux. Valgrind support on macOS may be more limited. A PR is open to automate setting that exit flag when running functional tests with valgrind.

  • On 30 November, practicalswift opened today’s PR 17639 to enable running the unit tests with valgrind memcheck.

What can we do to mitigate uninitialized variables?

  • Recommended reading: the discussion in PR 17627 “Suppress false positive warning about uninitialized entropy buffers”.

  • Test/fuzz/review better… or just write perfect code.
    • We do these things (test, fuzz, and review), but the recent issues show that it’s not enough.
    • Attackers may find what programmers don’t think of, and what fuzzers don’t hit.
  • Compile with -Werror=uninitialized or -Wconditional-uninitialized and fix all uninitialized variables in the codebase.
    • A fair amount of code to change.
    • The value chosen to initialize doesn’t necessarily make sense.
    • Code may not express intent anymore.
    • Similarly, static analysis isn’t smart enough.
  • Use MemorySanitizer or valgrind.
    • MSan requires compiling everything in the process, which is cumbersome.
    • Can’t deploy in production.
    • Several times slower and memory-hungry.
    • Resource exhaustion can generate false positives.
  • Pre-initialize variables with dummy values, e.g. compile with Clang -ftrivial-auto-var-init=pattern.
    • Raises testing sensitivity concerns.
    • If a flaw is introduced, it may be undetectable by valgrind.
    • Compilers can warn when they’re certain a value will be undefined, and pre-emptive dummy initialization suppresses those warnings.
    • Valgrind has special macros that can be used to mark memory as undefined. It may be best if dummy initialization were to be always done via a macro that would allow disabling it for testing, or valgrind annotating it.

Sources/resources (credit to practicalswift for most of these)

Questions

  1. Did you review the PR? Concept ACK, approach ACK, ACK <commit>, or NACK?  Were you able to run the unit tests with valgrind? Don’t forget to put your PR review on GitHub or ask questions.

  2. Have you tried reproducing and catching the uninitialized read errors that were found?

  3. Which options do you use – or plan to begin using – to catch these?

  4. What would be the best way to generalize detecting uninitialized variables reliably in Bitcoin Core?

  5. Extra credit: can you improve the PR to run the tests in parallel in a way that prints generated suppressions to the user and does not potentially introduce false positives due to resource exhaustion?

Meeting Log

  118:33 <jonatack> Hi everyone, we'll get started a little under a half hour
  218:34 <jonatack> About the session notes today: if you have any corrections or improvements to suggest, please share!
  318:35 <jonatack> I'd like the notes, questions, and discussion to be a resource that people can use or come back to on this subject.
  419:00 <jonatack> #startmeeting
  519:00 <MarcoFalke> hi
  619:00 <fanquake> hi
  719:00 <emzy> Hi
  819:00 <jonatack> Hi all! Welcome to this week's episode of the Bitcoin Core PR Review club!
  919:00 <amiti> hi
 1019:00 <jnewbery_6> hi
 1119:00 <jonatack> #topic Today we are looking at PR 17639, "Add make check-valgrind to run the unit tests under Valgrind (tests)".
 1219:00 <jonatack> We usually start Bitcoin Core IRC meetings with a 'hi' so it's clear who's at keyboard. Feel free to say hi, even if you arrive in the middle of the meeting!
 1319:00 <jkczyz> hi
 1419:00 <jonatack> Please jump in at any point with thoughts and questions. Don't be shy! This discussion is about your thoughts and input.
 1519:00 <jonatack> If you didn't have the chance to read the notes yet, I encourage you to do so.
 1619:00 <michaelfolkson> hi
 1719:00 <jonatack> I think they provide a good background summary for this discussion and for reviewing the PR.
 1819:00 <jonatack> For today's episode, I thought we could begin by talking about review of this PR.
 1919:00 <jonatack> And then springboard from the PR out to a larger discussion on preventing errors like this
 2019:01 <jonatack> both on an individual level of integrating helpful tools into our testing and reviewing workflow
 2119:01 <michaelfolkson> Excellent
 2219:01 <michaelfolkson> *notes
 2319:01 <jonatack> and on the broader level of the Bitcoin Core project itself
 2419:01 <kanzure> hi
 2519:01 <jonatack> which is certainly an issue that the maintainers are thinking about
 2619:01 <jonatack> That said, feel free to digress, comment, and ask questions at any time without waiting. Just jump in and go for it. The more the merrier!
 2719:01 <jonatack> michaelfolkson: ty
 2819:02 <jonatack> #topic Did you review the PR? Thoughts? Concept ACK, approach ACK, ACK <commit>, or NACK?
 2919:02 <jonatack> Were you able to run the unit tests with valgrind?
 3019:03 <jonatack> For me, they start but then seem stalled at "Running 387 test cases...", so i need to debug that
 3119:03 <michaelfolkson> Not on Mac OS. Getting set up on Ubuntu now
 3219:04 <amiti> I was v disappointed to find out that valgrind is incompatible with catalina
 3319:04 <jonatack> yes, macOS users get better clang and worse valgrinding, iiuc
 3419:04 <emzy> ACK. But I had to remove "--exit-on-first-error=yes --error-exitcode=1" from the makefile. Because my debian 10 has only valgrind-3.12.0.SVN
 3519:05 <jonatack> emzy: good point, that flag is v3.14 and later
 3619:05 <emzy> jonatack: right
 3719:06 <jonatack> i was surprised that the unit tests hang, because the functional tests run for me with the --valgrind flag (though they time out and i have to fiddle with the suppresions file)
 3819:06 <jonatack> suppressions*
 3919:06 <fanquake> emzy how are you installing it? Debian 10 has Valgrind 3.14: https://packages.debian.org/buster/valgrind
 4019:07 <jonatack> note that there is a good first issue opened by MarcoFalke to fix the functional test valgrind timeouts to make it more useable
 4119:08 <emzy> fanquake: my bad. I have Debian GNU/Linux 9.11 (stretch) on that machine... I have to update.
 4219:08 <MarcoFalke> Jup, it is for the ci/travis run. Looks like someone is working on it already: https://github.com/bitcoin/bitcoin/pull/17770
 4319:09 <andrewtoth_> hi
 4419:10 <jonatack> nice. the GFI tag seems to work well.
 4519:10 <jonatack> emzy: so today's PR worked for you after rm-ing that flag? How long did the unit tests take to run?
 4619:11 <emzy> jonatack: not shure. I will run it now, and time it.
 4719:12 <michaelfolkson> Will they be done by the end of this meeting?
 4819:12 <MarcoFalke> I'd say no
 4919:13 <jonatack> emzy: i'm interested, because reading the manual and searching online i saw varying perf figures: 3x, 10x, 30-40x, 50x, and up to 300x slowdowns
 5019:13 <jonatack> it's module-dependent, but here we're interested in the default memcheck perf, i think
 5119:14 <emzy> I think it was about 20 min. But I was not watching :)
 5219:14 <emzy> I was building without qt.
 5319:14 <jonatack> emzy: be sure to put your review on GH when you can.
 5419:15 <emzy> ok
 5519:15 <jonatack> topic Have you tried reproducing and catching the uninitialized read errors that were found?
 5619:15 <jonatack> #topic Have you tried reproducing and catching the uninitialized read errors that were found?
 5719:15 <emzy> no
 5819:17 <MarcoFalke> I think both of them need a node that has finished IBD
 5919:17 <jonatack> I did a few days ago while reviewing one of practicalswift's valgrind PRs. MarcoFalke: yes i was out of IBD.
 6019:18 <jonatack> Thanks to help from MarcoFalke and practitalswift. The main issue for me was the suppressions file needed additions.
 6119:20 <michaelfolkson> One of them needed booting up on mainnet to replicate I read but not full IBD. Made me feel bad about always running tests on regtest
 6219:20 <jonatack> Question: which of the tools mentioned in the notes do you like using for your own workflow, or plan to adopt?
 6319:22 <jonatack> e.g. MSan, valgrind, -Werror= / -Wconditional=, etc
 6419:22 <jonatack> each of them seems to have tradeoffs
 6519:23 <jonatack> MSan is cumbersome to set up and use AFAICT
 6619:23 <jonatack> Valgrind seems even slower than MSan
 6719:23 <michaelfolkson> MSan works on Mac?
 6819:24 <michaelfolkson> It appears no
 6919:24 <amiti> I tried using the memory sanitizer on mac & ran into issues as well
 7019:25 <jonatack> MSan compat: in https://clang.llvm.org/docs/MemorySanitizer.html it says Linux/NetBSD/FreeBSD
 7119:25 <jonatack> I'm learning here too
 7219:26 <jonatack> amiti: can you describe the issues you encountered?
 7319:26 <michaelfolkson> Regardless these tools are intense. Are we moving to a world with two tiers of testers? Most that don't run these tools and a small minority that do?
 7419:26 <amiti> I'm trying to remember ...
 7519:27 <jonatack> I have not yet tried MSan, only valgrind and various clang compile flags
 7619:27 <emzy> would it be enough to run Valgrind on one OS. Or have you also check on others?
 7719:28 <jonatack> practicalswift did a tutorial here on installing MSan in a Bitcoin Core issue: https://github.com/bitcoin/bitcoin/issues/17620#issuecomment-559156227
 7819:28 <jonatack> he walked provoostenator through it to help get him set up
 7919:29 <michaelfolkson> Or we try to identify which tests should be run with Valgrind, MSan and which don't need to be...
 8019:29 <MarcoFalke> emzy: In theory it would have to be run on each shipped binary (thus each OS), because slightly different code is compiled for each OS
 8119:29 <MarcoFalke> michaelfolkson: All tests should be run in valgrind, msan, ... etc :)
 8219:30 <emzy> ok makes sense. could be that there are OS specific code segments.
 8319:31 <michaelfolkson> MarcoFalke: But there could be two tiers of testers? A non-Valgrind/MSan tested ACK and a Valgrind tested ACK?
 8419:31 <MarcoFalke> emzy: Jup
 8519:31 <jonatack> MarcoFalke: how feasible does it seem to have CI run these (assuming the CI provider works properly...)
 8619:32 <MarcoFalke> jonatack: It already does :)
 8719:32 <jonatack> oh?
 8819:32 <emzy> nice
 8919:32 <jonatack> Things move fast on Bitcoin Core :)
 9019:32 <MarcoFalke> Heh, this is on current master: https://travis-ci.org/bitcoin/bitcoin/jobs/626430212
 9119:32 <michaelfolkson> Ah
 9219:33 <MarcoFalke> It wraps all binaries in valgrind (see the wrap-valgrind step and https://github.com/bitcoin/bitcoin/blob/master/ci/test/wrap-valgrind.sh )
 9319:34 <MarcoFalke> Then runs the tests as normal (make check && ./test/functional/test_runner.py)
 9419:34 <MarcoFalke> Well, not all functional tests
 9519:34 <MarcoFalke> See https://github.com/bitcoin/bitcoin/pull/17770
 9619:35 <emzy> "make check-valgrind" took 21 minutes for me. "./configure --enable-hardening --without-gui --disable-wallet"
 9719:36 <jonatack> emzy: that's reasonable, something i could run on critical PRs
 9819:37 <emzy> I will try again on Debian 10. Maybe the newer valgrind is slower.
 9919:38 <jonatack> MarcoFalke: Is there a CI build that runs memory checks than valgrind? MSan, Werror, etc?
10019:38 <jonatack> other than*
10119:38 <MarcoFalke> not yet
10219:38 <jonatack> What plans?
10319:39 <MarcoFalke> I presume the -Werror=uninitialized had a lot of false positives?
10419:39 <MarcoFalke> MSan is a subset of what valgrind does, so it doesn't seem worth it
10519:40 <MarcoFalke> Source: https://github.com/google/sanitizers/wiki/MemorySanitizer (search for "subset")
10619:41 <jonatack> This article suggests some advantages of sanitizers over valgrind https://lemire.me/blog/2019/05/16/building-better-software-with-better-tools-sanitizers-versus-valgrind/
10719:42 <jonatack> MarcoFalke's link: "MSan implements a subset of functionality found in Valgrind (Memcheck tool). It is significantly faster than Memcheck (TODO:benchmark)."
10819:43 <jonatack> #topic Is it feasible to run the unit tests in parallel with valgrind?
10919:43 <MarcoFalke> I mean I won't object someone contributing a ci run that builds with msan
11019:43 <MarcoFalke> jonatack: Yes, travis does it
11119:44 <MarcoFalke> See my link above
11219:45 <MarcoFalke> Takes about 30 minutes (with wallet unit tests, etc), but it works
11319:45 <jonatack> I recall the Valgrind manual has a section about running in parallel
11419:45 <jonatack> MarcoFalke: great!
11519:45 <fanquake> 27m26.144s to run make check-valgrind inside a debian container on my machine
11619:46 <fanquake> That's with minimum depends, so skipping a bunch of tests
11719:47 <jonatack> I'd like to get in the habit of running individual unit tests with valgrind. For example, after working on a new test, doing a final valgrind check before pushing.
11819:50 <michaelfolkson> I have a question on initialization. I'm trying to reconcile the advice from the video, basically "always initialize local variables" and Greg's "No constant is an acceptable random value"
11919:51 <michaelfolkson> Greg's advice is Bitcoin specific right? Because a constant would be more dangerous than an uninitialized variable?
12019:51 <jonatack> MarcoFalke: I don't recall false positives with the (few) times I tried -Werror but it was with the --enable-werror flag, which might be different?
12119:52 <jonatack> michaelfolkson: good question
12219:52 <MarcoFalke> michaelfolkson: How is valgrind supposed to find the uninitizlied variable when it was initialized by you?
12319:53 <MarcoFalke> And how do you know what value to inizialize to when the value is supposed to be passed in or parsed from somwhere else?
12419:54 <michaelfolkson> MarcoFalke: It isn't. But that's not the only reason we're using Valgrind?
12519:54 <jonatack> michaelfolkson: as you probably remarked, i recapped gregory's thoughts in the notes about pre-initialising with dummy values, but i need to re-watch the video with those caveats in mind
12619:54 <michaelfolkson> Greg's comment here: https://github.com/bitcoin/bitcoin/pull/17627#issuecomment-562387426
12719:55 <michaelfolkson> And the video I'm referring to is here: https://www.youtube.com/watch?v=I-XUHPimq3o
12819:55 <michaelfolkson> From Jon's notes
12919:55 <MarcoFalke> michaelfolkson: no, we use valgrind for other checks as well (e.g. use-after-free, I believe)
13019:55 <MarcoFalke> or whatever else valgrind checks for
13119:56 <jonatack> the video went into which are the optimal dummy values for different types, and the question of pattern, random, fixed dummy vals
13219:56 <jonatack> 3 minutes!
13319:56 <jonatack> any last thoughts?
13419:57 <MarcoFalke> So picking a constant as a random value is horribly wrong
13519:57 <MarcoFalke> Picking a constant for a user preference (or similar) is less horrible, but still wrong
13619:57 <michaelfolkson> Yup. That would surely apply to all software that relies on randomization not just Bitcoin
13719:58 <michaelfolkson> Maybe I misunderstood the video but "always initialize local variables" seemed to be the advice
13819:58 <MarcoFalke> It is not only about random values. Any value that is parsed/read/input from somewhere else (i.e. not the code)
13919:59 <jonatack> OK let's wrap up. Feel free to continue after of course.
14019:59 <jonatack> #action review this PR, ask questions, keep learning, keep sharpening your tools
14119:59 <jonatack> #action see you next year!
14219:59 <jonatack> #endmeeting
14319:59 <michaelfolkson> OK cool. Many thanks jonatack and everyone else
14420:00 <MarcoFalke> Thanks for hosting jonatack!
14520:00 <michaelfolkson> Have a great Christmas :)
14620:00 <MarcoFalke> Happy holidays everyone!
14720:00 <jonatack> Thanks everyone, and ty to maintainers MarcoFalke and fanquake for joining us.
14820:00 <emzy> Thanks and Happy holidays everyone!
14920:01 <amiti> thanks all ! thanks for the notes & hosting jonatack
15020:02 <jonatack> ty amiti :)