Bench: add priority level to the benchmark framework (tests)

https://github.com/bitcoin/bitcoin/pull/26158

Host: stickies-v  -  PR author: furszy

The PR branch HEAD was cbc077e at the time of this review club meeting.

Notes

  • Benchmark tests help us monitor the performance of specific parts of the code by running it a large number of iterations and measuring how long it takes to run that piece of code, as well as how stable the results are.

  • Benchmark tests are defined in src/bench and are based on the nanobench framework. Instructions on how to compile and run the benchmarks can be found in doc/benchmarking.md. It is highly recommended to do this prior to the review club.

  • To ensure all benchmark tests work (i.e. can actually be ran without failing), make check runs the bench_bitcoin binary with option -sanity-check which sets the number of iterations for each test to just 1.

  • During an IRC meeting (#topic acceptable runtimes of new benchmarks (achow101)), it was suggested that not all benchmarks need to be run or sanity checked all the time.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?

  2. What is the initial motivation behind this proposed change? Does the approach in this PR sufficiently address that? Can you think of another, not necessarily (strictly) better, approach?

  3. Are there any downsides to make check only running -priority-level=high benchmarks by default? Do you think the benefits outweigh the downsides?

  4. Why do we need to static_cast<PriorityLevel> the return value when operator|(PriorityLevel a, PriorityLevel b)’s return type is PriorityLevel already?

  5. In which sense is PriorityLevel an enum? In which sense is it a bit vector? What do you think about this approach?

  6. Which levels can we pass to -priority-level? Are they all represented in map_priority_level? How are the other one(s) resolved? Can you think of a different approach?

  7. Were you able to compile and run the benchmarks? How did you test that the -priority-level argument works as expected?

  8. Which benchmarks do you think should/could be labeled as LOW or MEDIUM priority? What did you base that decision on?

  9. Quiz: Which of the following are valid PriorityLevels:

    • a. PriorityLevel{0x00}
    • b. PriorityLevel{"low"}
    • c. PriorityLevel::DEFAULT_PRIORITY_LEVEL
    • d. PriorityLevel{3}
    • e. PriorityLevel{4}
    • f. PriorityLevel{0xff}
    • g. auto static_cast<PriorityLevel>(8)

Meeting Log

  117:00 <stickies-v> #startmeeting
  217:00 <stacie> hi!
  317:00 <emzy> hi
  417:00 <pablomartin> hello!
  517:00 <amovfx_> hi!
  617:00 <araujo88> hello
  717:00 <kouloumos> hi
  817:00 <furszy> hi
  917:00 <stickies-v> hi everyone! today we're reviewing https://bitcoincore.reviews/26158, authored by furszy who is joining us here today too, whoo!
 1017:00 <theStack> hi
 1117:00 <LarryRuane> hi
 1217:01 <brunoerg> hhi
 1317:01 <brunoerg> hi*
 1417:01 <stickies-v> anyone joining us for the first time today?
 1517:01 <araujo88> I am
 1617:01 <stickies-v> feel free to say hi, and either lurk or participate as much as you want to
 1717:02 <araujo88> ok thanks
 1817:02 <stickies-v> very glad you could make it araujo88 , hope you'll enjoy today's club!
 1917:03 <amovfx_> welcome arauho88
 2017:03 <amovfx_> welcome *araujo88
 2117:03 <furszy> stickies-v: thanks for hosting it!
 2217:03 <araujo88> thanks everyone, I'm hoping to contribute to as much as I can
 2317:03 <stickies-v> first up as per usual, who's been able to have a look at the notes/questions or the PR? (y/n)
 2417:03 <pablomartin> y
 2517:04 <emzy> n
 2617:04 <kouloumos> yy
 2717:04 <stacie> 50%. reviewed PR but not the questions you'll be asking today
 2817:04 <amovfx_> y
 2917:04 <LarryRuane> n
 3017:05 <theStack> n
 3117:05 <stickies-v> okay a few people with newly gained expertise here, nice - that should make for a good discussion
 3217:05 <araujo88> y
 3317:06 <stickies-v> for those that reviewed it, what are your initial thoughts? would you give it a Concept ACK, approach ACK, tested ACK, or NACK?
 3417:06 <amovfx_> Concept , tested ACK
 3517:06 <amovfx_> I like the idea
 3617:06 <pablomartin> tested ACK, haven't added the proper comments on the pr yet
 3717:07 <stickies-v> nice!! how did you do your testing amovfx_ pablomartin ?
 3817:07 <stacie> approach ACK. the IRC meeting linked in the PR review club page was helpful in understanding the motivation for this PR (new wallet benchmark tests taking long to set up)
 3917:08 <amovfx_> I changed some of the benchmarks to low and medium
 4017:08 <pablomartin> compiled the code changes, changed the priorities on some benchs, ran the benchs with the diff priorities (low, high, med, all and a combination)
 4117:08 <amovfx_> then rand the commands
 4217:08 <kouloumos> I understand how priority levels can be useful, what I think I haven't grasp yet is the tradeoff between not running all the benchmarks and time saved
 4317:09 <kouloumos> Although I guess, the introduction of priorities allows for the introduction of complex benchmarks in the future
 4417:09 <pablomartin> oh, forgot to mentioned, recompiled only the bench runner to verify the behaviour
 4517:09 <amovfx_> I think that needs to be measured as bench marks havn't been bucketed yet
 4617:09 <stickies-v> that's a useful test, try and see how it will operate when this feature is actually being used
 4717:09 <stickies-v> kouloumos covers one of the next questions already, but we might as well dive into it now already
 4817:10 <stickies-v> what are people's thoughts on the trade-offs of this PR? should we or should we not always run all the benchmarks?
 4917:10 <kouloumos> oups
 5017:11 <stickies-v> maybe a better question to start: why are we even running all the benchmarks with `make check` in the first place?
 5117:11 <stacie> I like that this PR preserves existing behavior (running all the current benchmarks) but it does lead to an important question of what makes a benchmark test worth running during make check.
 5217:11 <amovfx_> They should all be run, but I can see the need for this during development time when a specific feature is being changed
 5317:12 <amovfx_> +1 stacie
 5417:12 <pablomartin> classifying all the benchmarks as high, as the pr does, doesn't make any difference the current behaviour... I guess perhaps some of the current benchmarks would be re-evaluated/ discussed their priority classification
 5517:12 <amovfx_> stickies-v: Is it for coverage? Or regression testing?
 5617:12 <stacie> I think it's fair to ask, if we skip some benchmark tests, why not skip them all and let the CI take care of it?
 5717:12 <kouloumos> I would say regression testing
 5817:13 <stickies-v> pablomartin: exactly, this PR shouldn't change any behaviour - it allows changing behaviour in the future
 5917:14 <stickies-v> stacie: the CI is pretty slow feedback when you're developing, it takes a while for tests to fail. Having `make check` fail locally is much much faster
 6017:15 <stickies-v> is running the benchmarks important for compilation-time errors or runtime-errors?
 6117:15 <stacie> stickies-v ah, that makes sense!
 6217:16 <amovfx_> I"m going to guess run time
 6317:16 <amovfx_> cause bitcoin_bench wouldn't have built in the first place?
 6417:17 <stickies-v> amovfx_: yes! all the benchmarks are already compiled with `make`, but they don't catch certain runtime errors like segfaults etc
 6517:17 <amovfx_> osom
 6617:18 <amovfx_> I would have thought the unit tests would be catching things like seg faults
 6717:18 <pablomartin> benchmarks are for performance to measure all the crytographic algorithms
 6817:18 <amovfx_> only the crypto algorithms?
 6917:18 <pablomartin> regression tests that also run by default during the "make check" would catch the compilationd and runtime errors...
 7017:18 <kouloumos> stickies-v couldnt we catch those sefgaults, along with other regressions during unit and functional tests? Do benchmarks offer any other benefit on that front?
 7117:19 <pablomartin> amovfx_: sorry, not only... also: rolling bloom filter, coins selection, thread queue, wallet balance.
 7217:20 <stickies-v> amovfx_ kouloumos: unit tests would catch segfaults in the functions they're testing, but the benchmark tests themselves can also have segfaults that we'd ideally catch before they're merged into master
 7317:21 <kouloumos> pablomartin: I believe that's not the case when we run them with `make check`, because of the`-sanity-check` flag, which is only to test if they are running, we don't measure performance at that point
 7417:21 <pablomartin> true, it's sanity, they run only once
 7517:22 <stickies-v> i'll move on to the next questions already but feel free to keep discussing the previous question - we're async!
 7617:22 <stickies-v> Why do we need to `static_cast<PriorityLevel>` the return value when `operator|(PriorityLevel a, PriorityLevel b)`'s return type is `PriorityLevel` already?
 7717:22 <stickies-v> (link: https://github.com/bitcoin-core-review-club/bitcoin/blob/cbc077e77a5d0ba8ae11b1f5d89480c997fdef55/src/bench/bench.h#L50)
 7817:23 <michaelfolkson> hi
 7917:23 <amovfx_> T is a complete enumeration type and PriorityLevel is implicitly convertible.?
 8017:23 <pablomartin> hi michaelfolkson
 8117:24 <stickies-v> amovfx_: is _not_ implicitly convertible, you mean?
 8217:24 <furszy> extra note, we do have a benchmark that actually tests the duplicate inputs error
 8317:25 <furszy> introduced https://github.com/bitcoin/bitcoin/pull/14400
 8417:25 <amovfx_> I thought PrioirtyLevel would be cast to uint8_t
 8517:26 <stacie> I have a question about that line, the PR author furszy says this code "can make the software crash if I pass a combination of tags that isn't declared (e.g. "medium, low") and call PriorityToString(levels) (at the assertion point)." https://github.com/bitcoin/bitcoin/pull/26158#discussion_r982590159 Is this concern still valid?
 8617:26 <stickies-v> that would be true for an unscoped enum, but PriorityLevel is a scoped enum
 8717:26 <amovfx_> ah
 8817:26 <stickies-v> the question was actually also about the opposite conversion, where we already have an int and need to cast it back into a PriorityLevel
 8917:26 <furszy> stacie: not on the latest push
 9017:27 <kouloumos> Also, a note, for anyone interested in this kind of stuff, nanobench (the benchmarking framework we are using) is a relatively new addition, and it's maintained by a Bitcoin Core contributor. It was added with https://github.com/bitcoin/bitcoin/pull/18011
 9117:27 <amovfx_> stickies_v: I thought that automatically happens, as PrioirtyLevel inherits from uint8_t
 9217:27 <amovfx_> so it just takes ints
 9317:27 <stickies-v> to which the answer is: neither scoped nor unscoped enums can be implicitly converted _from_ their underlying type, which is why we use the explicit `static_cast` here
 9417:27 <amovfx_> PrioirtyLevel is esseintailly an int
 9517:28 <amovfx_> stickies-v: TIL
 9617:28 <pablomartin> stickies-v ok, thanks
 9717:28 <stacie> furszy: ok ty!
 9817:28 <amovfx_> is the underlying type in this case int?
 9917:28 <stickies-v> mmm I wouldn't call it essentially an int, it's an enum and every enum has an underlying type, in this case it's a uint8_t because we don't need a larger int
10017:28 <stickies-v> and unsigned because we use it as a bitvector (more on that later)
10117:28 <amovfx_> ah good good
10217:29 <amovfx_> I understand now
10317:29 <stickies-v> okay turns out "later" is "now" already hah, moving on to the next question:
10417:29 <stickies-v> In which sense is `PriorityLevel` an enum? In which sense is it a bit vector? What do you think about this approach?
10517:30 <furszy> stacie: on the latest push, changed the PriorityToString function to incrementally add the priority types to the string instead of only looking the enum value on the `map_priority_level` map (which isn't covering every possible combination).
10617:31 <amovfx_> It's a bitvector when it is being used as an argument in a bench mark
10717:31 <amovfx_> err
10817:31 <amovfx_> enum sry
10917:33 <amovfx_> a bitvector when it is used in things like PriorityToString
11017:33 <kouloumos> It's a really cool approach, I've noticed that we are using it in other places in the codebase as well.
11117:33 <amovfx_> where values are compared
11217:34 <amovfx_> and the & and | operators are invoked
11317:34 <stacie> furszy: clever! that is a good solution
11417:35 <pablomartin> if understood the questions correctly... prioritylevel it's used to make it easy to understand/ setup for user/ dev... internally the mappings and the "all" priorities resolution to include the new categories that could be added uses the bit vector for practicality
11517:35 <stickies-v> amovfx_: yep exactly, the only place where it really is an enum is where we define the priority level of each benchmark
11617:37 <stickies-v> do you think it's confusing that we use the PriorityLevel both to represent an actual priority level (the enum use case), as well as the aggregation of multiple priority levels (the bit vector approach)?
11717:38 <amovfx_> Nope, I can't really see it being done another way really
11817:38 <amovfx_> bitvectors are perfect for that imo
11917:40 <amovfx_> allows for the system to be expanded really easy too, we can go to PRIORITYLEVEL::MAX and PRIORITYLEVEL::ULTRA
12017:40 <stickies-v> amovfx_: but I'd argue an enum is conceptually entirely different to a bitvector? an enum item is meant to represent one item of a limited, specified set of items
12117:40 <furszy> pablomartin: yes, simpler to only define the primitive types, then use a bitmap to accept any combinations of them.
12217:40 <amovfx_> I can see that point of view
12317:41 <stickies-v> personally, I'd prefer keeping the PriorityLevel enum for actual priority levels, and just using a uint8_t whenever we want to represent the aggregation - that way the intent is more clear imo
12417:41 <stacie> I can see how that would be confusing. It's kind of like having a multiple choice question with options for a, b, c, and then "all the above"
12517:42 <amovfx_> I can also see this enum thing being made a general utility class too
12617:42 <stickies-v> stacie: and then also choices for all combinations of all the options :D in which case, you might as well just have people check off multiple boxes instead
12717:42 <pablomartin> stickies-v i see your point now, i didnt get it at firt from your note on the pr
12817:43 <furszy> stickies-v: the cool thing about the scoped enum is the removal of all the "static_cast"
12917:44 <furszy> still, if it's confusing, then all the static_cast could be re-added.
13017:44 <stacie> stickies-v: yes! same energy as trying to pick a radio button or a check box for a UI. It didn't initially stand out as confusing to me, but I see the discrepancy now. I'm just not sure if as a developer, I've been trained to think a certain way haha
13117:44 <stickies-v> furszy: yeah I agree, that makes for clean code. but that can also be done in a separate class?
13217:45 <stickies-v> alright next question: Which levels can we pass to `-priority-level`? Are they all represented in `map_priority_level`? How are the other one(s) resolved? Can you think of a different approach?
13317:45 <stickies-v> (link: https://github.com/bitcoin-core-review-club/bitcoin/blob/cbc077e77a5d0ba8ae11b1f5d89480c997fdef55/src/bench/bench.cpp#L47-L51)
13417:46 <amovfx_> any combination and amount of low,medium,high
13517:46 <amovfx_> all is defualyt
13617:46 <amovfx_> *default
13717:46 <furszy> stickies-v: could be, but.. we might be over-engineering it with that. The same behavior is being used in the sources already for the net processing ServiceFlags field.
13817:48 <stickies-v> amovfx_: technically we can also pass low,medium,high,all
13917:48 <stickies-v> (or any combination thereof)
14017:48 <amovfx_> all is recognized?
14117:49 <amovfx_> thats cool i guess
14217:49 <pablomartin> yes
14317:49 <stickies-v> well that's the next part of the question! anyone's got an answer to that?
14417:50 <pablomartin> stickies-v I'm not sure about the last one "g."
14517:50 <kouloumos> it's recognized because of the if clause in stringtopriority and prioritytostring functions
14617:50 <amovfx_> is all DEFAULT_PRIORIT_LEVEL?
14717:50 <stacie> I think it's somewhere in bench.cpp RunAll() but my C++ isn't good enough to find the exact line
14817:50 <pablomartin> yes also
14917:51 <pablomartin> stacie it's where kouloumos said above
15017:51 <amovfx_> ah yes, I see now, thanks
15117:51 <stacie> we're looking for the line of code that runs all benchmark tests if a priority isn't specified right?
15217:51 <stickies-v> kouloumos: stacie yeah it's manually defined in https://github.com/bitcoin-core-review-club/bitcoin/blob/cbc077e77a5d0ba8ae11b1f5d89480c997fdef55/src/bench/bench.cpp#L55
15317:52 <amovfx_> well, if priority isnt specified, it gets the default value
15417:52 <pablomartin> true
15517:53 <kouloumos> Talking about those transformation functions, I'm still not sure about the usefulness of PriorityToString. What's the benefit of using it instead of hardcoding the strings in the help message and the default value?
15617:54 <stacie> oh I see it now, ty!
15717:55 <kouloumos> To add more context, it's only used here https://github.com/bitcoin-core-review-club/bitcoin/blob/cbc077e77a5d0ba8ae11b1f5d89480c997fdef55/src/bench/bench_bitcoin.cpp#L33-L35 and here https://github.com/bitcoin-core-review-club/bitcoin/blob/cbc077e77a5d0ba8ae11b1f5d89480c997fdef55/src/bench/bench_bitcoin.cpp#L130
15817:55 <stickies-v> kouloumos: then you'd be hardcoding that in 2 separate places which incurs the risk of people forgetting to update both locations in future updates
15917:57 <stickies-v> alright let's quickly look at the quiz before we have to wrap up
16017:58 <amovfx_> all but b, 1011111?
16117:58 <stickies-v> Which of the following are valid PriorityLevels? a. PriorityLevel{0x00} | b. PriorityLevel{"low"} | c. PriorityLevel::DEFAULT_PRIORITY_LEVEL | d. PriorityLevel{3} |e. PriorityLevel{4} | f. PriorityLevel{0xff} | g. auto static_cast<PriorityLevel>(8)
16217:59 <pablomartin> not sure about "g."
16317:59 <amovfx_> oh wait not g or d
16417:59 <theStack> amovfx_: that would be also be my guess
16517:59 <kouloumos> a, e
16618:00 <amovfx_> 0xff is 8 or 16?
16718:00 <stickies-v> 0xff is 256
16818:00 <amovfx_> or wait thats 255?
16918:00 <amovfx_> derp
17018:00 <amovfx_> I think any ints are default priority levels
17118:00 <stickies-v> *255 sorry
17218:01 <amovfx_> but they wont work in the code, if you force them in, some wont be found in the maps
17318:01 <stickies-v> okay there are more valid ones than invalid ones, so we'll cover those
17418:01 <amovfx_> because anything greater thatn 4 and not 255 is out of the enum members
17518:01 <amovfx_> err not an enum member
17618:01 <stickies-v> b. is invalid because we can't construct a PriorityLevel from string, we have StringToPriority for that
17718:01 <amovfx_> aye
17818:02 <pablomartin> right
17918:02 <stickies-v> c. is invalid because `DEFAULT_PRIORITY_LEVEL`is a const defined in `benchmark`, so it's `benchmark::DEFAULT_PRIORITY_LEVEL` - it's not a member of `PriorityLevel`
18018:02 <amovfx_> o shit
18118:02 <stickies-v> everything else is valid!
18218:02 <pablomartin> nice
18318:02 <amovfx_> I thought that would evaluate to 0xff
18418:03 <amovfx_> but yea, its not a member
18518:03 <amovfx_> Well shit, this was excellent
18618:03 <amovfx_> learned lots
18718:03 <stickies-v> alright, let's wrap it up here - sorry for going slightly over time!
18818:03 <stickies-v> #endmeeting