Bench: add priority level to the benchmark framework (tests)

Oct 12, 2022

https://github.com/bitcoin/bitcoin/pull/26158

Host: stickies-v - PR author: furszy

The PR branch HEAD was cbc077e at the time of this review club meeting.

Notes

Benchmark tests help us monitor the performance of specific parts of the code by running it a large number of iterations and measuring how long it takes to run that piece of code, as well as how stable the results are.
Benchmark tests are defined in src/bench and are based on the nanobench framework. Instructions on how to compile and run the benchmarks can be found in doc/benchmarking.md. It is highly recommended to do this prior to the review club.
To ensure all benchmark tests work (i.e. can actually be ran without failing), make check runs the bench_bitcoin binary with option -sanity-check which sets the number of iterations for each test to just 1.
During an IRC meeting (#topic acceptable runtimes of new benchmarks (achow101)), it was suggested that not all benchmarks need to be run or sanity checked all the time.

Questions

Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?
What is the initial motivation behind this proposed change? Does the approach in this PR sufficiently address that? Can you think of another, not necessarily (strictly) better, approach?
Are there any downsides to make check only running -priority-level=high benchmarks by default? Do you think the benefits outweigh the downsides?
Why do we need to static_cast<PriorityLevel> the return value when operator|(PriorityLevel a, PriorityLevel b)’s return type is PriorityLevel already?
In which sense is PriorityLevel an enum? In which sense is it a bit vector? What do you think about this approach?
Which levels can we pass to -priority-level? Are they all represented in map_priority_level? How are the other one(s) resolved? Can you think of a different approach?
Were you able to compile and run the benchmarks? How did you test that the -priority-level argument works as expected?
Which benchmarks do you think should/could be labeled as LOW or MEDIUM priority? What did you base that decision on?
Quiz: Which of the following are valid PriorityLevels:
- a. PriorityLevel{0x00}
- b. PriorityLevel{"low"}
- c. PriorityLevel::DEFAULT_PRIORITY_LEVEL
- d. PriorityLevel{3}
- e. PriorityLevel{4}
- f. PriorityLevel{0xff}
- g. auto static_cast<PriorityLevel>(8)

Meeting Log

17:00

<stickies-v> #startmeeting

17:00

<stacie> hi!

17:00

<emzy> hi

17:00

<pablomartin> hello!

17:00

<amovfx_> hi!

17:00

<araujo88> hello

17:00

<kouloumos> hi

17:00

<furszy> hi

17:00

<stickies-v> hi everyone! today we're reviewing https://bitcoincore.reviews/26158, authored by furszy who is joining us here today too, whoo!

17:00

<theStack> hi

17:00

<LarryRuane> hi

17:01

<brunoerg> hhi

17:01

<brunoerg> hi*

17:01

<stickies-v> anyone joining us for the first time today?

17:01

<araujo88> I am

17:01

<stickies-v> feel free to say hi, and either lurk or participate as much as you want to

17:02

<araujo88> ok thanks

17:02

<stickies-v> very glad you could make it araujo88 , hope you'll enjoy today's club!

17:03

<amovfx_> welcome arauho88

17:03

<amovfx_> welcome *araujo88

17:03

<furszy> stickies-v: thanks for hosting it!

17:03

<araujo88> thanks everyone, I'm hoping to contribute to as much as I can

17:03

<stickies-v> first up as per usual, who's been able to have a look at the notes/questions or the PR? (y/n)

17:03

<pablomartin> y

17:04

<emzy> n

17:04

<kouloumos> yy

17:04

<stacie> 50%. reviewed PR but not the questions you'll be asking today

17:04

<amovfx_> y

17:04

<LarryRuane> n

17:05

<theStack> n

17:05

<stickies-v> okay a few people with newly gained expertise here, nice - that should make for a good discussion

17:05

<araujo88> y

17:06

<stickies-v> for those that reviewed it, what are your initial thoughts? would you give it a Concept ACK, approach ACK, tested ACK, or NACK?

17:06

<amovfx_> Concept , tested ACK

17:06

<amovfx_> I like the idea

17:06

<pablomartin> tested ACK, haven't added the proper comments on the pr yet

17:07

<stickies-v> nice!! how did you do your testing amovfx_ pablomartin ?

17:07

<stacie> approach ACK. the IRC meeting linked in the PR review club page was helpful in understanding the motivation for this PR (new wallet benchmark tests taking long to set up)

17:08

<amovfx_> I changed some of the benchmarks to low and medium

17:08

<pablomartin> compiled the code changes, changed the priorities on some benchs, ran the benchs with the diff priorities (low, high, med, all and a combination)

17:08

<amovfx_> then rand the commands

17:08

<kouloumos> I understand how priority levels can be useful, what I think I haven't grasp yet is the tradeoff between not running all the benchmarks and time saved

17:09

<kouloumos> Although I guess, the introduction of priorities allows for the introduction of complex benchmarks in the future

17:09

<pablomartin> oh, forgot to mentioned, recompiled only the bench runner to verify the behaviour

17:09

<amovfx_> I think that needs to be measured as bench marks havn't been bucketed yet

17:09

<stickies-v> that's a useful test, try and see how it will operate when this feature is actually being used

17:09

<stickies-v> kouloumos covers one of the next questions already, but we might as well dive into it now already

17:10

<stickies-v> what are people's thoughts on the trade-offs of this PR? should we or should we not always run all the benchmarks?

17:10

<kouloumos> oups

17:11

<stickies-v> maybe a better question to start: why are we even running all the benchmarks with `make check` in the first place?

17:11

<stacie> I like that this PR preserves existing behavior (running all the current benchmarks) but it does lead to an important question of what makes a benchmark test worth running during make check.

17:11

<amovfx_> They should all be run, but I can see the need for this during development time when a specific feature is being changed

17:12

<amovfx_> +1 stacie

17:12

<pablomartin> classifying all the benchmarks as high, as the pr does, doesn't make any difference the current behaviour... I guess perhaps some of the current benchmarks would be re-evaluated/ discussed their priority classification

17:12

<amovfx_> stickies-v: Is it for coverage? Or regression testing?

17:12

<stacie> I think it's fair to ask, if we skip some benchmark tests, why not skip them all and let the CI take care of it?

17:12

<kouloumos> I would say regression testing

17:13

<stickies-v> pablomartin: exactly, this PR shouldn't change any behaviour - it allows changing behaviour in the future

17:14

<stickies-v> stacie: the CI is pretty slow feedback when you're developing, it takes a while for tests to fail. Having `make check` fail locally is much much faster

17:15

<stickies-v> is running the benchmarks important for compilation-time errors or runtime-errors?

17:15

<stacie> stickies-v ah, that makes sense!

17:16

<amovfx_> I"m going to guess run time

17:16

<amovfx_> cause bitcoin_bench wouldn't have built in the first place?

17:17

<stickies-v> amovfx_: yes! all the benchmarks are already compiled with `make`, but they don't catch certain runtime errors like segfaults etc

17:17

<amovfx_> osom

17:18

<amovfx_> I would have thought the unit tests would be catching things like seg faults

17:18

<pablomartin> benchmarks are for performance to measure all the crytographic algorithms

17:18

<amovfx_> only the crypto algorithms?

17:18

<pablomartin> regression tests that also run by default during the "make check" would catch the compilationd and runtime errors...

17:18

<kouloumos> stickies-v couldnt we catch those sefgaults, along with other regressions during unit and functional tests? Do benchmarks offer any other benefit on that front?

17:19

<pablomartin> amovfx_: sorry, not only... also: rolling bloom filter, coins selection, thread queue, wallet balance.

17:20

<stickies-v> amovfx_ kouloumos: unit tests would catch segfaults in the functions they're testing, but the benchmark tests themselves can also have segfaults that we'd ideally catch before they're merged into master

17:21

<kouloumos> pablomartin: I believe that's not the case when we run them with `make check`, because of the`-sanity-check` flag, which is only to test if they are running, we don't measure performance at that point

17:21

<pablomartin> true, it's sanity, they run only once

17:22

<stickies-v> i'll move on to the next questions already but feel free to keep discussing the previous question - we're async!

17:22

<stickies-v> Why do we need to `static_cast<PriorityLevel>` the return value when `operator|(PriorityLevel a, PriorityLevel b)`'s return type is `PriorityLevel` already?

17:22

<stickies-v> (link: https://github.com/bitcoin-core-review-club/bitcoin/blob/cbc077e77a5d0ba8ae11b1f5d89480c997fdef55/src/bench/bench.h#L50)

17:23

<michaelfolkson> hi

17:23

<amovfx_> T is a complete enumeration type and PriorityLevel is implicitly convertible.?

17:23

<pablomartin> hi michaelfolkson

17:24

<stickies-v> amovfx_: is _not_ implicitly convertible, you mean?

17:24

<furszy> extra note, we do have a benchmark that actually tests the duplicate inputs error

17:25

<furszy> introduced https://github.com/bitcoin/bitcoin/pull/14400

17:25

<amovfx_> I thought PrioirtyLevel would be cast to uint8_t

17:26

<stacie> I have a question about that line, the PR author furszy says this code "can make the software crash if I pass a combination of tags that isn't declared (e.g. "medium, low") and call PriorityToString(levels) (at the assertion point)." https://github.com/bitcoin/bitcoin/pull/26158#discussion_r982590159 Is this concern still valid?

17:26

<stickies-v> that would be true for an unscoped enum, but PriorityLevel is a scoped enum

17:26

<amovfx_> ah

17:26

<stickies-v> the question was actually also about the opposite conversion, where we already have an int and need to cast it back into a PriorityLevel

17:26

<furszy> stacie: not on the latest push

17:27

<kouloumos> Also, a note, for anyone interested in this kind of stuff, nanobench (the benchmarking framework we are using) is a relatively new addition, and it's maintained by a Bitcoin Core contributor. It was added with https://github.com/bitcoin/bitcoin/pull/18011

17:27

<amovfx_> stickies_v: I thought that automatically happens, as PrioirtyLevel inherits from uint8_t

17:27

<amovfx_> so it just takes ints

17:27

<stickies-v> to which the answer is: neither scoped nor unscoped enums can be implicitly converted _from_ their underlying type, which is why we use the explicit `static_cast` here

17:27

<amovfx_> PrioirtyLevel is esseintailly an int

17:28

<amovfx_> stickies-v: TIL

17:28

<pablomartin> stickies-v ok, thanks

17:28

<stacie> furszy: ok ty!

17:28

<amovfx_> is the underlying type in this case int?

17:28

<stickies-v> mmm I wouldn't call it essentially an int, it's an enum and every enum has an underlying type, in this case it's a uint8_t because we don't need a larger int

100

17:28

<stickies-v> and unsigned because we use it as a bitvector (more on that later)

101

17:28

<amovfx_> ah good good

102

17:29

<amovfx_> I understand now

103

17:29

<stickies-v> okay turns out "later" is "now" already hah, moving on to the next question:

104

17:29

<stickies-v> In which sense is `PriorityLevel` an enum? In which sense is it a bit vector? What do you think about this approach?

105

17:30

<furszy> stacie: on the latest push, changed the PriorityToString function to incrementally add the priority types to the string instead of only looking the enum value on the `map_priority_level` map (which isn't covering every possible combination).

106

17:31

<amovfx_> It's a bitvector when it is being used as an argument in a bench mark

107

17:31

<amovfx_> err

108

17:31

<amovfx_> enum sry

109

17:33

<amovfx_> a bitvector when it is used in things like PriorityToString

110

17:33

<kouloumos> It's a really cool approach, I've noticed that we are using it in other places in the codebase as well.

111

17:33

<amovfx_> where values are compared

112

17:34

<amovfx_> and the & and | operators are invoked

113

17:34

<stacie> furszy: clever! that is a good solution

114

17:35

<pablomartin> if understood the questions correctly... prioritylevel it's used to make it easy to understand/ setup for user/ dev... internally the mappings and the "all" priorities resolution to include the new categories that could be added uses the bit vector for practicality

115

17:35

<stickies-v> amovfx_: yep exactly, the only place where it really is an enum is where we define the priority level of each benchmark

116

17:37

<stickies-v> do you think it's confusing that we use the PriorityLevel both to represent an actual priority level (the enum use case), as well as the aggregation of multiple priority levels (the bit vector approach)?

117

17:38

<amovfx_> Nope, I can't really see it being done another way really

118

17:38

<amovfx_> bitvectors are perfect for that imo

119

17:40

<amovfx_> allows for the system to be expanded really easy too, we can go to PRIORITYLEVEL::MAX and PRIORITYLEVEL::ULTRA

120

17:40

<stickies-v> amovfx_: but I'd argue an enum is conceptually entirely different to a bitvector? an enum item is meant to represent one item of a limited, specified set of items

121

17:40

<furszy> pablomartin: yes, simpler to only define the primitive types, then use a bitmap to accept any combinations of them.

122

17:40

<amovfx_> I can see that point of view

123

17:41

<stickies-v> personally, I'd prefer keeping the PriorityLevel enum for actual priority levels, and just using a uint8_t whenever we want to represent the aggregation - that way the intent is more clear imo

124

17:41

<stacie> I can see how that would be confusing. It's kind of like having a multiple choice question with options for a, b, c, and then "all the above"

125

17:42

<amovfx_> I can also see this enum thing being made a general utility class too

126

17:42

<stickies-v> stacie: and then also choices for all combinations of all the options :D in which case, you might as well just have people check off multiple boxes instead

127

17:42

<pablomartin> stickies-v i see your point now, i didnt get it at firt from your note on the pr

128

17:43

<furszy> stickies-v: the cool thing about the scoped enum is the removal of all the "static_cast"

129

17:44

<furszy> still, if it's confusing, then all the static_cast could be re-added.

130

17:44

<stacie> stickies-v: yes! same energy as trying to pick a radio button or a check box for a UI. It didn't initially stand out as confusing to me, but I see the discrepancy now. I'm just not sure if as a developer, I've been trained to think a certain way haha

131

17:44

<stickies-v> furszy: yeah I agree, that makes for clean code. but that can also be done in a separate class?

132

17:45

<stickies-v> alright next question: Which levels can we pass to `-priority-level`? Are they all represented in `map_priority_level`? How are the other one(s) resolved? Can you think of a different approach?

133

17:45

<stickies-v> (link: https://github.com/bitcoin-core-review-club/bitcoin/blob/cbc077e77a5d0ba8ae11b1f5d89480c997fdef55/src/bench/bench.cpp#L47-L51)

134

17:46

<amovfx_> any combination and amount of low,medium,high

135

17:46

<amovfx_> all is defualyt

136

17:46

<amovfx_> *default

137

17:46

<furszy> stickies-v: could be, but.. we might be over-engineering it with that. The same behavior is being used in the sources already for the net processing ServiceFlags field.

138

17:48

<stickies-v> amovfx_: technically we can also pass low,medium,high,all

139

17:48

<stickies-v> (or any combination thereof)

140

17:48

<amovfx_> all is recognized?

141

17:49

<amovfx_> thats cool i guess

142

17:49

<pablomartin> yes

143

17:49

<stickies-v> well that's the next part of the question! anyone's got an answer to that?

144

17:50

<pablomartin> stickies-v I'm not sure about the last one "g."

145

17:50

<kouloumos> it's recognized because of the if clause in stringtopriority and prioritytostring functions

146

17:50

<amovfx_> is all DEFAULT_PRIORIT_LEVEL?

147

17:50

<stacie> I think it's somewhere in bench.cpp RunAll() but my C++ isn't good enough to find the exact line

148

17:50

<pablomartin> yes also

149

17:51

<pablomartin> stacie it's where kouloumos said above

150

17:51

<amovfx_> ah yes, I see now, thanks

151

17:51

<stacie> we're looking for the line of code that runs all benchmark tests if a priority isn't specified right?

152

17:51

<stickies-v> kouloumos: stacie yeah it's manually defined in https://github.com/bitcoin-core-review-club/bitcoin/blob/cbc077e77a5d0ba8ae11b1f5d89480c997fdef55/src/bench/bench.cpp#L55

153

17:52

<amovfx_> well, if priority isnt specified, it gets the default value

154

17:52

<pablomartin> true

155

17:53

<kouloumos> Talking about those transformation functions, I'm still not sure about the usefulness of PriorityToString. What's the benefit of using it instead of hardcoding the strings in the help message and the default value?

156

17:54

<stacie> oh I see it now, ty!

157

17:55

<kouloumos> To add more context, it's only used here https://github.com/bitcoin-core-review-club/bitcoin/blob/cbc077e77a5d0ba8ae11b1f5d89480c997fdef55/src/bench/bench_bitcoin.cpp#L33-L35 and here https://github.com/bitcoin-core-review-club/bitcoin/blob/cbc077e77a5d0ba8ae11b1f5d89480c997fdef55/src/bench/bench_bitcoin.cpp#L130

158

17:55

<stickies-v> kouloumos: then you'd be hardcoding that in 2 separate places which incurs the risk of people forgetting to update both locations in future updates

159

17:57

<stickies-v> alright let's quickly look at the quiz before we have to wrap up

160

17:58

<amovfx_> all but b, 1011111?

161

17:58

<stickies-v> Which of the following are valid PriorityLevels? a. PriorityLevel{0x00} | b. PriorityLevel{"low"} | c. PriorityLevel::DEFAULT_PRIORITY_LEVEL | d. PriorityLevel{3} |e. PriorityLevel{4} | f. PriorityLevel{0xff} | g. auto static_cast<PriorityLevel>(8)

162

17:59

<pablomartin> not sure about "g."

163

17:59

<amovfx_> oh wait not g or d

164

17:59

<theStack> amovfx_: that would be also be my guess

165

17:59

<kouloumos> a, e

166

18:00

<amovfx_> 0xff is 8 or 16?

167

18:00

<stickies-v> 0xff is 256

168

18:00

<amovfx_> or wait thats 255?

169

18:00

<amovfx_> derp

170

18:00

<amovfx_> I think any ints are default priority levels

171

18:00

<stickies-v> *255 sorry

172

18:01

<amovfx_> but they wont work in the code, if you force them in, some wont be found in the maps

173

18:01

<stickies-v> okay there are more valid ones than invalid ones, so we'll cover those

174

18:01

<amovfx_> because anything greater thatn 4 and not 255 is out of the enum members

175

18:01

<amovfx_> err not an enum member

176

18:01

<stickies-v> b. is invalid because we can't construct a PriorityLevel from string, we have StringToPriority for that

177

18:01

<amovfx_> aye

178

18:02

<pablomartin> right

179

18:02

<stickies-v> c. is invalid because `DEFAULT_PRIORITY_LEVEL`is a const defined in `benchmark`, so it's `benchmark::DEFAULT_PRIORITY_LEVEL` - it's not a member of `PriorityLevel`

180

18:02

<amovfx_> o shit

181

18:02

<stickies-v> everything else is valid!

182

18:02

<pablomartin> nice

183

18:02

<amovfx_> I thought that would evaluate to 0xff