logging: use std::bitset for categories (utils/log/libs)

https://github.com/bitcoin/bitcoin/pull/26697

Host: larryruane  -  PR author: LarryRuane

The PR branch HEAD was 40e1923e856661fdf68cf783ed9c6d1931dfbdf3 at the time of this review club meeting.

Notes

  • The logging facility of bitcoind writes debug and informational messages to debug.log at the top level of the data directory. A brief description can be found in the developer notes.

  • LogPrintf() writes to debug.log unconditionallly. These messages are considered important enough to always be written.

  • LogPrint() takes a category as its first argument, and only writes to debug.log if the specified category is enabled. The list of categories is defined by enum LogFlags.

  • The user can enable a logging category, for example NET (p2p messages) by specifing the configuration option debug=net. The strings are defined by [LogCategories](https://github.com/bitcoin/bitcoin/blob/0c2a1288a210916254c939d080ce12ffc5547841/src/logging.cpp#L150) but the mapping from the enum category symbols to strings is trivial.

  • The logging categories can also be enabled and disabled at runtime using the logging RPC.

  • Multiple logging categories can be enabled simultaneously by specifying multiple debug= config options.

  • PR #9424 changed the logging categories from a set of strings (like "net") to a enumeration type (NET).

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?

  2. What does this PR do, and what is the problem it proposes to address?

  3. What are the advantages of std::bitset compared with bit manipulations on integers?

  4. An earlier attempt, PR #26619, was closed in favor of the current PR. What are some advantages and disadvantages of each approach?

  5. There are two special logging categories, ALL and NONE. Why are these needed?

  6. Logger::m_categories was wrapped by std::atomic. Why was that needed? Why is it no longer needed?

  7. Following on to that question, some concern was expressed on the PR about performance, and the need for benchmarking. Why would this be important?

  8. An unknown P2P message does not cause a LogPrintf() message. Why not? Wouldn’t it be helpful to see if a peer is sending us a message we don’t understand?

  9. Does the same principle apply to RPC, REST, or ZMQ requests?

  10. The Logger::WillLogCategory is annotated with const; what does this mean? How is it possible to modify the object by locking m_cs?

Meeting Log

  117:00 <LarryRuane> #startmeeting
  217:00 <effexzi> Hi every1
  317:00 <stickies-v> hi
  417:00 <emzy> hi
  517:00 <kouloumos> hi
  617:00 <brunoerg> hi
  717:00 <Yaz> hi
  817:00 <lightlike> hi
  917:00 <codo> hi
 1017:00 <LarryRuane> Hi everyone, please feel free to say hi
 1117:01 <LarryRuane> If you just want to "lurk" that's fine too!
 1217:01 <rozehnal_paul> hi
 1317:01 <LarryRuane> Is anyone here new to review club? Please feel free to introduce yourself
 1417:01 <Yaz> Hi, my name is Yazid, an Industrial Engineer, looking to strengthen myself in bitcoins source code:)
 1517:02 <LarryRuane> Reminder, if anyone is interested in hosting review club, just let me or @stickies-v or @glozow know, that would be great!
 1617:02 <dzxzg> Hi, second time at review club, just here to watch
 1717:02 <LarryRuane> Yaz: hi! welcome, glad you're here!
 1817:02 <coreyphillips> hi. I'm new. Will mostly lurk for now.
 1917:02 <LarryRuane> dzxzg: welcome to you too, glad you're back! :)
 2017:02 <b_101> hello every1
 2117:03 <LarryRuane> hello @coreyphillips great to have you here, feel free to lurk! Or ask questions!
 2217:03 <LarryRuane> Today's review is: https://bitcoincore.reviews/26697
 2317:04 <LarryRuane> We'll go through the questions, but feel free to continue discussing anything we've already covered
 2417:04 <LarryRuane> So let's begin, Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?
 2517:04 <stickies-v> Approach ACK
 2617:05 <codo> Tested ACK
 2717:05 <emzy> Concept and tested ACK (not good at C++) :)
 2817:05 <LarryRuane> Oh I should also ask, does anyone have any questions or comments about the Notes? Anything you'd like to add?
 2917:06 <codo> The last part of the last Q I did not understand, but maybe I will if we come to that.
 3017:06 <glozow> For people who reviewed, how did you go about doing so?
 3117:07 <emzy> I first git clone it and compile it. Read the conversation and test it. Then try to understand a bit the code.
 3217:08 <roze_paul> I just read over the file changes and comments, going back to maxwell's PR in late 2016...but i didn't test...AFAIK all the testing added amounts to one in-line code (asserts)
 3317:08 <emzy> try to figure out what to test. I'm more testing then doing code review.
 3417:09 <emzy> That is manual testing.
 3517:09 <LarryRuane> good! Did anyone try the `logging` rpc? It's cool how you can enable and disable logging categories on the fly
 3617:09 <codo> I wrote down what I did in a comment in the PR: https://github.com/bitcoin/bitcoin/pull/26697#issuecomment-1387281337
 3717:09 <kouloumos> I haven't finish looking at the implementation yet, but what I started with is gathering context about how the change came to be, other related logging changes, other usages of bit operation in the codebase and looked a bit into the performance
 3817:09 <emzy> Yes, did that.
 3917:10 <emzy> Tested the RPC logging enable and disable on the fly. Very usefull.
 4017:10 <LarryRuane> those all are great! It's something I struggle with myself, how to go about reviewing a PR (there are so many things one can do)
 4117:10 <LarryRuane> let's go to Q2, What does this PR do, and what is the problem it proposes to address?
 4217:11 <codo> There was a limit on the number of possible logging categories. This PR removes that limit.
 4317:11 <roze_paul> it extends the number of log-topics available to an arbitrary amount by using the std::bitset function
 4417:12 <LarryRuane> roze_paul: yes, although it's actually a type
 4517:12 <emzy> Before it there was a bit set or unset to set the categories. That was limitet by the size of the interger used.
 4617:13 <LarryRuane> did you notice that `std::bitset` has a fixed size (number of bits)? the fact that the number of bits is specified within angle brackets indicates that
 4717:14 <LarryRuane> https://github.com/bitcoin-core-review-club/bitcoin/commit/40e1923e856661fdf68cf783ed9c6d1931dfbdf3#diff-21abb6b14af1e9330a6f0c89a87231035a439248c556ef5e110eb0617b88a1f4R77 (`ALL` is a constant)
 4817:14 <LarryRuane> emzy: +1
 4917:14 <brunoerg> So, `ALL` defines the maximum size?
 5017:15 <LarryRuane> yes, which is ... a little confusing! but it probably makes sense to do that identifier rather than create another one to indicate size
 5117:16 <brunoerg> makes sense
 5217:16 <LarryRuane> Q3 What are the advantages of std::bitset compared with bit manipulations on integers?
 5317:17 <andrew_m_> it has more internal methods?
 5417:17 <roze_paul> we get to get rid of the manual bitshifting code...which i don't understand why this is such a huge advantage, but it was stated in the notes as an advantage
 5517:17 <b_101> will size (ALL) wll change everytime a new log category gets added?
 5617:17 <emzy> It will be grow with more options for categories.
 5717:17 <lightlike> code looks cleaner, no need to manage the bits of an integer with "1 << 4" and such.
 5817:17 <roze_paul> other than making the code more concise
 5917:17 <roze_paul> +1 b_101
 6017:17 <LarryRuane> b_101: yes, but automatically
 6117:18 <LarryRuane> lightlike: yes, there may be fewer code conflicts to resolve (like if two PRs allocate the same bit)... although those aren't too hard to resolve
 6217:18 <emzy> I think it make the code more high level, without the bit shifting.
 6317:19 <roze_paul> another adv: we now get to utilize std::bitset's built-in functions like set() and reset()
 6417:19 <LarryRuane> here's what @lightlike is referring to: https://github.com/bitcoin-core-review-club/bitcoin/commit/40e1923e856661fdf68cf783ed9c6d1931dfbdf3#diff-21abb6b14af1e9330a6f0c89a87231035a439248c556ef5e110eb0617b88a1f4L44
 6517:19 <LarryRuane> emzy: yes, I like that about it too
 6617:20 <roze_paul> i can't recall if the version we are replacing also used a form of set and reset() ??
 6717:20 <LarryRuane> roze_paul: yes, I think those are conceptually more clear than `0` and `~0` :)
 6817:20 <kouloumos> test() is also cool
 6917:20 <LarryRuane> roze_paul: i think it does bitwise and (`&`) and or (`|`)
 7017:21 <LarryRuane> also for me, the type `uint32_t` is generic, just seeing that type doesn't indicate what how it's being used
 7117:21 <brunoerg> seems elegant set() and reset() for enabling and disabling categories, it's a good reason..
 7217:22 <LarryRuane> although that (`uint32_t` being too generic) could be improved with `using LoggingFlags = uint32_t` (or `typedef uint32_t LoggingFlags`)
 7317:22 <LarryRuane> (i think `using` is preferred)
 7417:22 <emzy> set() and reset() sounds more like C++ than C to me. :)
 7517:23 <LarryRuane> on the other hand, `std::bitset` may reveal too much of the underlying representation... but I don't think so because could be _conceptual_ bits (flags)
 7617:23 <emzy> So it fits better in my mind.
 7717:23 <LarryRuane> emzy: yes me too
 7817:24 <codo> is it also more portable?
 7917:24 <LarryRuane> codo: good point, it's definitely more abstract, so yes, I'd say more portable (hadn't thought of that)
 8017:25 <roze_paul> by portable, we mean between architectures && machines?
 8117:26 <LarryRuane> (i'll go on but again, feel free to keep discussing previous questions)
 8217:26 <LarryRuane> we kind of covered this already, but Q4: An earlier attempt, PR #26619, was closed in favor of the current PR. What are some advantages and disadvantages of each approach?
 8317:26 <LarryRuane> roze_paul: right, although i would say `uint32_t` is definitely the same everywhere
 8417:27 <LarryRuane> https://github.com/bitcoin/bitcoin/pull/26619
 8517:27 <emzy> I think the only change was using a bigger integer. So only extend it to 64 options.
 8617:28 <roze_paul> re. q4: the previous approach (26619) required less work, in that there was no approach change, just a change from 32 to 64 bit integers...in that sense it probably required less testing and review, which is an advantage
 8717:28 <brunoerg> #26619 you're just increasing the limit but not making it flexible like the new approach?
 8817:28 <LarryRuane> emzy: yes, smaller diff, less review and risk
 8917:29 <LarryRuane> brunoerg: right, it's conceivable that we could need more than 64 logging categories in the future (new subsystems or make them more fine-grained)
 9017:29 <LarryRuane> make existing categories more fine-grain
 9117:30 <LarryRuane> roze_paul: +1
 9217:31 <LarryRuane> I think another advantage of the closed 26619 is that we can still use std::atomic (I couldn't figure out how to wrap `std::bitset` with `std::atomic` but maybe there's a way)
 9317:32 <kouloumos> although "levels" seems to be targeting that fine-graining, right?
 9417:33 <LarryRuane> kouloumos: yes, I guess that can serve that purpose too, good point (i personally don't like levels, I've always like categories only)
 9517:33 <LarryRuane> let's try Q5 There are two special logging categories, ALL and NONE. Why are these needed?
 9617:34 <roze_paul> @larry do you use trace levels, to get all the data, and then filter the logging data yourself...just thinking of a way to work around using levels..
 9717:35 <roze_paul> Q5: not entirely sure, but i think calling all will turn on all logging topics, and none will turn of (bitset.reset) all topics...ALL also conveys the number of total number of topics, if one wants that info
 9817:36 <LarryRuane> Yes you could do that, I think the default is to get all the logging for a given category
 9917:37 <LarryRuane> well I think `ALL` being the total number of categories is just a code-internal thing, not anything the user is aware of
10017:37 <LarryRuane> For Q5, i would say `debug=all` is more convenient than `-debug=net -debug=tor -debug=mempool ...`
10117:38 <roze_paul> most definitely more convenient
10217:38 <brunoerg> it's also easier to enable and disable all categories
10317:38 <LarryRuane> the functional tests do this, because they assume that by default you would want to see all categories, AND, there's one other reason, anyone know why?
10417:38 <LarryRuane> brunoerg: +1 that's a good point
10517:38 <kouloumos> I think `NONE` indicates the unconditional logging done with `LogPrintf()`
10617:39 <LarryRuane> kouloumos: i think that would be `ALL`, rather than `NONE`
10717:40 <LarryRuane> well I'm not sure, maybe that's wrong, need to think about it!
10817:41 <LarryRuane> notice you can't write `LogPrint(ALL, ...)` because those calls are always specified to a particular category
10917:42 <kouloumos> I think that's what this implies https://github.com/bitcoin/bitcoin/blob/0c2a1288a210916254c939d080ce12ffc5547841/src/logging.h#L236
11017:43 <LarryRuane> anyway, I think the other reason the functional tests enable all categories is to test the `LogPrint` calls! It would be bad if you enabled a category, and the `LogPrint` dereferenced a null pointer or something and crashed the process
11117:44 <LarryRuane> kouloumos: Oh i see, you're right! thanks
11217:45 <LarryRuane> we kind of touched on this already, but Q6 `Logger::m_categories` was wrapped by `std::atomic`. Why was that needed? Why is it no longer needed?
11317:45 <roze_paul> i believe we replaced std::atomic with a rw-lock?
11417:46 <kouloumos> interesting! I've seen that for functional tests there are some logging categories that we ignore, could this become be an issue if such regression occurs for those categories?
11517:46 <LarryRuane> roze_paul: yes but just a regular mutex lock, not a read-write lock
11617:47 <roze_paul> @LarryRuane that's the StdLockGuard scoped_lock ?
11717:47 <LarryRuane> kouloumos: good point!
11817:47 <LarryRuane> roze_paul: correct
11917:48 <kouloumos> I think it was needed because of concurrency due to different components wanting to access the logger. I was curious why it was now replaced with locks, but I think you already touched why.
12017:49 <LarryRuane> kouloumos: yes, I couldn't figure out how to wrap a `std::bitset` variable within `std::atomic` (but probably worth trying harder!)
12117:50 <LarryRuane> good transition to Q7 Following on to that question, some concern was expressed on the PR about performance, and the need for benchmarking. Why would this be important?
12217:51 <LarryRuane> i see that @kouloumos just posted some really helpful benchmark results to the PR: https://github.com/bitcoin/bitcoin/pull/26697#pullrequestreview-1253717862
12317:51 <kouloumos> Cause logging is a common operation, so a slowdown could have a significant impact
12417:52 <LarryRuane> kouloumos: right, especially if the relevant logging category is _not_ enabled (if enabled, probably performance isn't of much concern)
12517:52 <emzy> If I read that corectly than there is no slow down from the change. I'm right?
12617:53 <LarryRuane> that's why I think LoggingNoCategory is the most important of those results
12717:53 <LarryRuane> no, there's a significant slowdown in LoggingNoCategory
12817:54 <LarryRuane> but it's hard to say how important that difference is (is it a drop in the ocean compared with a half-drop in the ocean? Or is it a drop in the glass of water?) ... depends on the surrounding code
12917:54 <emzy> Oh I see.
13017:55 <LarryRuane> Q8 An unknown P2P message does not cause a LogPrintf() message. Why not? Wouldn’t it be helpful to see if a peer is sending us a message we don’t understand?
13117:55 <kouloumos> Also looking into those benchmarks, I was wondering how significant are the results of such a logging benchmark. They are benchmarking the performance of a single invocation, is this actually a good metric?
13217:55 <codo> re Q8: The daemon should ignore any unknown messages so it can't be DOS'sed.
13317:55 <brunoerg> codo: +1
13417:55 <LarryRuane> yes I think benchmarks attempt to isolate just one particular area of the code, so I think that's good
13517:56 <brunoerg> DoS is a good explanation
13617:56 <b_101> can the change of locking method of m_categories cause an impact?
13717:56 <kouloumos> +1 they were concerns about such a logging attack in the past https://github.com/bitcoin/bitcoin/issues/21559
13817:56 <LarryRuane> codo: yes! that's exactly what i had in mind... what would be the nature of the DoS?
13917:56 <roze_paul> codo +1
14017:57 <LarryRuane> oh there's the answer to my question right in the title of 21559 -- disk filling
14117:57 <brunoerg> cool
14217:57 <LarryRuane> Q9 Does the same principle apply to RPC, REST, or ZMQ requests?
14317:57 <roze_paul> q9 i would imagine yes. same attack vector
14417:57 <emzy> For shure the contents of the unknown P2P message must not be logged.
14517:58 <LarryRuane> emzy: only the contents? what about the fact that it happened?
14617:58 <brunoerg> what is "unknown P2P message" for us?
14717:58 <brunoerg> unknown message or any message from an unknown peer?
14817:59 <LarryRuane> unknown message ... in a way, all peers are unknown :)
14917:59 <LarryRuane> I'm thinking that for Q9, no, because those sources are trusted (can do lots of other harmful things anyway)
15017:59 <emzy> LarryRuane: seems to be relevant that there are unknown messages. Could be also a local problem. So I would like to have that option.
15118:00 <LarryRuane> yes I think logging unknown messages from RPC etc. should be logged .. is that what you're saying?
15218:00 <LarryRuane> we're at time, didn't get to Q10 The Logger::WillLogCategory is annotated with const; what does this mean? How is it possible to modify the object by locking m_cs?
15318:01 <emzy> But LogPrintf() is unconditionallly. So It sould be in LogPrint() net
15418:01 <LarryRuane> but if anyone would like to discuss, I'll stick around!
15518:01 <LarryRuane> #endmeeting
15618:01 <glozow> thanks LarryRuane! great meeting
15718:01 <effexzi> Thanks every1
15818:01 <codo> thank you LarryRuane a lot
15918:01 <emzy> Thank you LarryRuane and all others!
16018:01 <d33r_gee> thanks LarryRuane and every1
16118:02 <roze_paul> q10 i couldn't find where m_cs was first declared, so i don't understand what m_cs even `is`
16218:02 <roze_paul> thanks larry!!!
16318:02 <brunoerg> Thanks LarryRuane
16418:02 <svav> Thanks LarryRuane and all!
16518:02 <brunoerg> learned a lot
16618:02 <LarryRuane> thanks to all who participated! was really fun for me to host!
16718:02 <kouloumos> Thank you LarryRuane!
16818:06 <codo> I'd like to discuss Q10 also if more are interested.
16918:06 <b_101> thnks LarryRuane: for hosting and everyone!
17018:06 <b_101> yes I would like to understand Q10
17118:07 <codo> I think the first part might be: the annotation with const is good practice
17218:07 <codo> The second part of the question I do not understand
17318:10 <b_101> const means no data will be changed by the fuction, right?
17418:11 <b_101> they can't change the data members to be more precise
17518:12 <codo> yes that is how I understand it also
17618:12 <codo> but the function does not change anything, so that is why I think it is for good practice
17718:13 <codo> that it shouts out "I'm only reading, not writing"
17818:15 <b_101> ok, not sure about the second part either
17918:16 <LarryRuane> b_101: both of you are on exactly the right track ... `const` means that calling this method won't change the state of the object (it's a "read-only" call)
18018:16 <LarryRuane> but if a member variable is labeled `mutable`, then the method _can_ change the variable
18118:16 <LarryRuane> (just that variable)
18218:17 <LarryRuane> mostly done for locking
18318:17 <codo> ah
18418:17 <LarryRuane> https://en.cppreference.com/w/cpp/language/cv refers to the M&M rule: mutable and mutex go together
18518:18 <LarryRuane> there's one other common use of `mutable` besides locking ... do you know what it is?
18618:19 <codo> I don't
18718:19 <b_101> so in this case m_cs is the mutable data member?
18818:19 <LarryRuane> right
18918:21 <b_101> so this `StdLockGuard scoped_lock(m_cs);` mutates m_cs?
19018:22 <LarryRuane> yes
19118:22 <b_101> thx LarryRuane: very interesting
19218:23 <LarryRuane> that acquires (locks) `m_cs` ... then it also get mutated (unlocked) when `scoped_lock` goes out of scope (by its destructor)
19318:23 <b_101> right, thanks for clarifying that!
19418:23 <LarryRuane> `mutable` is also often used for member variables that are merely a cache of some kind, for performance ... doesn't change functionality
19518:24 <LarryRuane> here's an example (you can see from the name of the variable member): https://github.com/bitcoin/bitcoin/blob/aef8b4f43b0c4300aa6cf2c5cf5c19f55e73499c/src/coins.h#L220
19618:25 <LarryRuane> the comment just above that line is very helpful
19718:25 <b_101> yes
19818:26 <LarryRuane> but `mutable` does allow you to cheat, in that a const method can make changes that are actual functional changes! i.e. change the _logical_ state of the object
19918:26 <b_101> I have been studying mutex and locks, a little confused why Bitcoin Core has so many LOC macros instead of using standard c++ lock functions
20018:27 <LarryRuane> i.e. nothing in the language verifies that your use of `mutable` and `const` functions are non-functional ... there's probably no way to automatically check that
20118:27 <b_101> LarryRuane: ok, thx
20218:29 <LarryRuane> b_101: yes, I think `LOCK()` is the most common way to do locking (I don't know why the logging subsystem doesn't use `LOCK`), but what that macro does is actually declare a (local) variable with some constructed artificial name,
20318:30 <LarryRuane> and that variable's constructor actually does the mutex lock, and its destructor does the unlock.. which is very clever! it's hard to make the mistake of forgetting to unlock a mutex (like it is if you're doing unlocking explicitly)
20418:31 <LarryRuane> you may notice lots of places where there's an opening brace (`{`) then a call to `LOCK`, then a small amount of code, then the close brace ... that's to limit the duration of the mutex locking to just the block of code (not locked after the close brace)
20518:32 <b_101> yes, I still have to read more and make some basic c++ toy projects to fullly understand it
20618:32 <LarryRuane> `StdLockGuard scoped_lock(m_cs)` has that same property as `LOCK` where the unlock is automatic, but with `StdLockGuard` you're making the variable visible (you choose the name), instead of it being hidden within the `LOCK` macro
20718:33 <LarryRuane> and as i said, i don't really know why it's done that way in the logging subsystem ... maybe historical? maybe `LOCK` didn't exist when the logging code was written? not sure