Remove shutdown from kernel library (validation)

https://github.com/bitcoin/bitcoin/pull/27711

Host: stickies-v  -  PR author: TheCharlatan

The PR branch HEAD was a6a3c3245303d05917c04460e71790e33241f3b5 at the time of this review club meeting.

Notes

  • The libbitcoinkernel project is an effort to decouple Bitcoin Core’s consensus engine from other non-consensus modules (such as the various indices) in the codebase. We have previously covered libbitcoinkernel-related PRs #25527, #24410 and #20158.

  • #27636 introduced a kernel::Notifications interface, which can then be implemented by node implementations (e.g. KernelNotifications) to trigger the desired behaviour for an event.

  • One such type of event is the consensus engine requiring a shutdown, expectedly or unexpectedly.

  • This PR #27711 adds two new notification methods kernel::Notifications::startShutdown and kernel::Notifications::fatalError to allow the node to implement the necessary behaviour.

  • Moreover, this PR moves the shutdown files as well as the remaining usages of uiInterface out of the kernel code, as started in #27636.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?

  2. Why do we have startShutdown both in kernel/notifications_interface.h as well as in node/kernel_notifications.h?

  3. How does fRequestShutdown relate to this PR, and can you elaborate on its role in terminating long-running kernel functions?

  4. How does the notification interface contribute to the decoupling of most non-consensus code from libbitcoinkernel?

  5. Can you describe the flow of startShutdown and fatalError notifications in this new setup? Who are the producers and consumers of these notifications?

  6. Are there any potential race conditions or synchronization issues that might arise with the use of the notification interface in this context?

  7. Why is KernelNotifications::m_shutdown_requested a reference value? Do you have any ideas for alternative approaches to triggering a shutdown?

Meeting Log

  117:00 <stickies-v> #startmeeting
  217:00 <LarryRuane> hi
  317:01 <stickies-v> welcome everyone! Today we're looking at #27711, authored by TheCharlatan. The notes and questions are available on https://bitcoincore.reviews/27711
  417:01 <abubakarsadiq> hello
  517:01 <effexzi> Hi every1
  617:02 <stickies-v> note: we'll be focusing on the previous PR HEAD (https://github.com/bitcoin-core-review-club/bitcoin/commit/a6a3c3245303d05917c04460e71790e33241f3b5), which is _not_ the current HEAD anymore (but we don't want to overhaul the notes and questions last minute)
  717:02 <stickies-v> anyone joining us for the first time today? even if you're just lurking, feel free to say hi!
  817:03 <LarryRuane> silly question, but is there a way to see the commits before that one?
  917:03 <LarryRuane> (on GitHub)
 1017:03 <TheCharlatan> hi :)
 1117:04 <stickies-v> LarryRuane: GitHub allows you to browse a repo at a certain commit (like checkout), and if you do that and then click the commit history button, you can see the history indeed
 1217:05 <stickies-v> https://github.com/bitcoin-core-review-club/bitcoin/commits/a6a3c3245303d05917c04460e71790e33241f3b5 would be the link
 1317:05 <stickies-v> (an extra `s` after `commit` in the url)
 1417:05 <LarryRuane> that's it, perfect, thanks
 1517:05 <stickies-v> who got the chance to review the PR or read the notes? (y/n)
 1617:05 <stickies-v> hi TheCharlatan, thanks for joining us today!
 1717:06 <LarryRuane> do others do a lot of URL hacking on github? I do, but always wondered if it's only because I don't know of better ways
 1817:06 <mutex> when in doubt I revert to cli :-D
 1917:07 <LarryRuane> I have not much at all, sorry, I got stuck on a question that didn't make sense with the most recent head
 2017:07 <LarryRuane> mutex: is that the `gh` command? I haven't used it but maybe should learn how to
 2117:07 <mutex> this is my first PR review, not well versed in C++ so i'm at a bit of a disadvantage
 2217:07 <TheCharlatan> LarryRuane, url's are made to be hacked :D
 2317:08 <mutex> LarryRuane: git command, I haven't used the 'gh' tool
 2417:08 <abubakarsadiq> I read the notes and was reading about the kernel project :)
 2517:08 <LarryRuane> :) .. yeah i actually don't mind it! but just was wondering.. sorry for the sidetrack
 2617:08 <stickies-v> hey mutex, glad to have you here! there's a bit of a learning curve but hang in there :-)
 2717:09 <stickies-v> the PR turned out to be a bit more complex than I imagined when I first selected it hah, as also evidenced by the recent discussions and today's force-push
 2817:09 <LarryRuane> mutex: yes i of course use `git` constantly, but to interact with github, there's a `gh` command you can install, which i did, but i haven't used it yet
 2917:09 <stickies-v> I see there's not been too much review, so perhaps we can start if anyone has any questions about the purpose of this PR? does the approach make sense?
 3017:10 <LarryRuane> stickies-v: yes, i started reading the comments on the PR was like, wow, this is way over my head!
 3117:10 <stickies-v> LarryRuane: I've not had any use case where `git` and shell alias/functions couldn't help me out tbh
 3217:12 <mutex> concept ACK
 3317:12 <LarryRuane> is this a step toward getting rid of the fRequestShutdown flag?
 3417:12 <LarryRuane> (but I'm unclear on how exactly that flag interacts with the consensus kernel)
 3517:13 <stickies-v> LarryRuane: oh, good question. I don't think that's quite in the scope of the libbitcoinkernel project, since `fRequestShutdown` is owned by the node
 3617:14 <stickies-v> I think the version of the PR that we're reviewing doesn't really make any progress towards that goal either, since we're still very much relying on a global boolean value
 3717:15 <stickies-v> the new, current version, however, seems to be moving in that direction with the `interrupt()` function
 3817:15 <mutex> is there more than one PR we should be looking at?
 3917:16 <stickies-v> mutex: the most relevant prior PR to this one is https://github.com/bitcoin/bitcoin/pull/27636, that introduced the notification interface
 4017:17 <stickies-v> but I'd say the PR is pretty self contained
 4117:17 <stickies-v> gonna start with the first question:
 4217:17 <stickies-v> Why do we have `startShutdown` both in `kernel/notifications_interface.h` as well as in `node/kernel_notifications.h`?
 4317:19 <LarryRuane> (that's the question i got stumped on, since it's not in both of those places)
 4417:19 <mutex> yeah I'm not seeing it
 4517:19 <LarryRuane> (in the latest PR head)
 4617:21 <stickies-v> yeah, it's in commit a6a3c3245303d05917c04460e71790e33241f3b5
 4717:23 <stickies-v> but, I think it probably doesn't make sense to keep discussing questions that no one's seen the code for
 4817:23 <stickies-v> so I suggest we stop covering the prepared questions and instead just move to general Q&A on the PR, if there are any?
 4917:25 <mutex> is this an issue of the questions being prepared for code that has already been superceded?
 5017:26 <LarryRuane> Here's something I wondered while reviewing this PR (but the latest on github), but just a c++ question, https://github.com/bitcoin/bitcoin/pull/27711/files#diff-d6d633592a40f5f3d8b03863e41547de8751b874c1d20f129a616b9dd719b999R26 there are empty braces at the end of that declaration, do those need to be there? what do they do?
 5117:27 <stickies-v> mutex: yeah, there was quite a big code overhaul this morning. we do include the HEAD that we look at on the bitcoincore.reviews website, but I now see that it only is mentioned for reviews in the past, not for current/upcoming ones
 5217:27 <mutex> I have a quesiton about this comment: https://github.com/bitcoin/bitcoin/pull/27711/files#diff-04c8e84ee77f6085c064940c211aedb95c781be20927aec64553d8448253af97R44 No signals on windows? I thought windows was POSIX compliant?
 5317:27 <mutex> (modern windows)
 5417:28 <LarryRuane> mutex: +1 good question, what happens when you hit control-c on windows? Does that even work there? How does it work without signals?
 5517:29 <TheCharlatan> afaik windows has its own event thing that you can use to emulate posix signals.
 5617:30 <TheCharlatan> (I'm not a windows dev though)
 5717:30 <TheCharlatan> https://stackoverflow.com/questions/51476296/what-signals-should-i-use-to-terminate-kill-applications-on-windows
 5817:30 <mutex> my knowledge here is from 20 years ago, where WindowsNT had a POSIX subsystem, i'm sure things have changed since then ;-)
 5917:30 <stickies-v> LarryRuane: I think `KernelNotifications notifications` and `KernelNotifications notifications{}` are equivalent, since `KernelNotifications` is a class with a default constructor
 6017:31 <evansmj> `As a standalone library the libbitcoinkernel should not have to rely on code that accesses shutdown state through a global.`  was `static std::atomic<bool> fRequestShutdown(false);` the original global?  the state being just true/false?
 6117:31 <LarryRuane> thanks
 6217:31 <stickies-v> but with the braces we ensure that it's always initialized, so probably better practice to do it like this?
 6317:31 <LarryRuane> i see, so even if the class does have a default constructor, probably no harm in specifying those braces
 6417:32 <stickies-v> evansmj: that's my understanding, yes!
 6517:38 <LarryRuane> in case this is also helpful to others... here's a list of the files that are considered part of libbitcoinkernel: https://github.com/bitcoin/bitcoin/blob/2026301405f83c925ca68db6a3cd5134ed619ca7/src/Makefile.am#L911
 6617:39 <LarryRuane> (well, not "considered" ... actually are!)
 6717:39 <LarryRuane> it's more than I thought
 6817:39 <stickies-v> with the "#TODO" line above it quite important:
 6917:39 <stickies-v> "...as more and more modules are decoupled from the consensus engine, this list will shrink to only those which are absolutely necessary."
 7017:40 <LarryRuane> yes, great point, i had missed that!
 7117:41 <mutex> So the PR changes the signaling method, does the behavior change at all? I see a few exceptions being thrown, does that change the behavior other than to stop hashing?
 7217:42 <mutex> I don't think i know enough about the surrounding code to understand
 7317:42 <LarryRuane> so the first two commits (in the review club version, not the latest PR tip) are removing calls to the global `ShutdownRequested()` function from those files (that make up libbitcoinkernel), IIUC
 7417:43 <stickies-v> mutex: I've not looked at the current version too much yet, but I think behaviour is not meant to change, no
 7517:44 <evansmj> so now, classes interested in the shutdown state need to look at the kernel Context notifications for the signal interrupt flag.  what kinds of things/classes are interested in this?  i see bitcoind checks it immediately in AppInit(), is that somehow the main looping check?
 7617:45 <TheCharlatan> behaviour should not change except for the two subtle changes/bugfixes in the latest PR tip.
 7717:45 <stickies-v> LarryRuane: I think just the first commit? but otherwise, yeah
 7817:45 <LarryRuane> yes, you're right
 7917:51 <stickies-v> evansmj: I think long-running operations, mostly? In `master`, I'd look at callsites of `ShutdownRequested()`, e.g. here when we're loading the blockindex: https://github.com/bitcoin/bitcoin/blob/2026301405f83c925ca68db6a3cd5134ed619ca7/src/node/blockstorage.cpp#L264
 8017:54 <LarryRuane> I assume one of the goals of a PR like this is making consensus code more testable? So that last commit, "kernel: Remove shutdown from kernel library" changes all the calls (in libbitcoinkernel) to the global `StartShutdown()`, to instead call a notification callback, for example `chainman.GetNotifications().startShutdown(ShutdownReason::StopAfterBlockImport)`
 8117:55 <LarryRuane> so with that change, test code won't *really* get shutdown (the test process) when this condition occurs
 8217:56 <LarryRuane> test code wouldn't really get shut down, it might check that this callback did happen (like presumably it's supposed to), and fail the test if not .. something like that
 8317:58 <stickies-v> LarryRuane: yeah I think that's definitely true, and probably holds for pretty much all decoupling? the less intertwined everything is, the more straightforward your tests can be
 8417:58 <stickies-v> I didn't think of that though, nice
 8517:59 <stickies-v> alright, looks like we're out of time
 8618:00 <evansmj> thanks
 8718:00 <stickies-v> thanks for your work on this PR TheCharlatan, and thanks everyone else for attending!
 8818:00 <stickies-v> #endmeeting