De-globalize ChainstateManager (consensus)

Host: dongcarl  -  PR author: dongcarl

The PR branch HEAD was 09daa42b at the time of this review club meeting.


  • This is a draft PR which removes the g_chainman global variable and ::Chain{,state}Active() functions in favor of having callers either use or obtain local references to the relevant consensus-related object (ChainstateManager, CChainState, CChain, BlockManager, etc.).

  • Prior to this PR, there were numerous references to the g_chainman global variable and ::Chain{,state}Active() functions, some of them even in contexts where a local ChainstateManager reference existed.

  • This PR is a first step towards encapsulating our consensus engine, which allows for:

    • Clearer visibility into what currently lies in consensus codepaths and what depends on our consensus engine

    • Coalescing duplicate consensus initialization codepaths, mitigating against bugs that arise out of test/non-test initialization inconsistencies

  • Some historical context can be gleaned from:

    • The introduction of CChainState in PR 10279

    • The introduction of BlockManager in PR 16194

    • The introduction of ChainstateManager in PR 17737

    • The addition of the chainman pointer to NodeContext in PR 18698


  1. Did you review the PRs? Concept ACK, approach ACK, ACK, or NACK?.

  2. How did the PRs linked in the historical context notes change the organization of validation.cpp? What kind of state/logic now belongs in each consensus-related object (ChainstateManager, CChainState, BlockManager)?

  3. Where and how are the aforementioned consensus-related objects initialized?

  4. What are the advantages and disadvantages of using global variables and functions?

  5. Why are the first few fix commits in the branch necessary? Why does the previous code work prior to de-globalizing g_chainman and why doesn’t it work now without the fix?

Meeting Log

  117:00 <dongcarl> #startmeeting
  217:00 <jamesob> hi
  317:00 <b10c> hi
  417:00 <dongcarl> hello!
  517:00 <michaelfolkson> hi
  617:00 <emzy> hi
  717:00 <@jnewbery> Hi folks! Welcome to Bitcoin Core PR Review Club. Feel free to say hi to let everyone know you're here
  817:00 <rav3n> hi
  917:00 <elle> hi!
 1017:00 <dappdever> hello
 1117:00 <@jnewbery> (or don't. Lurkers are also welcome!)
 1217:00 <@jnewbery> Anyone here for the first time?
 1317:01 <dappdever> I am!
 1417:01 <troygiorshev> hi!
 1517:01 <dongcarl> Welcome!
 1617:01 <@jnewbery> dappdever: welcome!
 1717:01 <Klopp> Hi allI am new (as a lurker)
 1817:01 <dongcarl> Alright, let's get started. I think this is the first Draft PR?
 1917:01 <dongcarl> Did everyone get a chance to review the PR? How about a quick y/n from everyone
 2017:02 <dappdever> y
 2117:02 <troygiorshev> y partial :D
 2217:02 <elle> y
 2317:02 <Klopp> n
 2417:02 <rav3n> n
 2517:02 <emzy> y
 2617:02 <b10c> n
 2717:02 <@jnewbery> 5/82 y
 2817:02 <michaelfolkson> At a high level
 2917:02 <dongcarl> Hehe it is quite a chonky one
 3017:02 <dongcarl> Let's delve into the questions!
 3117:02 <jonatack> hi
 3217:02 <MarcoFalke> 3/82 and concept
 3317:03 <dongcarl> At a high level: Concept ACK, approach ACK, ACK, or NACK?
 3417:03 <washroom> hi
 3517:03 <jonatack> will look at it after release branch-off
 3617:03 <jamesob> n
 3717:03 <jamesob> Concept ACK
 3817:04 <emzy> Concept ACK
 3917:04 <elle> Concept ACK
 4017:04 <@jnewbery> Big big concept ACK
 4117:04 <MarcoFalke> Concept ACK
 4217:04 <dappdever> concept ack
 4317:04 <troygiorshev> Concept ACK
 4417:04 <dongcarl> Alright, let's get into the details
 4517:05 <nehan> hi
 4617:05 <michaelfolkson> Concept ACK (assuming I've understood it). Is "consensus engine" another term for "libconsensus"?
 4717:05 <dongcarl> Oh good question!
 4817:05 <dongcarl> The way I've thought about it is that "consensus engine" is an encapsulation of how our consensus code works _currently_
 4917:06 <dongcarl> Next Q: How did the PRs linked in the historical context notes change the organization of validation.cpp? What kind of state/logic now belongs in each consensus-related object (ChainstateManager, CChainState, BlockManager)?
 5017:08 <dongcarl> For those needing a link:
 5117:08 <michaelfolkson> BlockManager is chainstate agnostic block metadata
 5217:08 <elle> CChainState now owns mempool, blockfeeestimator and other things that represent our local view of the best chain.
 5317:08 <dappdever> It seems to have compartmentalized logic into objects, de-coupling a bit
 5417:08 <troygiorshev> CChainState is what will eventually become the libconsensus. It hold everything related to the curent tip of the chain.
 5517:09 <jamesob> troygiorshev: CChainstate and not ChainstateManager?
 5617:09 <MarcoFalke> jamesob: assumeutxo is not yet merged ;)
 5717:09 <jamesob> heh
 5817:10 <dongcarl> dappdever: You're right on that one! You can take a look at validation pre-10279 to see its state
 5917:10 <dongcarl> I call validation pre-10279 the "primordial soup of validation globals"
 6017:10 <troygiorshev> jamesob: yeah I think! Everything to do with, say, connecting blocks is part of CChainState? Might be CChainstate through ChainstateManager
 6117:10 <michaelfolkson> They both have chainstate in them though right? I want to understand the MarcoFalke joke
 6217:11 <@jnewbery> ChainstateManager can hold multiple chains. It was introduced as part of the assumeutxo project
 6317:11 <jamesob> at the moment, there's only one chainstate running around. But eventually there may be multiple. In any case, at the moment ChainstateManager is responsible for constructing chainstates, which involves e.g. injecting a BlockManager instance
 6417:12 <@jnewbery> currently there's only one chain - the chain that we consider to be the best. With assumeutxo, there would be multiple chains
 6517:12 <washroom> would ChainstateManager allow the daemon to run mainnet and signet simultaneously?
 6617:12 <@jnewbery> washroom: no
 6717:12 <dongcarl> elle michaelfolkson troygiorshev: True as well. A very rough approximation I have in my head: BlockManager = the blockchain/metadata, CChainState = everything related to current tip (UTXO, etc)
 6817:12 <michaelfolkson> Multiple chains ie the one we are assuming is ok from tip and the chain we are validating in the background?
 6917:12 <@jnewbery> All the chains would be running the same consensus parameters
 7017:12 <michaelfolkson> Surely you don't need more than two....
 7117:12 <jamesob> michaelfolkson: right
 7217:13 <nehan> multiple chains mean multiple mempools too?
 7317:13 <dongcarl> jamesob is quite the expert here, as he write two of the 4 PRs linked!
 7417:13 <jamesob> michaelfolkson: in concept, you could have n chainstates and validate n separate regions of the chain historically, simultaneously (maybe with something like utreexo) but that's way off
 7517:13 <washroom> jnewbery michaelfolkson ah I see now, a method to assess all available chain tips
 7617:14 <@jnewbery> hi nehan! No, only one mempool. We only have a mempool for transactions that we think can be added to the best current chain
 7717:14 <troygiorshev> dongcarl: and an eventual libconsensus api would most closely follow which class? CChainSate, no?
 7817:15 <washroom> mempool would just pertain to chain-tip extension rather than assessing or constructing branching consensus-equivalent chains -- right?
 7917:15 <jamesob> troygiorshev: if I had to guess, probably none - it would be some functional layer that accesses chainstate objects through the chainmanager... but defer to dongcarl there
 8017:15 <dongcarl> troygiorshev: I don't like to speculate about eventual APIs, because we're sooooo far from that right now. BUT I think a good intermediary goal is to encapsulate ChainstateManager and its dependencies first!
 8117:16 <@jnewbery> dongcarl: I agree. If there's obvious incremental improvements, we should do them, regardless of what the end state could be
 8217:16 <troygiorshev> jamesob dongcarl: ok!
 8317:17 <@jnewbery> even if we never get to libconsensus, simply cleaning up/encapsulating/modularizing are very beneficial
 8417:17 <washroom> +1
 8517:17 <dongcarl> jnewbery: +1
 8617:17 <dongcarl> Alright, feel free to keep discussing but here's the next topic: Where and how are the aforementioned consensus-related objects initialized?
 8717:17 <michaelfolkson> washroom: The start of the statement is right. But the assumeutxo project is validating from genesis in the background while assuming the tip is ok
 8817:18 <michaelfolkson> (Or at least the vanilla version is. And it isn't merged yet)
 8917:18 <dongcarl> bonus if you can identify some differences between constructing (as in C++ constructors) and initializing!
 9017:21 <dongcarl> Carl's tip for codebase navigation: try out sourcetrail! It's like grep, but you can click around and it understands C++/Python/etc
 9117:21 <troygiorshev> +1
 9217:22 <jamesob> hint: init.cpp
 9317:22 <@jnewbery> Construction means something very specific in C++. It's the operations that get called when a new object gets instantiated. Those operations include the initialization list, the constructor functions of any base classes and the constructor function of the class itself
 9417:22 <troygiorshev> ChainstateManager is a global g_chainman in validation.cpp, which is then pointed to by the NodeContext node in AppInitMain in init.cpp
 9517:23 <@jnewbery> I think initialization is more vague and means 'make sure the object is ready to use'
 9617:24 <troygiorshev> g_chainman is constructed, er, at some point before main() is run?
 9717:24 <@jnewbery> troygiorshev: yes, exactly
 9817:24 <dongcarl> troygiorshev: Yes!
 9917:24 * troygiorshev wipes sweat
10017:24 <@jnewbery> we don't have control over when global objects are constructed, and it could even be before main()
10117:25 <michaelfolkson> I thought initialization did not infer the initialized object was in a usable state. Just that it was initialized
10217:25 <sipa> jnewbery: it has to be before main
10317:25 <dongcarl> michaelfolkson: This is correct -> "construction did not infer the constructed object was in a usable state. Just that it was constructed"
10417:26 <troygiorshev> sipa: as long as it's a global in the global namespace, right?
10517:26 <sipa> troygiorshev: right
10617:26 <@jnewbery> often, initialization happens during construction, but if initialization requires some context, then obviously we can't initialize the object during construction if it's a global
10717:27 <sipa> jnewbery: what do you mean by initialization in this context?
10817:28 <dongcarl> Exactly what jnewbery said! If a variable is a global, and it needs some context to get initialized to steady state, then we need a two-step initialization. See: InitializeChainstate
10917:29 <sipa> oh, initialization as in an application-level concept, ok
11017:29 <dongcarl> Ah I should clarify, what I mean by initialization is "making the object ready to use/at stead-state"
11117:29 <dongcarl> Yup!
11217:29 <@jnewbery> sipa: I mean things like what dongcarl said. Make the object ready to use.
11317:30 <dongcarl> So ChainstateManager is constructed as a global and initialized in init.cpp... Anywhere else it's initialized?
11417:30 <jamesob> (fwiw, here's the C++ definition of initialization:
11517:30 <@jnewbery> sipa: does this: only apply to objects with internal linkage? ie if they're in the global namespace then the have to be constructed before main?
11617:31 <sipa> jnewbery: it may also be outdated; C++11 added guarantees around initialization order
11717:32 <@jnewbery> sipa: ah. thanks!
11817:32 <@jnewbery> jamesob: I think that's just for simple types. It doesn't say anything about classes
11917:32 <sipa> ah, they may be right - that globals are only guaranteed to be initialized before calling any function in the same compilation unit
12017:32 <dappdever> @dongcarl it seems that init.cpp is the only caller of InitiatilizeChainState
12117:33 <dongcarl> Hint for my small q: take a look at how our test framework starts up
12217:34 <dappdever> copied from the global chainman?
12317:34 <troygiorshev> c++ stackoverflow being outdated about a compilation question really embodies spooky season
12417:34 <sipa> jnewbery: c++11 added a guarantee that local static variables are only initialized once, even when the function is called simultaneously from multiple threads for the first time
12517:35 <nehan> testing setup calls it
12617:35 <dongcarl> dappdever: Not copied, per se, but in both init and TestingSetup, we point node.chainman to the g_chainman!
12717:35 <dongcarl> nehan: Bingo!
12817:36 <dappdever> ah, yes, got it!
12917:36 <dongcarl> The reason why this is important from a higher level is to realize that we've got two different initialization codepaths
13017:36 <dongcarl> And because of that, it is possible to have differences between how our test framework initializes, and how our software actually works
13117:37 <nehan> can you talk more about that? why does the test setup do this outside of initializing a node?
13217:37 <nehan> oh -- these are the C++ tests. nevermind.
13317:37 <dongcarl> nehan: Hehe yeah, it _is_ initializing a node!
13417:38 <dongcarl> Okay, let's get to the next one!
13517:38 <nehan> followup question: why does the code path for the test setup "count" as a different code path that one should care about?
13617:38 <MarcoFalke> Ideally the unit tests would re-use most of the init.cpp logic
13717:38 <dongcarl> nehan: Because I don't think the test framework ever calls AppInitMain
13817:38 <nehan> MarcoFalke: would that require factoring it out of main()?
13917:39 <nehan> dongcarl: right, that. why not?
14017:39 <@jnewbery> nehan: take a look at setup_common to see how the unit tests set up their environment
14117:39 <michaelfolkson> MarcoFalke: Unless you're testing the init.cpp logic. But that would be functional testing...?
14217:40 <troygiorshev> Is this a consious choice (along the lines of keeping Arrange Act Assert separate) or is it force on us?
14317:40 <MarcoFalke> Well, if the unit tests need to spin up a node, they should do it as similar as AppInitMain as possible
14417:40 <dongcarl> I'm not sure there's a reason why it's done the way it is right now. However, what I think would be a nice goal to have: an initialization codepath that's used by all binaries with the right toggles.
14517:41 <@jnewbery> Take a look at the logic inside init.cpp. A lot of the code there is actually initializing the individual sub-components. We need to do the same setup inside our unit tests, otherwise we're not actually testing the code paths that run in production
14617:41 <dappdever> do the new chainmanager tests use the same setup code? It seems like they no longer use the global instance (?)
14717:41 <troygiorshev> dongcarl jnewbery: ok thanks
14817:41 <dongcarl> As always jnewbery, puts it nicely :-)
14917:42 <dongcarl> Let's move on to the next one so we have time!
15017:42 <@jnewbery> I think it'd be nice if as much of the initialization of the subcomponents was done within those subcomponents, perhaps inside the constructor function. That way the unit tests could just instantiate that subcomponent and know that it's testing what will actually run on the live system
15117:42 <dongcarl> What are the advantages and disadvantages of using global variables and functions?
15217:42 <MarcoFalke> jnewbery: Indeed. And removing globals helps there
15317:43 <dongcarl> dappdever: Take a look here! :-)
15417:43 <@jnewbery> And then of course we'd be able to unit test units of our code, without worrying about global state interfering
15517:43 <MarcoFalke> advantage of globals would be less verbosity in code and if there can only be one instance of an object, passing it around explicitly doesn't help code clarity
15617:43 <elle> dongcarl: advantages: easy access from all over. disadvantages: easy access from all over. Plus: hard to mock for tests. Also: not always known if the object is yet initialised (in its steady state)
15717:43 <troygiorshev> advantage: can access from anywhere. disadvantage: can access from anywhere...
15817:44 <@jnewbery> troygiorshev: :)
15917:44 <dongcarl> elle: Hehe exactly!
16017:44 <troygiorshev> elle: well said!
16117:44 <dappdever> if globals can be modified from anywhere, it seems like a security risk
16217:44 <michaelfolkson> Lots of disadvantages mostly covered. Thread safety too
16317:44 <nehan> it's easy to get lazy and not have good encapsulation and abstraction
16417:45 <michaelfolkson> Getting rid of globals is also a pain
16517:45 <MarcoFalke> michaelfolkson: I'd say thread safety is another orthogonal issue
16617:45 <dongcarl> nehan: ...and end up needing a 80 commit PR to clean it up haha
16717:46 <sipa> dongcarl: squash it all
16817:46 <MarcoFalke> Our NodeContext is not thread safe. It just happens to be modified in at most one thread at a time
16917:46 <sipa> (not a serious suggestion)
17017:46 <dongcarl> I like what I'm hearing from everyone
17117:47 <dappdever> do globals perform better because there is less initiatlization?
17217:47 <nehan> is there a reason setup_common.cpp doesn't call AppInitMain()? what makes it hard to do so, if anything?
17317:47 <michaelfolkson> I guess I'm saying use of globals can impact thread safety but avoiding them doesn't guarantee it MarcoFalke
17417:47 <dongcarl> nehan: Hmmm... Don't think there is. MarcoFalke?
17517:48 <MarcoFalke> nehan: Good question. I think AppInitMain does too much (like calling weird compat code I don't understand)
17617:48 <dongcarl> dappdever: Since initialization is mostly a one-and-done affair, the performance impact is not something we'd be too concerned about. I also doubt there's any performance gains to be made from being global.
17717:48 <nehan> seems like that would be best, if possible, to test the initialization flow (as you pointed out)
17817:49 <MarcoFalke> AppInitMain also parses from the argsmanager, but unit tests historically didn't use "command line" args for initialization
17917:49 <dongcarl> nehan: Agreed, this is definitely in my list of followup projects :-)
18017:50 <dongcarl> Alright the last one is a little tricky, so let's leave some time for it!
18117:50 <nehan> dongcarl: ok! I haven't looked closely yet at the differences between TestingSetup() and AppInitMain() but will do so
18217:50 <dongcarl> Why are the first few fix commits in the branch necessary? Why does the previous code work prior to de-globalizing g_chainman and why doesn’t it work now without the fix?
18317:50 <@jnewbery> nehan: At that point, you're arguably not unit testing - it'd be closer to integration testing
18417:50 <dongcarl> I'm talking about
18517:50 <@jnewbery> with unit testing you want to be able to isolate each individual component
18617:50 <dongcarl>
18717:50 <dongcarl> and
18817:51 <nehan> jnewbery: one could think of it as unit testing the initialization function. i'm not sure it's in any way bad to test the same code path used in actual use, if possible.
18917:51 <MarcoFalke> good point jnewbery. AppInitMain should also be split up to accomodate the testing setups that don't need all the node components.
19017:51 <nehan> MarcoFalke: yeah, maybe it can be split up so the unit tests can use pieces
19117:52 <dongcarl> COALESCE THE INIT CODEPATHS 2020
19217:52 <MarcoFalke> Most tests don't even need a chain. (low level net tests for example)
19317:52 <MarcoFalke> dongcarl: good luck getting that done in 2020 ;)
19417:53 <@jnewbery> nehan: yes, a unit test that tests initialization is fine. But you shouldn't need to initialize the whole system if you just want to test one or a few subcomponent
19517:53 <nehan> dongcarl: i will make hats
19617:53 <dongcarl> Hehe
19717:53 <nehan> jnewbery: agreed!
19817:53 <sipa> dongcarl: last-minute presidential campaign?
19917:53 <dongcarl> For those looking at the first few commits: please ignore "test: Add new ChainTestingSetup and use it"
20017:53 <jamesob> sipa: I'm writing him in regardless
20117:54 <dongcarl> legoooo
20217:55 <dappdever> @dongcarl are the tests using the same instance of m_node across tests, whereas before each tests was creating the chain from a new node instance?
20317:57 <dongcarl> dappdever: Close! It does have something to do with m_node. It's mostly because after this PR, the code will now reach for the chainman inside the local context instead of g_chainman. So it's important that the local node context contains the chainman we want to work with instead of not having one!
20417:57 <dongcarl> Alright we're nearing the end now!
20517:58 <dongcarl> I want to thank everyone who came out and participated, jnewbery for organizing!
20617:58 <@jnewbery> Anyone have any last minute questions?
20717:58 <dongcarl> Any last questions?
20817:58 <dongcarl> Heh
20917:58 <troygiorshev> when undraft and merge
21017:58 <washroom> wait there are voters who didn't write in dongcarl ?
21117:58 <dongcarl> Surprised me too!
21217:59 <washroom> : P
21317:59 <dongcarl> troygiorshev: As soon as I'm 100% that it's bug-free!
21417:59 <@jnewbery> dongcarl: how do you plan to get this merged? are you going to slice off parts of it into smaller PRs?
21517:59 <MarcoFalke> dongcarl: How are you going to split up the pull?
21618:00 <dongcarl> Aha! I think the FIX commits will definitely be split off first. Those are very easy.
21718:00 <troygiorshev> not all of us are ryanofsky and can review it all in one go :)
21818:00 <michaelfolkson> 82 commits. Split into groups of 5 haha
21918:00 <michaelfolkson> (Joke)
22018:00 <dongcarl> I can split out the rest of the commits into 7 bundles for review and discussions.
22118:01 <dongcarl> However, I would hope that we can get it merged in quick succession
22218:01 <dongcarl> As a lot of the benefits come from the latter bundles of commits
22318:01 <dongcarl> Oh time's up!
22418:01 <@jnewbery> dongcarl: great. Feel free to announce them in this channel so people know where to look
22518:01 <dongcarl> jnewbery: Will do!
22618:01 <dongcarl> #endmeeting
22718:01 <washroom> ty dongcarl
22818:01 <troygiorshev> thanks dongcarl!
22918:01 <emzy> Thank you dongcarl, jnewbery and all.
23018:01 <nehan> thanks!
23118:01 <@jnewbery> Thanks Carl. That was great!
23218:01 <dappdever> thank you!~
23318:01 <dongcarl> Thank y'all!
23418:02 <michaelfolkson> Thanks dongcarl!
23518:04 <jonatack> Thank you, Carl! 🏆