The key things this commit adds are an IpcProcess::spawn() method, which a bitcoin parent process can call to spawn a new child process, and an IpcProcess::serve() method, which a bitcoin child process can call afterstartup to communicate back to its parent process.
In follow-up PR #10102 Multiprocess bitcoin, this functionality is used so a bitcoin-gui process can spawn a bitcoin-node process, and a bitcoin-node process can spawn a bitcoin-wallet process, and GUI, node, and wallet functionality can run in separate processes that are protected from each other. In further follow-ups #19460 Add bitcoin-wallet -ipcconnect option and #19461 Add bitcoin-gui -ipcconnect option, more flexibility is added so node, wallet, and GUI processes can be started and stopped independently.
All communication between Bitcoin Core processes happens through internal C++ interfaces, which are just C++ classes with pure virtual methods. The virtual methods allow bitcoin GUI, node, and wallet code to be written in a straightforward way that doesn’t require dealing with complications of IPC. IPC is handled by the multiprocess framework added in this PR, to avoid the need to complicate application code with low level I/O.
Specifically, follow-up PR #10102 Multiprocess bitcoin uses the IPC framework added here to generate subclasses for each C++ interface class, with every overridden virtual method of every subclass implemented to send method calls and arguments to a remote process, wait for a response, and then return the response as the method return value. For example, if the GUI wants to find out if a wallet address is spendable, it calls the interfaces::Wallet::isSpendable() method. If GUI code and wallet code are running in the same process, this directly invokes the interfaces::WalletImpl::isSpendable() method implementation. But if GUI and wallet code are running in different process, the multiprocess framework instead provides a different (generated) interfaces::Wallet class implementation that forwards the method request and arguments to a remote wallet process, waits for the results, and returns them, instead of directly calling wallet code.
The design goal is for cross-process communication to happen through normal method calls, and for node, wallet, and GUI code not to have to change drastically to support process separation.
The IpcProcess class has spawn(exe_name, pid) and serve(exit_status) methods and is responsible for spawning new child processes, creating pipes child and parent processes can use to communicate, and passing pipe file descriptors to IpcProtocol objects in parent and child processes.
The IpcProtocol class is what actually sends method calls across the pipe, turning every method call into a request and a response, and tracking object lifetimes. The IpcProcotol and IpcProcess classes could have been melded together into a single class, but separating them allows the spawn and pipe setup code to work with protocols other than Cap’n Proto, which is the internal protocol currently used by the IpcProtocol class.
The Init interface is similar to other cross-process C++ interfaces like interfaces::Node, interfaces::Wallet, interfaces::Chain and interfaces::ChainClient, providing virtual methods that can be called from other processes. What makes it special is that unlike other interfaces which are not implemented by every process—interfaces::Node is only implemented by the node process and interfaces::Wallet is only implemented by the wallet process—interfaces::Init is implemented by every process that supports being spawned, and it is the initial interface returned by the IpcProtocol::connect(fd) method, allowing the parent process to control the child process after the connection is established. The interfaces::Init interface has methods that allow the parent process to get access to every interface supported by the child process, and when the parent process frees the interfaces::Init object, the child process shuts down.
Questions
The entry points for spawned bitcoin-node and bitcoin-wallet processes both call IpcProcess::serve() then immediately exit when it returns. How do the child processes provide useful functionality to the parent processes if they never run the code after the IpcProcess::serve() calls?
When does the IpcProcessImpl::serve() method return true and when does it return false? Is it ever expected to return false, or is it always an error?
The IpcProcessImpl::spawn() implementation has a lambda that generates a vector of command line arguments for the process that should be spawned. What does the generated command line look like, and why does the generated command line for the child process depend on m_argv[0] of the parent process? Why does it include a pipe file descriptor (int fd).
The MakeCapnpProtocol(LocalInit& init) function returns an IpcProtocolImpl protocol implementation which is a dumb wrapper around libmultiprocess functions that translate interfaces::Init method calls to pipe reads & writes (for a parent process) and translate pipe read & writes to interfaces::Init interface method calls (for a child process). If we wanted to replace libmultiprocess and use a different protocol to communicate across the pipe, would the interfaces::IpcProtocol interface need to change? If communication needed to go to a different channel other than a pipe, like an IP address, or an SSL socket, would the interfaces::IpcProtocol interface need to change then? How would it change?
The new init_bitcoind.cpp file introduced in this PR is linked into the bitcoind executable and the new init_bitcoin-node.cpp file is linked into the bitcoin-node executable. Without this change, and before this PR, the bitcoind and bitcoin-node executables were identical. In follow-up PR #10102 Multiprocess bitcoin there are more changes to init_bitcoind.cpp and init_bitcoin-node.cpp that give bitcoin-node significantly different behavior from bitcoind, running wallet code in a new spawned bitcoin-wallet process instead of in the same process. In this PR, the differences between init_bitcoin-node.cpp and init_bitcoind.cpp are more minor, but what are they? Do they lead to differences in observable behavior?
<fjahr> I haven't studied capnp before. I remember seeing some discussion on it in the core dev irc. Is it worth re-reading any of that to get more of the context about it's use in core?
<ryanofsky> If it already has a thread to handle requests for #234, it runs calls the wallet getbalance method on that thread, otherwise it makes a new thread to handle that request and future ones
<lightlike> a very general q: what is the main goal for introducing multiprocesssing to core: Mostly architectural, i.e. is better separation of wallet/node/gui? or would it also affect performance?
<ryanofsky> lightlike, there are different goals and tradeoffs. generally good for security, bad for performance, good for flexibility, like being able to have node run in backgground and attach/detach wallets and guis
<ryanofsky> you could write a custom protocol, the advantage of using capnp is that when you want to add a new method or class or parameter, you just add it in a schema instead of having to write manual code
<ryanofsky> in capnp each object has an identity, and you call a method on a specific object. while in gRPC you just define request and response formats and have to look up the objects yourself from the request
<michaelfolkson> Adding spawn support seems like it would be one of the final steps after untangling all the code between the different components. Is that all done now and ready to be merged?
<ryanofsky> So this PR adds support for spawning, and then the next pr 10102 calls it to actually make bitcoin-node, bitcoin-gui, bitcoin-wallet processes specialize and talk to each other with the spawn support here
<jkczyz> Ah, so by "bidirectional", it doesn't simply mean returning data back from child to parent process but rather supporting callbacks from child to parent more generally
<ryanofsky> michaelfolkson, sure, one pre-question. The notes focused on IpcProcess, IpcProtocol, and Init classes introduced in this PR. We it clear what these classes do, and anyone want to summarize?
<ryanofsky> when a client passes a server a std::function or an object, the server can call back to that function or call an object method at any time, and the framework handles it
<ryanofsky> nehan, right, and one reason IpcProtocol is separate from IpcProcess, is so different protocols other than capnp could be supported in the future
<ryanofsky> michaelfolkson, right in 10102, bitcoin-gui spawns a bitcoin-node, and bitcoin-node spawns a bitcoin-wallet, so these are long running tasks
<ryanofsky> troygiorshev, yes. Init does a few things but the main reason it exists is because the only way the framework allows processes to communicate by calling object methods, and Init is the object a spawned process provides to start off with
<michaelfolkson> I get why you want to be able to start/stop processes. But generally the processes would be running continuously and concurrently. They aren't going to be regularly stop/started?
<nehan> in addition to what's described in init.h, it would be nice to see a diagram which shows the steps for the node, gui, and wallet. it's not obvious to me why the gui spawns a node and the node spawns the wallet?
<troygiorshev> related is, if the gui spawns a node, then does that mean we can't nicely shut down the gui without shutting down the node? Or is the distinction between parent and child sortof flexible?
<ryanofsky> nehan, that's a good question. gui spawning node and node spawning wallet are just artifacts of the way the code works currently and are just supported so no user changes are required
<ryanofsky> yes. Should note that the parent/child relationships talked about with respect to spawning aren't some permanent part of the connections. Connections are fully bidirectional
<ryanofsky> So if the node spawns a few wallets on startup, and some separate wallet processes are started which connect back to the node, all the wallets are equivalent
<ryanofsky> I guess the first question was: How do the child processes provide useful functionality to the parent processes if they never run the code after the IpcProcess::serve() calls in main()?
<ryanofsky> They definitely don't need to run code after. The question was assuming if you saw code in main that said if (condition) exit, you would might be suspicious
<lightlike> i thought that the protocol implementation (capnp) has a loop in its serve() method, so everything useful would happen during the IpcProcess::serve() call.
<ryanofsky> that line is saying "if I am a child process spawned to handle requests from a parent, and I am done handling requests, then exit without executing the rest of main()"
<ryanofsky> michaelfolkson, oh I just meant it is the equivalent place in terms of blocking, lines 62 and 79 both block the thread and wait for and respond to incoming requests
<ryanofsky> but question 4 first asks if you wanted to get rid of capnproto, and use a different protocol like gRPC or JSONRPC or something custom, would the method definitions hve to change?
<nehan> ryanofsky: why did you choose file descriptors as the interface instead of something more general? what would it take to run processes on different machines?
<michaelfolkson> It is really hard to catch up on all these years of work in an afternoon haha. I think we covered a previous PR on interfaces at a previous PR review club
<ryanofsky> michaelfolkson, Basically yes with exception that starting bitcoin-wallet tool and connecting to node isn't greately useful even with those 4 prs
<ryanofsky> followup PR would add a "serve" wallet tool subcommand or something similar so the wallet tool could connect to the node and then actually do useful operations