The PR branch HEAD was f51006b818 at the time of this review club meeting.
Notes
There are a variety of different types of P2P connections that Bitcoin Core nodes
can have with peers. See the ConnectionType enum in src/net.h for the different
types.
Until now, the Bitcoin Core functional test framework allowed only 2 types of
P2P connections:
manual connections from one bitcoind instance to another. See the
BitcoinTestFramework.connect_nodes() method, which calls the addnode RPC.
inbound connections made from the test framework to the bitcoind
instance. See the TestNode.add_p2p_connection() method.
This week’s PR adds the ability to create 2 additional types of regtest P2P
connections from the bitcoind instances to the functional test framework:
full outbound connections
block-relay-only outbound connections
The PR adds functionality to both the bitcoind C++ code and to the functional
test framework Python code:
In bitcoind, a test-only addconnection RPC is added to allow opening
full outbound or block-relay-only outbound connections in regtest.
In the functional test framework, TestNode.add_outbound_p2p_connection()
is added to enable making either full outbound or block-relay-only outbound
connections, along with P2PInterface.peer_accept_connection() to allow the
test framework to accept and listen to incoming connections.
We’ll focus mainly on the C++ code in this review club.
Questions
What are Bitcoin Core’s different connection types?
In the test framework, what is the difference between connecting two
bitcoind nodes using connect_nodes() and using a P2PConnection object?
Why would it be useful to create outbound or block-relay-only connections in
the test framework?
How does this PR enable creating outbound and block-relay-only connections
in the test framework?
Why do we make some RPC methods hidden, or “test only”?
For each different type of connection, what is the maximum number of
connections that Bitcoin Core will have by default?
What are CSemaphore and CSemaphoreGrant used for in CConnman?
<jnewbery> There are changes to the C++ and Python code in that PR. We'll concentrate on the C++ changes in this meeting, but it's important to review the Python code too
<amiti> sergei-t: good question, since outbound-full-relay & block-relay-only are the most common connections, I focused on those for this PR, but went a route that would allow for adding more connection types in the future
<ariard> sergei-t: intuitvely as those connections (feeler/addr_fetch) are short-lived we have less logic around them (e.g we don't have eviction for them)
<stacie> I have a q about the manual connection type. Are these connections restricted to being full relay or block relay only? I’m assuming they default to full relay but could they be block relay only?
<jnewbery> dappdever: yes. This PR includes basic tests for outbound full relay and outbound block relay only tests, but more testing of our p2p functionality would definitely be useful
<jnewbery> pinheadmz: anchor connections are a thing, but that's not a distinct connection type. We save all our block-relay-only connections on shutdown
<jnewbery> stacie: test code needs to be maintained just the same as all other code, so I'd NACK it if it doesn't meet the quality bar or doesn't add enough value to justify the ongoing maintainance cost
<jnewbery> sergei-t: I think there are quite a few subtle behavior differences between manual connections and full relay connections, but I'd need to dig around in the code to find them
<amiti> sergei-t: I don't think thats the only difference. for example, if you restart your node, you'll reconnect to your manually added peers. whereas you probably wont reconnect to the same full-relay outbound peers.
<willcl_ark> Connecting two nodes directly connects them via their listening p2p ports whereas using P2P.py passes the connection through the P2PConnection class, which lets you instrument and act upon each message exchanged.
<sergei-t> I guess, the testing mechanics is just unclear to me: when we say "connect from testing framework to the node"... what does actually happen?
<dappdever> Is there an example test for connecting two bitcoind nodes directly? I understood the P2P connection code fairly well but was unsure what to compare it to
<sergei-t> OK, so the testing framework maintains a "fake" node that connects to the "real" node? and there are two ways of doing so - "directly" or via P2P?
<jnewbery> sergei-t: I think you're getting your terminology mixed up. A test contains one or more bitcoind instances under test. We can interact with those instances from the test framework using the RPC interface or the P2P interface. We can also get the different bitcoind instances to connect to each other using p2p.
<jnewbery> P2PConnection is a low-level class that the Python framework uses to connect to the bitcoind node (and after this PR is merged, it will be able to accept an outbound connection _from_ the bitcoind node)
<sergei-t> jnewbery: is it correct that the test framework spins up one or more "real" nodes that it tests and an additionally a "fake" node that can connect to "real" nodes?
<jnewbery> sergei-t: there's no "fake" node. The test framework can open P2P connections to the bitcoind instances. You could call the test framework a "fake" node if you wanted to, but I don't think it makes things any clearer (in fact, I think it's confusing terminology)
<amiti> for people interested in learning more: when you are writing a functional test, in `set_test_params` you have to set `self.num_nodes`. this creates `TestNode` objects that wrap a bitcoind instance (and does things like forward along RPC calls)
<lightlike> sergei-t: I think it is a matter of terminology. I still think of the P2PConnection instances as objects, mock/fake nodes that we control and use to send the real node whatever p2p messages we wish.
<jnewbery> sergei: question 2 should say `P2PConnection`, not "P2P Connection". The difference is that the second option is creating a p2p connection using the python P2PConnection object, which as lightlike says, is a mock/fake to test the p2p interface
<jnewbery> dappdever elle troygiorshev: exactly, so we can test the code paths where we treat outbound and block-relay-only connections differently from manual connections
<willcl_ark> It adds `add_outbound_p2p_connection()` to test/functional/test_framework/test_node.py::TestNode which can be added in "outbound" or "blockrelay" mode
<jnewbery> that's right! There's a new RPC called addconnection, which calls through to a new CConnman method called AddConnection(). Good team effort :)
<sergei-t> because the behavior of these RPCs is not triggered by an RPC call in "real world"? I.e., full relay connections only appear to randomly chosen nodes, it the node is specified, it's a manual connection by definition
<jnewbery> emzy: I think that's the most important reason. As soon as you make an interface public, then it becomes part of some user's workflow and you can never change it!
<sergei-t> it would be like manual connection but with behavior of full relay connection... _if_ we were to add it, would it make more sense to parameterize manual connection instead?
<jnewbery> sergei-t: it's possible, but I think a good general principle is that new features/interfaces should be as tightly restricted as possible to begin with. It's much easier to give users features than take them away.
<amiti> when implementing, I initially tried adding the functionality to the `addnode`RPC, but didn't like adding yet another definition to an already overloaded endpoint, so separated it out into a new RPC endpoint. then made it test only for mostly these reasons, like more flexibility & its easier to isolate test needs vs trying to encapsulate all possible uses on mainnet
<stacie> I don't know if those constants are the correct answer (MAX_OUTBOUND_FULL_RELAY_CONNECTIONS, MAX_BLOCK_RELAY_ONLY_CONNECTIONS, etc.) but that link to net.h is where I found them
<ariard> there is a difference between a default setting for a wide majority of users and fine-setting its peering according to your network deployment
<amiti> addr_fetch connections are short-lived connections only (possibly) made on startup, I don't remember exactly how the counting works but for the majority of the time running a node, we won't have any open
<sergei-t> Semaphores are used to not exceed the max allowed number of connections (but I'm not exactly sure how they do it)... what does "grant" mean in this context?
<pinheadmz> jnewbery maybe i missed this but why do we need special connections just for addr fetch? do we not trust our exisintg peers? (something seomthing eclipse attack?)
<jnewbery> sergei-t: exactly. It's used to not exceed the maximum number of outbound connections. If you can't get a semaphore grant it means that all outbound connection slots are filled
<amiti> so, if you're starting up a node for the first time. you connect to the DNS seeds with addr fetch connections. they send you back addresses of nodes to populate your addrman, you close those connections and start opening connections to the general network
<amiti> there's also a fallback set of seeds incase there's a crazy infrastructure attack or you don't want to connect to DNS, but I believe these are the only two groups that we automatically connect to via addrfetch
<pedr0fr> I have a question. I want to try the ZMQ sequence notifier (merged on the 23rd of September - PR #19572).I am willing to wait for bitcoin core 0.21.0rc1, which, if I understand correctly, was planned for the 1st of November (https://github.com/bitcoin/bitcoin/issues/18947). When is 0.21.0rc1 expected to be released? Thank you.
<willcl_ark> amiti: well you say we only do it at fresh startup; what about a nodes that's starting up again after being offline for a while, or a node that's been running for some time without being turned off
<amiti> willcl_ark: if the node has been running for a while, you'd expect the addrman to be populated with addresses from normal address relay behaviors
<amiti> willcl_ark: I don't believe so. if you were to shut down the node, wipe your `peers.dat` and then restart, you might create addr fetch connections again
<sipa> addr_fetch was originally introduced for situations where we can't just resolve DNS as a means of getting IP addresses (specifically when only connecting through tor)
<sipa> so instead of asking a DNS seed, we *connect* to one (which means the exit node we're using will resolve for us, and open a connection to one of the resolved IPs); we still want a lot of IPs though, so we ask that node