The PR branch HEAD was 30e2ba6c at the time of this review club meeting.
Notes
Bitcoin Core uses headers-first synchronization (implemented in PR
4468). It first downloads
headers from its peers to build a tree of verified headers. Once the headers
are downloaded and verified, it requests blocks that build towards the
most-work chain. The decision whether to request a block for download is made
based on the headers tree.
This automatic download behavior is beneficial for the normal operation
of a node, saving bandwidth and protecting it against DoS attacks.
On the other hand, this also restricts the possibilities for analysis of blocks
that have not been chosen for download. Examples are stale
blocks (i.e. blocks that have
not become part of the best chain).
This PR introduces a new RPC getblockfrompeer that manually attempts to fetch
a block from a peer, while specifying the block hash and the id of the peer.
This is achieved by constructing a GETDATA message for the block and sending it
to the selected peer.
What are the possible applications of the new RPC getblockfrompeer?
Where in the codebase does a node make the autonomous decision whether to
request a block for download after hearing about it from a peer?
What is the main decision criterion?
The first commit collects various Ensure* helper functions and moves them
to rpc/server_util.h. What is the purpose of these functions, and why did this PR move them?
What happens if we request a block from a peer, but the peer doesn’t have the
block either?
This PR requests a block but adds no additional logic for dealing with the
answer to the manual block request. Why is this not necessary?
What would happen if we’d use getblockfrompeer to request a block for which
our node didn’t previously know the header?
Can you think of ways to further improve the new RPC command in the future?
(some ideas are mentioned in the PR discussion)
<sipa_> raj: i'll only mention this once, but a per peeve of mine is that blocks in a non-active branch of the chain shouldn't be called orphaned. it's not like they don't have parents...
<lightlike> larryruane: Not sure about testing, I think it's more about being able to better observe and analyze what is happening on mainnet with stale blocks.
<raj> lightlike, it also seems there should be a way to fetch any blocks from the network (as long as it known to be in the tree). Isn't there any current way to do that?
<jnewbery> right, you could imagine having a full node and wanting to fetch a transaction, but the block that it's in has been pruned. You could run this commend to redownload the block, and then call getrawtransaction with the txid and blockhash
<larryruane> glozow: I wondered the same thing, so I fetched an older block (my node is pruned) about an hour ago and it's still there (`getblock` shows it)
<lightlike> Where in the codebase does a node make the autonomous decision whether to request a block for download after hearing about it from a peer? What is the main decision criterion?
<larryruane> lightlike: "... pruned at the next restart" I just restarted my node, and that old block (beyond my prune window) is still there... I think it's because it's now in the `blocks` directory (which is persisted)
<jnewbery> I think that if we redownloaded a block we'd generally keep it for at least a couple of days. We'd write it to the latest block file (after the tip block), and we'd only prune that file once the tip had been buried by 288 blocks.
<theStack> is it already possible now to explicitely keep old blocks on a pruned node? or would this PR enable the first time ever to have "gaps" between blocks?
<jnewbery> I may not have got that exactly 100% accurate, but roughly I'd expect the block to stay around for a short while due to the way files are pruned and the minimum prune value
<jnewbery> ah yes, here's the constant for keeping all files that contain a block within the last 2 days: /** Block files containing a block-height within MIN_BLOCKS_TO_KEEP of ::ChainActive().Tip() will not be pruned. */
<larryruane> while testing this PR, i ran `getchaintips` but was surprised that i was not able to fetch any of those stale blocks from my peers (although i didn't try every one) ... maybe those peers just didn't ever consider those blocks to be part of best chain?
<lightlike> So, if the last header of a chain of headers doesn't have at least as much work as our chain, we wouldn't download the block automatically.
<theStack> hope that my question is not too much off-topic, but i wondered (more than once) what is rationale of the 550 MB minimum prune size? i expected more like a multiple of 144 :) (i guess it's correlated with MIN_BLOCKS_TO_KEEP somehow though?)
<larryruane> in case this is helpful to anyone, here's a command you can use to see which of your peers are non-pruning (look for NETWORK): `bitcoin-cli getpeerinfo|jq '.[]|(.id,.servicesnames)'`
<lightlike> Next q: The first commit collects various Ensure* helper functions and moves them to rpc/server_util.h. What is the purpose of these functions, and why did this PR move them?
<lightlike> raj: yes, there seems to be some manual steps possible (try existing peers, if you don't get the block you want disconnect and them and try others) that could be automated
<b10c> ray: when the forkmonitor detects two blocks, at least one of their nodes will have the block. And for everyone else it can still be useful to get pruned blocks
<lightlike> larryruane: yes - also I think it's possible (or will be with increasing modularity) not to have certain parts (mempool, peerman etc.), in which case the specific rpcs don't make sense but an assert would be wrong.
<lightlike> larryruane: correct! This PR uses EnsurePeerman() in rpc/blockchain, which was located in rpc/net before. But rpc/net includes other stuff from rpc/blockchain, hence the circular dependency
<larryruane> and there's a tool that detects those dependencies, but there's a way to add exceptions, and we'd like to keep the exceptions to a minimum (i think)
<lightlike> so that means automation would either send to all peers at the same time (and potentially receive the block many times which would be wasteful), or it would need to define some kind of waiting period after requesting from one peer, which introduces some complications.
<lightlike> raj: I think it could (thats exactly what NOTFOUND does for transactions), but this is not currently implemented (probably because it is not such a common situation for blocks?)
<lightlike> Next q: This PR requests a block but adds no additional logic for dealing with the answer to the manual block request. Why is this not necessary?
<larryruane> it's a cool design that when we receive a block, we don't really care *why* we're receiving it ... we're like, oh cool, here's another block i can learn about!
<lightlike> larryruane: yes, in particular AcceptBlock() in validation just deals fine with saving the block to disk, but only because we have marked it as requested before (BlockRequested())
<lightlike> which leads to the next q: What would happen if we’d use getblockfrompeer to request a block for which our node didn’t previously know the header?
<larryruane> lightlike: again I enabled "net" logging and tried this (by just specifying a nonexistent block hash arg) and it did send it out to our peer... who just ignored it (didn't reply)
<lightlike> But yes, the node sends it out - and if the peer knows about the block it will send it to us anyway - but we will drop it because we didn't call BlockRequested() in this case.
<lightlike> since time is getting low, last q: Can you think of ways to further improve the new RPC command in the future? (some ideas are mentioned in the PR discussion)
<larryruane> jnewbery: that logic in AcceptBlock() seems to prevent DoS vector, because those checks are very quick (we don't even look at the transactions for example)
<larryruane> I just want to say, the review that jnewbery did on this PR is amazing, I hope I can be half as good someday! If you haven't already, check out the resolved comments