The PR branch HEAD was 48e5385 at the time of this review club meeting.
During Initial Block Download (IBD), after downloading the full headers chain, nodes have a BLOCK_DOWNLOAD_WINDOW=1024-block window within their best
chain during which blocks are downloaded in parallel.
Blocks still need to be validated sequentially, since their validity depends on all outputs they spend being confirmed in previous blocks.
When a node receives a block, the node attempts to connect it to the current tip, validate it, and
The node may have up to MAX_BLOCKS_IN_TRANSIT_PER_PEER=16 in-flight requests to each peer. The
node will never send more than one request for the same block out at a time.
It only uses outbound peers unless that’s not possible.
Peers are not trusted to just always serve correct data in a timely manner. The node tries to
detect if block download is stalling based on the following criteria:
The node is unable to connect a new chain of blocks past the current tip, e.g. if the tip is
at height i and blocks [i+2: i+1024] have arrived, but block i+1 hasn’t.
The node cannot make any more requests; all of the 1024 blocks have already been received or are currently
Once the node detects that it is being stalled, it starts a
timer and gives each peer by which it is “bottlenecked” two more seconds to fulfil the block request(s) before
disconnecting them. Then, it connects to a new peer and continues requesting the blocks it needs.
However, the node will still consider itself to be stalled because the criteria are still met;
the new peer will be the “bottleneck”. On master, this peer is also given only 2 seconds to
fulfil to the block request before disconnection. This is a problem since, if our internet
connection is very slow, we might end up needlessly churning through peers thinking they are
Without looking at the PR, what solutions would you consider? Feel free to be creative.
<stickies-v> alright that's a great start! usually for tested ack we mean that we've done testing that goes beyond the functional tests though, because the CI runs the functional tests anyway - so them failing would get picked up pretty quickly
<stickies-v> what "testing" means depends on the PR, your understanding of it, and how creative you can get in trying to poke holes in it. so there's no one definition. In general, it's also good to describe on your review how you tested something, so people can replicate and/or test other angles
<lightlike> what I did to test this at some time is to locally reduce MAX_BLOCKS_IN_TRANSIT_PER_PEER=2 and BLOCK_DOWNLOAD_WINDOW=16 (or similar(), and run with -onlynet=tor - that way, I would get into stalling situations very soon.
<larryruane_> I'm running the PR now, and one fun thing to do is, during IBD, `bitcoin-cli getpeerinfo` and you can see how the outstanding block requests are more or less interleaved among the peers (no 2 peers downloading the same block)
<alecc> some clarification on the PR, why does it keep stalling on newly connected peers after only one stalls? i.e. is it intentional to assume new peers are stalling? i was thinking a possible alternative to adjusting the stall window would be to change this assumption? maybe i'm misinterpreting some part
<amovfx> I think it would help with selecting nodes that have good behavior, it randomly samples and starts building a confidence metric on the nodes that are returning what you want, in this case blocks
<stickies-v> dergoegge: perhaps one alternative solution would be to instead of disconnecting the peer, allowing one (or more) extra peers to download the same block that's being stalled? that would increase bandwidth usage though, so probably less preferable than what this PR proposes
<lightlike> I think if we have a peer that is significantly slower than the rest of the peers, such that it stalls the download, we probably want to disconnect it and exchange it for a better peer. It's not like we want to keep our current peers at all costs.
<larryruane_> well, I guess I was thinking, when we hit the stall condition, still boot that peer (as we do today), but when we then connect the replacement peer, have the window be larger ... (not sure how to shrink it again)
<alecc> for the actual code, when it's processing a getdata message in `net_processing.cpp`, in `FindNextBlocksToDownload` it sets the stalling bit, and then sets `m_stalling_since` to current time based on that
<larryruane_> I did have a question about the text in the notes tho, which is "The node is unable to connect a new chain of blocks past the current tip, e.g. if the tip is at height i and blocks [i+2: i+1024] have arrived, but block i+1 hasn’t." .... my question is, couldn't there be other missing block in [i+2: i+1024] BUT, all those blocks are assigned to the same peer (as block i+1 is assigned to)? ... small difference
<sipa> lightlike: Agree, there is no reason to keep our peers at all costs, though this is dependent on the fact that this is only relevant during IBD, when we mostly care about bandwidth and less about other properties (like partition resistance, latency, ...). After IBD, we're much more careful about when to disconnect.
<sipa> And because I see people talking about increasing the 1024 window: I don't think that helps. For most of IBD, 1024 blocks is way too much; that value was set at a time when blocks were tiny. It still makes sense early in the chain, but arguably we should be shrinking it.
<lightlike> larryruane_, stickies-v: we consider ourselves stalled if we have a peer with an open slot to which we want to assign a block, but can't becasue all of the blocks in thr 1024 windows are already either present or assigned to someone else. So yes, I think it's possible to have multiple stallers.
<alecc> yashraj: i think it's the opposite - if all but i+1 arrived (i+2 -> i+1024) and we haven't seen i+1 for a certain amount of time we consider ourselves stalled, i guess i'm not super sure exactly what happens if i+1 arrives and then the others are taking a while
<lightlike> sipa: yes, that is true, I meant that it could be a possible situation that multiple peers are actually stalling us (withholding blocks from us), that will get marked as stallers one after another - but only one at the same time.
<amovfx> I dont have the understanding to say weather it does or if it could be exploited. I think it would make it more secure as we have an adaptive time out, so the attacker could use the max time out consistently
<alecc> dergoegge: it seems the PR is generally leaning towards assuming some nodes are not stalling, so I'd imagine if you were eclipsed, by extending the allowed stall window it would take longer to maybe escape the eclipse im thinking
<stickies-v> next question: before this PR, if a node has such a slow internet connection that it cannot download blocks in less than 2 seconds, will it be continuously stalled and churning through peers? Why or why not?
<lightlike> alecc: I would say this PR leans towards assuming not __all of our peers__ are stalling. The first staller will be disconnected as fast as before, just subsequent peers will be given more time if that happened.
<alecc> yashraj: to elaborate on what lightlike said, the stall timeout also decreases at a slower rate than it increases so it would try to go against the situation where you just flip back and forth between increasing the window then decreasing
<stickies-v> if we can't download any block in < 2 seconds, then as soon as we end up with one stalling peer, we would disconnect that peer. afterwards, we would give every new peer 2 seconds to catch up, but that would never happen, so we would be stuck forever I think?
<larryruane_> lightlike: yes I agree, remember that the 2s (or with the PR, can be greater) timeout comes into play ONLY if we've entered this "stalled" condition ... with similarly slow peers, (or like if our own network connection is slow), we may never enter that state
<stickies-v> I think if all of our nodes are just not sending us any blocks at all (probably malicious, or some serious technical issues), then we would just not make any progress at all, and no single node would be marked as stalling
<stickies-v> alecc: so we wouldn't disconnect them for stalling, but luckily as dergoegge points out there are other mechanisms that disconnect nodes when they're not sending us anything - and hopefully eventually we'd stumble upon a node that's able to send us something! that second mechanism is slower than the stalling disconnect, though
<larryruane_> dergoegge: "Last question" -- we still disconnect the peer that's stalling us, as before, but we increase the 2s timeout for the replacement peer .. and increase again if it happens again, etc. (exponentially)
<alecc> amovfx: from what i could find, using std::atomic allows for some atomic operations that allows different threads to edit memory at the same time without unexpected behavior (compare_exchange_strong is used in the pr for this reason). i think this is easier than using another mutex/locking? there's probably more to it
<stickies-v> alecc: you're right that we use std::atomic to protect against UB during threading, but std::atomic doesn't allow multiple threads to edit memory at the same time - it just prevents data races from happening by using atomic operations, i.e. makes sure that one thread waits for the other one to finish before proceeding