Currently when connections are full, if we receive in inbound peer request, we look for a current connection to evict so the new peer can have a slot. To find an evict-able peer we go through all our peers and “protect” multiple categories of peers, then we evict the “worst” peer that is left unprotected. If there are no peers left to evict, the inbound connection is denied.
With this PR, if the inbound connection is on our whitelist we still loop through all our current connection, removing protected peers from the evict-able list. However, now we keep track of the last peer we protected (aka the last peer we removed from the evict-able list). If we end up with no peers left to evict, the last protected peer is evicted instead. This should preserve the peer eviction logic but just in the case of a whitebind inbound, we evict the “worst best” connection.
How is the function signature of EraseLastKElements changed in this PR? Why is this necessary? Do you see an alternative approach?
In SelectNodeToEvict(), is the order in which we make the various EraseLastKElements() calls important? Why (not)?
EraseLastKElements used to be a templated function, but this is changed in the first commit. Why is this the case? Do you see any downsides to this change?
Suppose we pass a vector of 40 eviction candidates to SelectNodeToEvict(). Before and after this PR, what’s the theoretical maximum of Tor nodes that can be protected from eviction?
<stickies-v> welcome everyone! Today we're looking at #27600, authored by pinheadmz, who I'm delighted is here as well! The notes and questions are available on https://bitcoincore.reviews/27600
<stickies-v> for those of you who were able to review, would you give it a Concept ACK, Approach ACK, Tested ACK, or NACK? what was your review approach?
<LarryRuane> I find it easier to look at the diffs from within VScode (i.e. `code` on ubuntu), as opposed to GitHub, because then I can see more context, and search symbols and such
<LarryRuane> (sorry for the sidetrack) one thing I don't like about using vscode is that when looking at a diff, it doesn't allow you to "tag" (jump to the definition of the symbol your cursor is on)
<stickies-v> LarryRuane: yeah, that is by far my biggest annoyance in reviewing code on vscode. opening the file works, but in multi-commit PRs not ideal, have to actually check out the commit too
<LarryRuane> abubakarsadiq: I was wondering about that ... there are times where we evict an outbound, aren't there? I'm not sure, but if it's not performing well?
<LarryRuane> i know that during IBD, if there's a peer who's holding us up (we're like 1000 blocks ahead of that peer), then we can kick it (but I forget the details)
<stickies-v> outbound peers are irrelevant here because we choose our outbounds, whereas for inbounds we are connected to, so we don't really control who fills up our slots (although we do filter, but it's more of a passive thing). but in some cases (e.g. with whitebind), we want to make sure we keep some space for certain nodes
<LarryRuane> stickies-v: lightlike: isn't there some kind of "ban" score? so if a peer misbehaves a little bit, we decrement its score (from 100), and only if it reaches zero do we kick it?
<stickies-v> pinheadmz: oh yeah, great point! we don't blindly always want to evict a peer whenever a new node tries to connect to us, because that would make it easy for an attacker to fill up all our (inbound) slots
<lightlike> sebastianvanstaa: they do if we have too many outbound peers (because we sometimes create extra ones), see EvictExtraOutboundPeers in net_processing
<alex_wiederin> stickies-v: 'force' ensures that at least one is selected for eviction. Previously it would not pick one if it had already excluded all peers to protect
<LarryRuane> stickies-v: if our eviction candidate list is empty (everyone on it has been protected), then we normally return `nullopt`, but if force is set, then we return the least-good protected peer
<stickies-v> pinheadmz: this makes me wonder what happens if we pass a vector containing only whitelisted candidates to `SelectNodeToEvict()`, I think that means we'll be returning a default-initialized `last_out.id`?
<lightlike> before, I guess the order could have some small effects if some peers would be protected through multiple criteria, but I can't really see a systematic bias there.
<stickies-v> lightlike: that's my understanding as well. On master, I don't think the order has any impact. After this PR, it affects which `std::optional<NodeId>` is returned, where the implicit assumption is that the "least important" protection rule needs to get executed last
<alex_wiederin> stickies-v: if say the number of current peers is 4 (which the first EraseCall covers) do the other EraseCalls take effect? Not sure I understand the EraseLast function - new to C++
<stickies-v> wow, sorry. brain fart. I was only considering the edge case where all candidates get evicted, in which case the order does not matter on `master` (but it still does on this PR)
<lightlike> I'm not sure that all the EraseLastKElements calls are ordered by importance. e.g. the "recently sent us novel blocks" seems more important/harder to game to me than having low minimum ping times.
<pinheadmz> i guess it might make sense to specifically pick one quality (instaed of all) and for exmple "force' will just evict the worst ping time peer
<stickies-v> yashraj: with this PR, the order in which we execute the `EraseLastKElements` calls is important we keep overwriting `last_out` with the most recently erased candidate. does that make sense?
<LarryRuane> stickies-v: "Do you see an alternative approach?" yes, `last` could be passed to `EraseLastKElements()` as an in-out argument (and make that function return void again),
<lightlike> a very simple alternative approach would be to just pick a random peer before the first EraseLastKElements, and evict that one if at the end of the function there is no better peer left because every peer is protected.
<stickies-v> pinheadmz: but this only applies in case of `force==true`, i.e. we trust the inbound peer, so I think we can be less careful about eviction here?
<lightlike> pinheadmz: this would only apply when you have a whitelisted incoming peer connecting to you and all peers are protected. seems a like a very special-case scenario to me.
<LarryRuane> i actually like that change to eliminate the templating... I think it's confusing when code is over-generalized... if it's only instantiated one way, then why not "hard code" that way?
<stickies-v> LarryRuane: that's a good alternative! generally i'm not a huge fan of out-parameters but in this case it would quite significantly clean up the code
<LarryRuane> but on the other hand (arguing against myself), those `if` statements make it very clear to the reader what's going on, without having to dig into the function being called
<stickies-v> I agree with LarryRuane about templating. Much easier to reason about what a function is doing when it's not templated, so I like to avoid it when possible
<stickies-v> I think another approach would be to have `EraseLastKElements` return all (instead of just the last) removed elements, which imo is a bit more intuitive/general, but perhaps not worth it. And this can also be coupled with LarryRuane's idea of using an out-parameter to simplify the current code
<lightlike> pinheadmz: why does ProtectEvictionCandidatesByRatio need to be changed? It protects only 50% at best as far as I understand, so it should leave an unprotected peer usually so last wouldn't be needed if we get that far? Or is this just for the edge case of having 1 peer left?
<stickies-v> Suppose we pass a vector of 40 eviction candidates to `SelectNodeToEvict()`. Before and after this PR, what's the theoretical maximum of Tor nodes that can be protected from eviction?
<stickies-v> lightlike: I really like your suggestion to just select a random (non-whitelisted, inbound) peer, it seems sensible at first sight and would simplify this PR so much
<pinheadmz> whitebind is also not entirely attack-proof for example i have a ful node with whitebind on a non-standard port that i use for my SPV wallet... but anyone could discover that port by sweeping
<stickies-v> LarryRuane: I think both with and without this PR 34/40 Tor nodes would be the maximum number (assuming they're not NoBan and inbound, I underspecified the question a bit) of nodes that can be protected by `SelectNodeToEvict`