The PR branch HEAD was 36ee76d at the time of this review club meeting.
Notes
Bitcoin Core uses port 8333 as the default port on mainnet (18333 on testnet).
This means that nodes will listen on the default port for incoming connections,
unless another port is specified using the -port or -bind
startup options.
However, nodes that listen on non-standard ports are unlikely to receive incoming
connections, because the automatic connection logic disfavors these addresses heavily.
In preparation for this PR, PR #23306
changed the address manager behavior such that an addrman entry is now defined
by both IP and port, so that multiple entries with different ports and the same
IP can coexist.
This PR changes the logic for automatic outgoing connections by dropping the preferential
treatment for the default port. It doesn’t treat all ports as equal though:
A list of “bad ports” is introduced that are still disfavored for outgoing connections.
Later commits also adjust the address gossip relay logic to include the port
of an address in a hash that is used to determine which peers to relay an
address to.
What were the historical reasons for the preferential treatment of the default port?
What are the benefits of removing this preferential treatment with this PR?
Before this change, automatic connections to peers listening on non-default ports
were discouraged, but not impossible. Under what circumstances would a node still
connect to such a peer?
After this PR, the default port still plays a role in bitcoin core. Where is it
still used? Should it be a long-term goal to abandon the notion of a default port
entirely?
The PR introduces a list of “bad ports” that was taken from internet browsers.
Do you agree with having a list like this in general?
Are there any reasons to deviate from the list used by browsers?
What is the reason for allowing callers to pass salts to CServiceHash and
then initializing it with CServiceHash(0, 0) in commit
d0abce9?
<ziggie> how does addrman disfavour other ports right now, does disfavour mean no chance to get a connection to another port than 8333, or is there a way ?
<stickies-v> based on sipa 's answer I found somewhere, another reason could to be make it harder for an attacker to fill people's addrtable with many IP/port combinations of the same node, which could potentially be used for eclipse attack
<sipa> Though I don't know to what extent this is public. I recently saw some (alleged) leaked satoshi emails that justified this preference, and it only mentioned the concern about eclipse attacking (before that term existed).
<lightlike> maybe it was also about reputational concerns? Bad publicity if bitcoin nodes connect to you on various ports, even if this is not DOS-worthy?
<glozow> Hopefully over time we move towards a healthy balance of 8333 and non-8333 nodes to make Bitcoin connection traffic a bit less easily identifiable?
<stickies-v> it allows people that can't/don't want to listen on 8333 to still receive incoming connections, increasing the number of available nodes to connect to for the entire network
<lightlike> glozow: I think that incoming Bitcoin connection traffic would still be identifiable without too much effort. But blocking it is not as easy as just blocking a single port.
<stickies-v> glozow is the 8333/n-8333 a healthiness indicator for the network though? I think the network doesn't really care about the balance itself - it just allows more people to participate?
<sipa> svav: The historical reason, as far as I know, was concerns about someone being able to listen on 1000s of ports on the same machine, rumouring all of those as separate addrs, and thereby sort of cheaply eclipse attacking the network.
<sipa> (and it doesn't apply anyone since addrman, which buckets based on source range of IP anyway; it doesn't treat multiple ports on the same IP any different anymore from multiple IPs in the same range)
<willcl_ark> With bitcoin traffic so easily identifiable on the wire I do wonder how much benefit it can bring to someone being censored at e.g. ISP level on port 8333 though... However if people have a simple local block on the port, I suppose it can help a little
<emzy> I can think of an easy eclipse attack with configurable ports. Run 10 bitcoind on the same random port and filter the internet connection of the victim to that port.
<stickies-v> and perception matters. It's much easier to claim a network needs to close certain ports for security reasons (without specifically targeting use cases), than to specifically target bitcoin packets (which you have to be specific about)?
<larryruane> basic question... doesn't ability to connect to alternate ports already exist because it's used by the functional tests (regtest)? Does this PR enable such for non-regtest? (seems like it's doing a lot more than that)
<sipa> larryruane: It's not that functionality to connect to custom ports doesn't exist (it has always existed), and for manual connections you can do whatever you like. The change is that this PR stops the *automatic* outgoing connection selection mechanism from *disfavoring* non-8333.
<lightlike> larryruane: the ability was always there (and it is possible to connect to other ports via manual connections) it's just the automatic connections, where we wouldn't connect (although we technically could)
<svav> Someone explain this please - If you don't have a standard port for Bitcoin, isn't this going to make it difficult for the network to function, because no-one knows a standard port that will be used??
<sipa> svav: That's the bootstrap problem, and it's an annoying problem, but we do have some mechanisms for it. It isn't particularly made harder by not having a standard port though.
<lightlike> svav: if you are on a non-standard port, you also advertise your own address with it in addr gossip relay, so others will know to connect to you on that port.
<stickies-v> svav I think another way to look at it is that the IP address is as unknown as the port, so if you know one you should be able to know the other through the same communication?
<sipa> In IPv4 it's kind of possible to literally trying to connect to every IP address on a particular port (certain botnets have done that), which would be a... very naive way of bootstrapping that's technically made impossible by using random ports. On the other hand... don't do that.
<lightlike> but this means if you for some reason chose a new random port every second day, you'll likely not get many incoming connections - so that would not be advised
<stickies-v> sipa oh right lightlike did comment that on the PR. Would a straightforward solution then not be to upgrade the seeders to relay ports too? Is there anything technically complicating that?
<lightlike> Before this change, automatic connections to peers listening on non-default ports were discouraged, but not impossible. Under what circumstances would a node still connect to such a peer?
<sipa> stickies-v: That would be very hard, actually, because the DNS system isn't designed for resolving ports, only IPs. But there are alternatives to using DNS in the first place.
<lightlike> glozow: correct! and this behavior is kept the same for the "bad port" list, so if nothing else works for 50 tries, we'll also try a "bad port"
<glozow> so our treatment of "bad ports" is treated how we used to treat non-8333, and non-bad non-8333 and 8333 is treated the same as how we used to treat 8333
<lightlike> it would just be bad if DNS nodes listed IPs that are listening on non-default ports (so that other nodes would try to connect to them on the default port and fail). But I think this is not he case with the current seeder software.
<lightlike> but as mentioned before, the default port is also added to the DNS seeder results we get, to be able to connect to theses addresses and save them to addrman
<stickies-v> I'm not sure there's a need for that - it wouldn't really be user friendly to make everyone (including people who don't know what a port is) define which port they want to use?
<svav> Do we know a reason why this PR (and 23306) was felt necessary at this stage? Is it just to make Bitcoin more resilient? Is there any reason to feel default ports make it vulnerable?
<lightlike> next q: The PR introduces a list of “bad ports” that was taken from internet browsers. Do you agree with having a list like this in general? Are there any reasons to deviate from the list used by browsers?
<stickies-v> have we had any/significant amount of reports from people unable to use port 8333 or is that more of a preventative thing? difficult to measure of course, just wondering how big of a role that played in the prioritization
<sipa> And, after realizing how little of a change the previous PR was (the one permitting multiple ports per ip), there was little reason not to go for itm
<bitplebpaul92> lightlike the rational of avoiding ssh ports and other ports where attempted communications might result in a banned IP address make sense to me
<lightlike> I agree. there is issue https://github.com/bitcoin/bitcoin/issues/24284 with a suggestion to also include ports used by browsers (which are obviously not on the browser's lists) that may make sense
<svav> Re security leak, you can see 8333 means Bitcoin node here, but once you know that, are you then easily able to further compromise the node? I mean is it easy to start reading node traffic?
<michaelfolkson> Not sure how one tries to discourage other protocols from using "your" protocol's port. Other than loudly trying to claim it as your protocol's port
<lightlike> so 80 and 443 may be particularly good choices to run a bitcoin node, because the traffic isn't looked into deeply anyway if everyone uses them?
<stickies-v> we want the randomness to be deterministic, so by passing the same (0, 0) salts the same IP:port should lead to the same hash consistently
<glozow> We always use the same salt so that, if we get the same address again (within the 24hr time slot), we relay it to the same "random" peers, so there's no advantage to sending us the same address twice
<lightlike> stickies-v glozow : exactly! If we'd use a different salt, we'd send a given address to different peers in that 24h window, which is not what we want.