Erlay: bandwidth-efficient transaction relay protocol (p2p)

https://github.com/bitcoin/bitcoin/pull/18261

Host: mzumsande  -  PR author: naumenkogs

The PR branch HEAD was c452e5d at the time of this review club meeting.

In this PR Review Club meeting, we’ll discuss BIP330 and the high-level concepts behind Erlay.

We’ll also look at the first few net processing commits in more detail:

Notes

  • Erlay is a proposal for a new method of transaction relay based on a combination of flooding and reconciliation (the current transaction relay is flooding-only). The idea was presented in a 2019 paper, Bandwidth-Efficient Transaction Relay for Bitcoin, and is specified in BIP330.

  • Reconciliation works asymmetrically in Erlay, depending on the direction of the connection. When we interact with an outbound peer (i.e. we initiated the connection), we are the requester of a reconciliation, and our peer is the responder (vice versa when interacting with an inbound peer).

  • In the simplest scenario, being a requestor means that we send a reconciliation (message REQRECONCIL), receive an answer back (SKETCH), and reply with the reconciliation differences (RECONCILDIFF) that we can extract by comparing our sketch with the received one. At this point, both parties know the set of transactions their peers might require that are known to both.

  • A sketch is a representation of a set of transactions (or their short IDs) that is optimized for a specified space. Erlay implements an existing set reconciliation algorithm called PinSketch.

  • Erlay does not completely abandon flooding but uses it much more sparingly. Nodes will still flood certain transactions to a limited number of peers; the goal is that only well-connected publicly reachable nodes flood transactions to other publicly reachable nodes via outbound connections.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?

  2. What advantages does a reconciliation-based approach to transaction relay have over the current flooding method. What are the trade-offs?

  3. Which of the existing message types participating in transaction relay will be sent less often with Erlay? Which ones will be unaffected?

  4. Can you explain why a reconciliation-based approach scales better than flooding (after all, the reconciliation messages also consume regular traffic)?

  5. How do two peers reach agreement about whether to use reconciliation?

  6. If two peers have agreed to use reconciliation, does that mean there will be no flooding on this connection?

  7. In the Limit transaction flooding commit, MAX_OUTBOUND_FLOOD_TO is set to 8. Considering that 8 is also the maximum number of outbound connections participating in transaction relay, why do you think this value was chosen?

  8. Can you think of possible new attack vectors that would specifically apply to Erlay?

Meeting Log

  118:00 <jnewbery> #startmeeting
  218:00 <jnewbery> hi folks! Welcome to PR Review Club. Feel free to say hi to let everyone know you're here.
  318:00 <glozow> supppp
  418:00 <pinheadmz> hi !!
  518:00 <dhruvm> hi
  618:00 <jnewbery> (or not. lurkers are also welcome at this club)
  718:00 <dergoegge> hi!
  818:00 <joelklabo> Hey everyone 👋
  918:00 <sipa> hi
 1018:00 <OliP> Hi
 1118:00 <lightlike> hi!
 1218:00 <emzy> hi
 1318:00 <maqusat> hey
 1418:00 <michaelfolkson> hi
 1518:00 <jnewbery> Today, we're going to talk about Erlay. Notes and questions are here: https://bitcoincore.reviews/18261
 1618:00 <ma33> hi
 1718:00 <AnthonyRonning> hi
 1818:00 <NelsonGaldeman56> Hi!
 1918:00 <ajonas> Hi
 2018:00 <gleb> hi! Thank you for coming.
 2118:00 <jnewbery> Thank you to lightlike for hosting and to gleb for writing the PR!
 2218:00 <ariard> hello
 2318:01 <jnewbery> is anyone here for the first time?
 2418:01 <effexzi> Hey...
 2518:01 <ecola> hi
 2618:01 <felixweis> hi
 2718:01 <rich> hi
 2818:01 <dergoegge> first time not lurking
 2918:01 <willcl_ark> hi
 3018:01 <OliP> First time here!
 3118:01 <jonatack> hi (lurking while reviewing 18017 ;)
 3218:01 <schmidty> hi
 3318:01 <DylanM> Hi first time, lurking mostly
 3418:01 <gleb> A lot of new names since I attended last time, great :)
 3518:01 <jnewbery> dergoegge: lovely. Thanks for joining us!
 3618:01 <amiti> hi
 3718:01 <jnewbery> OliP, DylanM: welcome, welcome
 3818:01 <fodediop> hi
 3918:01 <OliP> jnewbery: Many thanks
 4018:02 <jnewbery> couple of reminders. We're all here to learn. Lightlike will guide the discussion but feel free to ask questions at any time
 4118:02 <jnewbery> Also, you don't have to ask if you can ask a question. Just ask!
 4218:02 <tuition> hi
 4318:02 <svav> hi
 4418:02 <larryruane_> hi
 4518:02 <willcl_ark> great turnout!
 4618:02 <jnewbery> And one more reminder before I hand over: I'm always looking for hosts for review clubs. If you think that's something you'd like to have a go at, please message me
 4718:02 <jnewbery> ok, enough of me. Over to lightlike
 4818:03 <lightlike> Ok, thanks jnewbery - so today we'll talk about Erlay!
 4918:03 <lightlike> First, since this a really large PR and there is also a BIP and a paper with lots of information, we won't get deep into too much of the code today.
 5018:03 <lightlike> (that would be a possibility for a follow-up)
 5118:03 <lightlike> But let's start with first things first - Who had the time to review the PR this week? (y/n)
 5218:04 <willcl_ark> y
 5318:04 <OliP> y
 5418:04 <pinheadmz> y
 5518:04 <ecola> n
 5618:04 <dergoegge> ~y
 5718:04 <rich> n
 5818:04 <sergei-t> n
 5918:04 <svav> y
 6018:04 <AnthonyRonning> n
 6118:04 <DylanM> n
 6218:04 <jnewbery> y (just the four commits)
 6318:04 <fodediop> n
 6418:04 <ariard> y
 6518:04 <emzy> y (without the code)
 6618:04 <michaelfolkson> y
 6718:04 <dhruvm> n(ot yet)
 6818:04 <glozow> looked at bip, 0.2y for the PR
 6918:04 <joelklabo> n (read about erlay)
 7018:05 <lightlike> Great! So what is your initial impression? (Concept ACK or NACK)
 7118:05 <emzy> Concept ACK
 7218:05 <rich> Concept ACK
 7318:05 <dergoegge> Concept ACK
 7418:05 <pinheadmz> concept ACK
 7518:05 <joelklabo> concept ACK
 7618:05 <lightlike> seems still a bit early for code ACKs :-)
 7718:05 <OliP> Concept ACK
 7818:05 <larryruane_> n
 7918:05 <fodediop> Concept ACK
 8018:05 <dhruvm> Concept ACK - the gains seem incredible!
 8118:05 <AnthonyRonning> concept ack
 8218:05 <sergei-t> concept ack (based on prior knowledge about erlay)
 8318:06 <svav> Concept ack
 8418:06 <lightlike> Ok - that's a lot of concept ACKS!
 8518:06 <lightlike> So, let's dive into the questions:
 8618:07 <lightlike> What advantages does a reconciliation-based approach to transaction relay have over the current flooding method. What are the trade-offs?
 8718:07 <emzy> Less bandwidth usage at the cost of slower propagation of TX through the network.
 8818:07 <dhruvm> a reconciliation-based approach minimizes redundant INV messages sent to nodes which already have the transaction. 224/524 ~ 42.7% of all bytes exchanged for tx relay are redundant if flooding is used in contrast.
 8918:07 <ecola> savings in bandwith
 9018:07 <pinheadmz> less bandwidth required
 9118:07 <AnthonyRonning> Significantly reduces bandwidth and allows for better connectivity. It increases the time it takes to propagate a tx to all nodes.
 9218:07 <dergoegge> less bandwidth usage, increased connectivity and privacy benefits vs. slightly slower tx propagation.
 9318:07 <pinheadmz> obfuscation of the network graph perhaps ?
 9418:07 <tuition_> lower transaction bandwidth use at the cost of marginal increases in txn propagation delays
 9518:08 <fodediop> It allows for susbtantial bandwidth savings on trasaction propagation
 9618:08 <sergei-t> Con: more complex protocol logic, the notion of "well-connected publicly reachable" nodes (centralization risks?)
 9718:08 <dhruvm> also the growth in bandwidth is sub-linear to number of peer connections per node.
 9818:08 <pinheadmz> trade offs might be the extra round trips if txs are missing
 9918:08 <willcl_ark> It will allow nodes to make more connections due to lowering bandwidth usage, and therefore make eclipse attacks more costly/difficult
10018:08 <joelklabo> less bandwidth, more code complexity
10118:09 <sergei-t> Privacy implications: do peers now know more about which txs I have and don't have?
10218:09 <sipa> pinheadmz: that's already accounted for in the model
10318:09 <tuition_> willcl_ark nice point about decreasing bandwidth constraints allowing more peering to mitigate eclipse attacks
10418:09 <dhruvm> sergei-t: peers already knew that with flooding i think
10518:09 <ajonas> +1 willcl_ark and that plays nicely with dhruvm's comment that things only gets better with more conns
10618:10 <pinheadmz> tradeoff then might be computational resources
10718:10 <gleb> sergei-t: Erlay attempts to leak no more than already leaks through flooding. In practice, reviewers should confirm the defences indeed work.
10818:10 <pinheadmz> extra math per-peer
10918:10 <lightlike> lots of great answers here! I think especially the scaling for more connections are important.
11018:10 <sipa> tuition_: yeah, that's the big one; the savings are modest for current average connection counts, but with increased connections the benefits grow
11118:11 <rich> seems to be some implication that tx origination could be obscured since it's not pushed out preemptively via inv (amiti?)
11218:12 <lightlike> Not sure about this, but at the current connectivity (8 OB), probably a similar bandwidht saving could be gained by only using short TX Ids with Salting (as is part of Erlay) but no reconciliation
11318:12 <lightlike> sipa, gleb: would you agree to this?
11418:12 <amiti> rich: good question, I don't know the answer :)
11518:13 <sipa> lightlike: yeah, with salting
11618:13 ℹ Dulce is now known as Guest73056
11718:13 <gleb> lightlike sipa: but then it's hard to deduplicate.
11818:13 <dhruvm> rich: origination seems to be done with reconciliation and not flooding, so originator gets privacy iiuc
11918:13 <gleb> I would say just short ids are not so easy to get done right as well
12018:14 <sipa> gleb: right
12118:14 <lightlike> ok, moving on: Which of the existing message types participating in transaction relay will be sent less often with Erlay? Which ones will be unaffected?
12218:14 <pinheadmz> a lot less INV
12318:14 <dhruvm> INV will be sent less often. TX messages will be unaffected.
12418:14 <emzy> INVs will be send less often. The missing TX itself still needs to be send.
12518:14 <sipa> dhruvm: both recon-based announcement and inv announcement leak which txids the sender has; the point is that the recon-based ones are less frequent (on a per peer basis)
12618:15 <pinheadmz> and a little less GETDATA as well i presume?
12718:15 <dergoegge> unaffected: tx, getdata less: inv
12818:15 <dhruvm> sipa: ah, that makes sense.
12918:15 <lightlike> pinheadmz: why would there be less GETDATA?
13018:15 <gleb> pinheadmz: why do you think less getdata? I might be forgetting somethign at this point :)
13118:15 <dhruvm> pinheadmz: there should be less GETDATA if the missing txes are known at set-difference time?
13218:16 <pinheadmz> could be wrong but curent flood goes INV-> then GETDATA<- then TX-> right?
13318:16 <gleb> number of getdata = number of txs, logically, right?
13418:16 <felixweis> i wonder if sending a sketch of transactions that are dependent and required together to meet minrelayfee requirements of a nodes mempool can help with package relay?
13518:17 <willcl_ark> sipa: Does it not make some timing analysis attacks to detect origination more difficult or, less accurate?
13618:17 <gleb> felixweis: Hard to talk about optimizations while we have no basic design for package relay...
13718:17 <sipa> pinheadmz: every node should GETDATA every transaction exactly once
13818:17 <sipa> pinheadmz: both before and after erlay
13918:17 <dhruvm> pinheadmz: I think that's right and am wondering why we'd need as many GETDATA as well. At set difference, the other node could just send the TX messages.
14018:17 <lightlike> I think it's important thet the current flow with INV->GETDATA->TX will be unchanged - it's just that the initial INV is sent only for those transaction we are reasonably sure our peer actually needs.
14118:17 <glozow> right now I think you'd only request 1 getdata at a time, unless you got it by txid and there's different wtxids you don't know about
14218:17 <pinheadmz> sipa a ha right
14318:18 <pinheadmz> but do we still use GETDATA to reconcile if a tx is missing?
14418:18 <pinheadmz> i thought bip330 introduced other messgaes
14518:18 <glozow> yeah, gettx
14618:18 <gleb> pinheadmz: you're probably referring to an older version of the bip
14718:18 <glozow> oh, i guess if you're sending gettxes then you're sending fewer getdatas
14818:19 <pinheadmz> gleb https://github.com/naumenkogs/bips/blob/bip_0330_updates/bip-0330.mediawiki ?
14918:19 <gleb> pinheadmz: yeah, no gettx there, right?
15018:19 <gleb> The latest version is linked in the PR.
15118:19 <rich> willcl_ark: that is also how I understood "the point is that the recon-based ones are less frequent (on a per peer basis)"
15218:19 <pinheadmz> oh I see `reconcildiff` triggers the usual inv->getdata->tx
15318:20 <gleb> pinheadmz: exactly!
15418:20 <lightlike> pinheadmz: yes!
15518:20 <pinheadmz> i guess i thought it was more like getblocktxn from compact blocks
15618:20 <ariard> felixweis: thought about it either adding another snapshot for package ids only or eating the bullet of wasted bandwidth in case of redundancy
15718:20 <gleb> dhruvm: in short, sketches operate over short ids, but we can't request by short ids for reasons.
15818:20 <felixweis> whats the rationale for the last point in Short ID computation? 1 + s mod 0xFFFFFFFF. why not use s already?
15918:21 <sipa> felixweis: the pinsketch algorithm needs item IDs that are nonzero
16018:21 <lightlike> next question:
16118:21 <dhruvm> gleb: i see. are the reasons documented somewhere?
16218:21 <felixweis> ah! makes sense
16318:21 <ma33> Is their an analysis of when Erlay becomes too computationally expensive? I remember seeing a comparison with another set-recon scheme in the paper with up to 50 differences. What if the number of differences increases to, say, 500?
16418:22 <lightlike> Can you explain why a reconciliation-based approach scales better than flooding (after all, the reconciliation messages also consume regular traffic)?
16518:22 <felixweis> re-reading the next sentence would have helped already lol
16618:23 <pinheadmz> lots of redundancy sending 255 byte transactions to all nodes all the time
16718:23 <rich> It doesn't if you only have one peer, but with more peers sending the same tx inv, there is deduplication savings.
16818:23 <lightlike> ma33: I think in the paper there is a suggestion to make larger set differences computationally viable via bisection - which is not needed for bitcoin though becuse typical differences are smaller.
16918:23 <ariard> dhruvm: 32-bit short ids means you can have tx-relay jamming based on malicious collisions
17018:23 <sipa> pinheadmz: again, the transactions are only sent once already
17118:23 <willcl_ark> instead of receiving every tx from every peer, you just receive it from one
17218:23 <gleb> dhruvm: that's one of the design choices we made while working on the implementation, I think you get it naturally when you review the protocol closely. Not sure all of them should be documented. But we can discuss this as well
17318:23 <sipa> willcl_ark: transactions are only sent once already
17418:24 <pinheadmz> sipa we dont re-broadcast entire TX to every peer besides the one who sent it to us ?
17518:24 <pinheadmz> (currently, before PR)
17618:24 <dhruvm> ariard: gleb: thanks.
17718:24 <gleb> ma33: yeah, check graphs in the minisketch repo. It has nice data on the time it takes to compute
17818:24 <sipa> pinheadmz: we _announce_ it to every other peer, by txid; not the full tx, obviously
17918:24 <pinheadmz> ahaahahaha yes
18018:24 <sipa> the peer requests the actual transaction from one node that announced it
18118:24 <willcl_ark> ah
18218:24 <pinheadmz> so what were saving is 32 byte txid > 32 bit short txid
18318:25 <sipa> nope, far more than that
18418:25 <sipa> because erlay doesn't send short ids; it sends sketches
18518:25 <pinheadmz> oh right
18618:25 <sipa> and the size of sketches scales with the bandwidth of actual differences
18718:25 <rich> I wonder what the crossover point for number of peers where reconciliation starts to pay off?
18818:25 ⚡ pinheadmz 💡
18918:26 <rich> I suppose it also has to due with how interconnected your peers are too.
19018:26 <lightlike> yes, that is the core of it. If we just sent short txids over each link as part of the recon, the scaling wouldn't be better than now.
19118:26 <sipa> rich: yeah, and on the exact parametrization; i believe gleb experimented with various configurations
19218:26 <gleb> yeah, the data is in the paper.
19318:26 <maqusat> only diff of tx ids sent instead of the whole set
19418:27 <gleb> rich: I think 9 peers actually. Erlay would cost 8x, short ids would pay 9x, current flooding is 32x (hence full txid)
19518:28 <gleb> or not.... sorry, disregard for now.
19618:28 <rich> gleb: do we have some way to model how interconnected typical peers are?
19718:28 <lightlike> sipa, gleb: how would Erlay deal with it if someone dumped thousands of transactions at the same time into the network (e.g. for utxo consolidation)? would sketches be able to deal with that?
19818:29 <ma33> rich TxProbe paper had some stats on the Bitcoin graph, albeit only on testnet
19918:29 <gleb> rich: my simulator generated a topology similar to bitcoin today: 10k reachable nodes + 50k non-reachable nodes. And then random distribution. That's it
20018:29 <rich> gleb: +1 seems reasonable
20118:30 <gleb> lightlike: i haven't tried this scenario. Worst case we waste a limited amount of bandwidth and fallback to flooding.
20218:30 <dhruvm> lightlike: while that might make the sketch diffs larger it should still be atleast as good as flooding right (and then some due to short txids)?
20318:30 <sipa> lightlike: if you dump 1000s of transactions on any given 0.21 node, the ones above 5000 will just be ignored; the rest would be requested with some delay... then that peer would start propagating them, but at a limited rate
20418:31 <ariard> rich: https://github.com/bitcoin/bitcoin/pull/15759#issuecomment-480868516 see discussions about some interconnection models
20518:31 <sipa> erlay doesn't really come into play much, i think
20618:31 <rich> ariard: thx!
20718:31 <lightlike> ok, thanks!
20818:31 <dergoegge> why is the reconil. frequency fixed at 1 every 2 seconds? could it make sense to have the frequency change based on how many new txs the node has seen in the last x hours/minutes? (i.e. reconcil more frequently if there are more txs coming through)
20918:32 <gleb> dergoegge: that's possibly, I kind of assumed a "normal" operation of the network in the window of 3..14 txs per second appearing randomly in the network.
21018:33 <lightlike> moving on with the q (getting into the actual commits):
21118:33 <lightlike> How do two peers reach agreement about whether to use reconciliation?
21218:33 <dhruvm> When a node receives a WTXIDRELAY message from a peer that supports tx-relay, it announces a SENDRECON message and requests to act as a reconciliation-requestor for outbound connections and a reconciliation-responder for inbound connections.
21318:33 <sipa> gleb: just because the initiating node hasn't received many transactions doesn't mean there aren't many; it may just not have heard about them>
21418:34 <glozow> do i have this correct? :
21518:34 <glozow> both send `SENDRECON`, both must not have txrelay off, both must use wtxid relay, peer who initiates outbound must have requestor=on/responder=off, peer who receives inbound connection must have requestor=off/responder=on
21618:34 <sipa> dergoegge: more generally, i think it's a bit scary to make the frequency of propagation depend on seen transactions (e.g. i could imagine some scary fix point where all nodes end up deciding there are no transactions, and stop reconciling entirely...)
21718:35 <lightlike> dhruvm, glozow: exactly! both nodes send a SENDRECON message.
21818:35 <ariard> glozow: I think they should be mauve peers to be correct
21918:35 <glozow> ariard: mauve?
22018:35 <lightlike> so everything is happening during VERSION negotiation and fixed for the lifetime of the connection - no later changes.
22118:35 <pinheadmz> as part of the version handshake, after WTXID is set
22218:36 <ajonas> Well we would fall back to flooding on a conn that supports reconciliation if we don’t have enough outbound flood conns, right?
22318:36 <jnewbery> Is there a reason that both parties add salt?
22418:37 <glozow> so that nobody can get nodes to use duplicate salts?
22518:37 <pinheadmz> jnewbery so attackers cant try to find coliding short ids
22618:37 <pinheadmz> or if they do itll only affect one pair of nodes ;-)
22718:37 <dergoegge> gleb, sipa: i see. although there could be a limit say "reconciliate at least every ~n seconds" to avoid stopping entirely
22818:38 <sipa> dergoegge: yeah, not saying that doesn't make sense, just that it does bring in some possibly scary considerations
22918:38 <gleb> two salts may be redundant actually, what if the salt was always picked by the "connection initiator" side randomly per peer, and they use that?
23018:39 <gleb> what do you think guys? sipa?
23118:39 <sipa> gleb: i think that's probably ok, but i find it easier to reason about if both contribute
23218:39 <gleb> i see.
23318:39 <dergoegge> sipa thats true, thanks for the answer
23418:39 <ma33> If I understand it correctly, a node can request a sketch from only a maximum of 8 of its (outbound) peers, correct? Or do peers in blocks-only-mode also participate in recon?
23518:39 <jnewbery> I don't understand the attack. I could make more than one of my peers have the same salt?
23618:39 <gleb> yeah, it's indeed "safer" when both contribute
23718:40 <jnewbery> but I can't force the same salt transitively to other connections
23818:40 <sipa> jnewbery: the biggest reason for the salt is that not all connections use the same salt; if there were, an attacker could easily grind transactions that propagate very badly over all connections
23918:40 <jnewbery> certainly I understand the reason for a salt per connection, just not the need for both the requestor and responder to contribute
24018:41 <sipa> so if both contribute entropy, you can just say "assume not both parties are the grinding attacker", which is obviously true - the attacker doesn't need to relay to themselves
24118:41 <gleb> ma33: not blocks-only, no, but all tx-relaying peers can be requested for a sketch (if they indicated support). Yes, this is *currently* limited to 8.
24218:41 <dhruvm> ma33: recon seems like a tx-relay thing, not sure blocks-only nodes would get the network characteristics they desire with recon. but erlay perhaps reduces the need to be blocks-only?
24318:41 <lightlike> ma33: It would request sketches from all of its peers, its just that the current limit for outbounds is 8.
24418:41 <sipa> blocks-only connections definitely should not do erlay; it reveals information about their mempool
24518:42 <lightlike> ajonas: your question wrt fallback to flooding kind of touches my next question
24618:42 <jnewbery> the party that sends the SENDRECON message second can grind the salt value
24718:42 <lightlike> If two peers have agreed to use reconciliation, does that mean there will be no flooding on this connection?
24818:42 <rich> if I understand correctly, blocks-only would still be lower bandwidth than erlay, but with the limitation of no mempool
24918:42 <dhruvm> I am not certain but it seems flooding is between listening nodes. Reconciliation is between all pairs of peers?
25018:43 <rich> (and blocks come too infrequently to worry about inv message deduplication for them)
25118:43 <sipa> jnewbery: yes, but they can't grind it so that the same combined salt is used on all their connections
25218:44 <sdaftuar> sipa: does it matter if a peer chooses to make relaying with themselves not work?
25318:44 <sdaftuar> ie, they could just not relay transactions anyway
25418:44 <gleb> dhruvm: what you're referring to is an *intuition* behind how erlay achieves it's properties. In practice, the relay policy is a bit more specific. This is because we have no notion of reachable node in the codebase (a node can't for sure know if it's reachable or not )
25518:44 <dergoegge> There will be at least 8 flooding connections even if they also support reconciliation. If the reconciliation set is full we also fall back to flooding i think.
25618:44 <pinheadmz> and what is the worst case scenario for a succesful shortid collision? just one peer doesnt get a tx message? theyll see it in a block at least
25718:44 <lightlike> rich: yes, blocks-only is strictly lower bandwidth - but obviously tx relay is necessary for the network, so you dont contribute that by being blocks-only.
25818:44 <ariard> dhruvm: beyond the bandwidth saving, blocks-only are also (hopefully!) hidden peers, making it harder to partition a victim node
25918:44 <sipa> sdaftuar: generally in cryptographic analysis you always assume that at least one of the involved parties is honest; if everyone is an attacker, who cares?
26018:44 <sipa> i think the same applies here
26118:44 <ma33> @gleb
26218:45 <ma33> gleb got it, thanks
26318:45 <ma33> dhruvm nice follow up question. any ideas?
26418:45 <sdaftuar> sipa: not sure i follow -- the idea is that one of the peers can break relay on their own link. that doesn't generalize to any other connections on the network that don't involve them
26518:45 <dhruvm> gleb: ah, i should have read more than the paper :)
26618:45 <sdaftuar> i'm just saying they could do that already by not relaying to begin with
26718:46 <sipa> sdaftuar: ah yes
26818:46 <dhruvm> ariard: not sure i follow. how are block-only peers hidden?
26918:46 <sipa> dhruvm: connectivity graph is much harder to infer for blocks-only connections (because of no addr and no tx relay)
27018:46 <sdaftuar> i don't know that it matters much either way, no downside either to both sides contributing entropy
27118:47 <lightlike> dergoegge: yes, but those connections won't be flooding only. We would flood TO up to 8 peers if we are a reachable note, but still get txes FROM these peers via reconciliation
27218:47 <jnewbery> sdaftuar: yes, this is what I was trying to ask earlier. The worst I could do would be have the same salt for all my peers, which isn't attacking anyone else
27318:47 <sdaftuar> jnewbery: yeah that makes sense to me
27418:47 <gleb> jnewbery: this is my understanding as well.
27518:48 <sipa> right, that's a fair point
27618:48 <dhruvm> sipa: i see. so two incentives to be blocks-only: protection from graph inference and bandwidth reduction.
27718:48 <dergoegge> lightlike: ah yes, thanks for clarifying
27818:48 <gleb> In some older design, we needed 2 salts, but not really anymore
27918:48 <sipa> dhruvm: indeed
28018:48 <sdaftuar> one thing that occurred to me is if having a bad salt incurred cpu cost on your counterparty
28118:48 <sdaftuar> in which case that might be a motivation to just avoid the influence of an attacker in that regard, but no idea if that might be true
28218:49 <lightlike> next q:
28318:49 <lightlike> In the Limit transaction flooding commit, MAX_OUTBOUND_FLOOD_TO is set to 8. Considering that 8 is also the maximum number of outbound connections participating in transaction relay, why do you think this value was chosen?
28418:49 <lightlike> (already partly answered I think)
28518:49 <ariard> dhruvm: if you're picked up as a outbound block-relay-only the protection for graph inference is also benefiting your peer
28618:50 <ariard> it's a shared incentive
28718:50 <sipa> sdaftuar: i believe siphash does assume that keys are random; if the exact key can be chosen exactly, there may be attacks beyond the ability to grind colliding transactions (e.g. i could imagine that some pathological keys exist that result in strictly more collisions that average keys); the actual key is computed with SHA256, but with too little entropy, that might be grindable too
28818:50 <dhruvm> ma33: as sipa and ariard mentioned, the other incentive for block-only is security related.
28918:50 <dhruvm> ariard: yeah that makes sense because the inference ability is on the link.
29018:51 <gleb> I think this q is a hard one. MAX_OUTBOUND_FLOOD_TO=8 is based on my simultions: it provides low enough latency while not flooding too much (assuming we have more max conns)
29118:52 <gleb> (low latency achieved in a combination of "diffusion" interval we wait before flooding out a tx, which was another parameter)
29218:53 <lightlike> gleb: Ok - I thought part of the motivation was to not flood more than right now when, as a next step, the number ob MAX outbound connections is adjusted upwards.
29318:53 <dhruvm> gleb: so it was the empirical sweet-spot between all-flooding(bandwidth-intensive) and all-reconciliation(latency-intensive)?
29418:53 <gleb> lightlike: yeah, that's a way to think about it.
29518:53 <rich> gleb: so can we think about it as the default outbound relay connections are added to the flood connections?
29618:53 <gleb> dhruvm: yeah, how often are reconciliations vs how often we flood
29718:54 <lightlike> Final question: Can you think of possible new attack vectors that would specifically apply to Erlay?
29818:54 <gleb> rich: sorry, i don't understand your question
29918:54 <rich> gleb: nevermind
30018:55 <sdaftuar> lightlike: i think that's a great question that i hope reviewers give a lot of thought to!
30118:55 <rich> perhaps new attack vectors related to latency if enough nodes reduce/exchange flood connections and you can somehow synchronize when reconciliations happen
30218:56 <glozow> fingerprinting-type attacks? (which is why there are delays in responses)?
30318:56 <tuition_> /me wonders about any sort of race conditions
30418:56 <lightlike> gleb: in your simulations, did you simulate the mixed scenario of Erlay and legacy nodes? I am a bit concerned that, since flooding is faster, TX relay might be centralised towards the legacy nodes, and the new Erlay nodes only get the "leftovers" and don't really have anything to reconcile.
30518:56 <gleb> glozow: yeah, I only accounted for a "first-spy estimator", and also my general understanding of spy heuristics. Def a lot of room for research
30618:56 <ariard> targeting the limited size of the local reconcliation set?
30718:57 <ariard> broadcasting hard to compute sketch?
30818:57 <dhruvm> Reducing redundancy can potentially open up tx censorship, but I can't actually think of a specific attack.
30918:57 <glozow> what are the upper bounds of how hard it could be to compute a sketch?
31018:57 <gleb> lightlike: I think no, but that would mean erlay nodes just won't get the benefit in early days? That's possible.
31118:57 <willcl_ark> what happens if the flood nodes censor certain transactions
31218:58 <lightlike> gleb: it might also mean more bandwidth for legacy nodes.
31318:58 <gleb> willcl_ark: every node gets every transaction anyway. Censorship can happen the same way it can happen today (if maaany reachable nodes censor a tx)
31418:58 <ariard> willcl_ark: you damage their latency but they should propagate through reconciliation?
31518:58 <tuition_> Do we expect miners will run with erlay and non erlay nodes? it seems they'll want to be maximally aware of all txns (hence non erlay) but will also want to guard against eclipse attacks (and maybe run block only hence no erlay)
31618:58 <gleb> lightlike: need to think about this one
31718:58 <rich> I guess you are creating/storing a sketch and holding onto it for some response, you'd want to be careful someone couldn't blow up your memory.
31818:58 <lightlike> becasue they send all the TXes now before Erlay nodes get a chance
31918:59 <gleb> tuition: if they run reachable nodes, even erlay-enabled, their latency is gonna be lower than what we have today!
32018:59 <gleb> because in erlay, flooding artificial delays are reduced.
32119:00 <ariard> lightlike: depend if you assume that reachable nodes are going to be deploy edfaster than non-listening ones?
32219:00 <lightlike> ok, great discussion, time is up already!
32319:00 <lightlike> !endmeeting
32419:00 <glozow> thank you lightlike!
32519:00 <willcl_ark> thanks lightlike, very interesting!
32619:00 <ajonas> thanks lightlike
32719:00 <ecola> thank you lightlike, very interesting !
32819:00 <dhruvm> Thanks lightlike, gleb, sipa! This was 🔥!
32919:00 <rich> thanks lightlike!
33019:00 <dergoegge> thank you, this was fun!
33119:00 <maqusat> thank you!
33219:01 <shafiunmiraz0> Thank you lightlike
33319:01 <jonatack> thanks gleb!
33419:01 <OliP> Thank you!
33519:01 <tuition_> thanks lightlike gleb sipa
33619:01 <jonatack> thanks lightlike!
33719:01 <gleb> I hope you guys stick around for the PR until it's done :)
33819:01 <felixweis> thanks lightlike!
33919:01 <fodediop> Thank you lightlike 🙏🏿
34019:01 <effexzi> Thanks.
34119:01 <jnewbery> thank you lightlike!