Use legacy relaying to download blocks in blocks-only mode (p2p)

https://github.com/bitcoin/bitcoin/pull/22340

Host: dergoegge  -  PR author: dergoegge

The PR branch HEAD was 0c2f9346024cfca29d5f0880cb4471961ffd112b at the time of this review club meeting.

Notes

  • After a block is mined it is broadcast to the p2p network where it will eventually be relayed to all nodes on the network. There are two methods available for relaying blocks: legacy relay and compact block relay.

    • Legacy Relay: A node participating in legacy relaying will always send or request entire blocks. For nodes that maintain a mempool this is quite bandwidth inefficient, since they probably already have most of the transactions from a new block in their mempool.

    • Compact Block Relay: Compact block relay is specified in BIP 152. The goal is to address the bandwidth inefficiencies of legacy relaying by only relaying the transactions of a new block that the requesting peer has not yet seen. Check out this Compact Blocks FAQ for bechmarks and more info.

  • Bitcoin Core 0.12 introduced a -blocksonly setting that can reduce a node’s bandwidth usage by 88%. The reduction is achieved by not participating in transaction relay. For more info check out this post on blocksonly mode by Gregory Maxwell. Blocksonly nodes currently use compact block relaying to download blocks even though they don’t maintain a full mempool.

  • This PR makes blocksonly nodes use legacy relaying to download new blocks.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?

  2. What is the sequence of messages used in legacy and compact block relaying?

  3. Why does compact block relay waste bandwidth for blocksonly nodes during block download? How much bandwidth is waisted?

  4. Can and should a blocksonly node still serve compact blocks?

  5. What is PeerManagerImpl::m_ignore_incoming_txs? Where and how does this PR use it to achieve its goals?

  6. What do you think of Gleb’s comments on the usage of sendrawtransaction? Can you think of other exceptions in which a blocksonly node would still want to download compact blocks?

Meeting Log

  110:00 <dergoegge> #startmeeting
  210:00 <amiti> hi
  310:00 <dergoegge> Hi everyone! Welcome to this week's PR Review Club!
  410:00 <willcl_ark> hi
  510:00 <glozow> hi
  610:00 <raj> hello
  710:00 <BlockHead> Whats up
  810:00 <jnewbery> hi
  910:00 <dergoegge> Feel to say hi to let people know you are here (lurkers are also welcome)
 1010:00 <JanB> hi
 1110:00 <svav> Hi All
 1210:00 <murch1> Hi
 1310:00 <dergoegge> Anyone here for the first time?
 1410:00 <merkle_noob[m]> Hi everyone...
 1510:00 <larryruane> hi
 1610:00 <schmidty> hi
 1710:00 <b10c> hi
 1810:00 <dergoegge> today we are looking at #22340 - Use legacy relaying to download blocks in blocks-only mode
 1910:00 <lightlike> hi
 2010:01 <JanB> firstimer , will be watching only for this time
 2110:01 <dergoegge> notes and questions are in the usual place: https://bitcoincore.reviews/22340
 2210:01 <theStack> hi
 2310:01 <jnewbery> JanB: welcome :)
 2410:01 <glozow> welcome JanB :D
 2510:01 <BlockHead> also my 1st time
 2610:01 <dergoegge> JanB: welcome!
 2710:01 <dergoegge> we have a couple of prepared questions but feel free to jump in at any point if you have any questions or points you want to add
 2810:01 <jnewbery> BlockHead: Welcome!
 2910:01 <Azorcode> Hello Everyone
 3010:01 <JanB> tnx :)
 3110:02 <dergoegge> Ok lets get started, Who had a chance to review the PR? Concept ACK, approach ACK, tested ACK, or NACK?
 3210:02 <murchandamus> n
 3310:02 <b10c> n
 3410:02 <amiti> started reviewing, concept ACK!
 3510:02 <dergoegge> BlockHead: Welcome!
 3610:02 <raj> initial pass. Concept ACK..
 3710:02 <theStack> y, concept ACK
 3810:02 <larryruane> review y, tested n, ACK
 3910:02 <svav> n
 4010:02 <glozow> concept ACK, needs a functional test in p2p_compactblocks_hb.py imo
 4110:02 <murchandamus> Sounds like a concept ack though
 4210:02 <lightlike> concept ACK
 4310:02 <amiti> yeah agree
 4410:02 <jnewbery> reviewed && ACKed
 4510:02 <raj> glozow, +1..
 4610:02 <willcl_ark> ACK from me
 4710:03 <amiti> (agree with glozow)
 4810:03 <dergoegge> glozow: good point, i agree
 4910:03 <dergoegge> First question: What is the sequence of messages used in legacy and compact block relaying?
 5010:03 <willcl_ark> Yeah I also thought a test would be nice, to avoid a future change inadvertantly reverting this nice win :)
 5110:04 <amiti> I started tinkering with the functional test, but I'm getting surprising results. I thought one difference would be a blocksonly node sending sendcmpct with 0 vs 1, but that's not what I'm getting in the tests 🤔
 5210:04 <amiti> but maybe we can get to that later :)
 5310:05 <dergoegge> amiti: interesting, we can get to that later
 5410:05 <larryruane> for the sequence, there's a really nice diagram in the PR description
 5510:06 <BlockHead> larryruane yeah i think that sequence diagram is taken from BIP 152 doc
 5610:06 <glozow> both would do headers first,
 5710:06 <glozow> low-bandwidth compact block would request GETDATA(CMPCT_BLOCK) and get a compact block and then do GETBLOCKTXN?
 5810:06 <glozow> legacy would do a GETDATA(BLOCK) and then a BLOCK
 5910:06 <glozow> high-bandwdith compact block sends a compact block straight away, with coinbase prefilled?
 6010:07 <dergoegge> glozow: yes, i think the coinbase is prefilled
 6110:07 <jnewbery> larryruane: the sequence diagram is from BIP152: https://github.com/bitcoin/bips/blob/master/bip-0152.mediawiki
 6210:08 <glozow> headers first for low bandwidth compact block and for legacy*?
 6310:08 <dergoegge> glozow: i think both inv and headers is possible but unsure
 6410:08 <larryruane> i did have a question about case C, low bandwidth relaying ... since node B is sending a getdata(CMPCT) message, why is it necessary for that node to announce ahead of time that it wants to use this mode (the sendcmpct(0) message?
 6510:09 <glozow> larryruane: i think you'd need a version of bitcoin client that understands compact block p2p messages
 6610:09 <larryruane> so for example, let's say node B doesn't send the sendcmpct(0) message ... then later sends a getdata(CMPCT)? is that just a protocol error?
 6710:10 <dergoegge> larryruane: yes the peer would not respond with a cmpctblock message
 6810:10 <lightlike> GETBLOCKTXN is not always sent - if we have all the txes in our mempool, would it be skipped?
 6910:10 <murchandamus> IIRC, only really old nodes would announce blocks with INV messages, any newer nodes would always announce it with the header
 7010:10 <larryruane> glozow: okay, but if it's sending a getdata(CMPCT), doesn't that indicate that it understands compact block p2p messages?
 7110:10 <BlockHead> glozow it looks like headers/inv are sent via legacy and low band. and not highband.
 7210:11 <dergoegge> lightlike: i think it is sent and it requests all the txs that are missing
 7310:11 <theStack> ad legacy relaying diagram: what does it depend on whether a peer is notified via headers or via inv? (according to the protocol documentation, headers is only sent in response to getheaders: https://en.bitcoin.it/wiki/Protocol_documentation#headers)
 7410:11 <jnewbery> dergoegge: yes, Bitcoin Core will only prefill the coinbase. There's a TODO in the code to make it prefill transactions that it didn't have in its mempool as well: https://github.com/bitcoin/bitcoin/blob/4f1a75b1aa9402e62bc2ed3e0296e4fba81254e4/src/blockencodings.cpp#L23-L28
 7510:12 <glozow> BlockHead: ye!
 7610:12 <lightlike> dergoegge: yes, but what if there are no txs missing (because we have them all in our mempool)?
 7710:12 <glozow> lightlike: then we don't send any GETBLOCKTXN
 7810:12 <jnewbery> murchandamus: we'll announce blocks using INV to any peers that don't send us a `sendheaders` during version handshake
 7910:12 <murchandamus> jnewbery: Okay, thanks
 8010:13 <dergoegge> lightlike: oh right, i am not sure in that case but i guess it could be omitted
 8110:13 <glozow> lightlike: also, could get them from `vExtraTxnForCompact`
 8210:13 <dergoegge> Ok next question: Why does compact block relay waste bandwidth for blocksonly nodes during block download? How much bandwidth is wasted?
 8310:14 <lightlike> we can also revert to inv-mode - at least, this seems to happen sometimes in the functional tests intermittently, when the CI has strange delays for some reason
 8410:14 <glozow> larryruane: I think we usually disconnect a peer that sends a message we don't expect? not always tho, would need to look at what BIP152 says
 8510:14 <jnewbery> lightlike: we only send a `getblocktxn` if we weren't able to reconstruct the block from our mempool/extra txns: https://github.com/bitcoin/bitcoin/blob/4f1a75b1aa9402e62bc2ed3e0296e4fba81254e4/src/net_processing.cpp#L3522-L3535
 8610:15 <larryruane> dergoegge: is it because the node will need to request a bunch of tx that we would have gotten with the full block?
 8710:15 <BlockHead> dergoegge block only nodes don't have a mempool, so it always needs all the detail about new blocks
 8810:16 <dergoegge> larryuane: yes the requesting of the txs adds a bit of overhead but there is also another message that causes some waste
 8910:17 <jnewbery> larryruane: I may be mistaken, but it looks like we'll respond to a getdata(cmpctblock) even if the peer hasn't sent us a sendcmpct message: https://github.com/bitcoin/bitcoin/blob/4f1a75b1aa9402e62bc2ed3e0296e4fba81254e4/src/net_processing.cpp#L1849-L1866
 9010:18 <dergoegge> BlockHead: yes blocksonly nodes don't have a mempool
 9110:18 <lightlike> i think most of the overhead is them sending us all the short-ids for the txes, when we'd request all of them anyway. that is unnecessary
 9210:18 <dergoegge> lightlike: exactly
 9310:19 <dergoegge> the unnecessary shortids are in the cmpctblock and getblocktxn message
 9410:19 <jnewbery> larryruane: I think you can think of a sendcmpct(0, version) to mean "I can provide compact blocks" and a sendcmpct(1, version) to mean "I want you to use HB compact blocks to announce new blocks"
 9510:20 <dergoegge> one thing to note here is that bandwidth is only wasted when downloading blocks
 9610:20 <dergoegge> which ties into the next question: Can and should a blocksonly node still serve compact blocks to their peers?
 9710:20 <lightlike> dergoegge: Are you sure? I thought they do have a mempool as well, it is just not updated via p2p? for example, if I restart a normal node with a full mempool into blocks-only mode, the mempool will still be full?!
 9810:21 <larryruane> jnewbery: +1 thanks
 9910:21 <jnewbery> lightlike: yes, you're right. There is still a mempool, but the node won't request transactions from peers when it receives an inv
10010:21 <amiti> lightlike: that's true, but rare right? you'd have a mempool initially, but then after some time it'd likely clear out?
10110:22 <lightlike> amiti: yes, I agree. with time, it will clear out.
10210:22 <amiti> dergoegge: yes, I think blocksonly nodes serving compact blocks makes sense
10310:22 <jnewbery> I don't think there's currently a config option to switch the mempool off entirely, although it shouldn't be too much work to implement that now that so much work has been done to separate out the components
10410:23 <theStack> i'd also say serving compact blocks from blocksonly nodes makes sense
10510:23 <larryruane> but even in blocks-only, don't i need a mempool for transactions that i'm initiating?
10610:23 <glozow> larryruane: yes, you do. which is why you have one i think
10710:24 <amiti> larryruane: you can initiate transactions in blocksonly node but its not recommended. its a dead give away that they are your own =P
10810:24 <dergoegge> amiti theStack: yes so blocksonly nodes can and should still serve compact blocks since they really only the entire block to be able to do that
10910:24 <larryruane> amiti: glozow: gotcha thanks makes sense
11010:25 <dergoegge> btw. we forgot a part of the previous question. "how much bandwidth is wasted?" these two comments have some data: https://github.com/bitcoin/bitcoin/pull/22340#issuecomment-869016710, https://github.com/bitcoin/bitcoin/pull/22340#issuecomment-872723147
11110:26 <dergoegge> What is PeerManagerImpl::m_ignore_incoming_txs? Where and how does this PR use it to achieve its goals?
11210:27 <larryruane> hey in case this may be useful to anyone else, to measure the bandwidth of your local bitcoind, run `sudo iftop -f 'port 8333' in another window (on linux, you may have to apt install it)
11310:27 <theStack> m_ignore_incoming_txs it is set to true if the node was started with -blocksonly
11410:27 <larryruane> sorry that didn't format right `sudo iftop -f 'port 8333'`
11510:28 <dergoegge> larryruane: to collect the data for this PR i used -debug=all and then later parsed the logs
11610:28 <larryruane> theStack: that's right https://github.com/bitcoin/bitcoin/blob/master/src/init.cpp#L1164
11710:28 <dergoegge> theStack: correct!
11810:30 <lightlike> I think the naming of the variable m_ignore_incoming_txs is a bit misleading. we don't just ignore them, we'll disconnect the peer if they send us txes.
11910:31 <jnewbery> lightlike: good point. Maybe m_ignore_tx_announcements would have been more appropriate.
12010:32 <jnewbery> (some discussion of the naming here: https://github.com/bitcoin/bitcoin/pull/20217#discussion_r510099354)
12110:33 <amiti> what about m_please_don't_send_me_txns ?? :)
12210:33 <dergoegge> So one thing we do in this PR is not send sendcmpct(1) if m_ignore_incoming_txs=true
12310:33 <lightlike> jnewbery: but we seem to disconnect even for sending INVs. maybe just m_blocksonly_mode ?
12410:33 <raj> amiti, I like it.. :D
12510:33 <dergoegge> what does that achieve?
12610:34 <jnewbery> lightlike: oh yes, you're right that we'll disconnect if they send an INV
12710:35 <larryruane> just for us newbies, INV used to be used to announce both blocks and transactions, but now is used only for transactions?
12810:36 <jnewbery> (the reason we disconnect is that it's a violation of the protocol. We've sent them fRelay=false in our version message to them, requesting that they don't relay txs to us)
12910:36 <willcl_ark> Agree the name could be changed, but I quite like the behaviour, it's better for low-bandwidth nodes to disconnect if people start sending you unsolicited messages, wasting your bandwidth
13010:36 <lightlike> larryruane: you are right, i meant tx INVs, there are two types of INVs, and Block INVs are ok in blocksonly mode
13110:38 <merkle_noob[m]> Please, I have a question: What is the difference between sending a headers announcement and an INV announcement?
13210:38 <sipa> in an INV announcement, you send just the block hash
13310:38 <sipa> in a headers announcement, the block header (80 bytes) is sent instead
13410:39 <larryruane> notice too that at least by default, we set up a block-relay-only connection to some of our peers, even without the local `-blocksonly` flag ... does that cause `ignores_incoming_txs` to be set to true?
13510:39 <merkle_noob[m]> sipa: Thanks...
13610:39 <JanB> dergoegge by not sending sendcmpct(1) we default to Legacy relaying (?)
13710:39 <amiti> larryruane: no, -blocksonly is a mode & sets ignore_incomping_txs
13810:39 <amiti> but block-relay-only is an attribute of a connection
13910:40 <larryruane> but should the changes this PR is making apply to the latter case too?
14010:40 <jnewbery> markle_noob[m]: headers-first syncing was implemented in https://github.com/bitcoin/bitcoin/pull/4468
14110:41 <dergoegge> JanB: almost correct, there is a second step to defaulting to legacy relay. by not sending sendcmpct(1) we don't request high bandwidth mode, so our peer won't send us cmcptblock messages to announce blocks.
14210:42 <JanB> dergoegge: ah yes, i c
14310:43 <dergoegge> So now that high bandwidth mode is covered what do we need to do for low bandwidth mode?
14410:43 <amiti> larryruane: I don't think so. I could be running a node with a full mempool and have a block-relay-only connection to you. Just because you don't send me txns doesn't mean I necessarily don't already have them to reconstruct the block from short ids.
14510:43 <amiti> with -blocksonly mode, although its possible I have mempool txns based on edge cases, most likely I don't have mempool txns, so will need to get all the transactions
14610:44 <raj> dergoegge, we should also ensure not to send sendcmpct(0)?
14710:44 <theStack> dergoegge: if we receive an announcement via headers or inv, request the full block rather than compact blocks
14810:44 <lightlike> a weird thing is that I think even in blocksonly mode, one has two block-relay-only connections ;)
14910:44 <amiti> lightlike: hahhaha, yeah that's true =P
15010:45 <larryruane> amiti: yes, you're exactly right, i see now, +1
15110:45 <dergoegge> raj: we actually need to still send sendcmpct(0) otherwise a peer won't request compact blocks and a blocksonly node still wants to serve those
15210:45 <dergoegge> theStack: correct!
15310:46 <jnewbery> lightlike: amiti: (I know you know this alredy, but for everyone else) block-relay-only connections also don't gossip addrs, so even in blocksonly mode their behaviour is slightly different from the other connections
15410:46 <merkle_noob[m]> jnewbery: Thanks for the link...
15510:46 <raj> dergoegge, oh right.. thanks..
15610:46 <dergoegge> raj: Peers of a blocksonly node don’t request compact blocks if `fSupportsDesiredCmpctVersion=false`. The only place that `fSupportsDesiredCmpctVersion` is set to true is https://github.com/bitcoin/bitcoin/blob/6dfee13f650521f7542df0926aff01af9ac6a328/src/net_processing.cpp#L2712-L2717
15710:47 <dergoegge> amiti: since we are talking about sendcmpct what did you observe in your tests?
15810:48 <JanB> dergoegge: that's done by the extra check added on line 2125 ? (to request the full block)
15910:48 <theStack> so if i see that correctly, at the point of the diff where the second change occurs, the GETDATA message is already prepared to request the full block, and we add an additional condition (!m_ignore_incoming_txs) to when to modify the message to request compact blocks
16010:50 <jnewbery> theStack: right - there's no point in requesting a compact block if we don't have transactions in our mempool
16110:50 <amiti> dergoegge: even with the guard to MaybeSetPeersAsAnnouncing... commented out, the sendcmpct announce bool was set to False for a blocksonly node. I thought this is what that clause would change (so, true on master, false on PR). but I might be missing something in my test setup or my understanding =P
16210:50 <amiti> https://github.com/amitiuttarwar/bitcoin/commit/9053613f4c4131615c2a395da9d33f47dd8e4720
16310:50 <dergoegge> JanB: yes so if m_ignore_incoming_txs = true then we don't send getdata(CMPCT) and instead send a normal getdata(BLOCK)
16410:50 <larryruane> theStack: looks like to me too, and also I found it interesting (not changed by this PR) that only the first entry in the CInv vector needs to say MSG_CMPCT_BLOCK ... all the others can still say MSG_BLOCK
16510:51 <lightlike> larryruane: One more thought about your earlier question: I think the main reason for having block-relay-only connections is to not be subjected to privacy issues with tx and addr relay. By allowing compactblock downloading, you do reveal something about your mempool to them which you otherwise wouldn't (which txes of the block you still need), but I think it's not really possibly to abuse this, because creating valid blocks we'd
16610:51 <lightlike> download from you requires POW.
16710:52 <dergoegge> amiti: sendcmpct is send multiple times and the first time it is send the announce flag defaults to false
16810:52 <dergoegge> https://github.com/bitcoin/bitcoin/blob/6dfee13f650521f7542df0926aff01af9ac6a328/src/net_processing.cpp#L2678
16910:52 <dergoegge> maybe that is what you are seeing?
17010:52 <larryruane> lightlike: +1
17110:53 <theStack> jnewbery: yes that makes sense
17210:53 <amiti> I see two sendcmpct messages being sent & both have announce = false, whether or not the m_ignore_incoming_txs guard is set
17310:53 <theStack> larryruane: hm is it even possible to mix different types within an INV? (i have to make myself more familiar with the protocol messages...)
17410:53 <jnewbery> amiti: we'll only send a sendcmpct(hb=true) to a peer if that peer has sent us a valid block at the tip
17510:54 <amiti> ohhh
17610:54 <jnewbery> during version handshake we'll send two sendcmpct(hb=false) to the peer to let them know we support version 1 and version 2 compact blocks
17710:54 <dergoegge> jnewbery: oh yes that is also something that is saw in my manual testing
17810:54 <jnewbery> (plug: https://github.com/bitcoin/bitcoin/pull/20799 means we can remove version 1 compact blocks)
17910:55 <amiti> ok so in order to see the change in behavior, I should have the p2pconn send the blocksonly node a valid block at the tip in the test setup?
18010:55 <dergoegge> in the interest of time lets move on: What do you think of Gleb’s comments on the usage of sendrawtransaction? Can you think of other exceptions in which a blocksonly node would still want to download compact blocks?
18110:55 <dergoegge> gelb's comment: https://github.com/bitcoin/bitcoin/pull/22340#issuecomment-875542706
18210:55 <jnewbery> but we only send sendcmpct(hb=true) if the peer is the first to provide a valid block at the tip. It's in the BlockChecked callback in the validation interface
18310:55 <jnewbery> amiti: yes, that should do it as long as the node is out of IBD
18410:56 <amiti> jnewbery: ok! that helps a lot, thanks :)
18510:58 <amiti> dergoegge: I think its possible but highly unlikely
18610:58 <lightlike> dergoegge: if a node in blocksonly mode somehow receives up-to-date txes from somewhere else. I don't really think that the scenario by gleb sounds very probable.
18710:58 <jnewbery> dergoegge: I think it's unlikely that anyone is using the node in this way
18810:59 <dergoegge> lightlike: yeah i can't think of another way for a node to receive all txs besides sendrawtransaction
18910:59 <dergoegge> i also think gleb's scenario is pretty unlikely
19010:59 <dergoegge> if you find other exceptions please comment on the PR :)
19111:00 <dergoegge> #endmeeting