Replace chain relayTransactions/submitMemoryPool by higher method (wallet)

https://github.com/bitcoin/bitcoin/pull/15713

Host: jnewbery  -  PR author: ariard

Notes

  • The node-wallet interface was introduced in PR 14437. That PR added several interfaces, including the Chain interface, which is used by the wallet to access blockchain and mempool state.
  • PR 14437 and the following PRs (14711, 15288 and 10973) did not make significant changes to which node functions the wallet could call. From the code comments “TODO: Current chain methods are too low level, exposing too much of the internal workings of the bitcoin node, and not being very convenient to use. Chain methods should be cleaned up and simplified over time.”
  • Since that PR was merged, several PRs have been opened and merged to clean up and simplify the Chain interface:
    • PR 15670 combined two interface methods findFirstBlockWithTime and findFirstBlockWithTimeAndHeight
    • PR 15842 replaced two interface methods isPotentialtip and waitForNotifications with a higher-level method.
  • this is the next PR in that sequence, and replaces two interface methods relayTransactions and submitMemoryPool with a higher-level method.
  • the eventual end-point of this sequence of PRs is PR 16426 which removes the wallet’s ability to directly lock cs_main. That PR also reverses the locking order from cs_main before cs_wallet to cs_wallet before cs_main.
  • This PR also impacts work to cleanly separate the node and wallet responsibilities for rebroadcasting transactions. See issue 15734 for details.

Questions

  • Why is the wallet able to directly call functionality to broadcast transactions to peers?
  • What is a transaction rebroadcast? How (and how often) are they currently triggered?
  • What are some reasons that submitting a transaction to the mempool might fail?
  • What do we mean by locking order? Why must locks always be acquired in the same order?

Meeting Log

  107:00 <jnewbery> hi
  207:00 * bpo wave
  307:00 <lightlike> hi
  407:00 <ariard> hi
  507:01 <emzy> hi
  607:01 * jkczyz waves
  707:01 <fjahr> hi
  807:01 <jnewbery> Hi folks, we’re having internet issues at the office, so my connectivity might be intermittent
  907:01 <jonatack> hi !
 1007:01 <jonatack> * wipes sweat from sweltering brow in the heatwave
 1107:02 <amiti> hi
 1207:02 <davterra> hi all
 1307:02 <jnewbery> Before we begin, I'd like to encourage everyone to leave review comments on github. Review club should be here to help answer your questions, but the point is to give you the knowledge and tools to take part in the review process
 1407:03 <jnewbery> even if your comment is "I built and tested this PR by doing <x>", that's still helpful!
 1507:03 <jnewbery> ok, onto this week's PR: https://github.com/bitcoin/bitcoin/pull/15713
 1607:04 <jnewbery> Did people have a chance to build, review and read the notes/questions at https://bitcoin-core-review-club.github.io/15713.html ?
 1707:04 <jonatack> done
 1807:04 <bpo> I haven't but going to try to follow along as best I can
 1907:04 <amiti> q: with PRs like this that have multiple commits, do you often go through and make sure they all build & pass tests separately?
 2007:04 <amiti> seems ... tedious
 2107:05 <jkczyz> I briefly reviewed and read through comments
 2207:05 <lightlike> built it and looked at code
 2307:05 <jnewbery> amiti: I don't run tests on all commits, but I might build commits if I suspect there might be build issues
 2407:06 <jnewbery> first question: Why is the wallet able to directly call functionality to broadcast transactions to peers?
 2507:06 <jkczyz> It's running in the same process?
 2607:07 <jnewbery> jkczyz: that's true, but perhaps I should rephrase my question: "Why does the wallet call functionality directly?"
 2707:07 <jonatack> The wallet still relays txns
 2807:08 <jnewbery> jonatack: that's correct. The reason it does that is historic.
 2907:08 <jnewbery> if you look back through the git history, you'll see that the wallet and node functionality were much more mixed up in earlier versions
 3007:09 <jnewbery> we're trying to separate out the wallet from the node, and the fact that the wallet can tell the node to relay transactions to peers is really a hangover from when it was more mixed up
 3107:09 <jnewbery> Next question: What is a transaction rebroadcast? How (and how often) are they currently triggered?
 3207:11 <jonatack> * cue up amiti
 3307:11 <amiti> when a new block comes in, a node will rebroadcast any wallet txns that were not confirmed (with a variable delay)
 3407:11 <emzy> It is rebroadcasted if it is not included in a block.
 3507:11 <jnewbery> amiti: emzy: that's right!
 3607:12 <jnewbery> The code for that is here: https://github.com/bitcoin/bitcoin/blob/0626b8cbdf0aa971500eb5613c7ab4096c496966/src/wallet/wallet.cpp#L2327
 3707:13 <emzy> oh nice it is random.
 3807:13 <bpo> All wallets rebroadcast all transactions if the wallet sees that a new chain tip doesn't include the wallet's transactions? Isn't that a large overhead?
 3907:13 <lightlike> why is the reboradcast done that often? Shouldn't the tx have propagated into enough peers' mempools after the first relay?
 4007:13 <jnewbery> Note that we won't rebroadcast every block. There's a random timer before each resend
 4107:14 <harding> As an aside, transaction rebroadcast is really useful if your node didn't have any peers at the time you created your transaction, perhaps because you were having issues with Internet at the office. :-)
 4207:14 <bpo> lol
 4307:14 <hugohn> is it based on the arrival of a new block or a periodic schedule? I looked up who calls MabeResendWalletTxs() & it seems it's only called by the scheduler?
 4407:14 <hugohn> *scheduler
 4507:15 <jnewbery> bpo: lightlike: the rebroadcast logic right now isn't great. The main purpose it's serving is to make sure that your transaction doesn't get dropped. Like harding says, that's mostly useful if you tried to send the tx when you didn't have connectivity
 4607:15 <emzy> But sometimes you like to lose the transaction and don't rebroadcast.
 4707:15 <jnewbery> the downside is that it's terrible for privacy
 4807:16 <jnewbery> amiti has a proposal to improve rebroadcast logic here: https://gist.github.com/amitiuttarwar/b592ee410e1f02ac0d44fcbed4621dba
 4907:16 <bpo> jnewbery so I potential improvement would be to have logic that would estimate how likely that your txn was received by peers (If transaction was relayed when node had no peers, try again when I have peers)
 5007:16 <bpo> reviewing b592 now
 5107:17 <michaelfolkson> When the node and wallet are eventually entirely separated your node will rebroadcast other people's transactions as well as your own?
 5207:17 <jnewbery> hugohn: it's based on both. I can't remember the exact logic, but the idea is that we won't rebroadcast too frequently (that's the part that's handled by the scheduler), and we only rebroadcast after a block is received
 5307:17 <amiti> michaelfolkson: as per my proposal, yes
 5407:18 <jnewbery> michaelfolkson: we don't need to wait for full separation for that
 5507:18 <michaelfolkson> Just for the privacy benefit? Or because that helps other people's transactions be propagated (in the case that they have bad connectivity)?
 5607:18 <jnewbery> but yes, rebroadcasting transactions that aren't yours is how we preserve privacy
 5707:19 <jnewbery> Next question: What are some reasons that submitting a transaction to the mempool might fail?
 5807:19 <jnewbery> hint: take a look at AcceptToMemoryPoolWorker()
 5907:20 <emzy> invalid transaction
 6007:20 <hugohn> non-standard txs
 6107:20 <bpo> mempool overflow, network segmentation to miners
 6207:20 <lightlike> https://github.com/bitcoin/bitcoin/pull/15713#discussion_r306122680 :-)
 6307:20 <jnewbery> emzy: hugohn: yes, txs could be consensus-invalid or policy-invalid
 6407:20 <hugohn> RBF tx that violates RBF rules
 6507:21 <jonatack> No transactions are allowed below minRelayTxFee
 6607:21 <jnewbery> lightlike: yeah, that's pretty comprehensive :)
 6707:22 <michaelfolkson> MAX_FEE_EXCEEDED?!
 6807:22 <emzy> Dublicate transaction
 6907:22 <harding> A feerate below the node's current dynamic minimum feerate (BIP133).
 7007:22 <michaelfolkson> Didn't realize there was a max
 7107:23 <lightlike> there is some logic to prevent "absurd"ly high fees afaik
 7207:23 <jnewbery> michaelfolkson: Yeah, MAX_FEE_EXCEEDED is pretty weird. When we submit a transaction to the mempool locally, we set a max feerate to make sure we don't accidentally set a huge feerate.
 7307:23 <emzy> michaelfolkson: somtimes there was a transcations with way to much fee. like you forget to include one output
 7407:24 <harding> A transaction that's not valid now, but will be valid after a BIP9 soft fork transitions to activated.
 7507:24 <jnewbery> that's only for txs submitted to the mempool from an RPC like `sendrawtransaction` or by the wallet
 7607:24 <jonatack> git grep -ni absurd
 7707:24 <harding> Um, s/valid/policy allowed/g
 7807:24 <jnewbery> if we receive a tx over P2P, we don't reject it for 'absurdly high fee'
 7907:24 <michaelfolkson> Plenty of tales of people making that mistake and losing all their money to fees. Didn't realize there was this check
 8007:25 <michaelfolkson> But sorry I'm derailing discussion
 8107:25 <jnewbery> (I'd like to remove that argument from the AcceptToMemoryPool() interface and have the client check before submitting it. WIP PR here: https://github.com/bitcoin/bitcoin/pull/15810)
 8207:26 <jnewbery> harding: that's a good one. I think it's pretty rare that we make policy looser (ie more permissive), but one example is making segwit v1+ segwit txs standard recently
 8307:26 <harding> Another set of reasons would be because of ancestor limits.
 8407:27 <jnewbery> harding: yes ancestor/descendant size and count limits
 8507:27 <jnewbery> ok, so why in the interface method do we only care about a subset of those errors: https://github.com/bitcoin/bitcoin/pull/15713/commits/f2da7210c8e3f9ab5a5dac7b3c39bf9f0252faf7#diff-a6585739217ca647d3f10c40cccff7f1R305
 8607:29 <hugohn> because presumably when constructing the tx from the wallet, it should already perform all the other checks?
 8707:30 <jnewbery> hugohn: not quite. The answer is in the comment above the line I linked to
 8807:30 <jonatack> "Chain clients only care about failures to accept the tx to the mempool. Disregard non-mempool related failures" ?\
 8907:31 <jnewbery> This PR is a refactor, so we're trying to maintain behaviour from before the PR
 9007:33 <harding> So if the wallet created a consensus-invalid transaction, we don't worry about it here (doesn't matter---it'll never confirm anyway). We only care about mempool errors that might be temporary and that we can get around by rebroadcasting?
 9107:33 <jnewbery> check where the new broadcastTransaction() interface method is called, and the code that it is replacing
 9207:34 <lightlike> Wouldn't it be better if the wallet knew and cared if its tx is accepted to the mempool but not relayed because P2P is not working?
 9307:34 <jnewbery> so if the tx gets into the mempool, but is not relayed to peers because P2P is disabled, the wallet still considers it successful
 9407:35 <jnewbery> lightlike: potentially yes, but we don't do anything about it now. This PR doesn't change that
 9507:35 <jnewbery> lightlike: what different behaviour would you like to see if the wallet knew about it?
 9607:36 <lightlike> jnewbery: maybe alert the user at least.
 9707:36 <jnewbery> yes, I could imagine that being useful
 9807:36 <jnewbery> ok, final question: What do we mean by locking order? Why must locks always be acquired in the same order?
 9907:36 <lightlike> jnewbery: and possibly some faster rebroadcast tries
10007:37 <hugohn> potential deadlocks
10107:37 <jkczyz> The order in which locks are acquired and released to avoid deadlocks
10207:38 <jnewbery> lighlike: maybe. I think that would go away once amiti's proposal is implemented. It'd be the node/mempool's responsibility to rebroadcast once it has peers
10307:38 <jonatack> for thread consistency
10407:38 <jnewbery> hugohn: jkczyz: correct. If one thread acquires lock A then lock B, and another acquires lock B then lock A, then there could be a deadlock
10507:38 <jonatack> unless the locks can be acquired in a single op with std::lock
10607:39 <jnewbery> there's a build flag that will check lock orders. I can't remember what it's called, but we run it on at least one travis job
10707:39 <ariard> --enable-debug
10807:39 <jnewbery> thanks ariard!
10907:40 <jnewbery> The last PR in this sequence is https://github.com/bitcoin/bitcoin/pull/16426 which actually reverses the cs_main / cs_wallet lock order
11007:40 <hugohn> probably should have a wrapper locking method & make the individual locks private, to ensure the right order is always honored
11107:40 <jnewbery> currently we have to take cs_main before we take cs_wallet. After 16426, it'll be the other way round
11207:41 <jnewbery> Were there any other general questions about the PR?
11307:41 <hugohn> or better yet, remove locks altogether! which ariard is working on 🙂
11407:41 <hugohn> question about https://github.com/bitcoin/bitcoin/pull/15713/files#diff-b2bb174788c7409b671c46ccc86034bdL2024
11507:42 <hugohn> I see we're removing GetDepthInMainChain() & IsCoinBase() checks in wallet.cpp
11607:42 <amiti> I also have a question- can you help me understand the `NODISCARD` thing here? https://github.com/bitcoin/bitcoin/pull/15713/files?file-filters%5B%5D=.h#diff-3f841b5d557d579ef74c43ff582b16ebL22
11707:42 <hugohn> is it because those are already checked by ATMP?
11807:42 <jnewbery> hugohn: they're already checked by BroadcastTransaction() in node/transaction.cpp I think
11907:43 <jnewbery> amiti: I believe it just means that the return value shouldn't be dropped
12007:43 <hugohn> I see. I also see a IsCoinBase() check in ATMP. So perhaps redundant hmm?
12107:44 <jnewbery> ie it's forbidden to call this function without assigning the return value to something
12207:44 <harding> hugohn: ATMP is really meant to run on transactions you receive from other nodes, and you don't want other nodes sending you loose coinbase transactions.
12307:45 <jnewbery> looks like NODISCARD was added here: https://github.com/bitcoin/bitcoin/pull/13815
12407:46 <hugohn> harding: you mean we shouldn't rely on ATMP checks for our own txs? they both route to the same path though yeah? I suppose defensive programming doesn't hurt
12507:46 <jonatack> Question: Should there be tests for the changed code? SubmitMemoryPoolAndRelay? As someone who is in the habit of TDD, this seems somewhat... precarious :D
12607:46 <amiti> jnewbery: cool, thanks.I found the definition in `attributes.h` but didn't know how to interpret what its doing
12707:47 <jnewbery> jonatack: how would you go about testing this?
12807:49 <jnewbery> some of the functionality is moving across the interface border from the wallet to the node (eg the IsCoinBase() and GetDepthInMainChain() checks that used to be in RelayWalletTransction())
12907:49 <jonatack> Apologies, lost the connection
13007:50 <jnewbery> So I don't think you could test no regressions in unit tests
13107:50 <jnewbery> I think this code should be fairly well covered by the functional tests
13207:51 <jonatack> For example, when I find myself writing code comments like " // Note: this will need to be updated if BroadcastTransactions() is updated to return other non-mempool failures // that Chain clients do not need to know about."
13307:51 <harding> hugohn: sorry, looks like I misunderstood you. I thought you were proposing to remove code from ATMP.
13407:52 <jonatack> ... I try to add a test to ensure that.
13507:53 <jonatack> (Perhaps the way the codebase is currently structured makes testing difficult, thus my question)
13607:53 <jnewbery> Yeah, ajtowns had a suggestion for that here: https://github.com/bitcoin/bitcoin/pull/15713#discussion_r306122680
13707:54 <jnewbery> I don't have any great suggestions
13807:54 <jnewbery> ok, 5 minutes left. Any last minute questions?
13907:54 <jkczyz> Could you clarify the outcome of the discussion around the addition of the relay parameter to BroadcastTransaction?
14007:55 <jkczyz> err to SubmitMemoryPoolAndRelay?
14107:55 <hugohn> harding: ah ok, yeah. not removing anything from ATMP, but perhaps leveraging ATMP logic to avoid redundant checks.
14207:57 <jnewbery> jkczyz: I don't think the interface methods that we end up with in this PR are ideal. I'd prefer to see something like https://gist.github.com/amitiuttarwar/b592ee410e1f02ac0d44fcbed4621dba implemented so the wallet has no access to relaying txs.
14307:57 <hugohn> since ATMP seems like the common entry point for all txs (external or internal, broadcast or rebroadcast)
14407:58 <jnewbery> BUT, this gets us closer to https://github.com/bitcoin/bitcoin/pull/16426, which is definitely a good change
14507:58 <jkczyz> got it, thanks!
14607:58 <harding> hugohn: that makes sense to me.
14707:58 <amiti> jkczyz: the relay param is needed mainly for `ResendWalletTransactions` to force the node relaying txns that would otherwise not be sent out due to the checks in `BroadcastTransaction`. my proposed changes to rebroadcast logic should make a cleaner separation between wallet & node and allow us to remove that relay param
14807:58 <amiti> delay in send... internet struggles
14907:59 <jonatack> amiti: nice :+1:
15008:00 <jnewbery> ok, let's wrap it up there. Reminder that next week we'll look at https://github.com/bitcoin/bitcoin/pull/15505. It's been closed, but it's still an interesting PR, and if enough of us review it, maybe we can get it opened again and merged!
15108:01 <jnewbery> thanks everyone!