#15713 refactor: Replace chain relayTransactions/submitMemoryPool by higher method (wallet)

https://github.com/bitcoin/bitcoin/15713

Host: jnewbery

Notes

  • The node-wallet interface was introduced in PR 14437. That PR added several interfaces, including the Chain interface, which is used by the wallet to access blockchain and mempool state.
  • PR 14437 and the following PRs (14711, 15288 and 10973) did not make significant changes to which node functions the wallet could call. From the code comments “TODO: Current chain methods are too low level, exposing too much of the internal workings of the bitcoin node, and not being very convenient to use. Chain methods should be cleaned up and simplified over time.”
  • Since that PR was merged, several PRs have been opened and merged to clean up and simplify the Chain interface:
    • PR 15670 combined two interface methods findFirstBlockWithTime and findFirstBlockWithTimeAndHeight
    • PR 15842 replaced two interface methods isPotentialtip and waitForNotifications with a higher-level method.
  • this is the next PR in that sequence, and replaces two interface methods relayTransactions and submitMemoryPool with a higher-level method.
  • the eventual end-point of this sequence of PRs is PR 16426 which removes the wallet’s ability to directly lock cs_main. That PR also reverses the locking order from cs_main before cs_wallet to cs_wallet before cs_main.
  • This PR also impacts work to cleanly separate the node and wallet responsibilities for rebroadcasting transactions. See issue 15734 for details.

Questions

  • Why is the wallet able to directly call functionality to broadcast transactions to peers?
  • What is a transaction rebroadcast? How (and how often) are they currently triggered?
  • What are some reasons that submitting a transaction to the mempool might fail?
  • What do we mean by locking order? Why must locks always be acquired in the same order?

Meeting Log

07:00 < jnewbery> hi
07:00  * bpo wave
07:00 < lightlike> hi
07:00 < ariard> hi
07:01 < emzy> hi
07:01  * jkczyz waves
07:01 < fjahr> hi
07:01 < jnewbery> Hi folks, we’re having internet issues at the office, so my connectivity might be intermittent
07:01 < jonatack> hi !
07:01 < jonatack>  * wipes sweat from sweltering brow in the heatwave
07:02 < amiti> hi
07:02 < davterra> hi all
07:02 < jnewbery> Before we begin, I'd like to encourage everyone to leave review comments on github. Review club should be here to help answer your questions, but the point is to give you the knowledge and tools to take part in the review process
07:03 < jnewbery> even if your comment is "I built and tested this PR by doing <x>", that's still helpful!
07:03 < jnewbery> ok, onto this week's PR: https://github.com/bitcoin/bitcoin/pull/15713
07:04 < jnewbery> Did people have a chance to build, review and read the notes/questions at https://bitcoin-core-review-club.github.io/15713.html ?
07:04 < jonatack> done
07:04 < bpo> I haven't but going to try to follow along as best I can
07:04 < amiti> q: with PRs like this that have multiple commits, do you often go through and make sure they all build & pass tests separately?
07:04 < amiti> seems ... tedious
07:05 < jkczyz> I briefly reviewed and read through comments
07:05 < lightlike> built it and looked at code
07:05 < jnewbery> amiti: I don't run tests on all commits, but I might build commits if I suspect there might be build issues
07:06 < jnewbery> first question: Why is the wallet able to directly call functionality to broadcast transactions to peers?
07:06 < jkczyz> It's running in the same process?
07:07 < jnewbery> jkczyz: that's true, but perhaps I should rephrase my question: "Why does the wallet call functionality directly?"
07:07 < jonatack> The wallet still relays txns
07:08 < jnewbery> jonatack: that's correct. The reason it does that is historic.
07:08 < jnewbery> if you look back through the git history, you'll see that the wallet and node functionality were much more mixed up in earlier versions
07:09 < jnewbery> we're trying to separate out the wallet from the node, and the fact that the wallet can tell the node to relay transactions to peers is really a hangover from when it was more mixed up
07:09 < jnewbery> Next question: What is a transaction rebroadcast? How (and how often) are they currently triggered?
07:11 < jonatack> * cue up amiti
07:11 < amiti> when a new block comes in, a node will rebroadcast any wallet txns that were not confirmed (with a variable delay)
07:11 < emzy> It is rebroadcasted if it is not included in a block.
07:11 < jnewbery> amiti: emzy: that's right!
07:12 < jnewbery> The code for that is here: https://github.com/bitcoin/bitcoin/blob/0626b8cbdf0aa971500eb5613c7ab4096c496966/src/wallet/wallet.cpp#L2327
07:13 < emzy> oh nice it is random.
07:13 < bpo> All wallets rebroadcast all transactions if the wallet sees that a new chain tip doesn't include the wallet's transactions?  Isn't that a large overhead?
07:13 < lightlike> why is the reboradcast done that often? Shouldn't the tx have propagated into enough peers' mempools after the first relay?
07:13 < jnewbery> Note that we won't rebroadcast every block. There's a random timer before each resend
07:14 < harding> As an aside, transaction rebroadcast is really useful if your node didn't have any peers at the time you created your transaction, perhaps because you were having issues with Internet at the office.  :-)
07:14 < bpo> lol
07:14 < hugohn> is it based on the arrival of a new block or a periodic schedule? I looked up who calls MabeResendWalletTxs() & it seems it's only called by the scheduler?
07:14 < hugohn> *scheduler
07:15 < jnewbery> bpo: lightlike: the rebroadcast logic right now isn't great. The main purpose it's serving is to make sure that your transaction doesn't get dropped. Like harding says, that's mostly useful if you tried to send the tx when you didn't have connectivity
07:15 < emzy> But sometimes you like to lose the transaction and don't rebroadcast.
07:15 < jnewbery> the downside is that it's terrible for privacy
07:16 < jnewbery> amiti has a proposal to improve rebroadcast logic here: https://gist.github.com/amitiuttarwar/b592ee410e1f02ac0d44fcbed4621dba
07:16 < bpo> jnewbery so I potential improvement would be to have logic that would estimate how likely that your txn was received by peers (If transaction was relayed when node had no peers, try again when I have peers)
07:16 < bpo> reviewing b592 now
07:17 < michaelfolkson> When the node and wallet are eventually entirely separated your node will rebroadcast other people's transactions as well as your own?
07:17 < jnewbery> hugohn: it's based on both. I can't remember the exact logic, but the idea is that we won't rebroadcast too frequently (that's the part that's handled by the scheduler), and we only rebroadcast after a block is received
07:17 < amiti> michaelfolkson: as per my proposal, yes
07:18 < jnewbery> michaelfolkson: we don't need to wait for full separation for that
07:18 < michaelfolkson> Just for the privacy benefit? Or because that helps other people's transactions be propagated (in the case that they have bad connectivity)?
07:18 < jnewbery> but yes, rebroadcasting transactions that aren't yours is how we preserve privacy
07:19 < jnewbery> Next question: What are some reasons that submitting a transaction to the mempool might fail?
07:19 < jnewbery> hint: take a look at AcceptToMemoryPoolWorker()
07:20 < emzy> invalid transaction
07:20 < hugohn> non-standard txs
07:20 < bpo> mempool overflow, network segmentation to miners
07:20 < lightlike> https://github.com/bitcoin/bitcoin/pull/15713#discussion_r306122680 :-)
07:20 < jnewbery> emzy: hugohn: yes, txs could be consensus-invalid or policy-invalid
07:20 < hugohn> RBF tx that violates RBF rules
07:21 < jonatack> No transactions are allowed below minRelayTxFee
07:21 < jnewbery> lightlike: yeah, that's pretty comprehensive :)
07:22 < michaelfolkson> MAX_FEE_EXCEEDED?!
07:22 < emzy> Dublicate transaction
07:22 < harding> A feerate below the node's current dynamic minimum feerate (BIP133).
07:22 < michaelfolkson> Didn't realize there was a max
07:23 < lightlike> there is some logic to prevent "absurd"ly high fees afaik
07:23 < jnewbery> michaelfolkson: Yeah, MAX_FEE_EXCEEDED is pretty weird. When we submit a transaction to the mempool locally, we set a max feerate to make sure we don't accidentally set a huge feerate.
07:23 < emzy> michaelfolkson: somtimes there was a transcations with way to much fee. like you forget to include one output
07:24 < harding> A transaction that's not valid now, but will be valid after a BIP9 soft fork transitions to activated.
07:24 < jnewbery> that's only for txs submitted to the mempool from an RPC like `sendrawtransaction` or by the wallet
07:24 < jonatack> git grep -ni absurd
07:24 < harding> Um, s/valid/policy allowed/g
07:24 < jnewbery> if we receive a tx over P2P, we don't reject it for 'absurdly high fee'
07:24 < michaelfolkson> Plenty of tales of people making that mistake and losing all their money to fees. Didn't realize there was this check
07:25 < michaelfolkson> But sorry I'm derailing discussion
07:25 < jnewbery> (I'd like to remove that argument from the AcceptToMemoryPool() interface and have the client check before submitting it. WIP PR here: https://github.com/bitcoin/bitcoin/pull/15810)
07:26 < jnewbery> harding: that's a good one. I think it's pretty rare that we make policy looser (ie more permissive), but one example is making segwit v1+ segwit txs standard recently
07:26 < harding> Another set of reasons would be because of ancestor limits.
07:27 < jnewbery> harding: yes ancestor/descendant size and count limits
07:27 < jnewbery> ok, so why in the interface method do we only care about a subset of those errors: https://github.com/bitcoin/bitcoin/pull/15713/commits/f2da7210c8e3f9ab5a5dac7b3c39bf9f0252faf7#diff-a6585739217ca647d3f10c40cccff7f1R305
07:29 < hugohn> because presumably when constructing the tx from the wallet, it should already perform all the other checks?
07:30 < jnewbery> hugohn: not quite. The answer is in the comment above the line I linked to
07:30 < jonatack> "Chain clients only care about failures to accept the tx to the mempool. Disregard non-mempool related failures" ?\
07:31 < jnewbery> This PR is a refactor, so we're trying to maintain behaviour from before the PR
07:33 < harding> So if the wallet created a consensus-invalid transaction, we don't worry about it here (doesn't matter---it'll never confirm anyway).  We only care about mempool errors that might be temporary and that we can get around by rebroadcasting?
07:33 < jnewbery> check where the new broadcastTransaction() interface method is called, and the code that it is replacing
07:34 < lightlike> Wouldn't it be better if the wallet knew and cared if its tx is accepted to the mempool but not relayed because P2P is not working?
07:34 < jnewbery> so if the tx gets into the mempool, but is not relayed to peers because P2P is disabled, the wallet still considers it successful
07:35 < jnewbery> lightlike: potentially yes, but we don't do anything about it now. This PR doesn't change that
07:35 < jnewbery> lightlike: what different behaviour would you like to see if the wallet knew about it?
07:36 < lightlike> jnewbery: maybe alert the user at least.
07:36 < jnewbery> yes, I could imagine that being useful
07:36 < jnewbery> ok, final question: What do we mean by locking order? Why must locks always be acquired in the same order?
07:36 < lightlike> jnewbery: and possibly some faster rebroadcast tries
07:37 < hugohn> potential deadlocks
07:37 < jkczyz> The order in which locks are acquired and released to avoid deadlocks
07:38 < jnewbery> lighlike: maybe. I think that would go away once amiti's proposal is implemented. It'd be the node/mempool's responsibility to rebroadcast once it has peers
07:38 < jonatack> for thread consistency
07:38 < jnewbery> hugohn: jkczyz: correct. If one thread acquires lock A then lock B, and another acquires lock B then lock A, then there could be a deadlock
07:38 < jonatack> unless the locks can be acquired in a single op with std::lock
07:39 < jnewbery> there's a build flag that will check lock orders. I can't remember what it's called, but we run it on at least one travis job
07:39 < ariard> --enable-debug
07:39 < jnewbery> thanks ariard!
07:40 < jnewbery> The last PR in this sequence is https://github.com/bitcoin/bitcoin/pull/16426 which actually reverses the cs_main / cs_wallet lock order
07:40 < hugohn> probably should have a wrapper locking method & make the individual locks private, to ensure the right order is always honored
07:40 < jnewbery> currently we have to take cs_main before we take cs_wallet. After 16426, it'll be the other way round
07:41 < jnewbery> Were there any other general questions about the PR?
07:41 < hugohn> or better yet, remove locks altogether! which ariard is working on 🙂
07:41 < hugohn> question about https://github.com/bitcoin/bitcoin/pull/15713/files#diff-b2bb174788c7409b671c46ccc86034bdL2024
07:42 < hugohn> I see we're removing GetDepthInMainChain() & IsCoinBase() checks in wallet.cpp
07:42 < amiti> I also have a question- can you help me understand the `NODISCARD` thing here? https://github.com/bitcoin/bitcoin/pull/15713/files?file-filters%5B%5D=.h#diff-3f841b5d557d579ef74c43ff582b16ebL22
07:42 < hugohn> is it because those are already checked by ATMP?
07:42 < jnewbery> hugohn: they're already checked by BroadcastTransaction() in node/transaction.cpp I think
07:43 < jnewbery> amiti: I believe it just means that the return value shouldn't be dropped
07:43 < hugohn> I see. I also see a IsCoinBase() check in ATMP. So perhaps redundant hmm?
07:44 < jnewbery> ie it's forbidden to call this function without assigning the return value to something
07:44 < harding> hugohn: ATMP is really meant to run on transactions you receive from other nodes, and you don't want other nodes sending you loose coinbase transactions.
07:45 < jnewbery> looks like NODISCARD was added here: https://github.com/bitcoin/bitcoin/pull/13815
07:46 < hugohn> harding: you mean we shouldn't rely on ATMP checks for our own txs? they both route to the same path though yeah? I suppose defensive programming doesn't hurt
07:46 < jonatack>  Question: Should there be tests for the changed code? SubmitMemoryPoolAndRelay? As someone who is in the habit of TDD, this seems somewhat... precarious :D
07:46 < amiti> jnewbery: cool, thanks.I found the definition in `attributes.h` but didn't know how to interpret what its doing
07:47 < jnewbery> jonatack: how would you go about testing this?
07:49 < jnewbery> some of the functionality is moving across the interface border from the wallet to the node (eg the IsCoinBase() and GetDepthInMainChain() checks that used to be in RelayWalletTransction())
07:49 < jonatack> Apologies, lost the connection
07:50 < jnewbery> So I don't think you could test no regressions in unit tests
07:50 < jnewbery> I think this code should be fairly well covered by the functional tests
07:51 < jonatack> For example, when I find myself writing code comments like "  // Note: this will need to be updated if BroadcastTransactions() is updated to return other non-mempool failures        // that Chain clients do not need to know about."
07:51 < harding> hugohn: sorry, looks like I misunderstood you.  I thought you were proposing to remove code from ATMP.
07:52 < jonatack> ... I try to add a test to ensure that.
07:53 < jonatack> (Perhaps the way the codebase is currently structured makes testing difficult, thus my question)
07:53 < jnewbery> Yeah, ajtowns had a suggestion for that here: https://github.com/bitcoin/bitcoin/pull/15713#discussion_r306122680
07:54 < jnewbery> I don't have any great suggestions
07:54 < jnewbery> ok, 5 minutes left. Any last minute questions?
07:54 < jkczyz> Could you clarify the outcome of the discussion around the addition of the relay parameter to BroadcastTransaction?
07:55 < jkczyz> err to SubmitMemoryPoolAndRelay?
07:55 < hugohn> harding: ah ok, yeah. not removing anything from ATMP, but perhaps leveraging ATMP logic to avoid redundant checks.
07:57 < jnewbery> jkczyz: I don't think the interface methods that we end up with in this PR are ideal. I'd prefer to see something like https://gist.github.com/amitiuttarwar/b592ee410e1f02ac0d44fcbed4621dba implemented so the wallet has no access to relaying txs.
07:57 < hugohn> since ATMP seems like the common entry point for all txs (external or internal, broadcast or rebroadcast)
07:58 < jnewbery> BUT, this gets us closer to https://github.com/bitcoin/bitcoin/pull/16426, which is definitely a good change
07:58 < jkczyz> got it, thanks!
07:58 < harding> hugohn: that makes sense to me.
07:58 < amiti> jkczyz: the relay param is needed mainly for `ResendWalletTransactions` to force the node relaying txns that would otherwise not be sent out due to the checks in `BroadcastTransaction`. my proposed changes to rebroadcast logic should make a cleaner separation between wallet & node and allow us to remove that relay param
07:58 < amiti> delay in send... internet struggles
07:59 < jonatack> amiti: nice :+1:
08:00 < jnewbery> ok, let's wrap it up there. Reminder that next week we'll look at https://github.com/bitcoin/bitcoin/pull/15505. It's been closed, but it's still an interesting PR, and if enough of us review it, maybe we can get it opened again and merged!
08:01 < jnewbery> thanks everyone!