The PR branch HEAD was e40ddf36be at the time of this review club meeting.
Notes
The node has multiple local clocks, including the system clock and a mockable
NodeClock.
The setmocktime regtest-only RPC sets the clock to a specific time in the past or future; the
time does not continue to tick forward after that. This allows us to test node behaviors that
happen over a long period of time, such as peer disconnection after timeouts.
The mockscheduler regtest-only RPC āfast-forwardsā scheduled jobs by making each time point
sooner. It does not modify the nodeās clock.
The wallet periodically rebroadcasts its transactions that havenāt been mined yet.
ResendWalletTransactions() is a task scheduled every 1000milliseconds. The task itself compares
the nodeās current (mockable) time and the walletās randomly chosen nNextResend time to decide
whether or not itās time to call SubmitTxMemoryPoolAndRelay(). As such, to trigger a rebroadcast,
the clock must pass two time points: CWallet::nNextResend and the time at which the scheduler
has MaybeResendWalletTxs() task set for.
PR #25768 addresses an issue in which the wallet
rebroadcasts multiple transactions with dependency relationships. It must submit a parent
transaction before its child, otherwise the mempool will reject it due to missing inputs.
Prior to this PR, the transactions are rebroadcast in whichever order they appear in
CWallet::mapWallet,
which is a std::unordered_map.
This PR reuses the logic in ReacceptWalletTransactions() to sort mapWallet transactions by
CWalletTx::nOrderPos.
Questions
Did you review the PR? What was your review approach?
Were you able to reproduce the issue? Does the test added in the PR adequately test this
behavior, i.e., does it succeed with this fix but fail without it?
Why is it difficult to write a test that reliably demonstrates that transactions may be
rebroadcast in the wrong order? How does std::unordered_map order its elements?
Approach
Can you think of any other methods of getting the wallet transactions sorted in this order?
Would it make sense to keep the wallet transactions sorted in this order? Why is mapWallet
implemented as a std::unordered_map; would another data structure make sense?
If you thought of an alternative approach, how does it compare? What is its runtime complexity?
How much {more, less} memory is allocated? Does it require a lot of code to be
(re)written? Is its complexity appropriate for this purpose?
Implementation
How is GetSortedTxs(), the function extracted from ReacceptWalletTransactions(), implemented?
How much memory is needed, as a function of the transactions in mapWallet?
GetSortedTxs() is
declared
as requiring the wallet lock to already be held. Why or why not does this make sense?
A structured binding declaration is used in
twoplaces
in GetSortedTxs(). What are the types of each variable (wtxid, wtx, _, and wtx) and what
do they represent? Why or why not is this better than declaring the types explicitly?
Can you find any behavior changes to ReacceptWalletTransactions(), or is it a move-only code
change?
The test
calls
both setmocktime and mockscheduler to trigger a rebroadcast. What is the difference between
these two calls, and why or why not is it necessary to call both of them?
The test
āevictsā
the transactions from the mempool by calling setmocktime. Why is this necessary, and why does it
work?
<glozow> we're looking at a wallet PR today, "Properly rebroadcast unconfirmed transaction chains." Notes in the usual place https://bitcoincore.reviews/25768
<kevkevin> Reviewd the code ran the functional tests but still need to do some manualy testing, still understanding the repo aswell as im still new here
<larryruane_> I'm running the functional test (wallet_resendwallettransactions.py) in the debugger, and watching the node's `debug.log` file ... but I haven't really got to the new test code that this PR adds, still trying to understand the first (existing) part of the test
<raj> I am guessing its about the situation where a child appears in `mapWallet` before its parent, and thus at the time of rebroadcasting the child would get rejected in mempoolcheck if mempool doesn't have the parent..
<stickies-v> `ResendWalletTransactions()` rebroadcasts transactions based on txid instead of their sequential order, which leads to child transactions not being broadcast if their parents txid is higher
<larryruane_> I think the problem is that if we (our node) broadcasts 2 transactions, A and B, and let's say B is a child of A (B spends an output of A), then if we happen to broadcast tx B first, then A, then our peers won't accept B into their mempool because it's not valid
<Zaidan> ResendWalletTranactions will fail to rebroadcast child transactions if their txids happen to be lexicographically less than their parent's txid (copy and paste)
<glozow> raj: stickies-v: larryruane_: Zaidan: good answers! indeed, the problem is rebroadcasting children before parents. the order is not "by txid" or "lexicographical" though, as it's a `std::unordered_map`
<larryruane_> in general, is there a long-term effort to replace `std::map`s with unordered maps where possible? aren't the latter considered more efficient?
<sipa> for sufficiently large size, unordered_map is faster, due to O(1) lookups, but they're also more complicated (you need a hash function, possibly a salted one depending on whether it can be under attacker influence) rather than just a comparator
<glozow> I think we've just answered the next question, "Why is it difficult to write a test that reliably demonstrates that transactions may be rebroadcast in the wrong order?"
<Zaidan> Don't answer this if it is out of scope I'm new: but why are these UTXO's transactions all seperate objects, and not wrapped up in a tree structure where their depth order can be easily maintained and referenced? It seems maintaining tx order is a pain.
<glozow> next question is about the functional test: Does the test added in the PR adequately test this behavior, i.e., does it succeed with this fix but fail without it?
<sipa> with a boost::multiindex you could have multiple indexes (one insertion ordered, another txid-based) simultaneously... but indeed I don't think that's useful.
<ishaanam[m]> On master the test fails around 50% of the time for me, though it would be nice if we had a higher chance of failing without the fix. I think glozow left a comment on the PR that would help with this?
<larryruane_> glozow: i didn't actually try it, but there's a great comment added to the functional test that explains that the new test will fail only about half of the time
<larryruane_> I have a question, would it be possible to run this entire new part of the functional test a few times (within a single run of the functional test)? To prevent CI from passing when it shouldn't?
<larryruane_> I'm just imagining that a future change breaks this, and the dev runs the test suite (or CI does), and it passes, so the dev thinks it's okay
<stickies-v> with std::sort you just need to allocate an additional vector instead of an additional vector and a map, but there's probably some downside to it I don't understand?
<glozow> larryruane_: AFAIK yes, parents always inserted before children. I suppose more precise would be if they cached their ancestor counts or something, but can't imagine why we wouldn't insert a parent before child.
<achow101> larryruane_: I think it is a reasonable assumption. I believe it is possible to mess with it using importprunedfunds though, however that only works on confirmed txs anyways
<larryruane_> glozow: I was wondering about that, `mapWallet` contains confirmed transactions, right? so it could be pretty large (so we're adding many entries to the temporary map in `GetSortedTxs()`)?
<ishaanam[m]> glozow: yeah, it looks like the main reason that the filtering is done after is because both of the functions that call GetSortedTxs() do different kinds of filtering? why is that?
<glozow> indeed. I'll say the inefficiency is probably not noticeable to users since this is only happing once a day, and there *usually* aren't that many transactions in mapWallet, but yes.
<glozow> ishaanam: good question, now is a good time to ask: What's the difference between `ResendWalletTransactions` and `ReacceptWalletTransactions`?
<raj> `Resend` sends all the unconfirmed tx 5 mins before the last known block.. `ReAccept` sends all the wallet transaction unconfirmed, not coinbase and not abandoned.
<achow101> they both do ~same thing. I'm still trying to figure whether we can de-duplicate here, but both of these functions have existed for very long time and git blaming to figure out why is difficult.
<glozow> Zaidan: right. rebroadcasting is done automatically, for transactions we're waiting to get confirmed but they haven't yet. reaccept is triggered once, at the start.
<stickies-v> I'm curious why for Reaccept we don't also check if we're in IBD or reindex, because that would raise the same concerns as mentioned for Resend?
<glozow> Zaidan: assuming you're talking about `cs_wallet`, the wallet lock doesn't have anything to do with the node's state. it's just a object used to make sure multiple threads don't try to access wallet members at the same time.
<pablomartin> glozow: achow101: is the resend being used? i see it checks fBroadcastTransactions which is set to false in wallet.h and I don't see it's changed anywhere else
<glozow> good questions everybody. I'll throw out a few more from the review club notes. The test calls both setmocktime and mockscheduler to trigger a rebroadcast. What is the difference between these two calls, and why or why not is it necessary to call both of them?
<stickies-v> `ResendWalletTransactions()` is exclusively called from the scheduler (every second). I suppose we need to mock the scheduler because otherwise it'd use system time instead of mocked time?
<stickies-v> yeah so first we advance time by 336h (2w) to trigger mempool eviction, then we run node.syncwithvalidationinterfacequeue() to sync, and then we advance by another 36h to ensure ResendWalletTransactions() executes
<larryruane_> I'm not quite clear on why we need `mockscheduler` to back up the scheduler items to an earlier time (IIUC), why can't we instead just advance mock time?
<raj> larryruane_, one reason could be `mockscheduler` advances by a delta, where `setmocktime` sets it at a specific timepoint.. So I think they are both essentially same?
<glozow> to answer the question of "why do we need to mock the scheduler too," there are a few ways of answering. One is to remove that line from the test and see what happens. Another is to look at how the timepoints in `CScheduler::taskQueue` are handled. What exactly triggers a task to be executed?
<achow101> on the question of resend vs reaccept: these functions were introduced in the first commit (or thereabouts) where everything was very tightly coupled together. At that point in time, the mempool was not in charge of deciding what got relayed and when. So Reaccept meant that the transactions would be re-added to the mempool after startup, but not necessarily rebroadcast. Resend meant that the transactions were actually being rebroadcast;
<achow101> it would send out invs. Over time, as the mempool began to be in charge of tx relay, adding to the mempool also meant potentially relaying, and resending turned into adding to the mempool. And so now they are not really different and could be deduplicated.