The PR branch HEAD was b2e2829 at the time of this review club meeting.
Notes
A wallet rescans the blockchain in order to find transactions related
to its keys. A wallet starts rescanning the blockchain upon the import
of keys, descriptors, etc. A rescan can also be invoked by calling
rescanblockchain.
Wallet rescans were previously discussed in review clubs
#15845 and #25957.
When a wallet is rescanning, it needs to be able to top-up its keypool.
A walletâs keypool is its reserve of unused keys of a denoted size (the
default size is 1000 keys). When a wallet is using BIP 32
to generate deterministic keys, an unused key is considered to be any key
whose BIP 32 index is greater than the index of the last key which was
provided to the user by the wallet or which received funds.
Wallets can optionally be encrypted using a passphrase. Only the
private key material of the wallet is encrypted using a user-provided
passphrase.
An encrypted wallet is locked by default;
it can be unlocked by calling walletpassphrase,
which then stores the passphrase in memory for a specified amount of
time before the wallet is locked again
Certain RPCs which make use of information relating to private keys, such as
sendtoaddress and signrawtransactionwithwallet
require the wallet to be unlocked in order to work correctly. If the wallet is
not unlocked, the RPC typically throws an error.
The cs_walletmutexguards access to the passphrase when stored in memory. For example, cs_wallet must be
held before the passphrase can be deleted, so
holding the cs_wallet mutex for the duration of the RPC ensures another thread
cannot delete the passphrase while it is in use.
The rescanblockchain and importdescriptors RPCs donât hold
cs_wallet during the entire rescan, only for certain relatively fast
operations, because a rescan can take over an hour, and the user might
want to use other RPCs in the meantime which require the cs_wallet
lock to be held.
Why can wallets that use unhardened derivation top up the keypool
without being unlocked? Would this problem occur when rescanning a
descriptor wallet with descriptors generated by Bitcoin Core?
Does the test
added to wallet_importdescriptors.py fail on master for you? Can you
think of any ways to make it fail on master more reliably?
What is this PRâs approach to fixing this bug? Can you think of any alternatives?
Why was the m_scanning_with_passphrase variable added to CWallet?
Why is wallet.m_scanning_with_passphrase only set to true in some
RPCs and not others like importpubkey and importprivkey?
What would happen if the first commit
of this PR was removed and a user tried to run walletlock while
performing a rescan that required the passphrase?
Why was another mutex added instead of just using cs_wallet?
Why is RecursiveMutex used here as opposed to Mutex?
Why does m_relock_mutex get locked twice in walletlock?
<LarryRuane> while either of the 2 long-running RPCs is running, the wallet lock can timeout (wallet locks), which prevents key-topoff (those long-running RPCs create a need for topoff)
<LarryRuane> one thing I'm unclear on is what happens (without the PR) if the key topoff wants to run but the wallet is locked? Does the long-running RPC immediately fail?
<LarryRuane> when rescanning, if you detect a payment to one of your addresses, you want to make sure you're looking for payments to the next 1000 addresses ... may need to generate more addresses
<rot13maxi> Will other signing operations with the wallet fail/error if the wallet is in the middle of rescanning? In other words, if my wallet normally re-locks after 30 seconds, but is "held open" because it's rescanning, could an RPC to send a transaction come in while its rescanning, or would that RPC fail because there's a scan-in-progress?
<ishaanam[m]> LarryRuane: yes, with unhardened derivation, the "keypool" is basically the xpub, because private information is not needed to derive additional keys.
<rot13maxi> if other signing RPCs don't currently fail/error when a rescan is in progress, then we might want to have the ` if (pwallet->IsScanningWithPassphrase())` guard on those other operations, right?
<ishaanam[m]> rot13maxi: yes, I wonder if that is a bug or if there is a deliberate reason for doing that. I think the main problem is that scanning RPCs don't hold cs_wallet, but signing RPCs do.
<LarryRuane> while rescanning, we're discovered UTXOs that we own (have keys for), but it should be okay to, for example, `sendmany` which would just use the UTXOs that we do know about? (rather than waiting for the rescan to complete?)
<ishaanam[m]> LarryRuane: I think so, I would have to double check with the code but I can't see any immediate problems on master with spending while rescanning
<pablomartin> i still have to perform the tests... but talking about it, is it worth to add more coverage regarding the signing RPCs during rescan and other situations/ combinations??
<ishaanam[m]> LarryRuane: Yeah, when writing this I had played around with using mocktime but it didn't seem to make a difference. However, if you can think of any ways to make this more reliable please let me know! Perhaps you can increase the "range" and "next_index" values in the descriptor object that the scan is being done for?
<LarryRuane> for anyone who may be newer here, you can see the PR changes many `LOCK` to `LOCK2` (to include the new `m_relocl_mutex`) -- the order of those `LOCK2` arguments is very important!
<ishaanam[m]> pablomartin: Currently it is somewhat difficult for the functional test suite to have a rescan going on in the backgroung while other RPCs are being run, but I think I remember there being a PR open for fixing that.
<ishaanam[m]> Right, so I think that the new recursive mutex is the most important part of this PR. Why is âm_relock_mutexâ added instead of just using âcs_walletâ?
<ishaanam[m]> Theoretically while âcs_walletâ could be used, âm_relock_mutexâ is added because rescanning a wallet can take a long time, and we would still want other RPCs that lock âcs_walletâ to be able to run.
<LarryRuane> RecursiveMutex allows the same thread that already owns the lock to lock it again ... it's only unlocked on the last unlock (that is, there's a lock recursion counter)
<LarryRuane> but there's a general push to move away from RecursiveMutex because it encourages sloppy code ... I'm not sure why a `Mutex` wouldn't work here (i saw that there's a comment on the PR that you replied to)
<ishaanam[m]> LarryRuane: it is because sometimes âcs_walletâ is locked, and then âCWallet::Lockâ is called, which locks âm_rescan_mutexâ and then âcs_walletâ, which is inconsistent from a few other places
<ishaanam[m]> glozow: yes, but if there is a strong preference towards Mutex then I can revisit the code and perhaps do it differently such that this is no longer a concern. This would require a few more changes then what is there currently from what I can tell.
<glozow> yes, from quick glance it seems `Lock()` could require `cs_wallet` to already be held and that could solve the problem. but i'm scope creeping
<ishaanam[m]> What would happen if the first commit of this PR was removed and a user tried to run walletlock while performing a rescan that required the passphrase?
<ishaanam[m]> Really what this question is asking is: if none of the encryption RPCs listed in the PR ran âIsScanningWithPassphraseâ, would happen when locking a wallet during a rescan?
<LarryRuane> i think those can be really tough questions in general ... if a request (probably an RPC let's say) can't be satisfied immediately, is it better to block or to return an error?
<LarryRuane> i don't know if bitcoin core has a convention for that ... maybe it's just case-by-case ... depending on how long the delay would typically be?