The PR branch HEAD was fac99dc at the time of this review club meeting.
Coin selection refers to the process of selecting UTXOs (or “coins”) from
a wallet’s available UTXO pool to fund a transaction. Generally, the goal is
to pick coins that will minimize fees for the user (now and in the long run),
help the transaction reliably confirm in a timely manner, and not leak
information about the wallet. This Stack Exchange
offers an overview of tradeoffs between different coin selection
strategies. We covered coin selection in a previous review club,
PR #17526 implements Single
Random Draw (SRD) as an additional fallback coin selection strategy. SRD is
fairly straightforward: it randomly picks OutputGroups from a pool of
eligible UTXOs until the total amount is sufficient to cover the payment and
fees. Any extra funds are put in a change output.
This means that, with this PR, our coin selection will have three different
solvers: Branch and Bound (BnB), Knapsack, and Single Random Draw (SRD). Note
that some randomness is used in coin selection, so we won’t always come up
with the same solution.
The overall strategy within
(including PR #17331, on which PR #17526 is built):
Coins manually selected by the user using CoinControl are added first.
All available coins (excluding the pre-selected ones) are gathered.
CoinEligibilityFilters are used to filter these coins within
SelectCoinsMinConf(). We have a clear hierarchy of which coin selection
solutions are preferred: we first try with a restriction of at least 6
confirmations on foreign UTXOs and 1 confirmation on our own UTXOs. If no
solution is found, we try with at least 1 confirmation on all UTXOs. If that
doesn’t work and the user allows spending unconfirmed change, we try that
(still requiring at least 1 confirmation on foreign UTXOs), gradually
increasing mempool ancestor limits on the unconfirmed change.
OutputGroups are created using the CoinEligibilityFilter. There is a
clear preference for BnB and we will only try Knapsack and SRD if that
fails. However, we’ll try both Knapsack and SRD together, picking the
solution with lower fees, breaking ties by number of UTXOs used.
Can you give a high-level description of the coin selection strategy
including the changes proposed in this PR?
Within SelectCoinsMinConf(), if we have both a Knapsack and an SRD
solution, how do we decide which one to use?
Why might we prefer to spend more inputs in the same transaction?
Quiz: Based on the coin selection scheme proposed here, let’s say that
Solutions A, B, and C exist (ignore the fact that we would exit early after
finding a solution we’re satisfied with). Which would we pick? (Hint: which
invocation of SelectCoinsMinConf() would each of these come from?)
Solution A: picked using Knapsack. Produces a change output, pays 100
satoshis in fees, and only uses confirmed UTXOs, each with 4 confirmations.
Solution B: picked using BnB. No change output, pays 95 satoshis in fees,
and uses one unconfirmed change output.
Solution C: picked using SRD. Produces a change output, pays 99 satoshis in
fees, and only uses confirmed UTXOs, each with 1 confirmation.
Why does SRD pick from output groups rather than from UTXOs?
What does calling GroupOutputs() with positive_only=true do (Hint: you
may want to review what effective values are)? What could happen if
SelectCoinsSRD() was called with all_groups instead of positive_groups?
What are some ways a deterministic coin selection algorithm might leak
information about the wallet’s UTXO pool? Why do we
shufflevCoins before creating OutputGroups?
Bonus: We’ve listed some qualitative (e.g. presence of a change output) and
quantitative (e.g. number of inputs used) ways to compare coin selection
solutions. Instead of returning as soon as SelectCoinsMinConf() finds a
solution, should we try multiple and then pick one? How might we design a
metric to decide which one to use?
<glozow> Note that this PR is built on top of #17331, which we covered last week in PR Review Club. It should still be follow-along-able if you weren't here, but those notes could be helpful: https://bitcoincore.reviews/17331
<larryruane_> i've always wondered, if the HD wallet sends change to a brand new address (as it should), does that harm privacy immediately? Or only later when the change is spent? (sorry if this is off-topic by now)
<glozow> alright, Quiz time! Based on the coin selection scheme proposed here, let's say that Solutions A, B, and C exist (ignore the fact that we would exit early after finding a solution we're satisfied with). Which would we pick?
<murch> So, Solution A and Solution C should be found in the second pass of the eligibility filter but C pays less fees. B uses an unconfirmed input which happens in a later iteration of SelectCoinsMinConf, i.e. vote for C
<glozow> Next question: What does calling `GroupOutputs()` with `positive_only=true` do (Hint: you may want to review what effective values are)? What could happen if `SelectCoinsSRD()` was called with `all_groups` instead of `positive_groups`?
<murch> lightlike: Does it group only UTXOs with positive values or does it only allow groups that are positive in sum or does it only create groups from scriptPubKeys that only have positively valued utxos?
<darius58> @glozow it would suck because there would be some cases where SRD would fail because it couldn't reach a high enough amount, even if it could have by not including negative-effective value UTXOs
<murch> michaelfolkson: People use "dust" to denominate various things. Some people refer to UTXOs that are uneconomic at 3 sat/vB as dust, some refer to UTXOs that are uneconomic at the current feerate as dust, some just seem to call anything below e.g. 2000 sats dust.
<murch> So, some coin selection strategies for example reveal the oldest UTXOs or the largest UTXOs. In the combination of address reuse or clustering via change output heuristics, or other wallet fingerprints, this can allow a sufficiently motivated adversary to guess the amount of coins, payment frequency and other things
<murch> lightlike: that's a good point, but generally you'll have a lot more smaller UTXOs than large UTXOs, so it would be somewhat uncommon. Also if you use a small input, you'll need multiple or get back tiny change, which are also suboptimal