The PR branch HEAD was fac99dc at the time of this review club meeting.
Notes
Coin selection refers to the process of selecting UTXOs (or âcoinsâ) from
a walletâs available UTXO pool to fund a transaction. Generally, the goal is
to pick coins that will minimize fees for the user (now and in the long run),
help the transaction reliably confirm in a timely manner, and not leak
information about the wallet. This Stack Exchange
post
offers an overview of tradeoffs between different coin selection
strategies. We covered coin selection in a previous review club,
#17331.
PR #17526 implements Single
Random Draw (SRD) as an additional fallback coin selection strategy. SRD is
fairly straightforward: it randomly picks OutputGroups from a pool of
eligible UTXOs until the total amount is sufficient to cover the payment and
fees. Any extra funds are put in a change output.
This means that, with this PR, our coin selection will have three different
solvers: Branch and Bound (BnB), Knapsack, and Single Random Draw (SRD). Note
that some randomness is used in coin selection, so we wonât always come up
with the same solution.
The overall strategy within
SelectCoins()
(including PR #17331, on which PR #17526 is built):
Coins manually selected by the user using CoinControl are added first.
All available coins (excluding the pre-selected ones) are gathered.
CoinEligibilityFilters are used to filter these coins within
SelectCoinsMinConf(). We have a clear hierarchy of which coin selection
solutions are preferred: we first try with a restriction of at least 6
confirmations on foreign UTXOs and 1 confirmation on our own UTXOs. If no
solution is found, we try with at least 1 confirmation on all UTXOs. If that
doesnât work and the user allows spending unconfirmed change, we try that
(still requiring at least 1 confirmation on foreign UTXOs), gradually
increasing mempool ancestor limits on the unconfirmed change.
Within
SelectCoinsMinConf(),
OutputGroups are created using the CoinEligibilityFilter. There is a
clear preference for BnB and we will only try Knapsack and SRD if that
fails. However, weâll try both Knapsack and SRD together, picking the
solution with lower fees, breaking ties by number of UTXOs used.
Questions
Can you give a high-level description of the coin selection strategy
including the changes proposed in this PR?
Within SelectCoinsMinConf(), if we have both a Knapsack and an SRD
solution, how do we decide which one to use?
Why might we prefer to spend more inputs in the same transaction?
Quiz: Based on the coin selection scheme proposed here, letâs say that
Solutions A, B, and C exist (ignore the fact that we would exit early after
finding a solution weâre satisfied with). Which would we pick? (Hint: which
invocation of SelectCoinsMinConf() would each of these come from?)
Solution A: picked using Knapsack. Produces a change output, pays 100
satoshis in fees, and only uses confirmed UTXOs, each with 4 confirmations.
Solution B: picked using BnB. No change output, pays 95 satoshis in fees,
and uses one unconfirmed change output.
Solution C: picked using SRD. Produces a change output, pays 99 satoshis in
fees, and only uses confirmed UTXOs, each with 1 confirmation.
What are
OutputGroups?
Why does SRD pick from output groups rather than from UTXOs?
What does calling GroupOutputs() with positive_only=true do (Hint: you
may want to review what effective values are)? What could happen if
SelectCoinsSRD() was called with all_groups instead of positive_groups?
What are some ways a deterministic coin selection algorithm might leak
information about the walletâs UTXO pool? Why do we
shufflevCoins before creating OutputGroups?
Bonus: Weâve listed some qualitative (e.g. presence of a change output) and
quantitative (e.g. number of inputs used) ways to compare coin selection
solutions. Instead of returning as soon as SelectCoinsMinConf() finds a
solution, should we try multiple and then pick one? How might we design a
metric to decide which one to use?
<glozow> Note that this PR is built on top of #17331, which we covered last week in PR Review Club. It should still be follow-along-able if you weren't here, but those notes could be helpful: https://bitcoincore.reviews/17331
<murch> Because it uses less fees, reduces the overall UTXO in the system, does not put any funds into flight in a change output, and breaks the change heuristic
<murch> A bunch of the chainalysis techniques revolve around guessing which output is the change returning excess funds from the input selection to the sender
<murch> by not returning any funds to the sender's wallet, there are future transactions directly related to this transaction (unless addresses are reused)
<larryruane_> i've always wondered, if the HD wallet sends change to a brand new address (as it should), does that harm privacy immediately? Or only later when the change is spent? (sorry if this is off-topic by now)
<darius58> reduces the memory demand for nodes keeping track of the UTXO set. I'm not sure why it's good for the user's wallet. Reduces future fees, plus less UTXOs to keep track of?
<murch> lightlike: Without a change output and under the assumption of no address reuse, all future inputs would be unrelated to this transaction's inputs.
<darius58> @murch my only question though is whether it is cheaper than another solution with less UTXOs in the input, if they had the same fee at the current moment?
<murch> larryruane_: It's a bit more complicated, but e.g. if two different scripts are used for recipient and change, it would still be obvious which one would be change if there is change
<glozow> alright, Quiz time! Based on the coin selection scheme proposed here, let's say that Solutions A, B, and C exist (ignore the fact that we would exit early after finding a solution we're satisfied with). Which would we pick?
<glozow> Solution A: picked using Knapsack. Produces a change output, pays 100 satoshis in fees, and only uses confirmed UTXOs, each with 4 confirmations.
<michaelfolkson> What's the importance of the foreign output? The danger it could be double spent? If it is an output we own we know we won't double spend ourselves?
<murch> So, Solution A and Solution C should be found in the second pass of the eligibility filter but C pays less fees. B uses an unconfirmed input which happens in a later iteration of SelectCoinsMinConf, i.e. vote for C
<glozow> Next question: What does calling `GroupOutputs()` with `positive_only=true` do (Hint: you may want to review what effective values are)? What could happen if `SelectCoinsSRD()` was called with `all_groups` instead of `positive_groups`?
<glozow> murch: good job attending last week's review club! could you break down for us, then, what it means for a coin to have a negative effective feerate?
<murch> lightlike: Does it group only UTXOs with positive values or does it only allow groups that are positive in sum or does it only create groups from scriptPubKeys that only have positively valued utxos?
<darius58> @glozow it would suck because there would be some cases where SRD would fail because it couldn't reach a high enough amount, even if it could have by not including negative-effective value UTXOs
<murch> michaelfolkson: People use "dust" to denominate various things. Some people refer to UTXOs that are uneconomic at 3 sat/vB as dust, some refer to UTXOs that are uneconomic at the current feerate as dust, some just seem to call anything below e.g. 2000 sats dust.
<michaelfolkson> glozow: Without giving specifics if a coin analysis/surveillance company knows how you are coin selecting (no randomness) they get a better idea of what output(s) are yours?
<murch> So, some coin selection strategies for example reveal the oldest UTXOs or the largest UTXOs. In the combination of address reuse or clustering via change output heuristics, or other wallet fingerprints, this can allow a sufficiently motivated adversary to guess the amount of coins, payment frequency and other things
<lightlike> murch: that sounds less than ideal to me for some use cases: if I want to pay for a small thing, I wouldn't necessarily want to make public how rich I am (if I have some large outgroups)
<murch> lightlike: that's a good point, but generally you'll have a lot more smaller UTXOs than large UTXOs, so it would be somewhat uncommon. Also if you use a small input, you'll need multiple or get back tiny change, which are also suboptimal