The PR branch HEAD was 32748da0f4 at the time of this review club meeting.
Notes
Coin Selection refers to the process of
picking UTXOs (or coins) from the wallet’s UTXO pool to fund a transaction. It is a complex
problem that involves minimizing short term and long term fees, working with non-guaranteed finality
of payments, and avoiding privacy leaks. We have covered coin selection in previous review clubs:
#17331, #17526, and #18418.
KnapsackSolver is Bitcoin Core’s legacy coin selector that reduces the problem to Subset
Sum and attempts to solve it through 1000 rounds of stochastic approximation. As long as
the wallet has enough funds to cover the transaction, KnapsackSolver always finds a
solution. It can overshoot, but the wallet just creates a change output to redeem the excess.
SelectCoinsBnB uses a Branch and Bound
algorithm to explore a bounded search tree of potential solutions, scoring them with a
metric called “waste.” Notably, the Branch and Bound algorithm looks for an exact
solution and never produces a change output. As such, it’s possible for SelectCoinsBnB
to fail even though the wallet has sufficient funds.
Other coin selection algorithms have also been proposed, such as Single Random Draw in
PR #17526.
The current behavior in
AttemptSelection()
unconditionally prefers the Branch and Bound solution, and only attempts to use KnapsackSolver if
SelectCoinsBnB fails. PR #22009 implements a
GetSelectionWaste() function and changes AttemptSelection() to try both solvers and pick the
solution with lower waste, breaking ties by preferring a greater number of inputs.
Waste is measured in satoshis and includes the cost of creating change, the excess selection
amount, and cost of spending inputs now as opposed to sometime in the future (when we
expect to be able to consolidate inputs).
Cost of change includes the fees paid on this transactions’ change output plus the fees
that will need to be paid to spend it later. If there is no change output, the cost is 0.
Excess selection amount refers to the difference between the sum of selected inputs and
the amount we need to pay (the sum of output values and fees). There shouldn’t be any
excess if there is a change output.
Cost of spending inputs now is the fee difference between spending our selected inputs at
the current feerate and spending them later at the “long term feerate.” This helps us
implement a long term fee-minimization strategy by spending fewer inputs in high feerate
times and consolidating during low feerate times.
PR #22009 also sets the default long term feerate
to 10 sats/vbyte, and creates a configuration option, -consolidatefeerate. The long term feerate
represents the feerate at which we would be happy to consolidate UTXOs in transactions.
Without looking at the waste calculation function, what properties would you consider when
comparing different coin selection solutions? How might you quantify them?
What is the waste calculation equation? How can we verify that the implementation in
GetSelectionWaste() is correct?
In what scenario would a coin selection solution have waste == 0?
Can a coin selection solution have waste < 0? How?
<glozow> First question (good for those who didn't review the PR as well): What properties would you consider when comparing different coin selection solutions? How might you quantify them?
<glozow> lightlike: that's an interesting one. people are sometimes surprised by how fees scale in bitcoin - what's scarce to us isn't liquidity but block space, so the fees scale with the size of the transaction rather than the amount being transacted
<larryruane> I learned this from the notes, but fascinating to think about how there may be a benefit to NOT spending a particular output now (assuming fees are currently high), given that we may be able to consolidate it later when fees are low
<raj> Its seems to me like the first part can be thought of as "Consolidation Factor" and the second is "Money burned in the process" so kind of a "Cost".
<sipa> glozow: privacy (avoiding merging and/or avoiding spending reused, to the extent possible) is another criterion for coin selection i think, but a hard to quanify one
<glozow> sipa: indeed. I would classify "not producing a change output" as slight win for privacy, and it would be interesting to see that quantified in a waste metric
<sipa> larryruane: i'd generally call "address reuse" the act of giving out the same address multiple times and/or peforming multiple (expected) payments to one
<murchandamus> larryruane: Bitcoin Core wallet goes out of its way to spend UTXOs associated with the same scriptPubKey together in one transaction so that there will not be multiple txns associated with the same scriptPubKey
<raj> glozow, I think maybe in this way "Waste = Opportunity Cost of waiting + Cost of Creation". Cost of Creation is always positive, while "Opportunity Cost of waiting" can be negative too..
<larryruane> would we somehow be able to capture the GOOD that a slight increase in fees does (in the case that we don't want to create a change output), in getting the tx mined more quickly? Lots of angles to all this!
<theStack> if we get a perfect solution via the BnB solver, both the cost of change and excess selection amount are always zero (since there is no change output). did i get that right?
<murchandamus> glozow: I was trying to differentiate between the actual cost of spending UTXOs and how they're scored by the waste metric. Not sure where you see "cost to spend".
<murchandamus> When actual feerate is equal to long term feerate, how does the number and type of inputs impact the waste score? What does that mean for the input count vs excess optimization?
<schmidty> Since waste was a metric introduced by BnB and this PR introduces a new GetSelectionWaste method, are there two different types of "waste" now? If the two types of waste are the same, should BnB use the new GetSelectionWaste for calculations?
<benthecarman> the bnb implementation wasn't touched in this PR to use `GetSelectionWaste`, would that be a good follow up PR to reduce code duplication
<murchandamus> glozow: I need to think more about it, but from the top of my gut I would say that when the waste score is equal, two changeless input set candidates would cost the same fees
<murchandamus> It's a bit more complicated when comparing a changeless solution with one that produces change, but there it would cause the changeless to be preferred, I think
<glozow> murchandamus: when you have long term feerate == effective feerate, two solutions can pick a different number of inputs and end up with the same excess, no?
<lightlike> i wonder why the long-term fee default value was changed to a fixed value. would the whole thing work less well with a dynamic estimate based on the last X blocks, as was in place before?
<raj> murchandamus, when you say the selction algo switches mode (consolidate or reduce fee) is that simple choice between BnB and KnapSack or something more going on?
<murchandamus> raj: It doesn't switch the preferred algorithm but it switches whether it prefers the solution with more inputs or fewer inputs (via the waste metric)
<murchandamus> raj: Since inputs get a negative score below the long term feerate, a candidate input set with more input size would be preferred over one with less input size at low feerates, whereas the opposite is true at higher feerates. This shifts some of the UTXO consumption to lower feerates overall saving cost
<glozow> benthecarman: kind of along those lines, maybe we can measure privacy based on the difference between the payment amount and change output amount
<raj> murchandamus, So would this be correct to say, just having a waste metric is not enough to ensure this tendency of the wallet, it also has to be used correctly?
<murchandamus> but we only pick from two results, so it's not like we build an actual consolidation transaction at low feerates and a minimized tx at high feerates. It's just a small bias.
<lightlike> if everyone in the world used the bitcoin core algorithm, would there be some reverting-to-the-mean effect stabilizing fees around 10sats/vbyte? if the current fee rate is below this, utxos are consolidated, leading to larger transactions and less block space, driving fees back up
<murchandamus> lightlike: Good question. I have been thinking about that too. I think that could happen if there were generally more continuous blockspace demand
<glozow> murchandamus: right, but i'm asking about replacing it, so instead of only trying coins with 1 confirmation when 6+ fails, you try both and pick the one with less waste
<larryruane> glozow: I ran `test/lint/commit-script-check.sh 935b3ddf72aa390087684e03166c707f5b173434~..935b3ddf72aa390087684e03166c707f5b173434` (but I know CI does that anyway), but to review, you study the script!
<theStack> would it make sense to support multiple change outputs in the future, just for the sake of confusing on-chain analysis and increasing privacy?
<larryruane> theStack: interesting idea! Also if somehow you know that in the future you'll need inputs with a specific amount, maybe make a change output with that exact amount?