Decide which coin selection solution to use based on waste metric (
wallet) Sep 8, 2021
PR author: achow101
The PR branch HEAD was 32748da0f4 at the time of this review club meeting.
refers to the process of
picking UTXOs (or coins) from the wallet’s UTXO pool to fund a transaction. It is a complex
problem that involves minimizing short term and long term fees, working with non-guaranteed finality
of payments, and avoiding privacy leaks. We have covered coin selection in previous review clubs:
Coin Selection #17331, #17526, and #18418.
The Bitcoin Core wallet currently implements
two coin selection
KnapsackSolver is Bitcoin Core’s legacy coin selector that reduces the problem to Subset
Sum and attempts to solve it through 1000 rounds of stochastic approximation. As long as
the wallet has enough funds to cover the transaction,
KnapsackSolver always finds a
solution. It can overshoot, but the wallet just creates a change output to redeem the excess.
SelectCoinsBnB uses a
Branch and Bound
algorithm to explore a bounded search tree of potential solutions, scoring them with a
metric called “waste.” Notably, the Branch and Bound algorithm looks for an exact
solution and never produces a change output. As such, it’s possible for
to fail even though the wallet has sufficient funds.
Other coin selection algorithms have also been proposed, such as Single Random Draw in
The current behavior in
unconditionally prefers the Branch and Bound solution, and only attempts to use
PR #22009 implements a
GetSelectionWaste() function and changes
AttemptSelection() to try both solvers and pick the
solution with lower waste, breaking ties by preferring a greater number of inputs.
Waste is measured in satoshis and includes the cost of creating change, the excess selection
amount, and cost of spending inputs now as opposed to sometime in the future (when we
expect to be able to consolidate inputs).
Cost of change includes the fees paid on this transactions’ change output plus the fees
that will need to be paid to spend it later. If there is no change output, the cost is 0.
Excess selection amount refers to the difference between the sum of selected inputs and
the amount we need to pay (the sum of output values and fees). There shouldn’t be any
excess if there is a change output.
Cost of spending inputs now is the fee difference between spending our selected inputs at
the current feerate and spending them later at the “long term feerate.” This helps us
implement a long term fee-minimization strategy by spending fewer inputs in high feerate
times and consolidating during low feerate times.
PR #22009 also sets the default long term feerate
to 10 sats/vbyte, and creates a configuration option,
-consolidatefeerate. The long term feerate
represents the feerate at which we would be happy to consolidate UTXOs in transactions.
Did you review the PR?
Concept ACK, approach ACK, tested ACK, or
Without looking at the waste calculation function, what properties would you consider when
comparing different coin selection solutions? How might you quantify them?
What is the waste calculation equation? How can we verify that the implementation in
GetSelectionWaste() is correct?
In what scenario would a coin selection solution have waste == 0?
Can a coin selection solution have waste < 0? How?
What does the
here attribute do?
How did you review the
(You did review it, right???)
What do you think of the 10 sat/vbyte consolidation feerate? What effect does this have on our
waste calculation method?
How does using
m_consolidate_feerate instead of the 1008-block feerate estimate change our
wallet’s coin selection behavior?
1 17:00 <glozow> #startmeeting
3 17:00 <glozow> Welcome to PR Review Club, everybody! Feel free to say hi, and let us know if it's your first time :)
10 17:00 <notmandatory> hi
14 17:01 <glozow> If you have any questions at any point in time, please ask!
17 17:02 <glozow> Did anyone get a chance to review the PR or look at the notes? y/n
21 17:02 <larryruane> only a little
22 17:02 <Azorcode> Hello everyone
23 17:02 <glozow> nice! :D
25 17:03 <emzy> n (just read the notes)
28 17:03 <chunkblob> also just read notes
30 17:03 <glozow> First question (good for those who didn't review the PR as well): What properties would you consider when comparing different coin selection solutions? How might you quantify them?
31 17:04 <notmandatory> transaction size
32 17:04 <theStack> first and foremost, minimizing fees, both short and long-term
33 17:05 <glozow> theStack: notmandatory: great answers
34 17:05 <svav> How much waste
35 17:05 <larryruane> not creating "dust" outputs?
36 17:05 <benthecarman> when to conslidate and when to conserve utxos
37 17:05 <glozow> larryruane: yes, absolutely!
38 17:06 <glozow> and for those who did review the PR, What is the waste calculation equation?
39 17:06 <lightlike> psychologically, minimizing the ratio of fees to total amount spent (even if it makes little sense on a technical level).
40 17:07 <glozow> lightlike: that's an interesting one. people are sometimes surprised by how fees scale in bitcoin - what's scarce to us isn't liquidity but block space, so the fees scale with the size of the transaction rather than the amount being transacted
41 17:07 <raj> waste = sum(current fee - long term fee) + Cost of spend/Excess paid in fees.
42 17:08 <benthecarman> (change_cost ==0 ? excess: change_cost )+ inputs * (effective_feerate - long_term_feerate)
43 17:08 <larryruane> I learned this from the notes, but fascinating to think about how there may be a benefit to NOT spending a particular output now (assuming fees are currently high), given that we may be able to consolidate it later when fees are low
44 17:08 <raj> Its seems to me like the first part can be thought of as "Consolidation Factor" and the second is "Money burned in the process" so kind of a "Cost".
46 17:09 <sipa> glozow: privacy (avoiding merging and/or avoiding spending reused, to the extent possible) is another criterion for coin selection i think, but a hard to quanify one
47 17:09 <sipa> i summoned a murch1 here
48 17:09 <glozow> sipa: indeed. I would classify "not producing a change output" as slight win for privacy, and it would be interesting to see that quantified in a waste metric
49 17:10 <larryruane> sipa: is spending reuse the same as address reuse?
50 17:10 <murchandamus> glozow: Not only a win for privacy, but also a reduction of current and overall fees
51 17:10 <glozow> larryruane: raj: yeah, i particularly liked how the waste metric captures the "feerates now vs when we'd want to consolidate" part
52 17:11 <glozow> murchandamus: yes, of course
53 17:11 <murchandamus> sorry, just catching up on previous convo
54 17:11 <sipa> larryruane: i'd generally call "address reuse" the act of giving out the same address multiple times and/or peforming multiple (expected) payments to one
55 17:11 <murchandamus> I see that this has been mentioned ':-)
56 17:12 <glozow> raj: benthecarman: 👌, would you be able to break that down into english for us? :)
57 17:12 <larryruane> sipa: thanks, i was just wondering if "spending reuse" is a different concept (which I haven't heard of)
58 17:13 <sipa> larryruane: say someone sends you a ton of dust to an address you've spent from already
59 17:13 <sipa> perhaps it's worth avoiding spending that dust, beyond the normal level that would otherwise be implied by fee minimization and waste metric
60 17:13 <glozow> people can send u money without your consent?!!?
61 17:13 <murchandamus> larryruane: Bitcoin Core wallet goes out of its way to spend UTXOs associated with the same scriptPubKey together in one transaction so that there will not be multiple txns associated with the same scriptPubKey
62 17:13 <glozow> even when ur offline???
63 17:14 <glozow> ok so back to the waste metric, can anyone tell me what "excess" is?
64 17:14 <theStack> glozow: :D
65 17:14 <emzy> glozow: hehe
66 17:14 <murchandamus> glozow: No, as the seeress Francis has established, you cannot receive while offline. Duh.
67 17:14 <raj> glozow, I think maybe in this way "Waste = Opportunity Cost of waiting + Cost of Creation". Cost of Creation is always positive, while "Opportunity Cost of waiting" can be negative too..
68 17:15 <raj> If current_fee < Long_term_fee , the "Opportunity Cost of waiting" is negative.. So we should not wait and do it now..
69 17:15 <raj> Not sure if it makes sense totally though.
70 17:15 <glozow> raj: yes, i agree with that breakdown. cost of creation being the excess or change cost
71 17:16 <lightlike> "excess" is if we don't make a change output and instead add the difference to the fees.
72 17:16 <theStack> excess = input_values - output_values - fees_needed
73 17:16 <glozow> lightlike: correcto
74 17:16 <glozow> and what is cost of change?
75 17:16 <murchandamus> right: since creating change costs money, we allow for a small overshoot that we drop to fees instead of creating change
76 17:16 <benthecarman> how much fees we are paying to create a change output
77 17:17 <murchandamus> glozow: Usually either a paradigm shift or a revolution
78 17:17 <glozow> benthecarman: just to create? :)
79 17:17 <raj> cost_of_change = Cost to "spend" the change? Or to "create" the change?
80 17:17 <benthecarman> and spend in the future
81 17:17 <benthecarman> change_cost = effective_feerate * change_output_size + long_term_feerate * change_spend_size
82 17:17 <glozow> benthecarman: right, exactly!
83 17:17 <murchandamus> glozow: the cost of creating the change at the current feerate, and the cost of later spending that UTXO
84 17:17 <raj> benthecarman, Ah thanks..
85 17:18 <larryruane> would we somehow be able to capture the GOOD that a slight increase in fees does (in the case that we don't want to create a change output), in getting the tx mined more quickly? Lots of angles to all this!
86 17:18 <glozow> so does it make sense to have both excess and cost of change be greater than 0?
87 17:18 <benthecarman> No, if you have a change output than your excess should be 0
88 17:18 <murchandamus> larryruane: Well, assuming that the feerate estimate was good, it's just an overpayment, but maybe the next block is a bit slow...
89 17:18 <glozow> larryruane: we set the feerate ahead of time
90 17:18 <murchandamus> Well, it's hard to calculate
91 17:19 <glozow> benthecarman: right, exactly
92 17:19 <glozow> In what scenario would a coin selection solution have waste == 0?
93 17:19 <murchandamus> I can think of at least two :)
94 17:19 <glozow> or should i say, scenarios - there are multiple ways this is possible of course
95 17:20 <benthecarman> if you have a change output and long term fee rate == fee rate, or if excess = 0 and long term fee rate == fee rate
96 17:20 <glozow> murchandamus haha jinx
97 17:20 <murchandamus> benthecarman: No, the change output cost would still increase the waste score
98 17:20 <glozow> benthecarman: yes to the second example
99 17:21 <glozow> i.e. the stars aligned and the BnB solver produced a perfect solution, AND the effective feerate == long term feerate
100 17:21 <benthecarman> oh right
101 17:22 <murchandamus> If the feerate is below the long term feerate and the inputs' score matches excess or cost of change you can also hit a 0
102 17:22 <glozow> or, it produced an imperfect solution, but subtract fees from outputs was on, and the excess was absorbed nicely...
103 17:22 <murchandamus> *negatively matches
104 17:22 <theStack> if we get a perfect solution via the BnB solver, both the cost of change and excess selection amount are always zero (since there is no change output). did i get that right?
105 17:22 <glozow> murchandamus: right, if cost to spend now is negative and equal to the excess or cost of change
106 17:22 <glozow> then they'll balance each other out
107 17:22 <murchandamus> theStack: Yes
108 17:23 <murchandamus> glozow: Nit: not the cost to spend, but the waste score
109 17:23 <murchandamus> You're still paying for the inputs ;)
111 17:23 <benthecarman> lightlike: haha looks like it
112 17:23 <murchandamus> lightlike: good catch
113 17:23 <glozow> murchandamus: i'm using "waste" as defined in GetSelectionWaste(), so it'd be confusing to call cost to spend waste scorfe?
114 17:24 <glozow> lightlike: indeed! haha
115 17:24 * murchandamus gets out his pitchfork starts looking for achow101
116 17:24 <glozow> How might we verify that `GetSelectionWaste` is implemented as specified?
117 17:24 <glozow> (how did you review it?)
118 17:25 <benthecarman> Tests!
119 17:25 <larryruane> unit tests with very specific inputs? (I didn't review it)
120 17:25 <raj> Printed the test results and matched by hand calculations..
121 17:26 <glozow> benthecarman: larryruane: raj: good answers
122 17:26 <murchandamus> glozow: I was trying to differentiate between the actual cost of spending UTXOs and how they're scored by the waste metric. Not sure where you see "cost to spend".
123 17:27 <glozow> as u can see, my method is to host a pr review club (u can too! contact jnewbery)
124 17:27 <glozow> we kind of already covered this, but: Can a coin selection solution have waste < 0? How?
125 17:27 <murchandamus> Indubitably!
126 17:27 * murchandamus leaves how to someone else
127 17:28 <benthecarman> if fee_rate < long_term_fee_rate
128 17:28 <lightlike> yep, if current ees are low enough compared to the long-term fee rate to overcome the excess or cost of change
129 17:28 <larryruane> murchandamus: never use a big word when a dimunitive one would do
130 17:28 <benthecarman> * and cost of change/excess doesn't bring it over
131 17:28 <raj> glozow, No Excess && No Chanage && Fee < Long term fee?
132 17:28 <glozow> yep yep!
133 17:29 <murchandamus> larryruane: "diminutive"? :p
135 17:29 <larryruane> murchandamus: 👍
136 17:29 <larryruane> ah, that's so the caller can't ignore the function return value! (without a warning at least)
138 17:29 <theStack> it's for telling the compiler that we'd like to get noticed if we don't use the return value
139 17:30 <murchandamus> Oooh, glozow I have another question, may I?
140 17:30 <glozow> larryruane: theStack: yes! :D
141 17:30 <benthecarman> your return value says notice me senpai
142 17:30 <glozow> murchandamus: yes go ahead
143 17:30 <larryruane> however you can cast the return to void .. what's the best way in c++? I'm used to c, where it's just `(void)`
144 17:30 <raj> benthecarman, haha.. They should put this in cpprefernce..
145 17:30 <murchandamus> When actual feerate is equal to long term feerate, how does the number and type of inputs impact the waste score? What does that mean for the input count vs excess optimization?
146 17:31 <benthecarman> ooh good question
147 17:31 <theStack> larryruane: according to the cppreference link glozow shared it seems to work the same way in C++... cast to void
148 17:31 <glozow> ooooh nice one. i also forgot to ask "how do we break ties when waste is equal?"
150 17:31 <raj> there goes my tomorrow morning.. :D
151 17:31 <larryruane> theStack: I was just wondering the preferred syntax to do that
152 17:32 <benthecarman> It shouldn't impact the waste score, so we should use results that have less total fees for tie breaks?
153 17:32 <theStack> larryruane: ah sorry, i misinterpreted your question. you mean like if there is something like "static_const<void>" or similar
154 17:33 <larryruane> theStack: yes
155 17:33 <murchandamus> benthecarman: Actually, I think it prefers the solution that uses more inputs
156 17:33 <glozow> benthecarman: right, the number of inputs wouldn't impact the waste score
157 17:33 <benthecarman> murchandamus: why is that?
158 17:33 <murchandamus> However, if the waste score is the same, how do the fees compare?
159 17:34 <larryruane> murchandamus: is that so we reduce the size of the UTXO set? to help the community?
160 17:34 <murchandamus> Yes, we err on being more consolidatory
161 17:34 <glozow> if the waste score is the same, you prefer the one that has more inputs, which you are indeed paying more fees on
162 17:34 <schmidty> Since waste was a metric introduced by BnB and this PR introduces a new GetSelectionWaste method, are there two different types of "waste" now? If the two types of waste are the same, should BnB use the new GetSelectionWaste for calculations?
163 17:35 <benthecarman> i guess that makes sense, if your long_term_fee_rate is what you expect to pay, then you would want to conslidate then
164 17:35 <glozow> schmidty: good question
165 17:35 <murchandamus> schmidty: It's the same waste metric, it has just been generalized to apply to all sorts of selection results
166 17:36 <benthecarman> the bnb implementation wasn't touched in this PR to use `GetSelectionWaste`, would that be a good follow up PR to reduce code duplication
167 17:36 <schmidty> Should BnB use that generalized method internally? murchandamus
168 17:36 <schmidty> benthecarman: yes that’s what Im getting at.
169 17:36 <murchandamus> glozow: I need to think more about it, but from the top of my gut I would say that when the waste score is equal, two changeless input set candidates would cost the same fees
170 17:37 <murchandamus> It's a bit more complicated when comparing a changeless solution with one that produces change, but there it would cause the changeless to be preferred, I think
171 17:37 <glozow> schmidty: indeed. it would be bad for those calculations to diverge
172 17:37 <benthecarman> murchandamus: if you were spending different output types they wouldnt be
173 17:37 <murchandamus> schmidty: If it does not yet, it definitely should
174 17:38 <glozow> murchandamus: when you have long term feerate == effective feerate, two solutions can pick a different number of inputs and end up with the same excess, no?
175 17:38 <glozow> or same change cost
176 17:38 <murchandamus> glozow: True in the case of being right on the boundary
177 17:40 <murchandamus> benthecarman: You sure? ;)
178 17:40 <benthecarman> now im not lol
179 17:41 <glozow> can anyone think of other things that could be added to the waste measurement?
180 17:41 <glozow> feel free to throw out ideas
181 17:41 <benthecarman> privacy, hard to quantify though
182 17:41 <glozow> for example, I wonder if we would want to weight cost of change vs excess differently, given that one has a change output and one doesn't
183 17:41 <murchandamus> benthecarman: Let's take it offline, it might take a while to pick apart
184 17:42 <murchandamus> glozow: What do you propose concretely? :)
185 17:42 <lightlike> i wonder why the long-term fee default value was changed to a fixed value. would the whole thing work less well with a dynamic estimate based on the last X blocks, as was in place before?
186 17:42 <glozow> e.g. if you scaled the cost of change by 1.1
187 17:43 <murchandamus> Would be wonderful if we had some heuristic to quantify privacy
188 17:43 <glozow> lightlike: i thought that was interesting as well, though i imagine that the long term fee estimate is usually about the same
189 17:43 <murchandamus> Hard, though, I think
190 17:43 <notmandatory> as I think sipa implied not spending all utxos to the same script could be a negative (in privacy terms)
191 17:44 <murchandamus> lightlike: the problem was that the 1008 block target is basically always 1 sat/vB if the mempool has cleared once in the last week
192 17:44 <glozow> mm, i think we already have a countermeasure to that type of dust attack
193 17:44 <murchandamus> So it would never actually switch between consolidatory and thrifty mode
195 17:45 <murchandamus> glozow: But the cost of change is already fairly dissuading since it counts both the creation and an assumed long term cost
196 17:46 <murchandamus> Although, yeah, avoiding change could definitely be encouraged ;)
197 17:46 <notmandatory> glozow: +1 thanks
198 17:46 <raj> murchandamus, when you say the selction algo switches mode (consolidate or reduce fee) is that simple choice between BnB and KnapSack or something more going on?
199 17:47 <benthecarman> maybe something you'd want to incorporate into waste is coins days destroyed, if you only want to spend recently received coins
200 17:48 <glozow> raj: murch means switching into "we want to spend more inputs to consolidate them" mode
201 17:48 <murchandamus> raj: It doesn't switch the preferred algorithm but it switches whether it prefers the solution with more inputs or fewer inputs (via the waste metric)
202 17:48 <glozow> benthecarman: that's interesting, like you always prefer to spend more recently received ones?
203 17:48 <raj> glozow, Ah.. And can you point me where exactly this switch logic happening?
204 17:49 <murchandamus> benthecarman: Why'd you want to do that?
205 17:49 <benthecarman> yeah maybe for privacy reasons or something
206 17:49 <glozow> raj: it's in the waste metric calculation, it depends on what the effective feerate is
207 17:49 <murchandamus> Wouldn't that mean that fewer funds are moved much more often?
208 17:49 <glozow> when effective feerate < long term feerate, the switch happens
209 17:49 <murchandamus> I.e. seems like privacy detriment more than a boon
210 17:49 <benthecarman> I'm not too sure, just throwing idea out for other metrics to add to waste
211 17:49 <glozow> you could also use coin control to manually pick the recent coins you want to use
212 17:50 <benthecarman> yeah I guess you would spend lots of the same funds often
213 17:50 <murchandamus> benthecarman: Sorry, brainstorm on!
214 17:51 <benthecarman> Another thing is maybe change output size, maybe you don't want to doxx your 50 bitcoin output when buying coffee
215 17:51 <benthecarman> change output value*
216 17:51 <notmandatory> murchandamus and benthecarman: maybe preferring oldest utxos is better, at least makes pruned nodes smaller?
217 17:51 <murchandamus> raj: Since inputs get a negative score below the long term feerate, a candidate input set with more input size would be preferred over one with less input size at low feerates, whereas the opposite is true at higher feerates. This shifts some of the UTXO consumption to lower feerates overall saving cost
218 17:52 <raj> murchandamus, Oh i see. So its kinda happens implicitly?
219 17:52 <murchandamus> raj: yes!
220 17:52 <larryruane> notmandatory: I don't think pruning is related to UTXO db
221 17:53 <murchandamus> raj: Well, BnB does it during it's search also, so it'll tend to find a waste score optimized solution among the possible ones
222 17:53 <glozow> benthecarman: kind of along those lines, maybe we can measure privacy based on the difference between the payment amount and change output amount
223 17:53 <glozow> one aspect of privacy i mean
224 17:53 <glozow> (just throwing out random ideas)
225 17:54 <raj> murchandamus, So would this be correct to say, just having a waste metric is not enough to ensure this tendency of the wallet, it also has to be used correctly?
226 17:54 <murchandamus> but we only pick from two results, so it's not like we build an actual consolidation transaction at low feerates and a minimized tx at high feerates. It's just a small bias.
227 17:54 <glozow> murchandamus: do you think we could use waste metric in place of the sequential calls to AttemptSelection()?
228 17:55 <glozow> well we could, but i mean to ask if you think it's a good idea*
229 17:55 <raj> murchandamus, got it.. thanks..
230 17:55 <lightlike> if everyone in the world used the bitcoin core algorithm, would there be some reverting-to-the-mean effect stabilizing fees around 10sats/vbyte? if the current fee rate is below this, utxos are consolidated, leading to larger transactions and less block space, driving fees back up
231 17:55 <murchandamus> oh, since knapsack was upgraded to use effective feerates, it should (almost) always find a solution if one is possible.
232 17:56 <murchandamus> So "attempt selection" should only fail if there are insufficient funds
233 17:56 <glozow> I wanted to ask this question from the notes: How did you review the scripted-diff commit? (You did review it, right???)
234 17:57 <murchandamus> lightlike: Good question. I have been thinking about that too. I think that could happen if there were generally more continuous blockspace demand
235 17:57 <glozow> murchandamus: right, but i'm asking about replacing it, so instead of only trying coins with 1 confirmation when 6+ fails, you try both and pick the one with less waste
236 17:57 <raj> glozow, I just looked.. And didn't knew what else to do.. Wanna know how to review them quickly..
237 17:57 <murchandamus> currently, we have a lot of gaps where the mempool actually empties out completely
238 17:57 <benthecarman> raj: there is a script you can run in the commit message
239 17:57 <larryruane> glozow: I ran `test/lint/commit-script-check.sh 935b3ddf72aa390087684e03166c707f5b173434~..935b3ddf72aa390087684e03166c707f5b173434` (but I know CI does that anyway), but to review, you study the script!
240 17:58 <murchandamus> glozow: Oh, I see, I guess one could
241 17:58 <glozow> larryruane: good yes
242 17:58 <glozow> CI will verify the script is correct for you, but you should review the script
243 17:58 <murchandamus> I thought "AttemptSelection" was the thing that looped Knapsack if the fees were insufficient after finding a solutino
244 17:59 <raj> benthecarman, ok in that way I can repro the changes? But then I have to manually see if all required changes are covered?
245 17:59 <theStack> would it make sense to support multiple change outputs in the future, just for the sake of confusing on-chain analysis and increasing privacy?
247 18:00 <murchandamus> glozow: I guess that would basically compound to just not preferring 6 confs especially
248 18:00 <glozow> so for instance, if the script was replacing `filter_standard` with `filter_confirmed` that would be wrong, even though the linter passed
249 18:00 <murchandamus> theStack: There are reasons to create multiple outputs occasionally, yes.
250 18:00 <larryruane> theStack: interesting idea! Also if somehow you know that in the future you'll need inputs with a specific amount, maybe make a change output with that exact amount?
251 18:01 <glozow> uhoh it's time already
252 18:01 <glozow> #endmeeting