Use effective values throughout coin selection (wallet)

Host: Xekyo  -  PR author: achow101

The PR branch HEAD was 4ac1add at the time of this review club meeting.


  • Wallets maintain their balance as a pool of Unspent Transaction Outputs (UTXOs). An OutputGroup collects all UTXOs that were sent to the same public key.

  • Coin selection refers to the process of picking UTXOs from the wallet’s UTXO pool to fund a transaction.

  • Beyond the primary goal of funding a transaction, the secondary goals of coin selection are:

    • Minimizing short term and long term fees: We have to balance two needs: we want to pay the smallest fee now to get a timely confirmation, but we have to keep in mind that all of the wallet’s UTXOs will need to be spent at some point. We shouldn’t over-optimize locally at the detriment of large future costs. E.g. always using largest-first selection minimizes the current cost, but grinds the wallet’s UTXO pool to dust.

    • Maintaining financial privacy: There are a number of heuristics that tracking companies employ to cluster payments and addresses. For example, using inputs with the same output script in two transactions indicates that the two transactions involved the same party. Similarly, it is usually assumed that all inputs of a transaction were controlled by the same entity. Sometimes the privacy and economic considerations are opposed. If you want to understand privacy considerations better, check out the Privacy article on the Bitcoin wiki.

    • Reliably move the payment(s) to finalization: Using unconfirmed inputs can make transactions unreliable. Unconfirmed transactions received from another wallet may time out or be replaced, making those funds disappear. Even using self-sent unconfirmed funds may delay the new transaction if the parent transaction has an extensive ancestry, is extraordinarily large, or used a lower feerate than targeted for the child transaction.

  • The process of creating a transaction in Bitcoin Core’s wallet roughly follows these steps:

    1. CreateTransaction() builds the header and recipient outputs, i.e. the parts of the transaction that will remain fixed throughout the coin selection process.

    2. The long_term_feerate, effective_feerate, and discard_feerate are established using GetMinimumFeeRate(), which falls back either to fee estimation or user input. The feerates remain constant for the remaining process. The amount to be raised from the inputs, nTargetValue is calculated from the sum of recipient amounts, the target feerate, and the size of the fixed transaction parts.

    3. The wallet’s UTXO pool is retrieved via AvailableCoins().

    4. Using the above fixed values, SelectCoins() is used to pick an input set to fund the transaction. If the user manually selected some UTXOs by means of the CoinControl feature, these UTXOs are used first. The remaining UTXOs are grouped into OutputGroups. The grouped UTXOs are then used in one or multiple rounds of SelectCoinsMinConf(), using increasingly permissive CoinEligibilityFilters to preselect OutputGroups until an input set is found.

    5. Determine whether to create a change output, finish assembling the transaction, and sign the transaction.

  • When a UTXO is selected as an input, the size of the transaction is increased, which in turn increases the necessary fees to maintain the targeted feerate. We can preempt the cost added by the input by considering each UTXO at its effective value, its value reduced by the UTXO’s input cost at the given feerate.

  • Bitcoin Core currently uses two different solvers:

    • The knapsack solver sorts the available UTXOs by effective value, and then combines candidate input sets by randomly selecting or skipping inputs until it has sufficient funds for the transaction. When it finds a viable solution, it keeps the solution if it produces a smaller change than the prior stored solution. Then, the algorithm deselects the last input and continues traversing the UTXO list as described, trying to converge on the smallest change output above MIN_CHANGE. This random process is repeated up to 1,000 times.

    • The Branch and Bound solver (BnB) deterministically searches the complete combination space to find an input set that will avoid the creation of a change output. It performs a depth-first search on a binary tree where each branch represents the inclusion or exclusion of a UTXO, exploring inclusion branches first, and backtracking whenever a subtree cannot yield a solution. It returns the first discovered input set that is an exact match for the funding requirement of the transaction. The qualifier “exact match” here refers here to an input set that overshoots the nTargetValue by less than the cost of a change output. The BnB algorithm is not guaranteed to find a solution even when there are sufficient funds since an exact match may not exist.

  • Using effective values in coin selection means that we can freely add and remove inputs until we have have raised enough funds to pay for the transaction’s fixed costs: the payment amounts, the transaction header and the output data. Whatever final input set we settle for, the increased transaction size and input costs have already been accounted for.

  • PR #17331 implements the use of effective values across all different coin selection strategies. Prior, it was only used for the BnB selection where it facilitated the search to be effectively performed. Introducing effective values into the knapsack solver means it no longer works on a moving target, and thus only needs to be run once. It also allows some of the pre-selection steps to be shared between BnB and knapsack.

  • This PR also changes the behavior of SelectCoinsMinConf() to try BnB first and then attempt knapsack only in the case that BnB did not find an input set. Before this PR, the function would only run one or the other strategy controlled by the boolean flag use_bnb.


  1. How does the effective value differ from the nValue of a UTXO?

  2. What is not_input_fees? What does this help with?

  3. What happens when m_subtract_fee_outputs is true?

  4. What does cost_of_change represent? How do we know the cost_of_change in advance?

  5. What are long_term_feerate, effective_feerate, and discard_feerate?

  6. Why are the OutputGroups calculated separately for BnB and knapsack?

  7. What purpose did the while-loop in CreateTransaction() serve? Why is it safe to remove it?

  8. Under which circumstances do resulting transactions not create a change output? (Hint: there are two cases.)

Meeting Log

  118:59 <murch> #startmeeting
  218:59 <jnewbery_> hi !
  318:59 ⚡ uncommon revs git engine
  418:59 <jonatack> hi :)
  518:59 <glozow> HI
  618:59 <murch> Hi everyone :)
  718:59 <uncommon> hi
  818:59 ℹ Guest41403 is now known as hernanmarino
  918:59 <schmidty> hi!
 1018:59 <hernanmarino> hi !
 1119:00 <lightlike> hey
 1219:00 <biteskola> hello
 1319:00 <svav> Hi
 1419:00 <b10c> hi
 1519:00 <marqusat> hi
 1619:00 <raj__> hi
 1719:00 <larryruane_> hi
 1819:00 <darius82> hi!
 1919:00 <michaelfolkson> hi
 2019:00 <murch> If we have anyone new today, please feel free to say "Hi" :)
 2119:01 <sipa> hi
 2219:01 <murch> hi sipa, welcome to PR Review Club ;)
 2319:01 ℹ jnewbery_ is now known as jnewbery
 2419:01 <hernanmarino> :D
 2519:01 <b10c> hi sipa! welcome!
 2619:01 <murch> So, we'll be talking about #17331 today
 2719:01 <sipa> I said "hi", not "Hi".
 2819:02 <uncommon> ;P
 2919:02 <sipa> (go on)
 3019:02 <murch> Did everyone have a chance to check out the notes and review the PR? How about a quick y/n from everyone
 3119:02 <glozow> y
 3219:02 <raj__> y
 3319:02 <dkf> y
 3419:02 <marqusat> y
 3519:02 <ccdle12> n
 3619:02 <svav> y
 3719:03 <sipa> i skimmed the notes
 3819:03 <larryruane_> y (mostly)
 3919:03 <lightlike> y
 4019:03 <jonatack> re-reviewing
 4119:03 <hernanmarino> y
 4219:03 <fodediop> hi
 4319:03 <darius82> y for notes but not the PR
 4419:03 <michaelfolkson> The notes were excellent imho
 4519:03 <hernanmarino> great notes, i agree
 4619:03 <murch> So, Coin Selection... What are we trying to achieve here in a sentence?
 4719:04 <murch> Thanks
 4819:04 <michaelfolkson> Selecting UTXOs to use to transfer Bitcoin ideally cheaply and not leaking unnecessary privacy
 4919:05 <pinheadmz> the cheapest possible transaction to get our job done + some privacy if we can
 5019:05 <fodediop> Optimize for fees paid while minimizing potential dust
 5119:05 <sipa> minizing fee costs for transactions created, now (but also in the future); some privacy considerations
 5219:05 <larryruane_> maybe also help the community by reducing the UTXO set? (less storage for everyone)
 5319:05 <jnewbery> notes and questions are here, by the way:
 5419:05 <murch> Right, so we need to fund the transaction, want to be thrifty, but also remain private. Not the easiest thing to do.
 5519:05 <hernanmarino> the use of effective values across different coin selection strategies
 5619:06 <murch> Thanks John
 5719:06 <murch> Good pointer, Hernan. So, how does the effective value differ from the `nValue` of a UTXO?
 5819:06 <marqusat> It will have effective spending fee subtracted from nValue when coin_selection_params.m_subtract_fee_outputs is true.
 5919:07 <glozow> effective value deducts the cost of including a UTXO from the `nValue` so that you're not using a moving target while selecting coins
 6019:07 <uncommon> effective value = subsequently spendable value
 6119:07 <darius82> It's the estimated value of the utxo after it has been redeemed by the next utxo
 6219:07 <darius82> effectiveValue = utxo.value − feePerByte × bytesPerInput
 6319:07 <glozow> if you just used `nValue`, after you pick a coin, the fees increase since the input increases the tx size
 6419:07 <murch> Right, so is effective value something that is always the same, @glozow?
 6519:08 <glozow> well, it depends on the feerate
 6619:08 <uncommon> murch in what domain? or are you saying in all domains?
 6719:08 <larryruane_> uncommon: excellent way to say it
 6819:08 <uncommon> s/saying/asking
 6919:08 <murch> Exactly, so the effective value is dependend on the context in which we are building the transaction, especially the targeted feerate.
 7019:09 <murch> so, what does using the `effective_value` vs the actual value of a UTXO help us do?
 7119:10 <glozow> only run knapsack once
 7219:10 <uncommon> determine of price efficient creating certain outputGroups are?
 7319:10 <uncommon> s/of/how
 7419:10 <darius82> it helps us create a transaction with a specific fee rate more easily
 7519:10 <murch> glozow: Yep, but why?
 7619:10 <murch> darius82: yeah
 7719:10 <raj__> question: are we using effective value when params.m_subtract_fee_outputs is false?
 7819:11 <murch> raj__: Great question, let's push that one back a little bit :)
 7919:11 <raj__> sure.. :)
 8019:11 <dkf> making sure we are creating a tx that is priced in such a way that it will be committed/propagated?
 8119:11 <glozow> murch: because, before, after you ran knapsack with nValues instead of effective values, you could be off by a little bit since you didn't take the inputs into account
 8219:12 <glozow> so you'd try again
 8319:12 <murch> dkf: Right, what we are all getting at here is that by calculating the effective value of UTXOs, we already account for the cost of the inputs when selectign them.
 8419:12 <sipa> it's making coin selection not a circular reasoning: because adding an extra input changes how much fee you have to pay, possbily necessitating trying coin selection again
 8519:12 <raj__> can we say, we need to run knapsack only once because subset sum isn't a moving target anymore?
 8619:13 <murch> sipa, raj__: yes, well put
 8719:13 <glozow> raj__: yeahhh
 8819:13 <murch> So, in this context, you may have seen `not_input_fees`:
 8919:13 <murch> What is included in that variable and why do we keep track of that?
 9019:13 <jonatack> src/wallet/wallet.cpp#L2402
 9119:14 <marqusat> Effective fee of static vsize overhead + outputs vsize. It’s needed to calculate selection_target, doesn’t need to be passed separately alongside nValue to SelectCoinsBnB.
 9219:14 <glozow> marqusat: all outputs? :)
 9319:14 <murch> marqusat: Yes, but which outputs specifically?
 9419:14 <murch> glozow: hey, that was my line.
 9519:14 <murch> haha
 9619:15 <lightlike> but isn't the knapsack algorithm non-deterministic with some randomness involved? In that case, couldn't running it multiple times still get a better solution, regardless of effective values used or not?
 9719:15 <murch> lightlike: It still does run 1,000 times
 9819:15 <lightlike> oh ok
 9919:15 <murch> It just doesn't run n*1,000 times
10019:15 <murch> but we'll get into that a bit later as well ;)
10119:16 <dkf> the fee for all extra block data that is not user-provided payload? I am not sure if this amount changes though due to unfamiliarity with this part
10219:17 <murch> mh, I'm not sure what you mean with "not user-provided"
10319:17 <uncommon> dkf elaborate on "extra block data that is not user-provided payload"
10419:18 <glozow> marqusat: murch: payment outputs only. excluding change outputs that may or may not be created.
10519:18 <murch> Right
10619:18 <dkf> it seems to me this looks for costs for things which are outside of a regular payload, that's my spontaneous impression.
10719:18 <murch> so, we collect the fixed costs of the transaction which will not change due to the results of the coin seletcion
10819:18 <lightlike> so all parts of the tx that are not influenced by coinselection?
10919:19 <murch> ^ that!
11019:19 <glozow> lightlike: yeah, that's how i think of it
11119:19 <dkf> non-user provided payload = protocol native payload
11219:19 <murch> Included are the costs to create the recipient outputs and the transaction header which we both need in any case
11319:19 <glozow> `fixed_fees` maybe
11419:19 <murch> not included are the inputs and the change output, because the latter is optional
11519:20 <murch> Alright, you may have noticed `m_subtract_fee_outputs`. When it is set to true, some of our calculations change. What is happening there?
11619:20 <raj__> murch, when you say "the recipient output" does that assuming a single output spend of some standard form?
11719:21 ℹ promag_ is now known as promag
11819:21 <marqusat> murch: Effective value is being considered.
11919:21 <murch> raj__: It could also be five payments in a single transaction, an OP_RETURN, or custom script. Basically all the things that are the intended product of the transaction
12019:22 <murch> marqusat: How is effective value considered differently when the flag is true?
12119:22 <raj__> when `m-subtract_fee_outputs` is false we use effective value, and nValue when its true. Thats what it seems to me from the code.
12219:23 <murch> yes, that's important, why do we no longer deduct the fees? Where do they go instead?
12319:23 <glozow> fees are deducted from the recipient output instead
12419:23 <darius82> they get subtracted from nValue of the output(s)?
12519:23 <murch> yep!
12619:24 <darius82> the recipients outputs as opposed to the change output
12719:24 <sipa> raj__: at the time coin selection is invoked we already have what you could call an "incomplete" transaction; it can have multiple inputs and outputs already (which must be included); the goal of coin selection is adding (a) additional inputs from the wallet and (b) optionally adding change so that the fee is as intended
12819:25 <raj__> sipa, thanks, now its clear..
12919:25 <lightlike> doesn't that mean that we should run knapsack multiple times again because we have a moving target again? (could have to substract less from the outputs if we had a better solution)
13019:25 <murch> The way this was implemented (now amended) in our snapshot we're looking, what subtle issue gets introduced here when we evaluate the effective value as being the whole value?
13119:25 <glozow> oh hey, `no_input_fees` doesn't include coin control inputs right?
13219:25 <murch> lightlike: I like how you're thinking here :)
13319:26 <murch> glozow: Great question, I hadn't actually thought about it. But I think that is resolved by the coin control inputs getting used first in either selection procedure and their fees getting accounted for then.
13419:27 <sipa> right; they are still inputs that can be selected; as long as they're treated like other ones in terms of accounting, they shouldn't be counted again throigh no_input_fees
13519:27 <murch> But essentially what lightlike and glozow both are getting here at is that we no longer filter UTXOs that are uneconomic and that the fees move back to the end of the transaction building
13619:28 <murch> if e.g. the total effective value of all UTXOs were negative, we could build a transaction that cannot pay for itself
13719:29 <murch> Let's establish some more terminology here.
13819:30 <murch> What does `cost_of_change` represent?
13919:30 <murch>
14019:30 <marqusat> The cost of adding change output and of spending it in the future (assuming discard feerate for the future spend).
14119:30 <murch> how do we know the `cost_of_change` in advance?
14219:30 <murch> marqusat: Yes, good!
14319:30 <glozow> we know the change output type in advance - it's configured across the wallet
14419:31 <glozow> (and thus we know the size of the input to spend it too)
14519:31 <murch> exactly
14619:32 <murch> Even when some wallets e.g. copy the output type of the recipient output(s), when we are starting the coni selection, we know what we are going for as an output type—if we need one in the first place
14719:32 <darius82> is the predicted fee for spending it in the future based on the current fee rate?
14819:32 <glozow> i guess we don't really _know_ it in advance, there's a bit of guessing for the spending feerate
14919:32 <glozow> darius82: uses an estimated long term feerate
15019:33 <murch> darius82: Great question. It's really hard to guess what feerate it will be spent at in the future.
15119:33 <murch> That actually brings us to the next point:
15219:33 <b10c> marqusat: what do you mean with "(assuming discard feerate for the future spend)"?
15319:33 <murch> We see a number of different feerates across this PR
15419:33 <murch> Some of them are already getting mentioned
15519:33 <darius82> glozow thanks!
15619:34 <murch> But what are `long_term_feerate`, `effective_feerate` and `discard_feerate`?
15719:34 <murch> (see: or
15819:34 <lightlike> effective rate is what we want to pay for the tx right now.
15919:35 <murch> b10c: Actually, that's a fair question
16019:35 <murch> lightlike: Yep!
16119:35 <murch> b10c, but it fits well to the current topic :)
16219:35 <glozow> conceptually,
16319:35 <glozow> effective feerate = base feerate we're aiming for in this transaction
16419:35 <glozow> long term feerate = upper bound for spending an output in the future
16519:35 <glozow> discard feerate = lower bound for spending an output in the future, any less and we'll call it dust
16619:37 <murch> I like to be a bit more precise for the long_term_feerate:
16719:37 <lightlike> "upper bound" probably not in a mathematical sense right? I mean how could we predict that feerates could go crazy
16819:37 <murch> an estimate of the maximum feerate that we will need to pay in the long term to use a UTXO. A reasonable upper bound on what we might need to pay for a low time preference transaction in the long term.
16919:38 <murch> Like, what can we get away with to free the value of that UTXO some time in the future
17019:38 <raj__> curious to know how we are calculating this.
17119:38 <murch> e.g. with a consolidation transaction or if we're willing to wait for a week
17219:38 <murch> raj__: We take the minimum of the 1000 block target and the arbitrary guess of 10 sat/vB. ;)
17319:38 <glozow> i guess i mean upper bound as in, an estimate so we're conservative about what feerate we expect to be able to get in the future?
17419:39 ⚡ murch hopes you weren't looking for something with more academic rigor here :sweatysmile:
17519:39 <dkf> is the assumption that the long term fee rate always increases compared to the effective rate? is there no way it could be cheaper than expected?
17619:39 <raj__> murch, fancy.. :D
17719:40 <darius82> murch why is long_term_feerate a maximum feerate, it sounds like a minimum feerate for 'what we can get away with'?
17819:40 <murch> dkf: No, it's basically, just accounting for the fact that we will have to spend money in the future to spend a UTXO
17919:40 <glozow> dkf: i thought opposite. we think we'll be able to spend at a lower feerate in the future?
18019:40 <glozow> l o w t i m e p r e f e r e n c e
18119:40 <b10c> darius82: agree
18219:41 <murch> It's a bit subtle, it has both elements of a minimum or a maximum
18319:42 <lightlike> do we calculate this only to determine whether we should create a change output or not bother and add it to the fees? or are there additional reasons?
18419:42 <murch> Like, "we know this will cost money", how much will we reasonably need to pay to use it in an input
18519:42 <murch> lightlike: It's only used to estimate the cost_of_chagne
18619:43 <lightlike> ok, but then my questions applies the same to the cost_of_change - to we calculate that for these reasons?
18719:44 <sipa> right, they"re not really upper bounds or lower bounds; they're both jusg estimates - but one is conservatively high, the other is conservatively low
18819:44 <sipa> *just
18919:44 <raj__> murch, it seems here cost_of_change is only dependent on `effective_rate` and `discard_rate`?
19019:44 <murch> lightlike: Let's pick that back up in a couple questions :)
19119:44 <lightlike> sure!
19219:45 <raj__> So it not `long_time_rate` ? or these three are interrelated somehow?
19319:46 <murch> raj__: mh, you appear to be right
19419:46 <murch> Aha, I believe the `discard_feerate = min(10, long_term_feerate)`
19519:47 <raj__> Ok.. that makes sense then..
19619:47 <b10c> murch: aahh
19719:48 <murch> Sorry, I guess we should have cleared that up earlier :)
19819:48 <darius82> @murch i guess that works because `long_term_feerate` will never be lower than the dust fee rate?
19919:49 <murch> mh. I think that might be a bit subtle to sort out
20019:49 <murch> Let's move on with the questions and get into that later?
20119:49 <darius82> :thumbs up:
20219:49 <murch> Why are OutputGroups calculated separately for BnB and Knapsack?
20319:50 <murch>
20419:50 <marqusat> To keep existing legacy behavior of knapsack spending dust outputs, so we don’t want to filter positive only for it.
20519:50 <murch> marqusat: Right!
20619:50 <murch> Actually, true story, I kinda broke that in 2014 with a tiny patch to coin selection that later got reverted when people found that it caused the UTXO set to bloat :p
20719:51 <murch> I made the Knapsack prefilter to only use economic inputs
20819:51 <murch> So, this PR leaves that behavior intact, to help keep wallet's UTXO pools slim
20919:52 <jnewbery> murch: that seems like it might be suboptimal for users
21019:52 <murch> so, what purpose did the while loop in CreateTransaction() serve? Why is it safe to remove?
21119:52 <murch>
21219:53 <lightlike> I don't see why there is a need to keep legacy behavior here. I mean, it's either better or worse to keep it, but why should we care how it used to be?
21319:53 <murch> jnewbery: I agree. We shouldn't be creatign dust in the first place, and BnB should also help use it constructively when it finds solutions.
21419:53 <jnewbery> do you expect a later PR to remove that legacy behaviour?
21519:54 <Murch> oops?!
21619:54 <sipa> you got Capitalized.
21719:54 <Murch> better than decapitated
21819:54 <Murch> um did you get my question?
21919:54 <Murch> about the while loop?
22019:54 <lightlike> yes
22119:55 <Murch> good
22219:55 <darius82> with the previous algorithm there was the moving target, but since we use the effective value we dont have that problem anymore?
22319:56 <Murch> jnewbery: Yes, I expect that Knapsack will go away altogether, but review in Wallet has been pretty slow, so there are efforts that have been waiting for literally years in that regard
22419:56 <glozow> it's the loop for knapsack solver finding a solution but not taking into account input costs, then needing to run again
22519:56 <Murch> darius82: Exactly!
22619:56 <Murch> glozow: right!
22719:56 <Murch> okay, last question:
22819:56 <Murch> So, when do we end up not creating any change output?
22919:57 <Murch> Hint:
23019:57 <glozow> (1) When the change output would be dust, drop it to fees
23119:57 <glozow> (2) When we are in the range of an “exact match,” i.e. the difference between the selected coins’ total effective value and the target value is lower than the cost of a change output
23219:58 <Murch> right
23319:58 <Murch> so, I think we actually didn't even talk about `exact_match`
23419:58 <Murch> and it ties into some questions from above ^
23519:58 <Murch> It means we're close enough to throw away the excess of what we have selected
23619:59 <Murch> because creating a change and spending that change later would cost more than dropping the remainder to the fees
23719:59 <Murch> that's what we use the `discard_feerate` for in estimating the future input cost of the change output
23819:59 <Murch> hui, that was a lot of content
23920:00 <Murch> We good? any questions?
24020:00 <jnewbery> We Good :)
24120:00 <larryruane_> question, if there's time, are there functional or unit tests that run through these various code paths? i like to watch code in action using a debugger
24220:01 <lightlike> so that's also to keep the utxo set small? As opposed to creating the change output and just not spending it / hoping for extremely low feerate times?
24320:01 <Murch> There are! Look for BnB in the tests :)