Use Single Random Draw in addition to knapsack as coin selection fallback (wallet)

Host: glozow  -  PR author: achow101

The PR branch HEAD was fac99dc at the time of this review club meeting.


  • Coin selection refers to the process of selecting UTXOs (or “coins”) from a wallet’s available UTXO pool to fund a transaction. Generally, the goal is to pick coins that will minimize fees for the user (now and in the long run), help the transaction reliably confirm in a timely manner, and not leak information about the wallet. This Stack Exchange post offers an overview of tradeoffs between different coin selection strategies. We covered coin selection in a previous review club, #17331.

  • PR #17526 implements Single Random Draw (SRD) as an additional fallback coin selection strategy. SRD is fairly straightforward: it randomly picks OutputGroups from a pool of eligible UTXOs until the total amount is sufficient to cover the payment and fees. Any extra funds are put in a change output.

  • This means that, with this PR, our coin selection will have three different solvers: Branch and Bound (BnB), Knapsack, and Single Random Draw (SRD). Note that some randomness is used in coin selection, so we won’t always come up with the same solution.

  • The overall strategy within SelectCoins() (including PR #17331, on which PR #17526 is built):

    • Coins manually selected by the user using CoinControl are added first.

    • All available coins (excluding the pre-selected ones) are gathered. CoinEligibilityFilters are used to filter these coins within SelectCoinsMinConf(). We have a clear hierarchy of which coin selection solutions are preferred: we first try with a restriction of at least 6 confirmations on foreign UTXOs and 1 confirmation on our own UTXOs. If no solution is found, we try with at least 1 confirmation on all UTXOs. If that doesn’t work and the user allows spending unconfirmed change, we try that (still requiring at least 1 confirmation on foreign UTXOs), gradually increasing mempool ancestor limits on the unconfirmed change.

    • Within SelectCoinsMinConf(), OutputGroups are created using the CoinEligibilityFilter. There is a clear preference for BnB and we will only try Knapsack and SRD if that fails. However, we’ll try both Knapsack and SRD together, picking the solution with lower fees, breaking ties by number of UTXOs used.


  1. Can you give a high-level description of the coin selection strategy including the changes proposed in this PR?

  2. Within SelectCoinsMinConf(), if we have both a Knapsack and an SRD solution, how do we decide which one to use?

  3. Why might we prefer to spend more inputs in the same transaction?

  4. Quiz: Based on the coin selection scheme proposed here, let’s say that Solutions A, B, and C exist (ignore the fact that we would exit early after finding a solution we’re satisfied with). Which would we pick? (Hint: which invocation of SelectCoinsMinConf() would each of these come from?)

    Solution A: picked using Knapsack. Produces a change output, pays 100 satoshis in fees, and only uses confirmed UTXOs, each with 4 confirmations.

    Solution B: picked using BnB. No change output, pays 95 satoshis in fees, and uses one unconfirmed change output.

    Solution C: picked using SRD. Produces a change output, pays 99 satoshis in fees, and only uses confirmed UTXOs, each with 1 confirmation.

  5. What are OutputGroups? Why does SRD pick from output groups rather than from UTXOs?

  6. What does calling GroupOutputs() with positive_only=true do (Hint: you may want to review what effective values are)? What could happen if SelectCoinsSRD() was called with all_groups instead of positive_groups?

  7. What are some ways a deterministic coin selection algorithm might leak information about the wallet’s UTXO pool? Why do we shuffle vCoins before creating OutputGroups?

  8. Bonus: We’ve listed some qualitative (e.g. presence of a change output) and quantitative (e.g. number of inputs used) ways to compare coin selection solutions. Instead of returning as soon as SelectCoinsMinConf() finds a solution, should we try multiple and then pick one? How might we design a metric to decide which one to use?

Meeting Log

  119:00 <glozow> #startmeeting
  219:00 <jnewbery> hi!
  319:00 <Guest13> hi
  419:00 <glozow> Welcome to PR Review Club everybody! We're doing Coin Selection, Part 2 this week!
  519:00 <larryruane_> hi!
  619:00 <glozow> Notes and questions are at
  719:00 <michaelfolkson> hi
  819:00 <hernanmarino> hi glozow and everyone !
  919:00 <glozow> hi hernanmarino!
 1019:00 <lightlike> hi
 1119:00 <kouloumos> hi
 1219:01 <glozow> PR for today is #17526: Use Single Random Draw in addition to knapsack as coin selection fallback
 1319:01 <murch> hi
 1419:01 <darius58> hi!
 1519:01 <emzy> hi
 1619:02 <Anthony85> 👋
 1719:02 <glozow> Note that this PR is built on top of #17331, which we covered last week in PR Review Club. It should still be follow-along-able if you weren't here, but those notes could be helpful:
 1819:02 <glozow> did y'all get a chance to review the PR? y/n
 1919:03 <michaelfolkson> y ~30 mins
 2019:03 <emzy> n
 2119:03 <mixoflatsixo> n
 2219:03 <dkf> n
 2319:03 <lightlike> a bit
 2419:04 <darius58> a bit
 2519:04 <jnewbery> y
 2619:05 <glozow> No problem :) let's do some high-level review for starters then. Would anyone like to tell us what Single Random Draw does?
 2719:05 <murch> y
 2819:05 <larryruane_> chooses a random set of UTXOs to use as the inputs to a transaction that we're creating
 2919:05 <hernanmarino> it randomly picks some outputs, until there's enough
 3019:05 <glozow> hernanmarino: yes!
 3119:06 <schmidty> hi!
 3219:06 <glozow> Can someone give a high-level description of the coin selection strategy including the changes proposed in this PR?
 3319:06 <michaelfolkson> If nothing else works let randomness save the day
 3419:07 <lightlike> BnB first - if it gives no solution, run both Knapsack and SRD and use whatever is "better".
 3519:07 <glozow> lightlike: yes!
 3619:07 <michaelfolkson> "better" defined as lower fees
 3719:07 <glozow> what constitutes "better" ?
 3819:07 <larryruane_> and I think try BnB first because if it is able to find a solution, there's no change output, which is nice
 3919:07 <glozow> Hint, Code here:
 4019:07 <darius58> and then if fees are equal, whichever spends more utxos
 4119:08 <glozow> larryruane_: yes! why is it good to have no change output?
 4219:08 <glozow> darius58: yep yep!
 4319:09 <jonatack> hi
 4419:09 <murch> Because it uses less fees, reduces the overall UTXO in the system, does not put any funds into flight in a change output, and breaks the change heuristic
 4519:10 <glozow> murch: indeed. wanna define "change heuristic" for us?
 4619:11 <michaelfolkson> The assumption that there is always an output which is effectively change back to the spender
 4719:11 <murch> A bunch of the chainalysis techniques revolve around guessing which output is the change returning excess funds from the input selection to the sender
 4819:12 <murch> When guessed correctly, this can be used to cluster future transactions with this one to build a wallet profile
 4919:12 <glozow> so there's a privacy component too!
 5019:12 <murch> by not returning any funds to the sender's wallet, there are future transactions directly related to this transaction (unless addresses are reused)
 5119:12 <glozow> back to the tie-breaking scheme, why might we want spend more utxos?
 5219:13 <darius58> to reduce the utxo bloat?
 5319:13 <lightlike> murch: are *no* future transactions?
 5419:13 <michaelfolkson> Taking advantage of a low fee environment to consolidate the number of UTXOs held
 5519:13 <glozow> darius58: yeah! why is that good for our wallet? and why is that good for the bitcoin ecosystem in general?
 5619:14 <Anthony85> I'm a complete newbie but maybe it just helps ensure that the entire ecosystem doesn't have to work as hard?
 5719:14 <michaelfolkson> Or maybe a large UTXO in the wallet is protected for privacy reasons and so a bunch of smaller UTXOs are used instead
 5819:14 <b10c> hi!
 5919:14 <larryruane_> i've always wondered, if the HD wallet sends change to a brand new address (as it should), does that harm privacy immediately? Or only later when the change is spent? (sorry if this is off-topic by now)
 6019:15 <darius58> reduces the memory demand for nodes keeping track of the UTXO set. I'm not sure why it's good for the user's wallet. Reduces future fees, plus less UTXOs to keep track of?
 6119:15 <glozow> Anthony85: good start, work as hard to do what?
 6219:15 <murch> lightlike: Without a change output and under the assumption of no address reuse, all future inputs would be unrelated to this transaction's inputs.
 6319:15 <murch> oooh
 6419:15 <murch> yes "+no"
 6519:16 <lightlike> murch: ok, then it makes perfect sense to me :-)
 6619:16 <glozow> darius58: yes good answer!
 6719:16 <sipa> larryruane_: the HD aspect is immaterial to your question
 6819:16 <glozow> keeping the UTXO set as small as possible is pretty important
 6919:17 <murch> darius58: Yes, not creating a change output saves cost now, and then also saves the future cost of spending that UTXO at a later time!
 7019:17 <glozow> for user's wallet, it's a bit less significant but a similar idea, it'd be nice to have to track/spend fewer UTXOs
 7119:17 <darius58> @murch my only question though is whether it is cheaper than another solution with less UTXOs in the input, if they had the same fee at the current moment?
 7219:17 <glozow> and as murch mentioned before, a change output is an additional unconfirmed UTXO we have to track (until the tx confirms)
 7319:18 <murch> larryruane_: It's a bit more complicated, but e.g. if two different scripts are used for recipient and change, it would still be obvious which one would be change if there is change
 7419:18 <glozow> alright, Quiz time! Based on the coin selection scheme proposed here, let's say that Solutions A, B, and C exist (ignore the fact that we would exit early after finding a solution we're satisfied with). Which would we pick?
 7519:18 <murch> darius58: Kinda depends, at low feerates you should probably prefer more inputs, at high feerates you should probably prefer less fees
 7619:18 <glozow> Solution A: picked using Knapsack. Produces a change output, pays 100 satoshis in fees, and only uses confirmed UTXOs, each with 4 confirmations.
 7719:18 <glozow> Solution B: picked using BnB. No change output, pays 95 satoshis in fees, and uses one unconfirmed change output.
 7819:18 <glozow> Solution C: picked using SRD. Produces a change output, pays 99 satoshis in fees, and only uses confirmed UTXOs, each with 1 confirmation.
 7919:19 <Anthony47> B?
 8019:19 <glozow> We have 1 vote for B. What do others think?
 8119:19 <lightlike> I'll try C.
 8219:19 <darius58> Does it depend if the unconfirmed output in B is an owned change output?
 8319:20 <sipa> does the answer not depend on predicted ftuture feerate?
 8419:20 <darius58> i'm guessing C
 8519:20 <glozow> they're all using the same feerates
 8619:20 <emzy> I guess B
 8719:20 <glozow> (sorry if that wasn't clear)
 8819:20 <michaelfolkson> C too. B's problem is it uses an unconfirmed change output?
 8919:21 <michaelfolkson> min confirmations by default is 1?
 9019:22 <glozow> darius58: would we spend any other type of unconfirmed output?
 9119:22 <glozow> michaelfolkson: right, the relevant code is here:
 9219:22 <glozow> The key parts to look at are the `CoinEligibilityFilter`s
 9319:23 <murch> larryruane_: let's chat later after the meeting
 9419:23 <glozow> (1, 6, 0)
 9519:23 <glozow> (1, 1, 0)
 9619:23 <glozow> (1, 2, 0)
 9719:23 <glozow> ...etc
 9819:23 <michaelfolkson> So prefer 6 confirmations but 1 is ok
 9919:23 <Anthony47> ahh interesting so B would be used if there was 1 confirmed?
10019:23 <glozow> we'll get to the answer to the quiz shortly (everyone feel free to give you answers as we discuss this)
10119:23 <kouloumos> michaelfolkson And it becomes 0 on a later invocation so according to the *Hint* I'll also say C
10219:23 <darius58> @glozow yeah my question didn't make sense lol, of course a change output means of course it was created by the user
10319:23 <glozow> michaelfolkson: correct, we try 6 confirmations on foreign outputs
10419:24 <glozow> so the first number in `CoinEligibilityFilter` corresponds to the # of confirmations on foreign outputs
10519:24 <glozow> wait no
10619:24 <glozow> that's our own outputs
10719:24 <glozow> the second number is foreign outputs, sorry
10819:25 <glozow> so the second number is always at least 1
10919:25 <larryruane_> I think B because if BnB finds a solution, that's it, we don't even run the other 2 algorithms
11019:25 <glozow> which means we always use confirmed foreign outputs
11119:25 <michaelfolkson> larryruane_: "(ignore the fact that we would exit early after finding a solution we're satisfied with"
11219:25 <glozow> ok, so it seems like we have 3 votes for B and 3 votes for C so far
11319:26 <Anthony47> I'll stick with my guns (B) but I'm starting to think it is C lol
11419:26 <glozow> hehe
11519:26 <lightlike> would be funny if the answer is A
11619:26 <michaelfolkson> What's the importance of the foreign output? The danger it could be double spent? If it is an output we own we know we won't double spend ourselves?
11719:26 <glozow> so, in this code block, which invocation of `SelectCoinsMinConf()` gets Solution B, and which one gets C?
11819:26 <sipa> michaelfolkson: exactly
11919:26 <glozow> this block:
12019:27 <glozow> You can answer by just saying which `CoinEligibilityFilter` it's called with
12119:27 <glozow> michaelfolkson: yep. if the foreign wallet creates another conflicting transaction and that gets confirmed, those UTXOs disappear
12219:29 <glozow> Ok, so I'll answer for Solution A since that's the least popular: we'd pick that using `CoinEligibilityFilter(1, 1, 0)`.
12319:29 <lightlike> imo, the second call (1, 1, 0) gets A and C, and uses C because of the lower feerate
12419:29 <glozow> lightlike: bingo!
12519:29 <murch> So, Solution A and Solution C should be found in the second pass of the eligibility filter but C pays less fees. B uses an unconfirmed input which happens in a later iteration of SelectCoinsMinConf, i.e. vote for C
12619:30 <glozow> yes, we get B from a call that uses `CoinEligibilityFilter(0, 1, *)`
12719:30 <glozow> because it has unconfirmed change outputs (that's the 0 in the filter)
12819:30 <glozow> so what's the answer to the quiz? :)
12919:30 <murch> C!
13019:30 <glozow> ding ding ding!
13119:30 <glozow> can everybody see how we got that answer? are there any questions?
13219:31 <glozow> (and yes, the next best answer would be A, then B)
13319:31 <darius58> great question and explanation!
13419:31 <Anthony47> Is there ever a scenario that B would get selected?
13519:31 <michaelfolkson> Fun question. The real winner here is glozow
13619:31 <glozow> Anthony47: yes, if an invocation of `SelectCoinsMinConf()` fails, we try again with a more permissive filter
13719:32 <Anthony47> gotcha, thanks!
13819:32 <glozow> but I guess this question is crafted as "all these solutions exist"
13919:32 <glozow> so it wouldn't fail
14019:32 <glozow> er, so, no
14119:32 <glozow> oops
14219:33 <glozow> (sorry, i just gave a really confusing answer to your question)
14319:33 <glozow> Ok let's continue with the questions
14419:33 <glozow> What are OutputGroups? Why does SRD pick from output groups rather than from UTXOs?
14519:34 <michaelfolkson> An unconfirmed output would be chosen if there was no alternative, I think the answer to Anthony47 question is
14619:34 <glozow> link to code here:
14719:34 <michaelfolkson> (assuming not foreign)
14819:34 <darius58> OutputGroups are UTXOs with the same scriptPubkey. And we'd rather spend those together for better privacy and security
14919:35 <glozow> michaelfolkson: yeah, good summary
15019:36 <glozow> darius58: yes, I think that's correct
15119:36 <larryruane_> i'm surprised an OutputGroup doesn't contain the scriptPubKey as a member
15219:37 <glozow> indeed, we compute these groups on the fly during coin selection using `GroupOutputs()`
15319:37 <glozow> you could pass in `separate_coins=true` which would put each UTXO in its own outputgroup
15419:38 <glozow> I wonder if it would be better to keep the UTXOs pre-sorted or something
15519:38 <glozow> anyway
15619:39 <glozow> Next question: What does calling `GroupOutputs()` with `positive_only=true` do (Hint: you may want to review what effective values are)? What could happen if `SelectCoinsSRD()` was called with `all_groups` instead of `positive_groups`?
15719:40 <glozow> does anyone who came to review club last week wanna tell us what effective values are?
15819:40 <darius58> does it not include UTXOs with a negative effective value?
15919:40 <glozow> darius58: jup
16019:41 <murch> effective values are the value of inputs after deducting the input cost at the current selection attempt's feerate
16119:41 <glozow> murch: good job attending last week's review club! could you break down for us, then, what it means for a coin to have a negative effective feerate?
16219:42 <lightlike> so we'd use groups that spend more fees than their UTXOs are worth and burn money.
16319:42 <glozow> lightlike: exactly. and why would this suck if we're doing a single random draw?
16419:42 <michaelfolkson> (inside joke, murch hosted last week's review club :) )
16519:43 <michaelfolkson>
16619:43 <murch> lightlike: Does it group only UTXOs with positive values or does it only allow groups that are positive in sum or does it only create groups from scriptPubKeys that only have positively valued utxos?
16719:43 <glozow> big perks for returning review clubbies, u get jokes 😛
16819:44 <lightlike> murch: I thought the criterium would be to be positive in sum, but I am not sure at all.
16919:45 <darius58> @murch the first one you said: group only UTXOs with positive value
17019:45 <michaelfolkson> So a negative effective feerate, it is effectively dust? It costs more to spend than the value of the output?
17119:46 <darius58> @glozow it would suck because there would be some cases where SRD would fail because it couldn't reach a high enough amount, even if it could have by not including negative-effective value UTXOs
17219:47 <murch> michaelfolkson: People use "dust" to denominate various things. Some people refer to UTXOs that are uneconomic at 3 sat/vB as dust, some refer to UTXOs that are uneconomic at the current feerate as dust, some just seem to call anything below e.g. 2000 sats dust.
17319:47 <murch> darius58: Excellent thinking :)
17419:47 <glozow> michaelfolkson: it's worse than dust, we'd need to select even more inputs to cover the extra cost
17519:48 <glozow> darius58: yes! you could literally be increasing the value you need to select while drawing
17619:48 <michaelfolkson> murch: I was thinking that. Need an authoritative definition of dust. Thanks
17719:48 <lightlike> also, we might spend a LOT of dust by repeately choosing negative ouputgroups if we have many, and pay a lot of unnecessary fees.
17819:48 <murch> michaelfolkson:
17919:49 <michaelfolkson> murch: Cool
18019:49 <glozow> let's talk a little bit about the benefits of single random draw
18119:49 <michaelfolkson> Lovely 2013 comment "Considering 0.01BTC as dust is probably outdated" lol
18219:50 <murch> lightlike: Or if the currentfeerate is just extremely high, a lot of the otherwise fine UTXOs are suddenly uneconomic
18319:50 <glozow> What are some ways a deterministic coin selection algorithm might leak information about the wallet's UTXO pool?
18419:50 <glozow> A fun read on coin selection algos and their tradeoffs:
18519:51 <murch> haha
18619:51 <murch> I was just going to link to that
18719:51 <michaelfolkson> StackExchange and coin selection is the perfect mix
18819:52 <glozow> sometimes when i type stackexchange it autocorrects to "Murch's blog"
18919:52 <lightlike> that link kind of answers that question :)
19019:53 <michaelfolkson> glozow: Without giving specifics if a coin analysis/surveillance company knows how you are coin selecting (no randomness) they get a better idea of what output(s) are yours?
19119:53 <murch> So, some coin selection strategies for example reveal the oldest UTXOs or the largest UTXOs. In the combination of address reuse or clustering via change output heuristics, or other wallet fingerprints, this can allow a sufficiently motivated adversary to guess the amount of coins, payment frequency and other things
19219:53 <glozow> oops, yeah. so SRD avoids any privacy leaks related to picking coins deterministically
19319:53 <murch> SRD selecting randomly muddles some of these
19419:54 <michaelfolkson> Some randomness kinda obfuscates things
19519:55 <murch> oh, 2014 murch thought that address reuse saved input costs. Gonna have to edit that.
19619:55 <michaelfolkson> Bad 2014 murch
19719:55 <glozow> To be thorough... Why do we shuffle `vCoins` before creating OutputGroups?
19819:55 <lightlike> if SRD encounters a large outgroup first by chance, will it just stop, use that and create a big change?
19919:55 <glozow> Hint:
20019:55 <murch> lightlike: Yes!
20119:56 <murch> oh yeah, that's another thing. Due to the random selection the distribution of change output sizes is not as homogenuous
20219:56 <glozow> Hint 2: Remember how we talked about the fact that SRD picks randomly from `OutputGroup`s, not singular UTXOs?
20319:56 <murch> E.g. Knapsack always tries to create the same change output size which is a terrible give-away
20419:56 <murch> Also, having many UTXO of the same size isn't really useful
20519:57 <michaelfolkson> Unless you want to do some coinjoining?
20619:57 <glozow> michaelfolkson: not with your own wallet
20719:57 <lightlike> murch: that sounds less than ideal to me for some use cases: if I want to pay for a small thing, I wouldn't necessarily want to make public how rich I am (if I have some large outgroups)
20819:58 <darius14> If they're not shuffled then the order of the inputs may reveal information about how they were sorted?
20919:58 <murch> glozow: Actually that kinda sounds like a bug, that means that outputgroups are more likely to be picked.
21019:58 <murch> It would be better to create the output groups first, then shuffle the output groups, then shuffle the UTXOs in the group
21119:58 <glozow> darius14: yep! we wouldn't want to make the groups deterministically if we had more than 11 UTXOs for the same SPK
21219:58 <michaelfolkson> glozow: No but maybe you are organizing your UTXOs in preparation to do some future coinjoining with other people's UTXOs. I dunno
21319:59 <darius14> is there something specific about '11' in that comment, or it just means for a lot of UTXOs?
21419:59 <glozow> murch: if you had 500 UTXOs for the same SPK, if you didn't shuffle them first, you'd always make the same groups
21519:59 <glozow> darius14: the maximum number of UTXOs for a `OutputGroup` is 10 right now
21619:59 <murch> lightlike: that's a good point, but generally you'll have a lot more smaller UTXOs than large UTXOs, so it would be somewhat uncommon. Also if you use a small input, you'll need multiple or get back tiny change, which are also suboptimal
21720:00 <glozow> relevant:
21820:00 <murch> darius14: I think that's a reference to outputgroups being limited to 10 UTXOs currently
21920:00 <murch> there is a PR open to push it to 100, though
22020:01 <darius14> @murch oh i see, and why are output groups limited to 10 UTXOs?
22120:01 <jonatack>
22220:01 <glozow> OMG
22320:01 <murch> glozow: oh good point. I guess shuffle UTXOs first, make OutputGroups, then shuffle OutputGroups is what I mean then.
22420:01 <glozow> WE RAN OUT OF TIME OOPS
22520:01 <glozow> #endmeeting
22620:01 <murch> uh oh
22720:02 <glozow> thanks for coming everyone!
22820:02 <murch> Thanks, glozow!
22920:02 <jonatack> o/
23020:02 <lightlike> thanks glozow!
23120:02 <darius14> thanks @glozow!!
23220:02 <jnewbery> Thanks glozow!
23320:02 <dulcedu> thanks so much Glowoz!
23420:02 <glozow> jnewbery: who's next week? hehe
23520:02 <svav> Thanks glozow
23620:02 <kouloumos> thanks glozow!
23720:02 <emzy> Thanks glozow!
23820:02 <murch> darius14: The PR that jonatack linked 18418 is the one I was referring to
23920:02 <biteskola> +1
24020:03 <Zero-1729> Thanks glozow!
24120:03 <dkf> thanks glozow, really like the link with all (!) the different coin sort algos
24220:03 <b10c> Thanks!