Coin selection refers to the
process of picking inputs to fund transactions. Bitcoin Core uses currently
four different algorithms that each produce candidate input sets, and then picks
one of those input sets to fund the transaction. Here is a rough sketch of how
the algorithms work:
“Knapsack”
Do 1000 random walks selecting and deselecting available coins, remember the input set that
overshot the target the least. Knapsack will usually create a change output,
but can create a changeless transaction when the overshoot is exactly 0, or
all funds are used.
Single Random Draw (SRD)
Randomly pick UTXOs from the available coins until enough funds are selected
to build a transaction with a change output of a reasonable amount.
Branch’n’Bound (BnB)
Deterministically search all combinations of the available coins to find the input set that is the least
wasteful that does not create a
change output.
CoinGrinder
Deterministically search all combinations of the available coins to find the
input set that has the lowest weight that does create a change output.
Algorithm
Deterministic
Creates Change Outputs
Notes on Special Cases
Knapsack
No
Usually
Selects all available UTXOs if target is exceeded, but target + minChange is not reached.
Single Random Draw
No
Always
Will only return a selection if target + minChange is exceeded.
Branch and Bound
Yes
Never
Disabled when SFFO is active. Only returns a selection if a combination produces a changeless transaction (i.e. one combination sums to a value between target and target + cost_of_change.
CoinGrinder
Yes
Always
Disabled below 30 ṩ/vB. Will only return a selection if target + minChange is exceeded.
This pull request is focused on the tests for the Branch and Bound algorithm.
Motivation
The author of this PR noticed
that many coin selection tests use coin selection parameters that generally do
not appear in production use. The unusual parameters could lead to unexpected
coin selection behavior.
The quirky outcomes would cause confusion when writing or reviewing new tests
because coin selection results could be heavier or lighter (in transaction weight) than expected,
selection could be unstable when it was expected to be deterministic, waste
scores could diverge from the obvious values, or changeful solutions could
unexpectedly be preferred to changeless solutions.
For example, when the cost_of_change parameter was set to 0, any selection
result would be interpreted as resulting in a changeless transaction, i.e.,
when cost_of_change was 0 and Single Random Draw picked a 1₿-input for a 0.1₿
selection target, the result would be evaluated as if dropping 0.9₿ to fees
instead of creating a ~0.9₿ change output.
Additionally, most coin selection tests use a feerate of 0 ṩ/vB, which results
in the amount of UTXOs being equal to their effective
value (amount - input fee).
This makes it easy to predict the change amount from the input amounts, but
e.g., means that the fees for a heavy or light transaction are the same (namely
0).
The described issues were partially
fixed by removing the
double-meaning of cost_of_change. This pull request is the start of an
effort to rewrite the coin selection tests to use non-zero feerates, and create
UTXO per their effective values, so that coin selection may happen at non-zero
feerates and more closely emulate in production use.
How to review the changes
It is recommended to review commit by commit.
The first commit adds the new test suite coinselection_tests.cpp including
default coin selection parameters, and helper functions for creating tests.
Each commit after that transplants one set of tests for the BnB coin selection
algorithm from the old coin selection test suite coinselector_tests.cpp to
the new suite. The new tests should provide at least the same coverage as the
old tests; please keep an eye on whether the test coverage may have been
reduced.
If you feel adventurous, you might look into generating a test coverage report
before the first commit (before any tests are moved), and after the last commit
to verify that the test coverage was not adversely affected. You can find
instructions in the developer notes under Compiling for test
coverage.
My understanding is that you will want to follow the “Using LLVM/Clang
toolchain” instructions.
Advantages and Disadvantages
It was considered to move the test and then refactor them, but one confusing
aspect of the old test suite was the high number of overloaded helper
functions. It was thus decided that building up the new test suite step by step
would be easier to follow and might give opportunities for additional clean-up.
Why might have so many of the original tests used a feerate of 0 ṩ/vB?
How does the AddCoins(…)
function in the new test suite address the same concern?
What are the advantages and disadvantages of using a helper function like TestBnBSuccess(…)?
What does the “clone skipping test” test? How does the expected_result in the original test line up with what you would pick manually at low and at high feerates?
What is the motivation for the “Overshoot upper bound” to use different values than the original?
What characteristic makes a set of UTXOs “expensive to search” for BnB?
<monlovesmango> read some of the background, then reviewed commit by commit as suggested. was pretty easy to follow that way. I also rebased onto master
<monlovesmango> well since this was well structured I was first reading through the tests that were removed, and then looked for the same functionality on the tests that were added
<Murch[m]> So regarding the algorithms, BnB is the only algorithm that will only return changeless input sets (except Knapsack which does it only when the excess is exactly 0).
<monlovesmango> knapsack does 1000 random walks while remembering its best selection (overshoots target the least), selection must be between the target and target+minChange
<monlovesmango> coingrinder seems to be similar to BnB, except it finds the input set with the lowest weight that exceeds target+minChange (also disabled below 30s/vB)
<Murch[m]> Let’s move to the next question: One important concept in the context is the "effective_value" of a UTXO. What is that? How is it calculated?
<Murch[m]> monlovesmango: The problem is that it turns a few general ideas on its head and it is often forgotten in testing when new functionality is added
<Murch[m]> So, the effective_value is useful, because it gives us the amount that an input contributes toward the selection target after paying for itself
<Murch[m]> Before we introduced the concept, the selection target would shift with every input that got added, because the cost of the input needed to be added to the target
<Murch[m]> Now the bonus question: if all the algorithms provide separate candidate input sets, how does the transaction building process pick among the candidates the one to use to build the transaction?
<Murch[m]> One day, we shoudl have a function that considers more things, like privacy, the wallet’s utxo composition and the wallet’s usage pattern, but that are dreams for the future
<Murch[m]> Why might have so many of the original tests used a feerate of 0 ṩ/vB? How does the AddCoins(…) function in the new test suite address the same concern?
<monlovesmango> I think it was probably bc of the problem you just mentioned, if there was a fee the selection target would shift with every input that gets added
<Murch[m]> In the old tests, the AddCoins function generated a UTXO with the amount that was provided. In the new tests, the AddCoins function adds the input’s fees, so that the provided value becomes the UTXO’s effective_value
<Santos> "why is it a problem to use a feerate of 0?" The tests should cover all cases (preferably close to real scenarios) - not only the zero feerate case
<stringintech> not sure if this is the answer but in some scenarios we need to test the algorithm sensitivity (like bnb) to different feerates. and we have to change the feerate
<Santos> What do you mean with "change complexity"? Without fees, change calculations are less complicated, allowing tests to verify coin selection predictably
<monlovesmango> 0 fee rate is unrealistic. also the behavior of BnB search is partially dependent on fee values so its probably better to have fees as part of the tests..?
<dzxzg38> The advantage is reuse, and one disadvantage is that your test failure happens far away from the interesting and invidual details of the test case that fails
<Murch[m]> dzxzg38: Yeah, that was a big one on my list too: When you have a function that runs the checks, any failure will show up with the line number in the helper function! So you need additional information to figure out which test actually failed
<dzxzg38> I understood monlovesmango to mean generally that every test case needs to have at least as many "parameters" as the most complicated test case, even if you don't really care about some of the parameters in some of the cases
<monlovesmango> some of the simplier tests didn't necesarily need some of the additional parameters in TestBnBSuccess, but now they also need to account for these parameters
<Murch[m]> monlovesmango: My thinking in that regard was that the boiler plate always writes out the entire test, so it’s often hard to see between two tests what the unique changes are.
<Murch[m]> By making each test a single line and having defaults for most things, you can very easily see which values diverge from the defaults and it helps you get a sense what the test is about
<Murch[m]> I do agree that we might need to explicitly define more parameters for some tests, though, e.g., when the custom_spending_vsize is changed, we of course have to give all the prior as well
<monlovesmango> it add test to cover heavy txs, in low fee environment heavy txs should be preferred, and in high fee env lighter txs should be preferred
<Murch[m]> In the old tests, more weight was only a function of the count of inputs, but in the new tests, we additionally vary the weight via the "output type"
<monlovesmango> Murch: can you explain the "Originally these tests relied on SFFO to work" comment in the git message? since BnB is disabled with SFFO?
<Murch[m]> monlovesmango: It could happen that we tried to drop more to fees than the recipient was supposed to pay, and then they’d get more money than we tried to send in th efirst place
<dzxzg38> Clone skipping lets us quickly get to the answer of using a 2 cent and two 7 cents to get to 16, but without lcone skipping we would search through the solutions including every "5 cent coin"
<monlovesmango> at high fee rates I think it would match what I would manually pick, but at low fee rates I think it would be preferable to use more utxos
<monlovesmango> also, i know this is probably irrelavent to the way the coin selection actually execute in BnB, but was thinking maybe 1 cent and two 7 cents should be used to get 15 instead? so that the 5 cent coins still factor to 15?
<Murch[m]> Since the question is orthogonal, let me also put out the last question: What characteristic makes a set of UTXOs “expensive to search” for BnB?
<Murch[m]> BnB has a hard time iterating through UTXO pools that have a lot of similar values. I.e., just like the "doppelganger pool" where there are a bunch of UTXOs that only differ by one sat.
<Santos> A set of UTXOs is “expensive to search” for BnB when it has: many coins with nearly identical values, a tight target value relative to the sum of available coins or weight characteristics that force frequent backtracking due to transaction size limits.
<stringintech> you can see it here; I should be comparing the coinselection.cpp coverage in the two reports right? (above for the PR branch and below the master)