Each bitcoin transaction output carries a specific
value.
Bitcoin Core defines dust as an output whose value is less than
what it would cost to spend this output.
An important goal of the Bitcoin network is decentralization, so
there are various development efforts to keep the resource costs
for running a fully-validating node to a minimum. One way to reduce
the storage requirement is to keep the size of the UTXO set small.
It would be inexpensive for an attacker, or a careless wallet, to create
many tiny-value UTXOs, bloating the UTXO set.
Whoever is able to spend these UTXOs (and thus remove them
from the UTXO set) would have little to no incentive to do so.
For this reason, Bitcoin Core has a policy of not accepting into its
mempool or relaying any transaction with a spendable dust output, that is, an
output whose value is below a dust limit.
details
When validating an incoming transaction, policy code calculates the
fee, at a particular feerate, to “pay for” both the output and the (later)
spending input. This fee is proportional to the sum of the sizes,
measured in virtual bytes, of both the input and output.
If the output value is below this (hypothetical)
fee, it is considered
dust;
it would cost more to spend this output than its value.
An output’s virtual
size
is just its physical size, but an input typically
also includes witness data which is discounted: 4 bytes of witness data
is counted as one byte of virtual data.
Rather than the dust feerate being hardcoded, bitcoind includes a
configuration option -dustrelayfee=<feerate> to set this value.
This feerate is in units of BTC per kvB (1000 virtual bytes).
The default is 0.00003 BTC per kvB (3000 sats/kvB or 3 sat/vB).
There are several kinds of standard outputs (for example, P2PK, P2PKH). Some have
a characteristic size, both of the output itself and for the input that will
later spend it. Enforcing the dust limit requires the code to
estimate
the sizes of the various kinds of inputs and outputs.
The concept of dust was first introduced in
PR #2577.
This commit
from PR #9380
introduced the -dustrelayfee option.
Previous to that PR, the dust feerate was whatever -minrelaytxfee was set to.
Why is the concept of dust useful? What problems might occur if it didn’t exist?
A transaction with an output considered “dust” is classified as a valid,
but non-standard, transaction.
What is the difference between valid transaction and a non-standard transaction?
Would it be better if transactions that don’t meet the dust treshhold
were considered invalid?
Why does the dust feerate limit apply to each output individually, rather than
to all of a transaction’s outputs collectively?
Can you think of an anomolous case in which this policy conflicts with
being miner-incentive compatatible?
Why is this feerate a configuration option, which makes it fairly static
(most node operators probably just accept the default), rather than having
it dynamically track the prevailing network feerate?
Why is -dustrelayfee a hidden (or debug) option?
Since -dustrelayfee is a per-node configuration option, what happens if various
nodes on the network set different values?
Can you see a future scenario where we’d want to change the default value of -dustrelayfee?
Would it more likely be increased or decreased? What does this depend on and which other
configuration options would then also very likely be adapted?
What does the largest possible output script that adheres to standardness rules look like?
Is it currently implemented in the functional test?
Which of the output scripts need to be inferred from an actual public key (derived from ECKey
in the test)? Could some of them also be created with only random data?
The P2TR output script to test is created with pubkey[1:].
What does this expression do and why is this needed?
Would that also work with an uncompressed pubkey?
(Idea: learn about pubkey encoding and the concept of x-only-pubkeys)
Can you give an example of an output script that is considered standard and is added
to the UTXO set (i.e. no null-data), but is still unspendable?
Bonus: is there a way to create such an output script where this unspendability
can even be mathematically proven?
<LarryRuane> By the way, if anyone has a suggestion for a PR to review, or if you'd like to volunteer to host a review club, please leave a comment here or DM me on IRC!
<LarryRuane> one mistake I used to make in writing tests is to make them too fragile ... if the test makes a very narrow requirement for the result, then it can break in the future when there's not really anything wrong
<LarryRuane> there's quite an art to writing a good test ... you want it to verify correct functionality (not leave something important unverified), but not be overly specific
<ishaanam[m]> andrewtoth_: another reason for NACKing could be if the suggested test could make more sense as a unit test instead of a functional test.
<LarryRuane> or even make code changes that make it *possible* to test with unit tests ... when something fails in a unit test, it's often much easier to narrow down where the problem is, because you're not running as much code
<LarryRuane> well, functional tests run one or more full nodes, and there are more chances for false failures due to timing windows or things like, shutting down and restarting nodes being less .... reliable?
<LarryRuane> also as I said, when something goes wrong, there's so much code that's being run by the test, it may be hard to tell where the problem actually is
<LarryRuane> schmidty_: +1 yes, unit tests can run MUCH faster than functional tests for a given amount of actual testing ... you don't have the delays in starting up the nodes, for example
<theStack> talking about speed, in an earlier version of the PR the test used multiple nodes, one for each config options (like it's currently done e.g. in mempool_datacarrier.py)... even with a small number of nodes, the test took significantly longer to be setup (non-surprisingly), so i changed to one node that is just restarted
<b_101_> It creates all posible Script types including a couple future SegWit versions, and try each of this scripts on a list of `-dustrelayfee` arbitrary settings, including the default of `-dustrelayfee` of 3000
<LarryRuane> formatted strings ... provides a more convenient way to, well, format strings! I think the .8f means floating point with 8 digits of precision (to the right of the decimal point)
<theStack> if someone is wondering why the `.8f` was needed, start up your python interpreter and type in "0.00000001". what you see as a result is a notation that bitcoind can't make sense of
<LarryRuane> there's a whole family of script generation functions such as `key_to_p2pk_script()` that are very interesting to examine .. @theStack added those in a previous PR, very helpful to both the tests and for understanding
<LarryRuane> feel free to continue previous discussions, but let's get to Q3, Why is the concept of dust useful? What problems might occur if it didn’t exist?
<theStack> the review club session LarryRuane is talking about was https://bitcoincore.reviews/22363, hosted by glozow. for anyone learning about scripts and output types, it's a great exercise to fill out this table there
<rozehnal_paul> dust could be used as an attack vector by enlarging the utxo set, as dust has to be accounted for by fullnodes, and if there are 50million dust outpusts to account for, then fullnodes get...tired.
<b_101_> This expression list the bytes object skipping the first byte, since `pubkey` is a compressed public key, the first byte indicate `x02=even`, `x03=odd`. This piece of data plus the `x` coordinate (bytes 2 to 33) is used to calculate `y` in compressed keys
<theStack> schmidty_: theoretically the output_key_to_p2tr_script helper could be adapted to accept both legacy and x-only-pubkeys, by looking at size, yes. not sure if we would really need it that often though, i think for using p2tr we usually create x-only-pubkeys from the start
<ishaanam[m]> rozehnal_paul: these transactions are technically still valid and can be mined into valid blocks. However these transactions are considered non-standard, which means that they are not relayed to other nodes.
<schmidty_> Thoughts on the concept of dust becoming less useful as p2tr gets more widely adopted (more folks using scripts and the cost to spent the output being largely unknown)?
<LarryRuane> schmidty_: That's a great point, the dust caclulation has to sort of "guess" at how big the future spending input will be, and with p2tr, that's gets harder
<LarryRuane> rozehnal_paul: "what stops a malicious miner from accepting dust" -- nothing! but by not forwarding, it's much less likely that a miner will ever see transactions with dust outputs
<ishaanam[m]> For Q4: I don't think that would be better because as mentioned previously, this is more of a "guess" so I don't think that it would make sense to hold all transactions to this partially arbitrary standard for validation.
<LarryRuane> I think the answer is (but others chime in), if we make dust part of consensus, then we could never lower it later, because that would be relaxing a rule, which would be a hardfork
<LarryRuane> it would make tx that were previously illegal, now legal ... we could *raise* the dust limit, that would be a softfork ... (do i have this right, anyone?)
<LarryRuane> this relates to Q9: "Can you see a future scenario where we’d want to change the default value of -dustrelayfee? Would it more likely be increased or decreased? What does this depend on and which other configuration options would then also very likely be adapted?"
<schmidty_> Im not sure how prevalent they are anymore, but protocols built on Bitcoin like counterparty allowed issuance of tokens. Some of those tokens could have a high $ value, but be stored in a low BTC value output.
<LarryRuane> schmidty_: interesting! so this might be a reason to lower the dust limit in the future? I was thinking if BTC because much more valuable per unit in the future, what's considered dust today would not be then
<rozehnal_paul> spitballing: if tx fees were somehow lowered in the future, we could lower the dust limit, as it would cost less to spend. not sure how this would affect other config.options
<theStack> thought experiment for Q9: let's say for years blocks are more or less constantly full with a minimum fee-rate of hundreds of sats/vbyte. would that be a reason to *increase* the default dust-limit at some point?
<LarryRuane> i think that's the answer we (or actually @theStack, who wrote this question) was looking for, the `incrementalrelayfee=` option might want to change also
<michaelfolkson> If demand for block space was permanently much higher then yeah you'd probably want to increase dust feerate as no chance of current dust getting into a block
<Jmy> Is it possible in theory to merge multiples UTXOs by using cross-input signature aggregation and then spend all these poorly managed UTXOs at once (paying also the tx-fee only once)?
<LarryRuane> theStack: what's the answer to Q10 "Q4 has been partially answered by ishaanam[m]: "What is the difference between valid transaction and a non-standard transaction?" ... i'm not sure!
<LarryRuane> What does the largest possible output script that adheres to standardness rules look like? Is it currently implemented in the functional test?
<schmidty_> Jmy: Absent cross input aggregation, in theory a miner during low fees could allow a block of dust to be spent with no fees to cleanup UTXO set as a service to node operators. Not sure where the output of all that dust would go though.
<LarryRuane> schmidty_: this is interesting, so a miner could advertise on twitter or somewhere, "directly send me zero-fee transactions, and if they spend dust and don't create MORE dust, i'll mine them" just to be nice? wouldn't the people sending those transactions decide where the output goes?
<schmidty_> "During the high-fees market of late 2017, 15–20% of all UTXOs had value densities below the lowest fee of 50–60 Satoshi/byte, making them almost impossible to spend. 40–50% of all UTXOs had value densities below the average fee of 600–700 Satoshi/byte, making them harder to spend."
<Jmy> schmidty_: iicu this means that e.g. once a year people who know that they have some dust around could negotiate where to send all of that, maybe to donate somewhere? And the miner could then creat a new output which then is no dust anymore and could be spend by someone else?
<LarryRuane> schmidty_: interesting, I heard a rumor that Roger Ver was behind the attack (if you want to think of it that way) to generate very high fees ... to make BCH look better
<rozehnal_paul> LarryRuane is your preference for unit > functional tests universal in software engineering, or specific to bitcoin? and is unit > functional testing preference controversial or pretty accepted?
<schmidty_> rozehnal_paul: Not Bitcoin specific. Back in my web engineering days the preference was similar. Unit test (run frequently) everything where possible (using mock objects if needed) and "integration" tests to test that everything is working together (run less frequently)
<LarryRuane> maybe i overstated it during review club ... it's important to have both, unit tests that zero in on one piece of logic, but also functional tests to make sure everything can work together
<LarryRuane> it's conceivable that unit tests covering two different subsystems both pass, but then when you put them together, one makes an unwarranted assumption about the other, so the functional test fails
<theStack> michaelfolkson: some of them really small refactorings also. i feel like the number of commits/PRs alone is kind of an inadequate measure alone
<theStack> anyone wants to give a shot at Q13 "Can you give an example of an output script that is considered standard and is added to the UTXO set (i.e. no null-data), but is still unspendable? Bonus: is there a way to create such an output script where this unspendability can even be mathematically proven?"? i thought this one was fun
<LarryRuane> but then the reader would have to understand how the loop works and confirm that it's correct... the way it's actually written is SO obvious
<LarryRuane> to go back to what i was saying about testing ... if the test is complex (to save lines of code, cpu time, memory, whatever), then you might feel the need to write a test for the test! we DON'T want to go there!!
<theStack> LarryRuane: yeah, for all those "theoretically possible" cases we can't mathematically prove that it's unspendable... but there are possibilities for output scripts that are guaranteed to be unspendable
<theStack> michaelfolkson: it doesn't prevent. i naively opened a PR a while ago trying to change those to be considered non-standard, but didn't consider the implications for e.g. exchanges: https://github.com/bitcoin/bitcoin/pull/24106
<theStack> i wonder if sending to pubkeys that are not on the curve would have been invalid since the beginning, if that would have prevented storing big amounts of data in bare multisig outputs
<instagibbs> there's an old (never deployed) gmax idea to gossip partial preimages of p2sh to force people to do lot of hash work to store stuff in utxo set, somewhere on bitcointalk...
<instagibbs> I suspect the idea was something like: in p2p, p2sh outputs must be accompanied by the ripemd160 preimage(sha2 hash of the redeemscript) in order to be propagated