Include coinbase transactions in receivedby RPCs (rpc/rest/zmq)

https://github.com/bitcoin/bitcoin/pull/14707

Host: larryruane  -  PR author: andrewtoth

The PR branch HEAD was 77da829 at the time of this review club meeting.

Background

  • The wallet RPC interface contains four “receivedby” methods:

    • getreceivedbyaddress "address" ( minconf )
    • getreceivedbylabel "label" ( minconf )
    • listreceivedbyaddress ( minconf include_empty include_watchonly "address_filter" )
    • listreceivedbylabel ( minconf include_empty include_watchonly )

    The results derive from non-coinbase transaction payments to addresses known to the local wallet.

  • The get variants return the sum of the payment amounts.

  • The list variants return a JSON array of objects, either per-address or per-label.

  • A label is a user-specified string that groups multiple addresses.

  • The minconf argument allows the RPC to exclude recent transactions (default 1).

  • The include_empty argument (default=false) reports addresses which have not received any payments.

  • The include_watchonly boolean argument includes payments for which the wallet has only the public keys.

This PR

  • Payments from coinbase transactions are no longer excluded from the results.

  • An optional include_immature argument is added to all of these RPCs.

    • getreceivedbyaddress "address" ( minconf include_immature )
    • getreceivedbylabel "label" ( minconf include_immature )
    • listreceivedbyaddress ( minconf include_empty include_watchonly "address_filter" include_immature )
    • listreceivedbylabel ( minconf include_empty include_watchonly include_immature )

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?

  2. What major data structures do these RPCs access? What locking is needed?

  3. How do these RPCs scale or not scale (“big-O” behavior)?

  4. What’s the purpose of the minconf argument?

  5. Why were coinbase payments excluded from the results in the past?

  6. Why does this PR add the include_immature argument? What does it do? Why is there a concept of immature coins to begin with? Do you agree with this argument’s default being false?

  7. This PR makes the logic that decides whether to include a transaction in the results more complex. Can you explain the various conditions?

  8. The functional test seems to exercise all of the various conditions. But how can you be sure? How can you “test the test” as part of your review?

  9. Are there any downsides to merging this PR?

  10. This PR changes the default behavior of these RPCs in ways that may break existing user’s tools. Is it a good idea to include a mechanism to revert the behavior of these PRs? Does this PR include such a mechanism? Is it tested? When can it be removed?

Meeting Log

  119:00 <larryruane_> #startmeeting
  219:00 <_andrewtoth_> hi
  319:00 <schmidty> hi
  419:00 <larryruane_> Hello review clubbers!
  519:00 <marqusat> hi
  619:00 <will__> hi!
  719:00 <jnewbery> hi!
  819:01 <larryruane_> today we have: https://bitcoincore.reviews/14707
  919:01 <larryruane_> is anyone here for the first time?
 1019:01 <lightlike> hi
 1119:01 <sishir> hi
 1219:01 <will__> yes, first time here
 1319:01 <svav> hi
 1419:01 <_andrewtoth_> thank you for taking an interest in my PR larryruane_
 1519:01 <larryruane_> will__: welcome!
 1619:02 <glozow> hi
 1719:02 <larryruane_> _andrewtoth_: you're very welcome -- thank you for creating such a nice PR!
 1819:02 <larryruane_> ok, couple of hints if it's your first time, and reminders if it's not: all questions are welcome! Everyone is here to learn.
 1919:02 <emzy> hi
 2019:02 <larryruane_> and 2: you don't have to ask to ask. If you have a question, just go right ahead and ask.
 2119:02 <larryruane_> tips here: https://bitcoincore.reviews/your-first-meeting
 2219:03 <will__> thanks
 2319:03 <larryruane_> ok, who had time to review the PR this week? (y/n)
 2419:03 <marqusat> y
 2519:03 <sishir> y
 2619:03 <glozow> 0.8y
 2719:03 <emzy> n
 2819:03 <svav> y
 2919:03 <jnewbery> y
 3019:03 <larryruane_> glozow: spoken like a true engineer!
 3119:04 <larryruane_> Okay we've got a good group here. Let's start with a very basic concept (I'm pretty new to all this myself!):
 3219:04 <larryruane_> This is one of many PRs that modifies the RPC interface. Are such changes considered consensus changes?
 3319:04 <glozow> nope
 3419:04 <schmidty> Nah
 3519:04 <marqusat> nope
 3619:05 <larryruane_> How can you tell if a change affects consensus?
 3719:05 <glozow> you don't 🤔 until it affects consensus
 3819:05 <sishir> what does "breaking change" mean?
 3919:05 <_andrewtoth_> glozow: then shouldn't the previous answer be "maybe"?
 4019:06 <larryruane_> Here is something I think about, correct me if I'm wrong: Imagine a different client (not bitcoin core) -- would that client need this change?
 4119:06 <b10c> hi
 4219:06 <larryruane_> (to stay in consensus)
 4319:06 <glozow> _andrewtoth_: heh. 99.9% confidence in my "nope"
 4419:07 <glozow> perhaps would have lower confidence if it was touching other code
 4519:07 <marqusat> changes something in bitcoin/src/consensus/ in a way of what is considered valid?
 4619:07 <glozow> afaik it's not _supposed_ to change consensus
 4719:07 <larryruane_> glozow: that's right, okay, so changes in the RPC layer aren't consensus.
 4819:08 <lightlike> i'd say that it wouldn't need to implement any rpc layer at all to stay in consensus.
 4919:08 <larryruane_> With RPC changes, what are some considerations related to backward compatibility?
 5019:08 <larryruane_> lightlike: very good, I agree
 5119:08 <larryruane_> that is, what do we have to worry about when changing an RPC?
 5219:08 <glozow> should get the same results if you call the RPC on an older version of Bitcoin Core. I think this answers your question too sishir
 5319:09 <schmidty> Parameter ordering
 5419:09 <sishir> i see thank glozow
 5519:09 <marqusat> to not break the existing contract
 5619:09 <marqusat> input and expected output
 5719:10 <larryruane_> yes, all good answers... imagine a script on some user's system that no one knows how to maintain ... it would be bad to break that script, it may be very inconvenient for the user to fix it
 5819:10 <larryruane_> Okay, moving on (unless there's anything else?)
 5919:11 <larryruane_> BTW, For the newer people here, I just want to mention that this PR is different than other recently-reviewed PRs -- it's quite simple and you don't need a lot of background knowledge. So please speak up! You'll be glad you did :)
 6019:11 <larryruane_> There are 4 RPCs that are affected by this PR. They all have "receivedby" in their names
 6119:12 <larryruane_> what major data structures do these RPCs access?
 6219:12 <glozow> `mapWallet`
 6319:13 <will__> maybe too silly, but this 'coinbase' transactions are related to Coinbase company? or i'm missing the coinbase concept
 6419:13 <sishir> JSON array?
 6519:14 <marqusat> CWallet
 6619:14 <svav> will__ : A coinbase transaction is the first transaction in a block. It is a unique type of bitcoin transaction that can be created by a miner. The miners use it to collect the block reward for their work and any other transaction fees collected by the miner are also sent in this transaction.
 6719:14 <larryruane_> glozow: yes, can you give us a quick summary of `mapWallet`'s purpose?
 6819:14 <sishir> will__ coinbase tx is tx that miner receive as reward. Hence first tx in block
 6919:14 <will__> thanks!
 7019:15 <_andrewtoth_> will__: coinbase the company took its name after a concept introduced by bitcoin. similar to blockchain the company.
 7119:15 <jnewbery> larryruane_: Something that we try to do to help with backwards compatibility is make the return a JSON object (rather than a array, string, or number). That means that additional key-values can be added without breaking clients that are receiving the return object.
 7219:15 <glozow> larryruane_: honestly not sure, I haven't looked at the wallet code that much. It's a map from txid to `CWalletTx`, I assume it's all the transactions that the wallet is interested in, e.g. sent and received
 7319:15 <larryruane_> marqusat: `CWallet` is a class ... `mapWallet` contains a bunch of those, indexed by ... ?
 7419:16 <glozow> `mapWallet` is a member of `CWallet` iiuc
 7519:17 <svav> Why does a wallet need the four "receivedby" methods in the first place?
 7619:17 <larryruane_> glozow: yes, that's correct, I stated it wrong I think, there's one `CWallet` instance (or now can there be multiple?), and it contains a `mapWallet`
 7719:17 <larryruane_> svav: there is a concept of a label, can anyone explain what a label is?
 7819:18 <svav> Labels are short, easy-to-remember words for long, complicated wallet addresses.
 7919:18 <svav> The label for an external bitcoin address is a name that you can use to keep track of the different BTC addresses for your own use. It is not a mandatory field.
 8019:19 <svav> Right?
 8119:19 <larryruane_> svav: yes! and also for grouping -- you can place multiple addresses in a single label
 8219:19 <_andrewtoth_> I think the use of labels in the wallet has largely been deprecated or discouraged though, right?
 8319:19 <svav> OK, so why are the 4 methods needed by a wallet?
 8419:19 <larryruane_> so labels are why there are 4 "receivedby" rpcs instead of 2 (address and label variants)
 8519:20 <larryruane_> _andrewtoth_: good question! I don't know!
 8619:20 <glozow> I suppose the `get` vs `list` could have been a verbosity option
 8719:20 <larryruane_> the other "dimension" in the "receivedby" RPC set is the "get" versus the "list" variants
 8819:20 <svav> getreceivedbyaddress See https://developer.bitcoin.org/reference/rpc/getreceivedbyaddress.html
 8919:21 <svav> getreceivedbyaddress - Returns the total amount received by the given address in transactions with at least minconf confirmations.
 9019:22 <larryruane_> so the `get` versions return just a simple amount, not JSON (see jnewbery's comment above about newer RPCs preferring to return JSON -- these `get` are older RPCs)
 9119:22 <svav> Can someone attempt a basic summary of the reason for this Pull Request? Why is it needed?
 9219:22 <larryruane_> svav: yes .... ah you asked the next question, perfect!
 9319:23 <larryruane_> (just to finish the thought) the `list` variants return an array of JSON objects
 9419:23 <marqusat> larryruane: mapWallet returns a map of CWalletTx indexed by tx id
 9519:24 <_andrewtoth_> svav: consider I received a coinbase output to one of my addresses from mining a block 100 blocks ago, and I wanted to see the balance received by that address. The current behaviour of getreceivedbyaddress would not show the value of the coinbase output, causing me to wonder where my money is.
 9619:24 <sishir> I believe its cause the getreceivedbyaddress and listreceivedbyaddress do not include amount from coinbase tx so this PR adds them
 9719:24 <jonatack> hi (lurking)
 9819:24 <glozow> Including coinbase outputs in the `receivedby` RPCs. I think of it as somewhat of a bug fix, since there's no reason to filter _all_ coinbase outputs
 9919:24 <larryruane_> marqusat: yes exactly! the local variable when iterating this container is typically called `wtx` which stands for "wallet tx" (not "witness")
10019:25 <larryruane_> good answers, everyone. Okay, now, next, what "technical difficulty" does including the coinbase in any RPC result introduce?
10119:26 <sishir> whether to include immature tx or not
10219:27 <marqusat> it's a behavior breaking change
10319:27 <glozow> I believe there's some considerations wrt user expectations about the spendability of their outputs. the number of confirmations is usually only relevant for confidence that this coin is indeed going to stay in our wallet, but for coinbases, it also affects whether or not we can spend it
10419:27 <svav> _andrewtoth_ : Under what circumstances would a coinbase output be received? Is this something just a successful miner receives? How does this affect the average Bitcoin user who has a wallet?
10519:27 <glozow> and yeah, I'd be extremely confused as a user if I made two calls to `getreceivedby*` on different versions and got different answers
10619:27 <larryruane_> marqusat: absolutely right, this is a change, as by default, the coinbase will be included, whereas it wasn't before. What did the PR author do to address this concern?
10719:28 <marqusat> added a hidden configuration option to revert to the previous behavior
10819:28 <larryruane_> sishir: right, immature coinbase is now an issue... what is immature coinbase, why is there such a concept?
10919:28 <svav> Can someone define coinbase output and how it differs from a standard UTXO?
11019:29 <larryruane_> marqusat: right, I believe the thinking is that although this is a change, it's one that most people won't mind ... but just in case someone does, there's an escape hatch (without having to recompile the client from source)
11119:30 <jnewbery> svav: A coinbase output is an output from the coinbase transaction. The coinbase transaction is the first transaction in the block, which contains the miner's reward. It's the only type of transaction that doesn't have inputs
11219:31 <marqusat> immature coinbase are funds that can be yet spent, require 101 confirmations
11319:31 <sishir> For a coinbase tx to be spendable they have to be mature i.e. wait till 100 confirmations, I believe this is because lets say a block is orphaned before it gets 100 blocks deep into the chain, then only the miner is affected.
11419:31 <_andrewtoth_> svav: In most cases yes, coinbase outputs are received by successful miners. Average Bitcoin users will likely not be affected by this bug, hence why I believe it has received little review. There are also cases like regtest where the wallet can receive coinbase outputs with generate RPCs, and for watchonly wallets that want to watch miner addresses.
11519:31 <marqusat> The coinbase maturity concept exists so that impact of forks that may naturally briefly occur in the network doesn’t have an effect on spendability.
11619:31 <glozow> a coinbase output looks identical to any other output though, so you'd need to see the transaction to identify that it's a coinbase (null input). that's why we store `fCoinbase` in coins?
11719:32 <sishir> But why the number 100? Why not 6 confirmation like Bitcoin does with regular tx validation?
11819:33 <larryruane_> why is it a _consensus rule_ that coinbase outputs aren't spendable for 100 blocks?
11919:33 <larryruane_> how do coinbase outputs fundamentally differ from regular UTXOs?
12019:34 <larryruane_> (just so we don't get too lost, all this ties back to the `include_immature` argument that this PR adds to these 4 RPCs)
12119:34 <glozow> idk if there are any reasons other than forks https://bitcoin.stackexchange.com/questions/1991/what-is-the-block-maturation-time
12219:35 <marqusat> it's a part of consensus because it affects the whole network and what is considered spendable atm
12319:35 <glozow> larryruane_: just the 100 confs requirement. no other differences
12419:36 <larryruane_> glozow: here is another similar link https://bitcoin.stackexchange.com/questions/20120/what-was-the-goal-of-creating-the-feature-of-immature-coins
12519:36 <_andrewtoth_> a normal reorg can cause some txs to be unconfirmed again, but they will all still be valid and re-enter the mempool. If coinbase txs were spendable and there were many ancestors, then all those ancestor txs would become invalid in the reorg.
12619:36 <larryruane_> _andrewtoth_: yes exactly!
12719:36 <sishir> Coinbase outputs are very identical to normal tx, the only difference is that they have exactly 1 input which must be of a special form which is This txin's  0000...0000.
12819:37 <_andrewtoth_> so the 100 block limit is protection against this
12919:37 <glozow> s/ancestors/descendants?
13019:37 <_andrewtoth_> glozow oops yes
13119:37 <larryruane_> coinbase UTXOs are different because if there is a reorg, they may disappear ... a normal UTXO will likely appear in the reorged chain
13219:39 <lightlike> but whats the different treatment there? if normal transaction became invalid, their descendants would become invalid just the same - so why a special rule for coinbase transactions?
13319:39 <lightlike> ah ok, thanks larryruane_
13419:39 <larryruane_> I think we're good on this point... so this is why the PR author (_andrewtoth_ ) has added the `include_immature`argument ... the default value is false, does that seem right?
13519:40 <sishir> Yea it makes sense
13619:40 <marqusat> It makes sense to have include_immature with default as false as that matches that by default minconf=1 so does not include funds from normal transactions which are not yet confirmed (spendable).
13719:40 <larryruane_> (a UTXO _could_ become spent after a reorg, essentially a double-spend, but it's unlikely)
13819:41 <glozow> i have gripes, I think the option should be `include_not_yet_spendable`
13919:41 <Murch> hi
14019:41 <svav> coinbase outputs are rewards for miners, and they might disappear if another chain surpasses the one the miner did. Hence, the coinbase output should not be spendable until 100 more blocks have passed, to ensure the miner's reward won't disappear, and to prevent the minder spending money that subsequently gets taken from them. Correct?
14119:41 <larryruane_> hi Murch, welcome ... glozow that's a very interesting idea, in effect combine the two reasons
14219:42 <_andrewtoth_> glozow: that does make sense to me
14319:42 <glozow> hehe
14419:43 <larryruane_> do we want to discuss glozow's idea here? or save it for the PR? I think it's definitely worth considering
14519:43 <Murch> If new bitcoin got created with a block and were spendable immediately, reorgs would get extremely messy
14619:44 <Murch> because not only would those funds disappear, but all descending txs would become invalid
14719:44 <_andrewtoth_> glozow: if we had include_not_yet_spendable, then we would also have to include non-final txs if that option was set
14819:44 <glozow> I don't mind either way, I'd prefer both minimum confirmations + spendability config
14919:44 <glozow> _andrewtoth_: yeah. I think immature coinbases and non-final txs should be treated the same
15019:45 <larryruane_> Also I recall reading somewhere that immediately spendable coinbase would make coins _nonfungible_ ... receivers would have a strong incentive to check whether they were being paid using (immature) coinbase or not
15119:45 <glozow> i.e. "not spendable yet"
15219:45 <glozow> and number of confirmations on the tx is a separate concern
15319:45 <larryruane_> and some receivers (light clients?) may not even be able to do that
15419:45 <jonatack> maybe look at the naming breakdown in rpc getbalances
15519:45 <glozow> larryruane_: ohhh interesting
15619:46 <Murch> why the maturation period is 100, I don't know, beyond it's sufficiently deep that a reorg is extremely unlikely
15719:46 <sishir> Murch this was discussed a lil while ago. Follow up link https://bitcoin.stackexchange.com/questions/20120/what-was-the-goal-of-creating-the-feature-of-immature-coins
15819:47 <glozow> teehee, Murch wrote that answer
15919:47 <jonatack> sishir: look who replied to it
16019:47 <glozow> We're phoning a friend
16119:48 <sishir> oops 😅
16219:48 <larryruane_> Great discussion, now if we're done with this subtopic, I did have one more thing to bring up... https://github.com/bitcoin/bitcoin/pull/14707#discussion_r613366384
16319:48 <larryruane_> there I suggested refactoring some slightly complex logic so it's not nearly-duplicated ...
16419:49 <larryruane_> General question, how should we decide when to refactor to eliminate redundant code, versus allowing the redundant code?
16519:49 <glozow> Murch: do you know if the 100conf rule is older than UTXOSet?
16619:49 <larryruane_> (BTW, if anyone would like to discuss earlier topics, please feel free!!)
16719:50 <Murch> glozow: I think it was in the original relese
16819:50 <Murch> *release
16919:51 <jonatack> larryruane_: depends on if the code could use test coverage and complexity vs lines saved and the value of having a single point of truth, but personally i like smart, lean code but don't mind dumb verbose tests
17019:51 <marqusat> larryruane_: https://en.wikipedia.org/wiki/Rule_of_three_(computer_programming)
17119:53 <lightlike> Murch: definitely older than 2010, https://bitcointalk.org/index.php?topic=661.msg7293#msg7293
17219:53 <larryruane_> Great points, TIL! I love your comment about verbose tests ... I once worked at a company that had a test framework that almost required a PhD to understand! Just to minimize test code. There's no reason to minimize test code, it must be understandable above all!
17319:54 <jonatack> yes, because you don't have tests for the tests! (usually)
17419:54 <larryruane_> exactly, oh, time's almost up, but you reminded me of one more question:
17519:55 <larryruane_> this logic is somewhat complicated, but there are good tests included in this PR
17619:55 <larryruane_> how can we test th tests? how can we make sure the tests are covering the various cases?
17719:55 <marqusat> Code coverage. Additional way to “test the test” can be mutation testing.
17819:56 <larryruane_> code coverage anaysis is great .. can you elaborate on mutation testing?
17919:56 <jonatack> it's an automated version of manually changing the code and making sure the tests catch it
18019:56 <marqusat> https://en.wikipedia.org/wiki/Mutation_testing
18119:57 <Murch> lightlike: good find!
18219:57 <larryruane_> perfect... I was going to say "break the code intentionally" but I didn't realize there are automated ways to do that!
18319:57 <larryruane_> ok anything else? this has been super fun as my first time hosting, thank you all so much!
18419:57 <marqusat> thank you!
18519:58 <glozow> thanks larryruane_!
18619:58 <_andrewtoth_> thanks larryruane_!
18719:58 <lightlike> thank you!
18819:58 <svav> Thanks
18919:59 <jnewbery> thank you larryruane! Great meeting
19019:59 <jnewbery> #endmeeting
19119:59 <jonatack> thanks larryruane_ ! great job, looking forward to seeing you do this again
19220:00 <larryruane_> thank you marqusat glozow _andrewtoth_ lightlike svav jnewbery Murch jonatack and anyone else I missed, TIL a lot!
19320:00 <emzy> thank you larryruane_!
19420:00 <schmidty> Thanks larryruane_ !