add getorphantxs (rpc/rest/zmq)

https://github.com/bitcoin/bitcoin/pull/30793

Host: glozow  -  PR author: tdb3

Notes

  • An orphan transaction is a transaction with missing inputs. The p2p code uses a TxOrphanage to store orphan transactions, to be reconsidered later if/when its parent transaction(s) are known.

    • We have discussed TxOrphanage in previous review club meetings n21527 and n30000.
  • PR #30793 adds a new RPC, getorphantxs, to return the contents of the node’s orphanage at that moment.

    • Its format is similar to the getrawmempool RPC, which also returns information on all transactions in the mempool. Lower verbosity returns the txids, and higher verbosity returns fields about each entry.

    • Its purpose is similar to that of getrawaddrman. Most likely, developers will be the main users.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?

  2. What is an orphan transaction? At what point do transactions enter the orphanage (can you find the code)?

  3. What command can you run to get a list of available RPCs?

  4. What is the benefit of the first commit, which creates a public OrphanTxBase and extends that in a protected struct OrphanTx?

  5. What is the difference between public, protected, and private? When should you use each for a class member/method?

  6. If an RPC has a non-string argument, does anything special need to be done to handle it?

  7. What specifically does it mean that the RPC is “hidden”? Why hidden and not net?

  8. Why can’t we just directly access the orphanage from the RPC code? Why don’t we just add a PeerManager function that returns a reference to the TxOrphanage, which would be more extensible?

  9. What is the maximum size of the result from this RPC? Is there a limit to how many orphans are retained? Is there a limit to how much time orphans can stay in the orphanage?

  10. Bonus question: Since when has there been a maximum orphanage size (can you find the commit or PR using git log, git blame, or github search)?

  11. These two items suggest that the RPC can be called with a boolean verbose or an integer verbosity. What does True correspond to, and what does False correspond to, in the function ParseVerbosity?

  12. Using this RPC, would we be able to tell how long a transaction has been in the orphanage? If yes, how would you do it?

  13. Using this RPC, would we be able to tell what the inputs of an orphan transaction are? If yes, how would you do it?

  14. Does the functional test cover the new code thoroughly? How did you evaluate coverage?

  117:00 <glozow> #startmeeting
  217:00 <kevkevin> hi
  317:00 <glozow> Hi everyone! This is the PR Review Club meeting
  417:00 <Guest27> hi
  517:00 <monlovesmango> heyy
  617:01 <danielabrozzoni> hi :)
  717:01 <dzxzg> hi!
  817:01 <glozow> We are looking at "add getorphantxs" today. Notes and questions can be found at https://bitcoincore.reviews/30793
  917:01 <glozow> Did anybody get a chance to review the PR or look at the notes?
 1017:01 <monlovesmango> yes a bit
 1117:01 <kevkevin> was able to breifly
 1217:01 <dzxzg> Reviewed code, didn't have a chance to test
 1317:02 <danielabrozzoni> yes, still need to finish my review
 1417:02 <glozow> Great! :)
 1517:03 <glozow> We'll jump in to the questions now, but feel free to ask anything related to the PR at any time. This meeting is for learning.
 1617:03 <glozow> What is an orphan transaction? At what point do transactions enter the orphanage (can you find the line of code)?
 1717:03 <monlovesmango> orphan transaction is a transaction that has missing inputs
 1817:04 <kevkevin> an orphaned transaction is one where the parents/inputs are unknown
 1917:04 <monlovesmango> is this it..? https://github.com/bitcoin/bitcoin/blob/master/src/net_processing.cpp#L82
 2017:04 <monlovesmango> https://github.com/bitcoin/bitcoin/blob/master/src/net_processing.cpp#L4685
 2117:04 <danielabrozzoni> in PeerManagerImpl::ProcessMessage, when a TX message is received, the transaction is validated; if it fails with TX_MISSING_INPUTS, parents are evaluated and it might be added to the orphanage
 2217:04 <monlovesmango> (ignore first link)
 2317:04 <glozow> monlovesmango: kevkevin: correct definition!
 2417:05 <glozow> monlovesmango: danielabrozzoni: yes exactly
 2517:05 <glozow> Here's a permalink: https://github.com/bitcoin/bitcoin/blob/6a370435526875a441001be8c44c9b94a2361c8c/src/net_processing.cpp#L4678
 2617:05 <glozow> What command can you run to get a list of available RPCs?
 2717:06 <monlovesmango> bitcoin-cli -help is all I know of
 2817:06 <Naiyoma> bitcoin-cli help
 2917:06 <kevkevin> can't you just run bitcoin-cli -help
 3017:06 <glozow> correct
 3117:06 <kevkevin> I think maybe you can add something for the hidden rpc's
 3217:06 <danielabrozzoni> is there a way to get the hidden rpcs in help?
 3317:06 <Guest27> "-help" is for the "client" and "help" is for the RPCs, right?
 3417:06 <glozow> oh, good question...
 3517:07 <glozow> I'm not sure if you can get a list of hidden ones
 3617:07 <kevkevin> Guest27 ya your right it should be bitcoin-cli help
 3717:08 <glozow> yeah, no dash for the RPCs
 3817:08 <glozow> What is the benefit of the first commit, which creates a public OrphanTxBase and extends that in a protected struct OrphanTx?
 3917:09 <monlovesmango> weird, i need the dash with bitcoin-cli
 4017:09 <glozow> monlovesmango: what error do you get without the dash?
 4117:10 <monlovesmango> that the server isn't running
 4217:10 <glozow> monlovesmango: is your node not running maybe?
 4317:10 <kevkevin> maybe you need bitcoind running?
 4417:10 <monlovesmango> oh ok i see thank you!
 4517:11 <dzxzg> the public OrphanTxBase makes PeerManager able to handle OrphanTxBase
 4617:12 <glozow> dzxzg: right, that's part of it. peerman isn't able to see the definition of `struct OrphanTx`. But why not just make that public?
 4717:13 <danielabrozzoni> i think the idea is to expose only some info about orphan txs, but not all of them, so there is a public structure and a protected one. i'm not sure why OrphanTx extends OrphanTxBase, maybe to avoid changing a lot of code to use the new Base structure?
 4817:13 <dzxzg> +1 I think we don't want to expose any more members of OrphanTx than are necessary
 4917:13 <tdb3> danielabrozzoni: dzxzg: bingo
 5017:13 <glozow> danielabrozzoni: that is my understanding as well
 5117:14 <glozow> More general c++ question: What is the difference between public, protected, and private? When should you use each for a class member/method?
 5217:15 <Naiyoma> public, means that attributes and methods are accessible everywhere
 5317:16 <dzxzg> private members are not accessible by any functions outside of the class
 5417:16 <kevkevin> public any thing can access its values, private only the class functions can access the values and protected I'm not sure tbh
 5517:16 <dzxzg> protected makes members accessible as private to descendants
 5617:16 <monlovesmango> public is for things that can be accessed externally, protected is for when only children can access, and private is for no external access
 5717:16 <glozow> great answers
 5817:16 <glozow> when should you use protected instead of private?
 5917:17 <Guest27> when you really really want to give access to subclasses, but nothing else.
 6017:17 <monlovesmango> when children should be able to access but no one else should be able to
 6117:18 <Naiyoma> when you don't need to completely restrict access, to derived classes
 6217:18 <kevkevin> use protected when you're expecting children to access the data but you dont want anything else to
 6317:18 <Guest27> if in doubt - use private.
 6417:18 <glozow> I suppose there isn't 1 right answer to this question. I've seen "use private as much as possible" and "private for members, protected for internal methods"
 6517:19 <kevkevin> So this means that if we extend OrphanTx then that class should be able to access the variables defined there? Where if it was private we would not be able to?
 6617:19 <monlovesmango> i guess protected for anything that should be inherited then?
 6717:19 <glozow> You mean subclasses of `TxOrphanage`?
 6817:20 <Guest27> if one uses private by default, it is clear that subclasses don't need those parts. you don't need to grep around.
 6917:20 <kevkevin> sorry ya subclasses of TxOrphanage
 7017:21 <glozow> kevkevin: yes. I haven't tried this, but you can try editing and the `TxOrphanageTest` class in orphanage_tests.cpp should complain
 7117:21 <kevkevin> ok sounds good thanks!
 7217:21 <glozow> If an RPC has a non-string argument, does anything special need to be done to handle it?
 7317:25 <Naiyoma> yes, I think during RPCHelpMan declaration the argument type is defined,  for this rpc its RPCArg::Type::NUM
 7417:26 <luisschwab> You have to add it to the vRPCConvertParams list
 7517:26 <Naiyoma> but also this checks ParseVerbosity for a bool or a int
 7617:27 <kevkevin> I think this comment tells a little bit about that https://github.com/bitcoin/bitcoin/blob/b6368fc285bf00b3033061dcd4e29298b227c6df/src/rpc/client.cpp#L25
 7717:27 <kevkevin> we need to add the arg to src/rpc/client.cpp
 7817:28 <glozow> Looks correct to me... tdb3 did you have anything to add?
 7917:29 <tdb3> nothing for now
 8017:29 <tdb3> (for that question)
 8117:29 <glozow> We already mentioned this, but What specifically does it mean that the RPC is “hidden”? Why hidden and not net?
 8217:29 <glozow> https://github.com/bitcoin-core-review-club/bitcoin/commit/8ec094959dc6afd46a709190d2deb58a50593fb7#diff-9c5b83de6dc84af277e352c88b9291aa44340a3c75f572a0b51661eb0a838de9R1131
 8317:30 <luisschwab> It doesn't show up on `bitcoin-cli help`. Maybe because this RPC is more aimed at developers instead of end users.
 8417:30 <kevkevin> because the rpc is meant to be used by developers
 8517:31 <monlovesmango> bc its mostly for devs and this also allow room for future changes
 8617:31 <glozow> Right, we don't want to overwhelm everyday users with a big list of RPCs that are unlikely to be relevant to them
 8717:31 <kevkevin> is there a reason why in `bitcoin-cli help` there is no way to show the hidden rpcs with a arg like `bitcoin-cli help -hidden`
 8817:32 <glozow> Why can’t we just directly access the orphanage from the RPC code? Why don’t we just add a PeerManager function that returns a reference to the `TxOrphanage`, which would be more extensible?
 8917:33 <glozow> kevkevin: idk. I suppose if you want the hidden RPCs, it's probably equally convenient for you to look at the source code for it
 9017:33 <glozow> Maybe there is a way to get the list I'm just not aware of
 9117:34 <dzxzg> I think similarly to the above discussion about why TxOrphanage remains protected, it's always best to expose as little interface as possible
 9217:34 <dzxzg> or not always, but as a rule of thumb
 9317:34 <glozow> dzxzg: yep agreed
 9417:35 <glozow> What is the maximum size of the result from this RPC (I'm looking for some rough math)?
 9517:35 <glozow> Let's start with verbosity = 0. What's the max size, roughly?
 9617:36 <tdb3> glozow: by max size, are we discussing bytes or number of elements returned?
 9717:36 <glozow> Hint: Is there a limit to how many orphans are retained?
 9817:36 <glozow> tdb3: rough bytes
 9917:37 <danielabrozzoni> By default, it's 100 orphans max (DEFAULT_MAX_ORPHAN_TRANSACTIONS)
10017:37 <luisschwab> 100 orphans limit, 32 bytes per, ~3200 bytes (plus brackets and commas)
10117:37 <glozow> danielabrozzoni: luisschwab: yep! that's the kind of answer I was looking for
10217:38 <glozow> verbosity = 1 adds some more bytes on top of that, but still 100 entries max. What about verbosity = 2?
10317:38 <kevkevin> ya looks like here we limit orphan amount https://github.com/bitcoin/bitcoin/blob/6a370435526875a441001be8c44c9b94a2361c8c/src/net_processing.cpp#L4687C49-L4687C63
10417:38 <glozow> kevkevin: yep, great find
10517:39 <glozow> Bonus question for the line that kevkevin sent (exercise your `git log` and `git blame` skills): Since when has there been a maximum orphanage size? What about maximum orphan size?
10617:39 <kevkevin> is there a limit to size returned if we change max_orphan_txs?
10717:40 <glozow> kevkevin: it'd still be a multiple of that number
10817:41 <kevkevin> glozow oh ya thats true
10917:42 <glozow> for verbosity = 2, I was looking for answer along the lines of "max size of the hex is 400kB since orphan transactions can't be larger than that"
11017:42 <glozow> Anybody do some git log/blameing?
11117:43 <kevkevin> I'm trying to do `git log -L 4686,4688:./src/net_processing.cpp` rn
11217:43 <luisschwab> glozow: since 2023-07-25 for orphanage size
11317:43 <glozow> luisschwab: are you sure?
11417:43 <Naiyoma> found this commit for MAX_ORPHAN_TRANSACTIONS https://github.com/bitcoin/bitcoin/commit/142e604184e3ab6dcbe02cebcbe08e5623182b81#diff-910d89612d74e91ae70ed40289b3910b1c1a09b1f5a1bb0b15849f70760cbba2R36
11517:44 <dzxzg> 100 * (txid(4) + wtxid(4) + size (4 bytes) + virtualsize ( 8 bytes) + weight (4 bytes) + expiration (8 bytes) + 8 bytes (nodeid) + longest allowed transaction is 100,000 bytes)
11617:44 <glozow> Naiyoma: fantastic :)
11717:44 <glozow> Took me a while. 2012!
11817:45 <dzxzg> ~1 MB
11917:45 <dzxzg> did I make a mistake?
12017:45 <kevkevin> oh wow 2012 thats a while ago
12117:45 <glozow> dzxzg: looks pretty good! but remember that it's 100k virtual bytes, so 400kB
12217:46 <dzxzg> Oh!
12317:47 <glozow> https://github.com/bitcoin/bitcoin/blob/6a370435526875a441001be8c44c9b94a2361c8c/src/txorphanage.cpp#L30
12417:47 <instagibbs> "git log -S MAX_ORPHAN_TRANSACTIONS --source --all" then paging all the way down is how I find these historical things
12517:47 <luisschwab> Oh, I was checking the last update on that line, it seems he made a style change last year
12617:48 <glozow> instagibbs: great tip, great for finding stuff in an old branch too
12717:49 <glozow> luisschwab: right, you don't want the last time the line was touched, you want to keep digging until you find the original commit it was introduced
12817:49 <glozow> Using the `getorphantxs` RPC, would we be able to tell how long a transaction has been in the orphanage? If yes, how would you do it?
12917:49 <dzxzg> you can also do git blame <commit>^ to see the blame for the parent of a commit
13017:50 <glozow> I like the vim-fugitive plugin: https://github.com/tpope/vim-fugitive
13117:51 <glozow> It's very quick for figuring out which commit to --fixup as well
13217:52 <luisschwab> It has a 20 minute life in the orphanage, and the RPC returns the expiration timestamp.
13317:52 <danielabrozzoni> yes, you can calculate when a transaction was inserted by looking at the expiration (using verbosity = 1) and subtracting ORPHAN_TX_EXPIRE_TIME
13417:52 <glozow> danielabrozzoni: yep!
13517:52 <kevkevin> I see NodeSeconds nTimeExpire maybe that can be used to tell how long its been there
13617:52 <glozow> btw, orphan expiration has been around since 2016: https://github.com/bitcoin/bitcoin/commit/11cc143895e730002749f0881c4c95635fa54bd5
13717:53 <dzxzg> (I use this lua script to also be able to quickly open bitcoin core PR's when using fugitive in neovim: https://gist.github.com/davidgumberg/50c42abd59214a444b2117beb8648369)
13817:53 <Naiyoma> + using expiration timestamp
13917:53 <glozow> dzxzg: niiiiice
14017:55 <glozow> Using this RPC, would we be able to tell what the inputs of an orphan transaction are? If yes, how would you do it?
14117:55 <monlovesmango> with verbosity 2 you can get the hex?
14217:56 <luisschwab> deserialize the hex
14317:56 <Naiyoma> decoderawtransaction "hexstring"
14417:56 <glozow> yep
14517:56 <glozow> Btw, did anybody try 0xb10c's visualizer? https://observablehq.com/d/a481f4ced64b7975
14617:59 <luisschwab> yeah, pretty cool
14717:59 <glozow> Ok y'all, we are about out of time. Make sure you review the tests as well. And post your reviews on the PR!
14817:59 <tdb3> thanks all
14917:59 <monlovesmango> thanks glozow and tdb3!
15017:59 <Emc99> Thanks
15117:59 <danielabrozzoni> thanks for hosting, i learned a lot!
15217:59 <dzxzg> Thanks!
15317:59 <Naiyoma> thanks
15417:59 <Guest27> thanks!
15517:59 <glozow> A good way to test is of course to try it on mainnet, and look at the transactions in your orphanage. I sanity checked against mempool.space to see that the sizes for example, because it was a little bit suspicious how many were vsize = 141 :P
15618:00 <glozow> Thanks for coming!
15718:00 <kevkevin> Thank you!!!!
15818:00 <glozow> #endmeeting