Add getblockbyheight method / support @height in place of blockhash for getblock etc (rpc/rest/zmq)

https://github.com/bitcoin/bitcoin/pull/16345

Host: jnewbery  -  PR author: emilengler

Notes

  • The getblock RPC method takes a block hash as its first argument.
  • Users often want to get the block details for a block at a specific height, and so need to run getblockhash <height> and then getblock <hash> with the hash returned.
  • There have been several user requests and PRs to change the getblock RPC to accept a block height and return information about that block at that height in the main chain.
  • PR 16345 adds a new RPC method getblockatheight, which duplicates the logic in getblock, but takes a height instead of a block hash as its first argument.
  • PR 16317 instead allows all RPC methods arguments that take a block hash to instead take a block height when prefixed with an @ (eg getblock @123456 would get information about the block at height 123456).

Questions

  • What other solutions have been proposed to this user request?
  • What are the problems with overloading RPC arguments to be interpreted as different types?
  • What test cases have been added to these two PRs? What additional test cases could be added?
  • Which approach do you like best?

Meeting Log

  113:00 <jnewbery> hi
  213:00 <kanzure> hi
  313:00 <emilengler> Hi
  413:00 <emilengler> Does it start now :)
  513:00 <lightlike> hello
  613:00 <ajonas> hi
  713:00 <Emzy> Hi
  813:01 <jnewbery> notes/questions for today's meeting: https://bitcoin-core-review-club.github.io/16345.html
  913:01 <digi_james> Hello ..
 1013:01 <emilengler> Oh that is my pull request
 1113:01 <ccdle12> hello
 1213:01 <jnewbery> Welcome everyone! I'll start with my normal reminder: The point of the review club is to give participants the tools and knowledge they need to take part in the Bitcoin Core review process on github.
 1313:02 <jnewbery> This meeting is for you to ask any questions you want about the review process, so you can test/review/leave comments on github.
 1413:03 <jnewbery> one other announcement before we begin. I recently became aware of this repo from fanquake: https://github.com/fanquake/core-review
 1513:03 <jnewbery> it's got a bunch of tools for helping with Bitcoin Core review. Might be worth taking a look at.
 1613:03 <jnewbery> ok, let's get started
 1713:04 <jnewbery> from the notes: "Users often want to get the block details for a block at a specific height, and so need to run getblockhash <height> and then getblock <hash> with the hash returned."
 1813:04 <jnewbery> First question: What other solutions have been proposed to this user request?
 1913:04 <emilengler> Two PRs
 2013:04 <emilengler> And others which overload arguments at the getblock method
 2113:04 <Emzy> To use a prefix for the hight
 2213:04 <jkczyz> Use parameter type; string for hash and integer for height
 2313:05 <emilengler> Like getblock 42 to get block 42 or getblock HASHOFBLOCK42
 2413:05 <jnewbery> any others?
 2513:05 <fjahr> 1. plain overloading 2. overloading with prefix 3. new rpc call
 2613:05 <emilengler> jnewbery: iirc no
 2713:05 <jkczyz> JSON RPC promiose pipelinig
 2813:06 <jnewbery> jkczyz: yes, that was proposed by wumpus I think
 2913:06 <jnewbery> the comment here: https://github.com/bitcoin/bitcoin/pull/16439#issuecomment-514038924 gives a good summary of different approaches
 3013:06 <lightlike> also leave as is and leave it up to the users to create aliases etc. locally
 3113:06 <jkczyz> yep, to avoid the round trip but still define the interface in terms of primitives
 3213:07 <jnewbery> ok, next question: What are the problems with overloading RPC arguments to be interpreted as different types?
 3313:07 <emilengler> It might be interpreted wrong
 3413:07 <jkczyz> Might be ambiguous as to whether the user intended a height or hash
 3513:08 <Emzy> Introducing parsing bugs.
 3613:08 <emilengler> especially in plain overloading
 3713:08 <jnewbery> jkczyz: right. The block hash is in hex, so it's possible (although unlikely) that every hex character is a digit, and the RPC would interpret it as a height
 3813:08 <jnewbery> Emzy: yes, it makes parsing logic more complex
 3913:08 <PaulTroon> script could fail in a way that gives bad input and then call returns wrong value instead of error
 4013:08 <emilengler> This wouldn't happen with a prefix character
 4113:09 <hanh> Changing a widely-used interface may break software depending on it.
 4213:09 <jkczyz> parameter names could be confusing if named as hash but given a height
 4313:09 <emilengler> +1
 4413:10 <jnewbery> There is already one RPC that overloads an argument in this way. Does anyone know which one?
 4513:10 <jnewbery> actually more than one
 4613:11 <jnewbery> getblockstats has a hash_or_height argument that can take a blockhash or height: https://github.com/bitcoin/bitcoin/blob/e5fdda68c6d2313edb74443f0d1e6d2ce2d97f5e/src/rpc/blockchain.cpp#L1770
 4713:12 <jnewbery> and getblock has a 'verbosity' argument that can take a number or bool: https://github.com/bitcoin/bitcoin/blob/e5fdda68c6d2313edb74443f0d1e6d2ce2d97f5e/src/rpc/blockchain.cpp#L877
 4813:12 <emilengler> Is a hash a num?
 4913:12 <jnewbery> one issue with these is that it makes type-checking a bit messier
 5013:12 <jnewbery> emilengler: hash is always a string in RPC methods
 5113:12 <digi_james> How is parsing ambuity solved with getblockstats?
 5213:12 <emilengler> But why is it RPCARG::Type::NUM then
 5313:13 <jnewbery> you can see here: https://github.com/bitcoin/bitcoin/blob/e5fdda68c6d2313edb74443f0d1e6d2ce2d97f5e/src/rpc/blockchain.cpp#L1826 if isNum() {} else {}
 5413:14 <jnewbery> emilengler: To allow the case where a block height is passed
 5513:14 <achow101> range in importmulti can also be multiple types
 5613:14 <achow101> as can timestamp
 5713:14 <emilengler> jnewbery: But why is it a num then?
 5813:14 <emilengler> This is also a problem of overloading
 5913:14 <emilengler> Could be problematic at autocomplete
 6013:15 <emilengler> And how do you transmit the hash? It is just weird if params:["hash"]
 6113:15 <jnewbery> yeah - overloading can make client code more complex
 6213:16 <emilengler> And why is getblockstats using a NUM and not a STR
 6313:16 <emilengler> A string can contain ASCII numbers which can be converted to real numbers then
 6413:16 <emilengler> But vice versa it is a bit more complicated
 6513:16 <emilengler> So a STR would be probably a better choice then
 6613:17 <emilengler> And there we have the problems of overloading...
 6713:17 <jnewbery> The proposal in https://github.com/bitcoin/bitcoin/pull/16439 allows a string argument to represent either a hash or a height (when prefixed with @). What are people's thoughts about that?
 6813:18 <Emzy> Better because no dublicated code.
 6913:18 <jkczyz> I'd prefer it to be done at the client side rather than server side
 7013:19 <PaulTroon> would prevent accidental entry interpreted as height
 7113:19 <lightlike> it is nice that is solves the problem for many RPCs, not just for getblock
 7213:19 <jnewbery> Emzy: yeah, I think the duplicate code is a problem in 16345. It could be re-implemented with less duplicate code though
 7313:19 <emilengler> Worser because overloading which I don't think is agood approach
 7413:19 <jkczyz> While the extra round trip is not desirable, once you get a block you can get the next and previous hash easily
 7513:20 <emilengler> Maybe it is also causing backwards comptabitlity issues
 7613:20 <jnewbery> jkczyz: what do you think about ryanofsky's comment "Can someone be specific about an actual advantage that would come from moving logic from server to client, or be clear about what types of logic should be moved?" ...
 7713:20 <Emzy> I think it would be harder to optimize it. You would simply do it in sequence.
 7813:20 <emilengler> I general don't like to change existing RPC calls except new parameters are being added
 7913:20 <jnewbery> "If I'm writing a rust client or any other type of client, it would seem like the more logic there is in bitcoin-cli, the more work I have to do in rust to get access to the same functionality, and the greater chance I have of introducing bugs or inconsistencies."
 8013:21 <jnewbery> lightlike: yeah, I agree. It's nice that a small amount of code can add this functionality for many RPC methods
 8113:22 <jnewbery> emilengler: I can't see how this would be an issue. This is making the RPC method more permissive. It shouldn't impact any clients that are already using those methods
 8213:22 <PaulTroon> old code can't accidentally trigger new behavior, but new code can use it with out adding a new set of RPC calls to support
 8313:22 <jnewbery> PaulTroon: exactly
 8413:22 <emilengler> jnewbery: Are you sure? It is changing a parameter type
 8513:22 <jnewbery> the parameter type is a string in all cases
 8613:23 <jnewbery> either the hex string for the block hash or a string "@<height>" for a block height
 8713:23 <PaulTroon> changing the behavior of getblockstats would be problematic though, so it would probably have to continue to use a different system
 8813:24 <jnewbery> the only (slight) change in behaviour is for the height_or_hash parameter in getblockstats
 8913:24 <emilengler> jnewbery: Yeah but is now using RPCArg::Type::BLOCK_REF
 9013:24 <emilengler> And not RPCArg::Type::STR_HEX
 9113:24 <PaulTroon> perhaps could continue to use the deprecated form with out @
 9213:24 <jnewbery> you can see that aj has maintained that parameter, but placed it behind a deprecation flag: https://github.com/bitcoin/bitcoin/pull/16439/commits/792c2a5dbca346e1b693446accd68b6a5448292f
 9313:24 <emilengler> Maybe this causing issues but I'm not 100% sure
 9413:24 <jkczyz> jnewbery: Generally, I'm in favor if keeping the interface here simple given the hash at given block height can change near head. Otherwise, I would lean more towards hiding the complexity from the client.
 9513:25 <jnewbery> we talked about RPC deprecation in some detail a few weeks ago: https://bitcoin-core-review-club.github.io/15996.html
 9613:26 <jnewbery> jkczyz: that's a good point. One response on twitter to this PR review club topic was "Ugh, I see this causing people to do a lot of stupid things that don't handle forks and orphans correctly."
 9713:26 <jnewbery> can anyone give more detail or examples of how that could cause issues?
 9813:27 <emilengler> The overloading?
 9913:27 <lightlike> getblock<hash> will stay the same during a reorg, while getblock<height> will change
10013:27 <jnewbery> lightlike: exactly
10113:27 <digi_james> lightlike: +1
10213:28 <jnewbery> you could imagine someone doing `getblock<@height>` followed by `getblockstats<@height>` or similar during a re-org
10313:28 <Emzy> this sould be expected if you are at the tip of the chain.
10413:28 <jnewbery> and expecting the results to be for the same block
10513:29 <Emzy> ok. that a good example, for using the hash instead.
10613:30 <jkczyz> Also, traversing the chain should ideally follow next/prev links rather than incrementing the height
10713:30 <jnewbery> but often, I use `get block $(getblockhash(<height>))`. That would also change if I ran it across a re-org
10813:30 <emilengler> jnewbery: Why
10913:31 <emilengler> It would still work
11013:31 <jnewbery> because the inner call to getblockhash would return a different hash
11113:32 <jnewbery> if I'm not inspecting or storing the hash returned and just using it immediately in a call, then it's basically equivalent to just calling the method with a height
11213:32 <emilengler> jnewbery: Is getblockhash also being changed?
11313:32 <jnewbery> yes, getblockhash() will return the hash of the block at that height in the main chain
11413:32 <jnewbery> which can change during a re-org
11513:32 <emilengler> Yeah I know
11613:32 <emilengler> Oh know I got what you mean
11713:32 <emilengler> now*
11813:33 <emilengler> But then using heights as identifiers is general a bad approach
11913:33 <emilengler> No matter which implementation
12013:34 <jnewbery> right, near the tip, height shouldn't be used as a unique identifier for blocks
12113:34 <jnewbery> Next question: What test cases have been added to these two PRs? What additional test cases could be added?
12213:34 <jkczyz> getblock(getblockhash(<height>)) == getblock(@<height>)
12313:35 <aj> (hmm, what if we made "@589082" give an error "height is too near the tip" ?)
12413:35 <jnewbery> jkczyz: yeah, I think that's the test case added in 16439
12513:35 <jkczyz> Could add tests for exceptional cases (height is out of bounds)
12613:35 <jnewbery> a wild aj appears!
12713:35 <jnewbery> aj: I think I like that
12813:36 <jnewbery> jkczyz: I agree. The tests are only testing the happy case. I don't think any of the failure modes are tested
12913:36 <jnewbery> ie giving a too-high height or a -ve height
13013:36 <emzy> woul'd be a warning better?
13113:37 <hanh> Could add parsing exceptional cases, something like "@xyz"
13213:37 <jnewbery> warnings are often ignored. I tend to think of them as mostly useless
13313:37 <jnewbery> hanh: +1
13413:37 <emilengler> hanh: +1
13513:38 <emilengler> Yeah good question what happens when you enter something invalid to getblock @height
13613:38 <emilengler> Like a string
13713:38 <emilengler> Even if its a string technically already
13813:38 <jnewbery> emilengler: it should always be a string
13913:39 <lightlike> aj, jnewbery: wouldn't it be necessary to be able get the block of a near-tip height, even if you know that it might change in a reorg? or do you suggest to refer to the old getblockhash(height) in this case?
14013:39 <PaulTroon> parsing strings is always a rich source of errors
14113:40 <emilengler> PaulTroon: +1, I think tests should be done dynamically
14213:40 <jnewbery> 16439 actually contains two proposals for the price of one: using '@height' and parsing on the server; or using '%height' and parsing on the client. The last two commits implement the '%height' in the client approach
14313:40 <aj> lightlike: if you want the tip, you don't know the exact height and will expect change
14413:40 <aj> lightlike: if you want the tip, you don't know the exact height and will expect changing results -- i mean
14513:41 <digi_james> Near the tip: If a client makes multiple (different) rpc calls using height, and would like to ensure that the return relates to the same chain, it would have to check tip/branch before, and after these rpc calls, to ensure they relate to the same chain. Actually reorgs can happen anytime, this doesn' t help.
14613:41 <jnewbery> lightlike: I'd suggest making the error text descriptive: "height given is close to the tip. Call getblockhash(height) to get the current blockhash at heigh <x>, but be aware that it could change with a shallow reorg." or something similar
14713:42 <aj> lightlike: i think if you want stuff near the tip you probably want to walk backwards from the tip manually, so heights are irrelevant? the times i'd want to say "tell me about block at height X" is for things well in the past where i just don't want to copy-and-paste a full hash...
14813:42 <digi_james> * correlating calls only works with blockhash nvrmind.
14913:43 <jnewbery> ok, final question: Which approach do you like best?
15013:43 <emilengler> I like mine (adding a new RPC call) because I'm skeptical about overloading parameters
15113:44 <jkczyz> Neither :) Actaully, @height in client (e.g., bitcoin-cli) I'd prefer
15213:44 <aj> jkczyz: (that's the %height one jnewbery mentioned)
15313:44 <jkczyz> yep
15413:44 <emzy> The one with the rpc-client change.
15513:45 <emzy> %height in short.
15613:45 <fjahr> It has been said that this was requested multiple times but do we know anything more about this? Who would use this frequently but not frequently enough to use a script/alias?
15713:45 <lightlike> I'm undecided between leave-as-is and the on by aj.
15813:46 <PaulTroon> I like the @height change if consistently used by rest of the RPC, otherwise emilengler's is cleaner
15913:46 <jnewbery> fjahr: good question. I think it's just convenience. People don't like those extra few keystrokes.
16013:47 <emilengler> Also thought about this to be honest :P
16113:47 <jnewbery> did anyone have any other questions about this PR or review in general in the last few minutes?
16213:47 <lightlike> fjahr: also, this is one of the first things many new users will do with bitcoind. at least it was for me.
16313:47 <PaulTroon> any sort of standards document around RPC interface?
16413:47 <emilengler> lightlike: Yeah
16513:48 <emilengler> PaulTroon: standards? This is a Bitcoin Core only thing
16613:48 <hanh> fjahr my experience with this RPC is it's heavily used in blockchain extraction software.
16713:48 <PaulTroon> emilengler: I mean convention within Bitcoin Core, like there is for coding
16813:48 <emilengler> hanh: Like blockexplorers?
16913:49 <emilengler> PaulTroon: I think the manpages are fine for this :)
17013:49 <hanh> That's one example.
17113:49 <hanh> Anywhere you need to extract blokchain data into an external storage
17213:49 <jnewbery> PaulTroon: we've been trying to rationalize the RPC methods. Take a look at RPCHelpMan and RPCArg::Type for example
17313:49 <jkczyz> hanh: yes, but once a block is retrieved the next/prev links could be followed
17413:50 <jnewbery> those are new, and impose some structure on the methods and arguments
17513:50 <PaulTroon> jnewbery: +1
17613:50 <hanh> jkczyz Yes.
17713:50 <emilengler> jkczyz: For this we have the prevHash field and the getblockcount
17813:51 <jkczyz> emilengler: yeah, I think we're on the same page
17913:52 <hanh> emilengler The problem is if you have a pipeline continuously extracting blockchain, you will keep pinging the tip of the chain. Of course, once you get the block, you can use prevHash.
18013:52 <emilengler> Do you going from the top to the bottom
18113:53 <emilengler> mean*
18213:53 <jnewbery> ok, let's wrap it up there. Thanks everyone. Next week is #16115 "On bitcoind startup, write config args to debug.log (config)"
18313:53 <emilengler> Ok, see you next week. Was funny today :)
18413:54 <PaulTroon> +1
18513:54 <digi_james> Thanks!
18613:54 * jkczyz waves
18713:54 <emzy> Thanks jnewbery and all others!
18813:54 <jnewbery> if anyone has suggestions for future PRs to cover or if you want to host a future meeting, please messge me or comment on https://github.com/bitcoin-core-review-club/bitcoin-core-review-club.github.io/issues/14
18913:54 <jnewbery> thanks all!
19013:54 <hanh> Thanks jnewbery
19113:55 <lightlike> thanks!
19213:56 <emilengler> Yes thank you all especially you jnewbery