#16345 Add getblockbyheight method / #16439 support @height in place of blockhash for getblock etc (rpc)

https://github.com/bitcoin/bitcoin/16345

Host: jnewbery

Notes

  • The getblock RPC method takes a block hash as its first argument.
  • Users often want to get the block details for a block at a specific height, and so need to run getblockhash <height> and then getblock <hash> with the hash returned.
  • There have been several user requests and PRs to change the getblock RPC to accept a block height and return information about that block at that height in the main chain.
  • PR 16345 adds a new RPC method getblockatheight, which duplicates the logic in getblock, but takes a height instead of a block hash as its first argument.
  • PR 16317 instead allows all RPC methods arguments that take a block hash to instead take a block height when prefixed with an @ (eg getblock @123456 would get information about the block at height 123456).

Questions

  • What other solutions have been proposed to this user request?
  • What are the problems with overloading RPC arguments to be interpreted as different types?
  • What test cases have been added to these two PRs? What additional test cases could be added?
  • Which approach do you like best?

Meeting Log

13:00 < jnewbery> hi
13:00 < kanzure> hi
13:00 < emilengler> Hi
13:00 < emilengler> Does it start now :)
13:00 < lightlike> hello
13:00 < ajonas> hi
13:00 < Emzy> Hi
13:01 < jnewbery> notes/questions for today's meeting: https://bitcoin-core-review-club.github.io/16345.html
13:01 < digi_james> Hello ..
13:01 < emilengler> Oh that is my pull request
13:01 < ccdle12> hello
13:01 < jnewbery> Welcome everyone! I'll start with my normal reminder: The point of the review club is to give participants the tools and knowledge they need to take part in the Bitcoin Core review process on github.
13:02 < jnewbery> This meeting is for you to ask any questions you want about the review process, so you can test/review/leave comments on github.
13:03 < jnewbery> one other announcement before we begin. I recently became aware of this repo from fanquake: https://github.com/fanquake/core-review
13:03 < jnewbery> it's got a bunch of tools for helping with Bitcoin Core review. Might be worth taking a look at.
13:03 < jnewbery> ok, let's get started
13:04 < jnewbery> from the notes: "Users often want to get the block details for a block at a specific height, and so need to run getblockhash <height> and then getblock <hash> with the hash returned."
13:04 < jnewbery> First question: What other solutions have been proposed to this user request?
13:04 < emilengler> Two PRs
13:04 < emilengler> And others which overload arguments at the getblock method
13:04 < Emzy> To use a prefix for the hight
13:04 < jkczyz> Use parameter type; string for hash and integer for height
13:05 < emilengler> Like getblock 42 to get block 42 or getblock HASHOFBLOCK42
13:05 < jnewbery> any others?
13:05 < fjahr> 1. plain overloading 2. overloading with prefix 3. new rpc call
13:05 < emilengler> jnewbery: iirc no
13:05 < jkczyz> JSON RPC promiose pipelinig
13:06 < jnewbery> jkczyz: yes, that was proposed by wumpus I think
13:06 < jnewbery> the comment here: https://github.com/bitcoin/bitcoin/pull/16439#issuecomment-514038924 gives a good summary of different approaches
13:06 < lightlike> also leave as is and leave it up to the users to create aliases etc. locally
13:06 < jkczyz> yep, to avoid the round trip but still define the interface in terms of primitives
13:07 < jnewbery> ok, next question: What are the problems with overloading RPC arguments to be interpreted as different types?
13:07 < emilengler> It might be interpreted wrong
13:07 < jkczyz> Might be ambiguous as to whether the user intended a height or hash
13:08 < Emzy> Introducing parsing bugs.
13:08 < emilengler> especially in plain overloading
13:08 < jnewbery> jkczyz: right. The block hash is in hex, so it's possible (although unlikely) that every hex character is a digit, and the RPC would interpret it as a height
13:08 < jnewbery> Emzy: yes, it makes parsing logic more complex
13:08 < PaulTroon> script could fail in a way that gives bad input and then call returns wrong value instead of error
13:08 < emilengler> This wouldn't happen with a prefix character
13:09 < hanh> Changing a widely-used interface may break software depending on it.
13:09 < jkczyz> parameter names could be confusing if named as hash but given a height
13:09 < emilengler> +1
13:10 < jnewbery> There is already one RPC that overloads an argument in this way. Does anyone know which one?
13:10 < jnewbery> actually more than one
13:11 < jnewbery> getblockstats has a hash_or_height argument that can take a blockhash or height: https://github.com/bitcoin/bitcoin/blob/e5fdda68c6d2313edb74443f0d1e6d2ce2d97f5e/src/rpc/blockchain.cpp#L1770
13:12 < jnewbery> and getblock has a 'verbosity' argument that can take a number or bool: https://github.com/bitcoin/bitcoin/blob/e5fdda68c6d2313edb74443f0d1e6d2ce2d97f5e/src/rpc/blockchain.cpp#L877
13:12 < emilengler> Is a hash a num?
13:12 < jnewbery> one issue with these is that it makes type-checking a bit messier
13:12 < jnewbery> emilengler: hash is always a string in RPC methods
13:12 < digi_james> How is parsing ambuity solved with getblockstats?
13:12 < emilengler> But why is it RPCARG::Type::NUM then
13:13 < jnewbery> you can see here: https://github.com/bitcoin/bitcoin/blob/e5fdda68c6d2313edb74443f0d1e6d2ce2d97f5e/src/rpc/blockchain.cpp#L1826 if isNum() {} else {}
13:14 < jnewbery> emilengler: To allow the case where a block height is passed
13:14 < achow101> range in importmulti can also be multiple types
13:14 < achow101> as can timestamp
13:14 < emilengler> jnewbery: But why is it a num then?
13:14 < emilengler> This is also a problem of overloading
13:14 < emilengler> Could be problematic at autocomplete
13:15 < emilengler> And how do you transmit the hash? It is just weird if params:["hash"]
13:15 < jnewbery> yeah - overloading can make client code more complex
13:16 < emilengler> And why is getblockstats using a NUM and not a STR
13:16 < emilengler> A string can contain ASCII numbers which can be converted to real numbers then
13:16 < emilengler> But vice versa it is a bit more complicated
13:16 < emilengler> So a STR would be probably a better choice then
13:17 < emilengler> And there we have the problems of overloading...
13:17 < jnewbery> The proposal in https://github.com/bitcoin/bitcoin/pull/16439 allows a string argument to represent either a hash or a height (when prefixed with @). What are people's thoughts about that?
13:18 < Emzy> Better because no dublicated code.
13:18 < jkczyz> I'd prefer it to be done at the client side rather than server side
13:19 < PaulTroon> would prevent accidental entry interpreted as height
13:19 < lightlike> it is nice that is solves the problem for many RPCs, not just for getblock
13:19 < jnewbery> Emzy: yeah, I think the duplicate code is a problem in 16345. It could be re-implemented with less duplicate code though
13:19 < emilengler> Worser because overloading which I don't think is agood approach
13:19 < jkczyz> While the extra round trip is not desirable, once you get a block you can get the next and previous hash easily
13:20 < emilengler> Maybe it is also causing backwards comptabitlity issues
13:20 < jnewbery> jkczyz: what do you think about ryanofsky's comment "Can someone be specific about an actual advantage that would come from moving logic from server to client, or be clear about what types of logic should be moved?" ...
13:20 < Emzy> I think it would be harder to optimize it. You would simply do it in sequence.
13:20 < emilengler> I general don't like to change existing RPC calls except new parameters are being added
13:20 < jnewbery> "If I'm writing a rust client or any other type of client, it would seem like the more logic there is in bitcoin-cli, the more work I have to do in rust to get access to the same functionality, and the greater chance I have of introducing bugs or inconsistencies."
13:21 < jnewbery> lightlike: yeah, I agree. It's nice that a small amount of code can add this functionality for many RPC methods
13:22 < jnewbery> emilengler: I can't see how this would be an issue. This is making the RPC method more permissive. It shouldn't impact any clients that are already using those methods
13:22 < PaulTroon> old code can't accidentally trigger new behavior, but new code can use it with out adding a new set of RPC calls to support
13:22 < jnewbery> PaulTroon: exactly
13:22 < emilengler> jnewbery: Are you sure? It is changing a parameter type
13:22 < jnewbery> the parameter type is a string in all cases
13:23 < jnewbery> either the hex string for the block hash or a string "@<height>" for a block height
13:23 < PaulTroon> changing the behavior of getblockstats would be problematic though, so it would probably have to continue to use a different system
13:24 < jnewbery> the only (slight) change in behaviour is for the height_or_hash parameter in getblockstats
13:24 < emilengler> jnewbery: Yeah but is now using RPCArg::Type::BLOCK_REF
13:24 < emilengler> And not RPCArg::Type::STR_HEX
13:24 < PaulTroon> perhaps could continue to use the deprecated form with out @
13:24 < jnewbery> you can see that aj has maintained that parameter, but placed it behind a deprecation flag: https://github.com/bitcoin/bitcoin/pull/16439/commits/792c2a5dbca346e1b693446accd68b6a5448292f
13:24 < emilengler> Maybe this causing issues but I'm not 100% sure
13:24 < jkczyz> jnewbery: Generally, I'm in favor if keeping the interface here simple given the hash at given block height can change near head. Otherwise, I would lean more towards hiding the complexity from the client.
13:25 < jnewbery> we talked about RPC deprecation in some detail a few weeks ago: https://bitcoin-core-review-club.github.io/15996.html
13:26 < jnewbery> jkczyz: that's a good point. One response on twitter to this PR review club topic was "Ugh, I see this causing people to do a lot of stupid things that don't handle forks and orphans correctly."
13:26 < jnewbery> can anyone give more detail or examples of how that could cause issues?
13:27 < emilengler> The overloading?
13:27 < lightlike> getblock<hash> will stay the same during a reorg, while getblock<height> will change
13:27 < jnewbery> lightlike: exactly
13:27 < digi_james> lightlike: +1
13:28 < jnewbery> you could imagine someone doing `getblock<@height>` followed by `getblockstats<@height>` or similar during a re-org
13:28 < Emzy> this sould be expected if you are at the tip of the chain.
13:28 < jnewbery> and expecting the results to be for the same block
13:29 < Emzy> ok. that a good example, for using the hash instead.
13:30 < jkczyz> Also, traversing the chain should ideally follow next/prev links rather than incrementing the height
13:30 < jnewbery> but often, I use `get block $(getblockhash(<height>))`. That would also change if I ran it across a re-org
13:30 < emilengler> jnewbery: Why
13:31 < emilengler> It would still work
13:31 < jnewbery> because the inner call to getblockhash would return a different hash
13:32 < jnewbery> if I'm not inspecting or storing the hash returned and just using it immediately in a call, then it's basically equivalent to just calling the method with a height
13:32 < emilengler> jnewbery: Is getblockhash also being changed?
13:32 < jnewbery> yes, getblockhash() will return the hash of the block at that height in the main chain
13:32 < jnewbery> which can change during a re-org
13:32 < emilengler> Yeah I know
13:32 < emilengler> Oh know I got what you mean
13:32 < emilengler> now*
13:33 < emilengler> But then using heights as identifiers is general a bad approach
13:33 < emilengler> No matter which implementation
13:34 < jnewbery> right, near the tip, height shouldn't be used as a unique identifier for blocks
13:34 < jnewbery> Next question: What test cases have been added to these two PRs? What additional test cases could be added?
13:34 < jkczyz> getblock(getblockhash(<height>)) == getblock(@<height>)
13:35 < aj> (hmm, what if we made "@589082" give an error "height is too near the tip" ?)
13:35 < jnewbery> jkczyz: yeah, I think that's the test case added in 16439
13:35 < jkczyz> Could add tests for exceptional cases (height is out of bounds)
13:35 < jnewbery> a wild aj appears!
13:35 < jnewbery> aj: I think I like that
13:36 < jnewbery> jkczyz: I agree. The tests are only testing the happy case. I don't think any of the failure modes are tested
13:36 < jnewbery> ie giving a too-high height or a -ve height
13:36 < emzy> woul'd be a warning better?
13:37 < hanh> Could add parsing exceptional cases, something like "@xyz"
13:37 < jnewbery> warnings are often ignored. I tend to think of them as mostly useless
13:37 < jnewbery> hanh: +1
13:37 < emilengler> hanh: +1
13:38 < emilengler> Yeah good question what happens when you enter something invalid to getblock @height
13:38 < emilengler> Like a string
13:38 < emilengler> Even if its a string technically already
13:38 < jnewbery> emilengler: it should always be a string
13:39 < lightlike> aj, jnewbery: wouldn't it be necessary to be able get the block of a near-tip height, even if you know that it might change in a reorg? or do you suggest to refer to the old getblockhash(height) in this case?
13:39 < PaulTroon> parsing strings is always a rich source of errors
13:40 < emilengler> PaulTroon: +1, I think tests should be done dynamically
13:40 < jnewbery> 16439 actually contains two proposals for the price of one: using '@height' and parsing on the server; or using '%height' and parsing on the client. The last two commits implement the '%height' in the client approach
13:40 < aj> lightlike: if you want the tip, you don't know the exact height and will expect change
13:40 < aj> lightlike: if you want the tip, you don't know the exact height and will expect changing results -- i mean
13:41 < digi_james> Near the tip:  If a client makes multiple (different) rpc calls using height, and would like to ensure that the return relates to the same chain, it would have to check tip/branch before, and after these rpc calls, to ensure they relate to the same chain. Actually reorgs can happen anytime, this doesn' t help.
13:41 < jnewbery> lightlike: I'd suggest making the error text descriptive: "height given is close to the tip. Call getblockhash(height) to get the current blockhash at heigh <x>, but be aware that it could change with a shallow reorg." or something similar
13:42 < aj> lightlike: i think if you want stuff near the tip you probably want to walk backwards from the tip manually, so heights are irrelevant? the times i'd want to say "tell me about block at height X" is for things well in the past where i just don't want to copy-and-paste a full hash...
13:42 < digi_james> * correlating calls only works with blockhash nvrmind.
13:43 < jnewbery> ok, final question: Which approach do you like best?
13:43 < emilengler> I like mine (adding a new RPC call) because I'm skeptical about overloading parameters
13:44 < jkczyz> Neither :) Actaully, @height in client (e.g., bitcoin-cli) I'd prefer
13:44 < aj> jkczyz: (that's the %height one jnewbery mentioned)
13:44 < jkczyz> yep
13:44 < emzy> The one with the rpc-client change.
13:45 < emzy> %height in short.
13:45 < fjahr> It has been said that this was requested multiple times but do we know anything more about this? Who would use this frequently but not frequently enough to use a script/alias?
13:45 < lightlike> I'm undecided between leave-as-is and the on by aj.
13:46 < PaulTroon> I like the @height change if consistently used by rest of the RPC, otherwise emilengler's is cleaner
13:46 < jnewbery> fjahr: good question. I think it's just convenience. People don't like those extra few keystrokes.
13:47 < emilengler> Also thought about this to be honest :P
13:47 < jnewbery> did anyone have any other questions about this PR or review in general in the last few minutes?
13:47 < lightlike> fjahr: also, this is one of the first things many new users will do with bitcoind. at least it was for me.
13:47 < PaulTroon> any sort of standards document around RPC interface?
13:47 < emilengler> lightlike: Yeah
13:48 < emilengler> PaulTroon: standards? This is a Bitcoin Core only thing
13:48 < hanh> fjahr my experience with this RPC is it's heavily used in blockchain extraction software.
13:48 < PaulTroon> emilengler: I mean convention within Bitcoin Core, like there is for coding
13:48 < emilengler> hanh: Like blockexplorers?
13:49 < emilengler> PaulTroon: I think the manpages are fine for this :)
13:49 < hanh> That's one example.
13:49 < hanh> Anywhere you need to extract blokchain data into an external storage
13:49 < jnewbery> PaulTroon: we've been trying to rationalize the RPC methods. Take a look at RPCHelpMan and RPCArg::Type for example
13:49 < jkczyz> hanh: yes, but once a block is retrieved the next/prev links could be followed
13:50 < jnewbery> those are new, and impose some structure on the methods and arguments
13:50 < PaulTroon> jnewbery: +1
13:50 < hanh> jkczyz Yes.
13:50 < emilengler> jkczyz: For this we have the prevHash field and the getblockcount
13:51 < jkczyz> emilengler: yeah, I think we're on the same page
13:52 < hanh> emilengler The problem is if you have a pipeline continuously extracting blockchain, you will keep pinging the tip of the chain. Of course, once you get the block, you can use prevHash.
13:52 < emilengler> Do you going from the top to the bottom
13:53 < emilengler> mean*
13:53 < jnewbery> ok, let's wrap it up there. Thanks everyone. Next week is #16115 "On bitcoind startup, write config args to debug.log (config)"
13:53 < emilengler> Ok, see you next week. Was funny today :)
13:54 < PaulTroon> +1
13:54 < digi_james> Thanks!
13:54  * jkczyz waves
13:54 < emzy> Thanks jnewbery and all others!
13:54 < jnewbery> if anyone has suggestions for future PRs to cover or if you want to host a future meeting, please messge me or comment on https://github.com/bitcoin-core-review-club/bitcoin-core-review-club.github.io/issues/14
13:54 < jnewbery> thanks all!
13:54 < hanh> Thanks jnewbery
13:55 < lightlike> thanks!
13:56 < emilengler> Yes thank you all especially you jnewbery