Use query parameters to control resource loading (build system, rpc/rest/zmq)

https://github.com/bitcoin/bitcoin/pull/24098

Host: stickies-v  -  PR author: stickies-v

The PR branch HEAD was 6628e8f at the time of this review club meeting.

Notes

This PR is a followup to #17631, which we discussed in a previous review club.

PR description

In RESTful APIs, typically path parameters (e.g. /some/unique/resource/) are used to represent resources, and query parameters (e.g. ?sort=asc) are used to control how these resources are being loaded through e.g. sorting, pagination, filtering, etc.

As first discussed in #17631, the current REST api contains two endpoints /headers/ and /blockfilterheaders/ that rather unexpectedly use path parameters to control how many (filter) headers are returned in the response. While this is no critical issue, it is unintuitive and we are still early enough to easily phase this behaviour out and ensure new endpoints (if any) do not have to stick to non-standard behaviour just for internal consistency.

In this PR, a new HTTPRequest::GetQueryParameter method is introduced to easily parse query parameters, as well as two new /headers/ and /blockfilterheaders/ endpoints that use a count query parameter are introduced. The old path parameter-based endpoints are kept without too much overhead, but the documentation now points to the new query parameter-based endpoints as the default interface to encourage standardness.

The COUNT parameter controls how many headers or blockfilterheaders are returned for a single API request, and defaults to 5 if unspecified.

Using the REST API

We’ll test the REST API on signet. To do so, first start a bitcoind instance with the -rest flag enabled. To use the blockfilterheaders endpoint, you’ll also need to set -blockfilterindex=1: ./bitcoind -signet -rest -blockfilterindex=1

As soon as bitcoind is fully up and running, you should be able to query the API, for example by using curl on the command line: curl "127.0.0.1:38332/rest/chaininfo.json" As a response, you should get:

{"chain":"signet","blocks":78071,"headers":78071,"bestblockhash":"000000b3e98c0de440154f42819b56586ed36bad0baa2db8ba5d0950e416dcad","difficulty":0.002873067874458486,"time":1645188967,"mediantime":1645188099,"verificationprogress":0.9999934105995945,"initialblockdownload":false,"chainwork":"000000000000000000000000000000000000000000000000000000dc187f6fa0","size_on_disk":373918704,"pruned":false,"softforks":{"bip34":{"type":"buried","active":true,"height":1},"bip66":{"type":"buried","active":true,"height":1},"bip65":{"type":"buried","active":true,"height":1},"csv":{"type":"buried","active":true,"height":1},"segwit":{"type":"buried","active":true,"height":1},"taproot":{"type":"bip9","bip9":{"status":"active","start_time":-1,"timeout":9223372036854775807,"since":0,"min_activation_height":0},"height":0,"active":true}},"warnings":"This is a pre-release test build - use at your own risk - do not use for mining or merchant applications"}

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?

  2. In HTTP requests, what is the difference between a path and a query parameter?

  3. What are the benefits of changing the COUNT parameter from a path parameter to a query parameter?

  4. What are the drawbacks of changing the COUNT parameter from a path parameter to a query parameter?

  5. (general discussion) If the effort of implementing a change is already done, what could be some good reasons for rejecting an (unharmful) code change anyway? Do you (dis)agree with this comment?

  6. Does this PR change any of the existing function signatures? If so, why? Can this cause any behaviour change?

  7. Can you list all the commits that introduce behaviour change(s)? Do you feel comfortable about these behaviour change(s)?

  8. Consider the request (signet) GET /rest/blockfilterheaders/basic/2/0000004c6aad0c89c1c060e8e116dcd849e0554935cd78ff9c6a398abeac6eda.json?count=1. What would the response be prior to this PR? What would the response be after this PR? Which (modified) function is responsible for this behaviour change? Try reasoning about it before verifying experimentally.

  9. One reviewer raises the view that /rest/headers/ is a collection endpoint instead of a single resource endpoint. Do you agree? If so, would you change the PR to implement this? What would be the drawbacks of doing that?

  10. Why does this PR introduce both HTTPRequest::GetQueryParameter and GetQueryParameterFromUri? Couldn’t we just put all the logic in HTTPRequest::GetQueryParameter?

  11. Specifically with regards to the structure of the endpoints, do you see any further improvements that could be made?

Meeting Log

  117:00 <stickies-v> #startmeeting
  217:00 <svav> Hi
  317:00 <kouloumos> hi
  417:00 <theStack> hi!
  517:00 <stickies-v> Welcome everyone! Today we're looking at #24098 (https://bitcoincore.reviews/24098) which aims to improve the endpoint logic of the REST API.
  617:00 <glozow> hi!
  717:00 <bitplebpaul> hi!
  817:01 <bitcoin1o1> hi all
  917:01 <jnewbery> hi!
 1017:01 <effexzi> Hi every1
 1117:01 <larryruane> hi
 1217:01 <michaelfolkson> hi
 1317:01 <OliverOffing> hi all!
 1417:01 <stickies-v> Lots of old timers I see - do we also have any first timers with us today?
 1517:01 <Dame> hello
 1617:01 <jaonoctus> hi
 1717:01 <Kaizen_Kintsugi_> Hello!
 1817:02 <schmidty> hi
 1917:02 <galv> stickies-v I'm a first timer.
 2017:02 <sipa> hi
 2117:02 <svav> Can I ask the first timers how they heard of this review club?
 2217:02 <stickies-v> Welcome galv ! It's a very open format here so please feel free to engage in the discussion for as much as you like.
 2317:03 <glozow> welcome galv!
 2417:03 <svav> Welcome galv
 2517:03 <stickies-v> As you may have deduced from some of the later questions, there are still some open discussion topics in this PR. Please don't feel shy to chip in with your thoughts, it helps to know what the community consensus is.
 2617:03 <stickies-v> Who got the chance to read review the PR or read the notes? Can I get a quick y/n? If you tried, did you run into any issues running and testing the REST API?
 2717:03 <galv> I heard about it through someone at a local meetup in the bay area (not crypto-related, but effective altruism related) from someone who works on bitcoin futures contracts.
 2817:03 <willcl_ark> hi
 2917:04 <svav> OK cool thanks galv
 3017:04 <bitplebpaul> y, concept ack
 3117:04 <svav> I read the notes and had a look at the code
 3217:04 <theStack> y (read PR description and discussion, didn't look at the commits yet)
 3317:05 <svav> Was this PR done because people using the functionality were complaining that it was non-standard? Or did the developer just decide it would be better if updated?
 3417:05 <bitcoin1o1> y, concept ack
 3517:05 <kouloumos> y, concept ack, further in the review than usual but still haven't look into the approach
 3617:05 <glozow> just read the notes, learned about parameter types
 3717:06 <OliverOffing> just read the notes and skimming through the code now
 3817:06 <stickies-v> svav I've not heard of any complaints. My main motivation to do it now is to minimize the overhaul needed if/when we have more endpoints in the future. It makes it easier for devs to start using the API the more standard it is.
 3917:06 <stickies-v> Easy one to get started, just to make sure everyone understands the concept of what we're changing here. In HTTP requests, what is the difference between a `path` and a `query` parameter?
 4017:07 <svav> A path parameter is part of the url and identifies where a resource is. They appear to the left of the ?
 4117:07 <svav> A query parameter appears to the right of the ? and controls how the resource is queried, e.g. controls sorting or filtering.
 4217:07 <svav> Can anyone give typical usage of the REST functionality? Who is using it and what for?
 4317:08 <michaelfolkson> Do you use REST over JSON-RPC stickies-v?
 4417:08 <stickies-v> svav very good, although to be pedantic the `query string` appears to the right of the `?`, and the `query string` can consist of multiple `query parameters` separated by an `&`
 4517:09 <jaonoctus> The query parameters are used to sort/filter resources. On the other hand, path parameters are used to identify a specific resource or resources.
 4617:09 <jaonoctus> e.g. /users?sort=asc&name=Joao (where /users is the recource and the rest are the query params)
 4717:10 <OliverOffing> A URL/URI is composed of different parts: <protocol>:<domain>/<path>?<query>, e.g. https://github.com/bitcoin/bitcoin?page=5. The path is generally used to represent objects/resources whereas the query usually represent filters or arguments that shape which of those objects are returned, in which order, and containing which fields
 4817:10 <larryruane> and this is a worldwide convention, so we're just trying to have a more standard interface
 4917:10 <stickies-v> michaelfolkson I'm not sure if I understand your question 100% right, but REST and RPC are different paradigms on how to organize your API, they don't strictly specify the communication protocol. REST is almost always done over HTTP(S), but this is not required
 5017:11 <larryruane> if you start the node with `-rest`, the interface is available only locally? Or available to anyone who can reach the IP addr and port?
 5117:11 <stickies-v> jaonoctus OliverOffing exactly!
 5217:12 <stickies-v> larryruane I suppose this depends on your networking settings?
 5317:12 <stickies-v> Just a few things to add to what's already been answered:
 5417:12 <stickies-v> Well actually just one haha, everything else is covered already, but path parameters are positional, query parameter are key-val structures
 5517:12 <stickies-v> (this becomes important later)
 5617:13 <bitplebpaul> <> /users?sort=asc&name=Joao <> would give all users named Joao, in ascending order?
 5717:14 <stickies-v> bitplebpaul well each API is free to implement its logic however it wants to, but that sounds like what you'd expect (assuming a GET request to this endpoint). This is a collection endpoint.
 5817:14 <stickies-v> I'll move on to the next question already, but in general - always feel free to continue the discussion on previous questions/topics
 5917:14 <stickies-v> What are the [benefits/drawbacks] of changing the `COUNT` parameter from a `path` parameter to a `query` parameter?
 6017:15 <bitplebpaul> benefit -> standardization with the rest of the web & RESTful practices
 6117:15 <willcl_ark> are paths also required, but queries are optional?
 6217:15 <theStack> obvious benefit: following best practices
 6317:15 <svav> The COUNT parameter is best described as a query parameter, and therefore the intuitive and conventional implementation would be to have it as a query parameter.
 6417:15 <bitplebpaul> drawbacks: none? since we are only deprecating and not eliminating the old way.
 6517:16 <OliverOffing> Benefits: "standardized", least developer confusion, more organized
 6617:16 <OliverOffing> Drawbacks: need to change code, need maintain two routes to keep backwards compatibility (for a while)
 6717:16 <stickies-v> bitplebpaul theStack yes exactly, and this is also the main benefit for the current endpoints. There are additional general functional advantages that we can benefit from in the future too, though
 6817:17 <stickies-v> willcl_ark path parameters are indeed required, unless of course you construct multiple endpoints where some path parameters are omitted, but this can become really confusing both for users and for developers as there will often be ambiguity
 6917:17 <svav> Do we know how much the REST interface is being used at the moment? And for what purposes?
 7017:17 <stickies-v> OliverOffing that's indeed the only and main drawback I see (cc bitplebpaul - dev burden is always something to consider)
 7117:19 <svav> For my understanding, are the headers we are referring to Version, Previous Block Hash, Merkle Root, Timestamp, Difficulty Target and Nonce? So e.g. a COUNT of 2, would return the first two of these?
 7217:20 <stickies-v> svav to my knowledge there is no data collection whatsoever, including on usage, so this is difficult to answer. My gut feeling says it's much less used than the RPC, also because it's unauthenticated and only contains a subset of the functionality. Main purpose over RPC is that it's much easier to use (less overhead), so if you don't need any of the RPC endpoints REST is probably the best choice. Also in general
 7317:20 <stickies-v> there's a ton more generic tooling available for REST that can make your life much easier.
 7417:21 <larryruane> svav: "... REST interface is being used..." I thinkthere are general-purpose monitoring mechanisms like Grafana that can easily be plugged into a REST interface, while it would take more work (if even possible) to use the RPC interface
 7517:22 <stickies-v> and also really cool tooling like OpenAPI, if we ever decide to start using that - which is probably a bit of a pipe dream of mine haha
 7617:23 <stickies-v> Alright let's dive into the code. Does this PR change any of the existing function signatures? If so, why? Can this cause any behaviour change?
 7717:23 <theStack> i guess the REST api is potentially useful for browser client-side stuff, e.g. javascript?
 7817:24 <svav> What is meant by a "function signature"?
 7917:25 <Kaizen_Kintsugi_> I think I'm seeing some function signatures changed in tests so far
 8017:25 <Kaizen_Kintsugi_> svav I believe its if the arguments to a function change
 8117:25 <larryruane> there is the scripted-diff rename, maybe that counts as a function signature change?
 8217:25 <stickies-v> theStack in general, it's much less overhead to interface with a HTTP JSON REST API because requests and (de)serialization are super straightforward. Imo, if the REST API has the endpoints you need for your purpose, pretty much any (programmatic) use should be easier instead of RPC I think.
 8317:26 <kouloumos> does ParseDataFormat becoming non-static counts as such change?
 8417:26 <Kaizen_Kintsugi_> Yea I remember that from a previous review, RPC seriealizes and has more overhead
 8517:26 <theStack> stickies-v: thanks, that makes sense!
 8617:27 <stickies-v> larryruane from my understanding function and parameter names are not part of the function signature?
 8717:28 <stickies-v> kouloumos on the money! And do you think that can cause any behaviour change?
 8817:28 <larryruane> stickies-v: "...should be easier instead of RPC..." But isn't the REST interface more restrictive? You can't for example `sendmany` using it, right? I thought REST was in effect readonly
 8917:28 <stickies-v> larryruane exactly, which is why I disclaimed it with "if the REST API has the endpoints you need for your purpose"
 9017:29 <kouloumos> I think not, but I am still sharpening my C++
 9117:30 <stickies-v> svav this doesn't seem to be an official source, but from quick glance this gives some more understanding on function signature: https://www.cs.unm.edu/~storm/C++/ProgrammingTerms/FunctionSignatures.html
 9217:30 <larryruane> stickies-v: ah right, got it +1
 9317:30 <stickies-v> anyone got any ideas on the behaviour change of ParseDataFormat becoming non-static?
 9417:32 <stickies-v> https://github.com/bitcoin/bitcoin/pull/24098/commits/833803e9aa4a107d9b48d0fb51b360b1b3df3b21#diff-590507deaf686d6571f73979df5f3c6da6013a13e597f390b8facce39f7c69d1R135-R136
 9517:32 <Kaizen_Kintsugi_> I only see it being added and not changed
 9617:32 <Kaizen_Kintsugi_> derp
 9717:32 <Kaizen_Kintsugi_> I dont think it would
 9817:32 <larryruane> Not sure if this counts as a behavior change, but when a function is `static`, it can be inlined, but otherwise it can't (unless it's implemented in a header file)
 9917:33 <sipa> it becoming non-static just means no function with the same name can occur in other compilation units
10017:33 <sipa> larryruane: It can still be inlined, but not inlined _away_.
10117:33 <Kaizen_Kintsugi_> but that is a legit function signature change correct
10217:34 <stickies-v> Kaizen_Kintsugi_ I'm actually not 100% sure but I did consider that a signature change yes haha, if not hope someone can correct me
10317:35 <stickies-v> the reason ParseDataFormat had to become static is so it became accessible in the unit test in rest_tests.cpp
10417:35 <stickies-v> *non-static sorry
10517:35 <Kaizen_Kintsugi_> ah
10617:35 <kouloumos> If it's static is only visible to functions in the same file right? so I understand that you did that to be able to use that function in test/rest_test.cpp
10717:36 <stickies-v> kouloumos technically it's only visible within the same translation unit, which in practice should mean same file but I think there are exceptions
10817:36 <Kaizen_Kintsugi_> ah i did not know that static did that
10917:36 <stickies-v> I'm glad to see no one confused the staticness of member functions with non-member functions, hoorah!
11017:37 <Kaizen_Kintsugi_> from the reference "static -- this function can only be seen in this file? No, this means that this function can be called without an instantiated object, as normally member functions (methods) must be called using an instantiation of the class, though with this keyword, you don't need it."
11117:37 <larryruane> and confusingly, a static method within a class means something different, it means you don't need an instance of that object to call the method
11217:37 <Kaizen_Kintsugi_> I see I see
11317:37 <Kaizen_Kintsugi_> that is confusing
11417:37 <larryruane> keyword overloading :)
11517:37 <Kaizen_Kintsugi_> srs
11617:37 <stickies-v> Behaviour change is always something to be extra careful with. Can you list all the commits that introduce behaviour change(s)? Do you feel comfortable about these behaviour change(s)?
11717:38 <bitplebpaul> @larry
11817:38 <bitplebpaul> woops
11917:38 <larryruane> definitely commit 4 is the main one, also commit 2 reinstates the backwards compatibility?
12017:39 <Kaizen_Kintsugi_> 5?
12117:39 <larryruane> (depending on how you count :) i mean as git log shows it)
12217:39 <stickies-v> larryruane 0-indexed? :D
12317:40 <larryruane> ah sorry, 1-indexed
12417:41 <larryruane> (commit 4 = 833803, commit 2 = 395f78)
12517:42 <stickies-v> larryruane yes exactly those two introduce behaviour change, even though those should be commits 3 and 5 so Kaizen_Kintsugi_ yes you're also right on 5
12617:43 <stickies-v> but have a closer look at 833803e9a Handle query string when parsing data format - it modifies ParseDataFormat to return `param` without the query string, so effectively this changes how the API responds. Even though previously we weren't *expecting* people to add query parameters, they still could
12717:44 <stickies-v> Alright we've already partly covered the next question, but just to see if anyone has issues running/testing this I'll quickly cover it anyway
12817:44 <stickies-v> Consider the request (signet) `GET /rest/blockfilterheaders/basic/2/0000004c6aad0c89c1c060e8e116dcd849e0554935cd78ff9c6a398abeac6eda.json?count=1`. What would the response be prior to this PR? What would the response be after this PR? Which (modified) function is responsible for this behaviour change? Try reasoning about it before verifying experimentally.
12917:45 <Kaizen_Kintsugi_> The count parameter of 1 is ignored and the count of 2 is taken after basic?
13017:46 <stickies-v> Kaizen_Kintsugi_ correct, we first check for path length, and only look at the query parameter if the path parameter is missing, as you can see in https://github.com/bitcoin/bitcoin/pull/24098/files#diff-590507deaf686d6571f73979df5f3c6da6013a13e597f390b8facce39f7c69d1R369-R379
13117:47 <bitplebpaul> and post PR the query parameter takes precedence?
13217:47 <stickies-v> Well, we skipped part of the question. Let's address that first. "What would the response be prior to this PR?"
13317:49 <svav> Prior, returns 2 headers of type basic.
13417:49 <Kaizen_Kintsugi_> Well I dont think it handled the splitting of the ?, so I think you would just get back nonesense
13517:50 <bitplebpaul> +1 kaizen
13617:50 <kouloumos> +1 svav
13717:52 <larryruane> without the PR, it ignores the query string (`count=1`)
13817:52 <stickies-v> Kaizen_Kintsugi_ yes! Prior, the ParseDataFormat would consider the query string as part of the format suffix (.json?count=5), which is an unrecognized. Moreover, it would try to look up the blockhash suffixed with the formatt suffix and query parameter (<somehash>.json?count=5), which of course doesn't exist, so the API would return
13917:52 <stickies-v> `Invalid hash: 0000004c6aad0c89c1c060e8e116dcd849e0554935cd78ff9c6a398abeac6eda.json?count=1`
14017:53 <svav> OK I see
14117:53 <stickies-v> svav larryruane nope, prior to this PR the API does not handle the query string at all, it is just considered as part of the path, so it would fail most requests
14217:55 <stickies-v> One reviewer raises (https://github.com/bitcoin/bitcoin/pull/24098#issuecomment-1027755825) the view that `/rest/headers/` is a collection endpoint instead of a single resource endpoint. Do you agree? If so, would you change the PR to implement this? What would be the drawbacks of doing that?
14317:55 <larryruane> oh i see, right, that's why this is not quite a completely backwards-compatible change
14417:56 <stickies-v> larryruane exactly - but as I've described in the Behaviour Change section of the PR, I can't anticipate any situations where this would be undesirable or unexpected, until someone proves me wrong...
14517:56 <bitplebpaul> 3 questions, 3 minutes, we got this.
14617:57 <stickies-v> glozow is making hosts do push ups for every missed question, help me out guys...
14717:57 <glozow> 😂
14817:59 <bitplebpaul> I'm not sure
14917:59 <bitplebpaul> I don't have a bitcoind to play/test with right now
15018:00 <Kaizen_Kintsugi_> damn I dont know the difference between collection and single resource
15118:00 <stickies-v> Kaizen_Kintsugi_ well basically you could e.g. query GET /users/ to get all users, and that would be a collection endpoint. When querying GET /users/5/, you would be querying a single resource and expect to get details on user 5
15218:00 <bitplebpaul> I think the difference between collection and single resource just refers to GETting a single block or a collection of blocks
15318:01 <bitplebpaul> yeah, +1 stickies
15418:01 <stickies-v> Alright unfortunately we're out of time though, if anyone has any further questions or comments I'm always happy to engage!
15518:01 <stickies-v> #endmeeting