Benchmark Chainstate::ConnectBlock duration (resource usage, tests)

Apr 2, 2025

https://github.com/bitcoin/bitcoin/pull/31689

Host: davidgumberg - PR author: Eunovo

Notes

Bitcoin Core uses the nanobench library for a suite of “microbenchmarks” that measure the performance of individual components or functions in idealized conditions.
Chainstate::ConnectBlock() does double-duty: it is partly responsible for validating blocks being connected to the node’s tip, and partly responsible for applying their effects to the node’s view of the UTXO set (CCoinsViewCache).
- One of the most “expensive” checks performed by ConnectBlock() is CheckInputScripts: which ensures that every input script of every transaction succeeds.
In the course of evaluating scripts, signature checks are often required, sometimes explicitly with opcodes like OP_CHECKSIG, OP_CHECKMULTISIG, and sometimes implicitly with Bitcoin output types like P2WPKH that have implicit signature checks.
- In pre-SegWit and SegWit version 0 outputs, signatures are generated and validated using ECDSA over the secp256k1 curve. Taproot introduced the version 1 SegWit output type, which uses Schnorr signatures over the same curve. BIP-0340 describes the way signatures are generated and evaluated for taproot outputs.
  - One of the advantages of Schnorr signatures over ECDSA signatures is that they can be verified in batches. A simplified description of batch verification is that instead of needing to prove that signature $A$ is valid for input $X$, signature $B$ is valid for input $Y$, and that signature $C$ is valid for input $Z$, we can add up signatures $A$, $B$, and $C$, to produce signature $D$, and add inputs $X$, $Y$, and $Z$ to produce input $W$, and then only perform a single verification, that signature $D$ is valid for input $W$.
Although in principle schnorr signatures can be validated in batches, Bitcoin Core today validates them individually just like ECDSA signatures. There is a PR open, #29491, that implements Batch Validation in Bitcoin Core. The motivation for this PR is to establish a baseline for signature validation performance in Bitcoin Core today, which can then be used to validate and potentially quantify performance improvements of Batch validation.
- #31689 introduces three ConnectBlock benchmarks, one for a block where all inputs that are spent use ECDSA signatures, one where all inputs are Schnorr signatures, and one where some are Schnorr and some are ECDSA.

Questions

Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?
Did you run the benchmarks? What did you observe?
What is TestChain100Setup? What does 100 mean? Why 100?
The notes above suggest that CheckInputScripts() is “expensive”. Is it? Why?
Some reviewers (and a code comment) observed that in their testing ConnectBlockMixed was the slowest of the three benchmarks. Is that possible?
Some reviewers disagreed about whether or not the ‘mixed’ case should be 50/50 schnorr/ecdsa, or if it should be some mixture of Schnorr and ECDSA that would be likely to appear in a block, what are the tradeoffs of each approach?
What is the purpose of the first transaction that gets created in CreateTestBlock()? Why couldn’t this transaction be created in the for loop like all the other transactions?
ConnectBlock does a lot more than just checking input scripts. Is this PR introducing a ConnectBlock() benchmark or a signature validation benchmark? Why use ConnectBlock() instead of benchmarking CheckECDSASignature() and CheckSchnorrSignature() directly?
Do you think the tests added here are sufficient in scope or are there other cases that should have been added in this PR? What additional benchmarks of ConnectBlock() would be good to have in a follow-up PR?

Meeting Log

16:59

<dzxzg> #startmeeting

16:59

<dzxzg> hi

16:59

<glozow> hi

17:00

<sliv3r__> hi

17:00

<dzxzg> Hi everyone, thanks for coming to this benchmarking review club :)

17:00

<monlovesmango> hey

17:00

<stringintech> Hello

17:00

<dzxzg> We're looking at #31689, and there are notes here: https://bitcoincore.reviews/31689

17:00

<dzxzg> If you have a question or something to say, please feel free to jump in at any time

17:00

<stickies-v> hi

17:01

<enochazariah> hello everyone

17:01

<dzxzg> Diid you have a chance to review the PR, and/or look at the notes?

17:01

<dzxzg> s/you/anyone

17:01

<monlovesmango> yes

17:01

<janb84> somewhat :)

17:02

<sliv3r__> yes, tried to answer the questions in the notes

17:02

<Novo__> hi

17:02

<glozow> y

17:02

<oxfrank> y

17:02

<stringintech> y

17:03

<dzxzg> Awesome! I guess it's slightly unusual to review a PR *after* it's been merged, but I think it's still important and helpful, this code is just beginning it's life in Bitcoin Core!

17:04

<monlovesmango> I learned a lot about benchmarking :)

17:04

<dzxzg> Did you run the benchmarks? What did you observe?

17:04

<janb84> Yes, and my results are lot less verbose than the results shown in the comments

17:05

<glozow> Nothing wrong with reviewing a PR after merge! Presumably if you take a look at the batch validation PRs and use the benches to measure the performance changes, you should also know what the benches are doing :)

17:05

<monlovesmango> yes... I have a few comments but dont want to get ahead of the questions...

17:05

<sliv3r__> yes, mixed blocks a bit slower

17:05

<stringintech> mixed block the slowest and all schnorr fastest for me

17:05

<glozow> For me, in order from fastest to slowest: `ConnectBlockAllSchnorr`, `*AllEcdsa`, `*MixedEcdsaSchnorr`.

17:06

<dzxzg> janb84: what do you mean your results are less verbose?

17:06

<monlovesmango> yeah I did observe mixed was slowest consistently

17:06

<stickies-v> Same for me! But I got an instability warning for ConnectBlockMixedEcdsaSchnorr

17:06

<stickies-v> (I'm getting them for a fair amount of other benches too so it might be me)

17:06

<janb84> @dzxzg i'm missing the column cyc/block, IPC / , BRA/block

17:07

<sliv3r__> I have same results as glozow

17:07

<sliv3r__> I'm also missing thosecolumns @janb84

17:07

<oxfrank> MixedEcdsaSchnorr slower than all

17:07

<dzxzg> What is TestChain100Setup? What does 100 mean? Why 100?

17:08

<monlovesmango> sets up new chain and mines 100 blocks

17:08

<janb84> pre-creates a 100-block REGTEST-mode block chain

17:08

<monlovesmango> 100 bc thats how long it take coinbase to mature before you can spend it

17:08

<oxfrank> 100 mean no of blocks in test environment

17:09

<sliv3r__> as per a code comment: texting fixture that pre-creates 100 blocks in regtest mode to get the coinbase mature

17:09

<dzxzg> Yeah, all answers that make sense to me! Super useful, and appears all over the place in benchmarking and test code

17:10

<dzxzg> I'm going to quote something from the review notes in case anyone didn't have a chance to read them.

17:10

<dzxzg> "One of the most “expensive” checks performed by ConnectBlock() is CheckInputScripts: which ensures that every input script of every transaction succeeds."

17:10

<sipa> hi

17:10

<dzxzg> So, the author of the notes suggested that CheckInputScripts() is “expensive”. Is it? Why?

17:11

<sliv3r__> they normally contain signatures to verify and that's an expensive task

17:11

<monlovesmango> bc input scripts generally need signarture verification

17:12

<oxfrank> CheckInputScripts() most computationally intensive part of validation because of cryptographic signatures

17:13

<stickies-v> The CheckInputScripts docstring mentions "This involves ECDSA signature checks so can be computationally intensive." does it not do Schnorr signature checks or did the docstring just not get updated?

17:13

<sipa> That sounds outdated.

17:14

<dzxzg> +1

17:14

<sipa> All signature checks are done through the script interpreter, which is invoked from CheckInputScripts.

17:15

<dzxzg> Some reviewers (and a code comment) observed that in their testing ConnectBlockMixed was the slowest of the three benchmarks. Is that possible?

17:16

<dzxzg> I also noticed that there was a recent comment from someone present here on the PR

17:16

<janb84> I made the same observation in my test runs.

17:16

<dzxzg> That shed some new light

17:16

<sliv3r__> Yes! Because the two different types are used in the same transaction so they have to be hashed multiple times due to differences in the signature digest algorithm

17:17

<monlovesmango> I ahve a question about this one. when I run benchmarks as is I do see that mixed is slowest, but it also has 5 keys/outputs rather than 4 like the other 2. could that be why?

17:17

<monlovesmango> when I test all 3 with 5 keys/outputs they are all fairly similar

17:18

<janb84> @monlovesmango interesting !

17:18

<monlovesmango> but I was not able to run the 'pyperf system tune' bc it errored on my machine

17:18

<monlovesmango> so i was using the min-time arg to get consistent results

17:19

<sliv3r__> May I ask how did you realize that?

17:19

<monlovesmango> not sure if that is sufficient

17:20

<oxfrank> but why did they go with 5 keys/outputs in mixed?

17:20

<monlovesmango> bc in the code ConnectBlockAllSchnorr creates 4 schnorr keys/outputs, ConnectBlockAllEcdsa creats 4 ecdsa keys/outputs, and ConnectBlockMixedEcdsaSchnorr creates 1 schnorr and 4 ecdsa

17:20

<dzxzg> (https://github.com/bitcoin/bitcoin/blob/639279e86a6edd6cb68e8cf077d14337bcd13959/src/bench/connectblock.cpp#L110-L132)

17:20

<monlovesmango> oxfrank: bc they wanted 80/20 ratio

17:20

<monlovesmango> i'm assuming

17:21

<dzxzg> Yeah, but I think not bumping the others up to 5 was a mistake!

17:21

<sliv3r__> oh right is hardcoded

17:21

<dzxzg> I was able to reproduce the result monlovesmango had

17:21

<dzxzg> when I changed the number of inputs in all of the tests so that they all had 5 inputs, the mixed block didn't stand out any more as the slowest!

17:22

<sliv3r__> so the assumtiom that some users had about having to hash multiple times bc of the signature digest algorithm is wrong?

17:22

<monlovesmango> dzxzg: nice!

17:23

<oxfrank> monlovesmango I think so too

17:24

<dzxzg> sliv3r__: I'm not sure, I thought that explanation made sense when I wrote the notes, but it seems that at the very least even if it wasn't wrong about extra work needed for validating transactions with mixed inputs, but it seems to have been wrong about how significant that would be!

17:26

<dzxzg> Nice find monlovesmango, I think a PR to address this would be nice! Another feather in the cap of never trusting explanations for poor performance until you've measured them :)

17:26

<sipa> i haven't run the numbers, but i'm curious how the block verification times compare with the raw pubkey decompression + signature checking numbers

17:27

<monlovesmango> dzxzg: do you think it would be alright if I opened a pr for this?

17:28

<sliv3r__> manlovesmango: it make sense to fix

17:29

<dzxzg> monlovesmango: I don't know how other reviewers would feel, but it's probably a Concept ACK for me

17:29

<dzxzg> In the same vein as sipa's remark above: ConnectBlock does a lot more than just checking input scripts. Is this PR introducing a ConnectBlock() benchmark or a signature validation benchmark? Why use ConnectBlock() instead of benchmarking CheckECDSASignature() and CheckSchnorrSignature() directly?

17:31

<sipa> FWIW, the numbers i have on my system are 31.0 us/sig for ecdsa, 31.8 us/sig for schnorr, and ~3.1 us/key for pubkey decompression

17:31

<monlovesmango> it seemed like one goal was to assess performance with a mixed back of sig types, which can't be done with CheckECDSASignature() or CheckSchnorrSignature() alone

17:31

<monlovesmango> that was the best reason I could think of

17:32

<sliv3r__> also this will allow us to benchmark batch verification when it's implemented

17:32

<sliv3r__> so I guess we should see an improvement on schnorr

17:33

<sipa> yeah, the PR is a preparation for batch validation, which is applicable to schnorr signatures, but not ECDSA, so to get a realistic benchmark, it may make sense to see how it impacts a block with a mix of both (which, for the time being, is likely what we'll need to expect)

17:33

<sipa> i assume

17:35

<Novo__> batch verification implementation will modify connectblock a lot, so we also want to see if that our changes don't negatively impact overall conectblock performance even if it speeds up CheckSchnorrSignature

17:35

<sipa> Novo__: good point

17:36

<dzxzg> Some reviewers disagreed about whether or not the ‘mixed’ case should be 50/50 schnorr/ecdsa, or if it should be some mixture of Schnorr and ECDSA that would be likely to appear in a block, what are the tradeoffs of each approach?

17:37

<dzxzg> And to add to that question, or maybe this would be part of the tradeoffs, how would we decide what a "representative" mixture would be?

17:37

<sliv3r__> I don't have a strong opinion on that tbh but some argue that 80/20 is the actual ratio now while 50/50 is probably what we will have in a future

17:38

<monlovesmango> yeah what sliv3r said :)

100

17:38

<janb84> sliv3r__: agree

101

17:39

<dzxzg> What is the purpose of the first transaction that gets created in CreateTestBlock()? Why couldn’t this transaction be created in the for loop like all the other transactions?

102

17:40

<monlovesmango> honestly might be good to have a few tiers, 80/20, 50/50, 20/80, just so we have a variety of benchmarks to compare changes against? or would this be redundant

103

17:41

<monlovesmango> the first transaction is spending the coinbase and setting up the outputs that will used for the bench mark. so this tx is different than the others, and this way the benchmark is only measuring the specific sig checks we are interested in (bc first tx is excluded from bench)

104

17:42

<sliv3r__> I guess if it was redundant there would not be discussion about the ratio so I guess it makes sense

105

17:42

<dzxzg> monlovesmango: re: coinbase transaction, yep!

106

17:43

<sliv3r__> agree with @monlovesmango also because testchain100setup is the one who decides the conditions on that tx

107

17:43

<sliv3r__> so we don't have control on it

108

17:43

<monlovesmango> yep that too

109

17:46

<dzxzg> The "What ratio should we use question?" makes me think of a bigger question, when should your measurement try to as closely as possible approximate the real situation of interest, like in this case maybe, real nodes connecting blocks to their tips, and when should you try to create idealized conditions that might exaggerate, or be focused on some tiny element which rarely constitutes much of the real task

110

17:47

<dzxzg> but you get the advantage of interpretability, when you exaggerate one element, it's really easy to interpret the outcome of a benchmark, if it's faster it's probably that thing, if it's slower it's probably that thing

111

17:47

<monlovesmango> as far as benchmarking goes, I feel like that question should come after all the data is in

112

17:48

<dzxzg> Okay, final question: Do you think the tests added here are sufficient in scope or are there other cases that should have been added in this PR? What additional benchmarks of ConnectBlock() would be good to have in a follow-up PR?

113

17:49

<monlovesmango> like we should want to know each scenario performs, and then make decisions about whether real use or idealized use should be given more significance

114

17:50

<monlovesmango> I think in the pr josie had mentioned testing mixed block composition (so instead of mixed transactions, each transaction would only have one type of signature but the block would have mixed bag of transactions)

115

17:51

<monlovesmango> which might make sense depending on how batch verification is implemented

116

17:51

<janb84> the mixed ratio seems arbitrary, would love to see if other ratios would change the outcome much

117

17:51

<oxfrank> I think Possible follow-up benchmarks are still necessary i.e different script types (P2PKH, P2SH-wrapped SegWit), different block sizes, ..

118

17:51

<sliv3r__> re: addition bencharmks - As this wants to benchmark batch validations I'm not sure how other parts of connectblock gets affected by that so...

119

17:54

<sliv3r__> if we want to benchmark unrelated to batch validation we could get some numbers on how fast we update the utxo set or even how some of the changes from CC like nLockTime validation for coinbase tx affects here (that's not implemented yet)

120

17:56

<dzxzg> Is there anything else that anyone wanted to say or ask that didn't fit into the topics we've talked about so far?

121

17:56

<dzxzg> Or to add to any of the topics we have discussed?

122

17:57

<Novo__> possible follow-up could also include TR script-path spend since batch verification should ultimately affect this too

123

17:58

<monlovesmango> oh i did have one question actually, on line 55 the variable name is taproot_tx but it really is taproot or edcsa right?