Bitcoin Core uses the nanobench library for a suite of “microbenchmarks” that measure the performance of individual components or functions in idealized conditions.
Chainstate::ConnectBlock() does double-duty: it is partly responsible for validating blocks being connected to the node’s tip, and partly responsible for applying their effects to the node’s view of the UTXO set (CCoinsViewCache).
One of the most “expensive” checks performed byConnectBlock() is CheckInputScripts: which ensures that every input script of every transaction succeeds.
In the course of evaluating scripts, signature checks are often required, sometimes explicitly with opcodes like OP_CHECKSIG, OP_CHECKMULTISIG, and sometimes implicitly with Bitcoin output types like P2WPKH that have implicit signature checks.
In pre-SegWit and SegWit version 0 outputs, signatures are generated and validated using ECDSA over the secp256k1 curve. Taproot introduced the version 1 SegWit output type, which uses Schnorr signatures over the same curve. BIP-0340 describes the way signatures are generated and evaluated for taproot outputs.
One of the advantages of Schnorr signatures over ECDSA signatures is that they can be verified in batches. A simplified description of batch verification is that instead of needing to prove that signature $A$ is valid for input $X$, signature $B$ is valid for input $Y$, and that signature $C$ is valid for input $Z$, we can add up signatures $A$, $B$, and $C$, to produce signature $D$, and add inputs $X$, $Y$, and $Z$ to produce input $W$, and then only perform a single verification, that signature $D$ is valid for input $W$.
Although in principle schnorr signatures can be validated in batches, Bitcoin Core today validates them individually just like ECDSA signatures. There is a PR open, #29491, that implements Batch Validation in Bitcoin Core. The motivation for this PR is to establish a baseline for signature validation performance in Bitcoin Core today, which can then be used to validate and potentially quantify performance improvements of Batch validation.
#31689 introduces three ConnectBlock benchmarks, one for a block where all inputs that are spent use ECDSA signatures, one where all inputs are Schnorr signatures, and one where some are Schnorr and some are ECDSA.
What is TestChain100Setup? What does 100 mean? Why 100?
The notes above suggest that CheckInputScripts() is “expensive”. Is it? Why?
Some reviewers (and a code comment) observed that in their testing ConnectBlockMixed was the slowest of the three benchmarks. Is that possible?
Some reviewers disagreed about whether or not the ‘mixed’ case should be 50/50 schnorr/ecdsa, or if it should be some mixture of Schnorr and ECDSA that would be likely to appear in a block, what are the tradeoffs of each approach?
What is the purpose of the first transaction that gets created in CreateTestBlock()? Why couldn’t this transaction be created in the for loop like all the other transactions?
ConnectBlock does alotmore than just checking input scripts. Is this PR introducing a ConnectBlock() benchmark or a signature validation benchmark? Why use ConnectBlock() instead of benchmarking CheckECDSASignature() and CheckSchnorrSignature() directly?
Do you think the tests added here are sufficient in scope or are there other cases that should have been added in this PR? What additional benchmarks of ConnectBlock() would be good to have in a follow-up PR?
<dzxzg> Awesome! I guess it's slightly unusual to review a PR *after* it's been merged, but I think it's still important and helpful, this code is just beginning it's life in Bitcoin Core!
<glozow> Nothing wrong with reviewing a PR after merge! Presumably if you take a look at the batch validation PRs and use the benches to measure the performance changes, you should also know what the benches are doing :)
<dzxzg> "One of the most “expensive” checks performed by ConnectBlock() is CheckInputScripts: which ensures that every input script of every transaction succeeds."
<stickies-v> The CheckInputScripts docstring mentions "This involves ECDSA signature checks so can be computationally intensive." does it not do Schnorr signature checks or did the docstring just not get updated?
<dzxzg> Some reviewers (and a code comment) observed that in their testing ConnectBlockMixed was the slowest of the three benchmarks. Is that possible?
<sliv3r__> Yes! Because the two different types are used in the same transaction so they have to be hashed multiple times due to differences in the signature digest algorithm
<monlovesmango> I ahve a question about this one. when I run benchmarks as is I do see that mixed is slowest, but it also has 5 keys/outputs rather than 4 like the other 2. could that be why?
<monlovesmango> bc in the code ConnectBlockAllSchnorr creates 4 schnorr keys/outputs, ConnectBlockAllEcdsa creats 4 ecdsa keys/outputs, and ConnectBlockMixedEcdsaSchnorr creates 1 schnorr and 4 ecdsa
<dzxzg> when I changed the number of inputs in all of the tests so that they all had 5 inputs, the mixed block didn't stand out any more as the slowest!
<dzxzg> sliv3r__: I'm not sure, I thought that explanation made sense when I wrote the notes, but it seems that at the very least even if it wasn't wrong about extra work needed for validating transactions with mixed inputs, but it seems to have been wrong about how significant that would be!
<dzxzg> Nice find monlovesmango, I think a PR to address this would be nice! Another feather in the cap of never trusting explanations for poor performance until you've measured them :)
<sipa> i haven't run the numbers, but i'm curious how the block verification times compare with the raw pubkey decompression + signature checking numbers
<dzxzg> In the same vein as sipa's remark above: ConnectBlock does a lot more than just checking input scripts. Is this PR introducing a ConnectBlock() benchmark or a signature validation benchmark? Why use ConnectBlock() instead of benchmarking CheckECDSASignature() and CheckSchnorrSignature() directly?
<monlovesmango> it seemed like one goal was to assess performance with a mixed back of sig types, which can't be done with CheckECDSASignature() or CheckSchnorrSignature() alone
<sipa> yeah, the PR is a preparation for batch validation, which is applicable to schnorr signatures, but not ECDSA, so to get a realistic benchmark, it may make sense to see how it impacts a block with a mix of both (which, for the time being, is likely what we'll need to expect)
<Novo__> batch verification implementation will modify connectblock a lot, so we also want to see if that our changes don't negatively impact overall conectblock performance even if it speeds up CheckSchnorrSignature
<dzxzg> Some reviewers disagreed about whether or not the ‘mixed’ case should be 50/50 schnorr/ecdsa, or if it should be some mixture of Schnorr and ECDSA that would be likely to appear in a block, what are the tradeoffs of each approach?
<sliv3r__> I don't have a strong opinion on that tbh but some argue that 80/20 is the actual ratio now while 50/50 is probably what we will have in a future
<dzxzg> What is the purpose of the first transaction that gets created in CreateTestBlock()? Why couldn’t this transaction be created in the for loop like all the other transactions?
<monlovesmango> honestly might be good to have a few tiers, 80/20, 50/50, 20/80, just so we have a variety of benchmarks to compare changes against? or would this be redundant
<monlovesmango> the first transaction is spending the coinbase and setting up the outputs that will used for the bench mark. so this tx is different than the others, and this way the benchmark is only measuring the specific sig checks we are interested in (bc first tx is excluded from bench)
<dzxzg> The "What ratio should we use question?" makes me think of a bigger question, when should your measurement try to as closely as possible approximate the real situation of interest, like in this case maybe, real nodes connecting blocks to their tips, and when should you try to create idealized conditions that might exaggerate, or be focused on some tiny element which rarely constitutes much of the real task
<dzxzg> but you get the advantage of interpretability, when you exaggerate one element, it's really easy to interpret the outcome of a benchmark, if it's faster it's probably that thing, if it's slower it's probably that thing
<dzxzg> Okay, final question: Do you think the tests added here are sufficient in scope or are there other cases that should have been added in this PR? What additional benchmarks of ConnectBlock() would be good to have in a follow-up PR?
<monlovesmango> like we should want to know each scenario performs, and then make decisions about whether real use or idealized use should be given more significance
<monlovesmango> I think in the pr josie had mentioned testing mixed block composition (so instead of mixed transactions, each transaction would only have one type of signature but the block would have mixed bag of transactions)
<sliv3r__> re: addition bencharmks - As this wants to benchmark batch validations I'm not sure how other parts of connectblock gets affected by that so...
<sliv3r__> if we want to benchmark unrelated to batch validation we could get some numbers on how fast we update the utxo set or even how some of the changes from CC like nLockTime validation for coinbase tx affects here (that's not implemented yet)