Please also try to read the BIP PR and, if possible, review it in tandem with the PR. The BIP and the PR should be in sync with each other and ideally the BIP should be a clear description of the key parts of the PR.
Bitcoin core supports three test networks out of the box: Regtest, Testnet, and Signet. There also exist some custom Signet variants like Mutinynet. At this point, the current Testnet has been running for 12 years. However, the current Testnet is actually Testnet 3. It was introduced in PR #1392. Documentation on how exactly Testnet 1 and 2 broke is not available but it appears that they fell victim to high fluctuation in mining power. Remember that around this time the first ASIC miners entered the market while Testnet was probably still mostly mined by CPUs and maybe the occasional GPU.
Testnet 3 features a Proof of Work exception rule, known as the 20-min exception. This rule was designed to prevent the chain from getting stuck again due to hash power fluctuation. However, a bug in this exception leads to so-called block storms, large numbers of blocks being mined in quick succession. This is the main reason Testnet 3 is so far ahead of mainnet even though it started much later. The bug was recently exploited on purpose for an extended period of time to highlight the issue.
Testnet 4 still includes the 20-min exception but adds a mitigation for the block storm issue.
The pull request also includes a fix for the timewarp attack, an attack that is still possible on mainnet today. A fix for this was proposed as part of the Great Consensus Cleanup but failed to get the necessary support as a softfork so far.
Why reset Testnet in the first place? Were there any arguments against the reset?
What is the message in the Genesis block in Testnet 3 and why (reference the code)?
Aside from the consensus changes, what differences do you see between Testnet 4 and Testnet 3, particularly the chain params?
Pick a single chain param that you don’t know/remember the meaning of. Look up what it does and explain it in one sentence.
How does the 20-min exception rule work in Testnet 3? How does this lead to the block storm bug? Please try to reference the code.
How is the block storm bug fixed in the PR? What other fixes were discussed in the PR?
Why was the time warp fix included in the PR? Hint: This came up in the PR discussion.
How does the time warp fix work? Where does the fix originate from? Can you think of any other ways to fix it?
How do you start your node with Testnet 4? What happens when you start it just with -testnet=1 after Testnet 4 is included?
The PR and ML discussions included many further concerns and ideas that were not addressed in the code of the PR. Pick the one you found most interesting and give a short summary. Do you think this is still a concern and should be addressed?
Do you have ideas for additional test cases? What makes Testnet 4 features tricky to test?
Why is it interesting to embed special scripts into the chain as test cases? What makes this useful beyond bitcoin core?
What expectations do you have for such a change before you would include it in a release? For example, would you reset the genesis block one more time?
<GregTonoski> #29520 add -limitdummyscriptdatasize option - I'm suggesting discussion about that PR in the next Bitcoin Core review monthly meeting, stickies-v and glozow. I'm contacting you in order to host the meeting (per instruction at https://bitcoincore.reviews.
<fjahr> I will get started with the rest of the questions because I think there are some interesting learnings even if you haven't reviewed everything.
<fjahr> lightlike: Right! The deployment heights of the past softforks are all set to 1, i.e. they are active from the beginning. While this might seem kind of trivial, these could have also been set to some later value allowing for some potential testing of deployment mechanisms, but there wasn’t that much appetite in that from what I remember.
<lightlike> unrelated question: looking at the existing testnet4 chain, according to mempool.space, blocks 10000 and 20000 were mined just 5 hours apart. Was someone just pointing a ridiculous amount hash power at testnet, or was there still some funny stuff going on?
<fjahr> I don't know about funny stuff, the difficulty has to ramp up initially and if someone pointed an ASIC at the chain that doesn't seem ridiculous. But still interesting to check.
<stickies-v> I had to look up `fPowNoRetargeting` during review, forgot about regtest not doing difficulty adjustments (luckily). So that's what it does: when `true`, don't adjust the required pow difficulty
<fjahr> I kind of failed at this Q and looked at something that's only in Testnet 3 but I found it interesting: The BIP16 exception (script_flag_exceptions). BIP16 standardized P2SH transactions and defined 3 rules that transactions can not violate. The blockhash in the exception is block 394 in Testnet. I didn’t have time to check which transaction exactly violates which the rules though.
<fjahr> Ok, so let's get to the meat: How does the 20-min exception rule work in Testnet 3? How does this lead to the block storm bug? Please try to reference the code.
<lightlike> if there is no block for 20 minutes, difficulty goes to 1 for the next block. For the ones after the next block, it goes back to whatever the difficulty was before the 20 minutes had passed.
<stickies-v> if the last block in a difficulty period is min-difficulty, then the next block (i.e. the first of the next epoch) won't have any "lookback window" to find the true difficulty, so it'll just take the previous difficulty, which is min-difficulty
<stickies-v> this can be exploited ad infinitum, right? so i guess the only reason block storms stop is because eventually attacker/trolls just decide to do so?
<fjahr> Yeah, like jameson lopp did recently on Testnet 3. I don't know how long it was, 2-3 weeks maybe? But it only got back to normal because he stopped.
<lightlike> don't you have to wait for 20 minutes regularly to avoid the difficulty from going back up (and in these 20 minutes, someone else could mine a block)?
<fjahr> Alright, the alternative fix I wanted to mention is just disallowing the last block in the difficulty to be min-difficulty. I think almost everyone was kind of indifferent between this and the look-back solution.
<fjahr> Right, the 20 min exception exploits in combination with this still are pretty annoying and test running it was the second entry in the pro column :)
<lightlike> is there any best practice on how to make exceptions for testnet? It used to be a flag (fPowAllowMinDifficultyBlocks), in the PR the genesis block is compared, elsewhere we use "chainparams.GetChainType() != ChainType::REGTEST" - are some ways better than others?
<fjahr> lightlike: sjors gave the feedback that we should introduce a new helper method, I will do that probably when I retouch or as a follow-up. Using the hash was just an easy first step when I opened the PR.
<stickies-v> fjahr: i wonder why we don't check that with `if (nHeight % consensusParams.DifficultyAdjustmentInterval() == 0)` instead using the `pindexPrev->nHeight`? is it because of the genesis block handling?
<fjahr> stickies-v: Hm, I haven't thought about it to be honest. I used the code from Bluematt without unnecessary changes because it had many eyes on it already and I didn't think about this in particular
<fjahr> Q: How do you start your node with Testnet 4? What happens when you start it just with -testnet=1 after Testnet 4 is included? Do you think that choice is sensible?
<fjahr> But this is a good one I think: Why is it interesting to embed special scripts into the chain as test cases? What makes this useful beyond bitcoin core?
<stickies-v> I don't really understand the second part of the question, but having a single place to go to battle test your software for all kinds of weird cases is pretty helpful for devs
<fjahr> Right, maybe I didn't formulate it well: I think what makes this particularly interesting is that we force other implementations and tools to parse these transactions and scripts if they want to validate the chain. That means we are getting some tests for the whole ecosystem, not just bitcoin core.
<fjahr> Alright, I think the last one is also pretty bike-sheddy so I would say we finish up unless anyone has a comment on the last question or the ones we skipped :)
<lightlike> did that embedding happen in the existing testnet4 chain? seems like a bit of work to come up with all kind of special scripts that might be interesting and create txns for them.
<fjahr> lightlike: Yeah, that isn't done and it's a project that is on my list but where I would also be interested to collaborate with someone else. Volunteers welcome :) There are a lot of ideas for sources, like the Taproot functional test, the fuzzing body, existing scripts on Testnet 3 etc.