Move calculation logic out from `CheckSequenceLocksAtTip()` (refactoring, validation, mempool)

https://github.com/bitcoin/bitcoin/pull/23897

Host: stickies-v  -  PR author: hebasto

The PR branch HEAD was ed2d714bd1130291a2f781e644f787d421cdf26e at the time of this review club meeting.

Notes

  • BIP68 introduced relative timelocks. They are relative to the block(s) in which this transaction’s parent transaction(s) were mined. The timelock can be expressed as block height or time difference.

  • Mastering Bitcoin by Andreas Antonopoulos has helpful background on relative timelocks.

  • Mempool policy does not allow transactions that can not be mined in the next block because they’re not BIP68-final.

  • LockPoints are used to represent the block or time when a relative lock is satisfied. Since reorgs can affect when (or if) a transaction’s input were confirmed, LockPoints can become invalid.

  • #23897 makes it easier to understand and reason about this code by separating the lockpoints calculation logic from the lockpoints validation logic.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?

  2. What is prevheights, and why do we need to calculate it?

  3. What is the difference between CalculateSequenceLocks and CalculateLockPointsAtTip? Are LockPoints and SequenceLocks the same thing, or how do they differ?

  4. CalculateLockPointsAtTip returns a std::optional<LockPoints>. Under which circumstance(s) will this be a std::nullopt? Do you see any alternative approaches?

  5. What’s the point of having LockPoints::maxInputBlock when we already have LockPoints::height?

  6. Consider the function CheckSequenceLocksAtTip. Prior to this PR, is LockPoints* lp an in, out, or in,out parameter? What about after this PR?

  7. How does this PR change the behaviour of CheckSequenceLocksAtTip()?

  8. Which, if any, observable behaviour change does this PR introduce?

Meeting Log

  117:00 <stickies-v> #startmeeting
  217:00 <pablomartin> hello!
  317:00 <stickies-v> hi everyone!
  417:00 <LarryRuane> hi
  517:00 <theStack> hi
  617:00 <hernanmarino> hi !
  717:01 <hebasto> hi (my mobile connection is quite poor though)
  817:01 <stickies-v> today we're looking at a refactoring PR authored by hebasto, the notes are available at https://bitcoincore.reviews/23897
  917:01 <stickies-v> thank you for joining us hebasto!
 1017:01 <_andrewtoth_> hi
 1117:01 <stickies-v> do we have any newcomers today? feel free to just say hi and lurk around or participate as per your preference
 1217:03 <stickies-v> regulars only today it seems, also fun. if you drop in late, don't be shy to say hi!
 1317:04 <stickies-v> who was able to have a look at the notes and/or the PR (y/n)?
 1417:04 <LarryRuane> y
 1517:04 <pablomartin> y
 1617:04 <theStack> n
 1717:05 <_andrewtoth_> could be because DST changed times for a lot of people
 1817:05 <hernanmarino> not really, I'm just lurking today. I only read the notes a few minutes ago, but I didn't have time to look at the code or PR
 1917:05 <stickies-v> ah yes, DST fun never gets old
 2017:06 <stickies-v> if you did review, would you give it a Concept ACK, approach ACK, tested ACK, or NACK?
 2117:06 <LarryRuane> _andrewtoth_: good point, we could have a bunch of people showing up in about an hour from now!
 2217:07 <LarryRuane> ACK on all (I'll review the actual PR later today), this is a nice PR!
 2317:08 <stickies-v> I agree with that sentiment LarryRuane ! a nice improvement in maintainability
 2417:08 <stickies-v> alright, moving on to the questions
 2517:08 <pablomartin> concept ack, havent had the chance to review the code much
 2617:08 <stickies-v> What is prevheights, and why do we need to calculate it?
 2717:08 <stickies-v> (link: https://github.com/bitcoin-core-review-club/bitcoin/blob/ed2d714bd1130291a2f781e644f787d421cdf26e/src/validation.cpp#L193)
 2817:09 <LarryRuane> There's a fair amount of code that moved, and there's a type of git diff that shows that (`--colorMoved` and `--colorMovedWs` i believe), that helps simplify the diff
 2917:10 <stickies-v> good point, I should have included that in the notes actually - thanks for the suggestion!
 3017:10 <LarryRuane> stickies-v: For a given transaction, it's the per-input height of the block that contains the output that this input is attempting to spend
 3117:11 <stickies-v> LarryRuane that sums it up perfectly
 3217:11 <LarryRuane> I think we need to calculate it to know if the timelock (if any) for each tx input has been satisfied
 3317:13 <stickies-v> exactly - the point of BIP68 relative timelocks is that they are relative to a previous transaction (being included in a block)
 3417:13 <LarryRuane> So if that prevheight (for a particular input) is a low number, then the difference between that height and the current height would be a large number (many blocks ago), then perhaps the timelock has been satisfied (depending on the value of the timelock, obviously)
 3517:14 <LarryRuane> but if prevheight is recent, then the timelock may not be satisfied yet
 3617:14 <stickies-v> does anyone know where we store relative timelocks?
 3717:15 <stickies-v> (store as in: where are they defined)
 3817:15 <stickies-v> (as in: which field)
 3917:15 <hebasto> also prevheight can be changed after reorg
 4017:16 <LarryRuane> each tx input has an `nSequence` field (4 bytes), and it's stored there, I think? that field is often 0xffffffff (no timelock), but otherwise, it is a timelock of some kind
 4117:16 <hernanmarino> Larryruane: good suggestion about git diff !
 4217:17 <theStack> not sure if i understand the question correctly, but isn't the relative timelock part of the locking script (i.e. argument for OP_CSV)?
 4317:17 <stickies-v> LarryRuane: yes! so... can a single transaction have different `nSequence` values? how do we deal with that?
 4417:17 <LarryRuane> hebasto: +1 good point, it can either increase or decrease (i think)
 4517:18 <stickies-v> theStack: OP_CSV is on the script level and only visible upon spending (BIP112)
 4617:19 <LarryRuane> theStack: yes that's right, it's actually stored in both places! in the output (or technically the locking script), and in nSequence .. I don't really understand why (there have to be both)
 4717:19 <theStack> oh, that's a different bip
 4817:19 <stickies-v> nLocktime is to OP_CTLV what nSequence is to OP_CSV - we have script-based and transaction-based timelocks for both relative and absolute timelcks
 4917:21 <LarryRuane> stickies-v: "can a single transaction have different `nSequence` values?" -- yes, I think the timelocks for ALL inputs must be satisfied, or else the tx isn't "finalized" yet (able to accept to mempool and relayed)
 5017:22 <theStack> stickies-v, LarryRuane: thanks
 5117:23 <stickies-v> LarryRuane: glad you mention it, because it's a very common source of confusion. for relative timelocks, you don't NEED to use OP_CSV. you can just use nSequence without any kind of script level locking, if you only want the transaction to be spendable when the transactions parents have enough ancestors
 5217:24 <stickies-v> however, OP_CSV allows you to (relatively) timelock _outputs_ (without showing that to the world until they're spent) instead of transactions
 5317:24 <LarryRuane> I did have a question about different inputs (nSequence) having different timelocks ... let's say the timelock is satisfied for one input but not another ... could the sender then steal back that output (spend it back to himself)?
 5417:24 <stickies-v> but interestingly, the validation of OP_CSV relies on the nSequence fields: the transaction _spending_ and OP_CSV output needs to set its nSequence field (on that input) sufficiently high so it satisfies the OP_CSV script pending path
 5517:26 <stickies-v> mmm I'm not sure I understand your question. nSequence is on the transaction level, OP_CSV is on the output/script level
 5617:27 <stickies-v> and re multiple `nSequence` values: indeed, a transaction can have a different `nSequence` value for each of its inputs, and _all_of them need to be satisfied in order for the transaction to be final
 5717:27 <LarryRuane> We can go on :) ... if I figure out if my question makes sense (i'm probably just confused), i'll bring it up later if there's time
 5817:27 <stickies-v> yes! next question:
 5917:27 <stickies-v> What is the difference between CalculateSequenceLocks and CalculateLockPointsAtTip?
 6017:28 <stickies-v> (links: https://github.com/bitcoin/bitcoin/blob/50422b770a40f5fa964201d1e99fd6b5dc1653ca/src/consensus/tx_verify.cpp#L39 and https://github.com/bitcoin-core-review-club/bitcoin/blob/ed2d714bd1130291a2f781e644f787d421cdf26e/src/validation.cpp#L179-L181)
 6117:28 <theStack> so IIUC now, you can either only use relative transaction-based timelocks with nSequence fields (bip 68), OR relative script-based timelocks using OP_CSV, but then the transaction-based timelocks also have to match (bip68+bip112)... but not bip112 alone
 6217:29 <LarryRuane> stickies-v: CalculateSequenceLocks() looks across inputs and prevheights, figures out the earliest (min) height and time (that the tip has to reach) so that this tx is finalized (timelocks satisfied), could be in the future or the past
 6317:29 <stickies-v> theStack: no you can combine them. Look at it this way: nSequence limits when the transaction itself becomes final, OP_CSV limits when the transaction *spending it* can become final
 6417:31 <LarryRuane> CalculateLockPointsAtTip() creates and populates the prevheights list, each entry is the height that the output was mined that this input wants to spend, Then it passes this to CalculateSequenceLocks() ... so CalculateLockPointsAtTip is a higher-level function that uses CalculateSequenceLocks
 6517:31 <theStack> stickies-v: gotcha
 6617:32 <LarryRuane> stickies-v: "no you can combine them" -- thanks, that's very helpful!
 6717:32 <stickies-v> LarryRuane: yes exactly, `CalculateLockPointsAtTip` is just a wrapper that first calculates the `prevheights`, and then after passing that to `CalculateSequenceLocks` also calculates the `max_input_height` (which we'll discuss in a bit)
 6817:33 <stickies-v> (imo: all of this would be much more readable if we'd start writing smaller functions with narrower scope)
 6917:34 <stickies-v> `CalculateLockPointsAtTip` returns a `std::optional<LockPoints>`. Under which circumstance(s) will this be a `std::nullopt`? Do you see any alternative approaches?
 7017:34 <stickies-v> (link: https://github.com/hebasto/bitcoin/blob/ed2d714bd1130291a2f781e644f787d421cdf26e/src/validation.cpp#L179-L181)
 7117:34 <LarryRuane> Yes I agree, this PR makes all this code more understandable, but it still seems pretty messy, as in, if you were writing all this from scratch today, it could be a lot simpler (but you never know until you try) ... but such a rewrite would be very hard to get merged (understandably so, easy to break things)
 7217:35 <stickies-v> to be clear: with my last "imo..." comment I meant this code in general, not this PR (which is indeed moving in the right direction!)
 7317:36 <LarryRuane> stickies-v: if I'm reading the code correctly, std::nullopt only if one of the transaction inputs refers to an output that can't be found in the coins database (UTXO set).
 7417:36 <stickies-v> yes! does the coins database include only unconfirmed or confirmed UTXOs, or both?
 7517:36 <LarryRuane> but I was a little surprised by this, because wouldn't this be checked elsewhere already (where timelocks aren't being used at all, every input must refer to a UTXO)
 7617:37 <stickies-v> (hint: look at who calls CalculateLockPointsAtTip, not how the function is implemented)
 7717:38 <LarryRuane> stickies-v: I think you can get different kind of utxo views? Like, include mempool or not, as you wish? I've always been somewhat confused about that
 7817:42 <LarryRuane> I think at both CalculateLockPointsAtTip callers, the mempool view does include unconfirmed utxos
 7917:42 <stickies-v> yeah exactly, it just takes a `CCoinsView` - however, it seems that in all callsites it's passed with a view that contains both unconfirmed and confirmed UTXOs
 8017:43 <stickies-v> it seems, however, that we never actually expect `GetCoin` to fail
 8117:44 <stickies-v> so... is using an `std::optional` appropriate?
 8217:46 <LarryRuane> there's some fancy class-derivation going on with these different views! for example https://github.com/bitcoin/bitcoin/blob/master/src/txmempool.h#L915
 8317:47 <stickies-v> yeah, and also e.g. here (https://github.com/bitcoin/bitcoin/blob/9dce30194bc07463d634072251a8bf83e1b10ff9/src/validation.cpp#L746) where we start with a `CCoinsViewCache` but then chuck the mempool in there anyway
 8417:48 <LarryRuane> Hmm if `GetCoin` can never fail, then I guess the `std::optional` isn't needed ... maybe it's still a good future-proof measure?
 8517:49 <stickies-v> I think it definitely can fail if we're passing it the wrong `CCoinsView`. However, if we're not expecting it to happen, shouldn't we just assert (or `Assume`) that to happen instead of just logging an error and move on? These silent failures always worry me. And perhaps making the return type unnecessarily convoluted
 8617:49 <stickies-v> But, there is probably some nuance I'm missing
 8717:50 <stickies-v> I'll move on the next question already:
 8817:50 <stickies-v> What’s the point of having `LockPoints::maxInputBlock` when we already have `LockPoints::height`?
 8917:51 <LarryRuane> is Assume a no-op in a non-debug build?
 9017:51 <stickies-v> yep, it's like `Assert` but it'll only fail in debug builds
 9117:51 <stickies-v> (no-op in non-debug, like you say)
 9217:52 <LarryRuane> thanks.. one thing I think keeps many of us awake at night is, what if some tx or block relay or whatever causes every node in the world to `assert` (just a few is okay, but all???)
 9317:54 <stickies-v> yeah, that would be a nightmare
 9417:54 <LarryRuane> stickies-v: "What’s the point of having `LockPoints::maxInputBlock" -- @hebasto referred to this earlier, reorgs!
 9517:55 <LarryRuane> If a reorg occurs, the block referred to by `maxInputBlock` may no longer be part of the best chain, whereas a block of any given height often will still exist.
 9617:55 <LarryRuane> I think the idea is, when we accept a tx into the mempool (it passes its timelock check), we also cache its `LockPoints`, so then later, if there's a reorg, we don't have to re-check the timelocks if the `LockPoints` is still valid (performance improvement only, i think?)
 9717:57 <stickies-v> "performance improvement only, i think?" yes exactly! when reorging, we only need to invalidate those `LockPoints` that have a `maxInputBlock` that is higher than how deep the reorg is
 9817:58 <stickies-v> because relative timelocks always refer to when a previous transaction was confirmed, and a reorg can change when a transaction was confirmed (or even make it unconfirmed)
 9917:58 <stickies-v> alright last question before wrapping up:
10017:58 <LarryRuane> is a tx's `LockPoints` stored with the tx in the mempool? (I didn't look into it closely enough to know)
10117:59 <stickies-v> oh good question, I'm not actually sure
10218:00 <stickies-v> mm actually not enough time left for the final questions, so let's wrap it up here
10318:00 <stickies-v> #endmeeting