BIP 141
(SegWit)
introduced a new structure called a witness to each transaction.
Transactions’ witnesses are commited to blocks separately from the
transaction merkle tree. Witnesses contain data required to check transaction
validity, but not required to determine transaction effects (output
consumption/creation). In other words, witnesses are used to validate the
blockchain state, not to determine what that state is.
Witnesses are commited to by placing the root of the witness merkle tree in
the block’s coinbase transaction. By doing so, the witnesses are also commited to
in the transaction merkle tree through the coinbase transaction. Nesting the
witness commitment in the coinbase transaction was done to make SegWit
soft-fork compatible.
Assume-valid is a node setting that makes the node skip some transaction
validity checks (signature and script checks) prior to a pre-determined
“known to be good” block (assume-valid point). The default assume-valid point
is set by the developers and is updated every release, but users have the
ability to set their own assume-valid point through the -assumevalid
setting.
The assume-valid feature does not significantly change Bitcoin’s security
assumptions. If developers (and everyone reviewing the code changes) were
to conspire with miners to build a more-work chain with invalid signatures
(and go undetected for weeks), and then include it as the default
assume-valid point, they could get the network to accept an invalid chain.
However, those same people already have that power by just changing the
code - which would be much less obvious.
(quoted)
Additionally, as long as the full chain history remains available for
auditing it would be hard for such an attack to go unnoticed.
It is also important to note that the configured assume-valid point does not
dictate which chain a node follows. The node still does
Proof of Work checks, meaning that a large reorg would be able
to orphan (parts of) the assumed-valid chain.
Nodes in prune mode (enabled by the -prune setting) fully download and
validate the chain history to build a UTXO set enabling them to fully
validate any new transaction, but only store a (configurable) portion of the
recent history.
PR #27050 proposes to skip
downloading the witnesses for blocks prior to the configured assume-valid
point, for nodes running in prune mode. The rationale for this change is that
pruned nodes currently download witnesses but then (prior to the assume-valid
point) don’t validate them and delete them shortly after. So why not skip
downloading those witnesses and save some bandwidth?
How much bandwidth is saved, i.e.,
what is the cumulative size of all witness data up to block
0000000000000000000013a20dcc8577282e1eabd430592bb8afdd5fe544c05a? (Hint:
the getblock RPC returns the size and strippedsize (size excluding
witnesses) for each block).
The end goal of the PR can be achieved with very few changes to the code
(ignoring edge case scenarios). It essentially only requires two changes,
one to the block request logic and one to block validation. Can you (in your
own words) describe these two changes in more detail?
Without this PR, script validation is skipped under assume-valid, but other
checks that involve witness data are not skipped. What other witness related
checks exist as part of validation on master?
With this PR, all additional witness related checks (Q4) will be skipped for
assumed-valid blocks. Is it ok to skip these additional checks? Why or why not?
The PR does not include an explicit code change for skipping all the extra
checks from Q4. Why does that work out?
Peter Todd left a
comment
concerning a reduction in security with the changes made in the PR. Can you
in your own words summarize his concerns? Do you agree/disagree with them?
<pakaro> concept clarification - if prune=0 & and av=1 we still need the witness data because eventually the witness-validity will be checked, perhaps once the node has caught up entirely?
<_aj_> pakaro: or the node operator might run -reindex with -noassumevalid, or they might lookup a post-segwit tx via getrawtransaction and want to see the witness data
<dergoegge> How much bandwidth is saved, i.e., what is the cumulative size of all witness data up to block 0000000000000000000013a20dcc8577282e1eabd430592bb8afdd5fe544c05a?
<dergoegge> amirreza: i can't list them all but to name a few: making sure inputs exist in the utxo set, checking the proof of work, inflation checks, ...
<dergoegge> The end goal of the PR can be achieved with very few changes to the code (ignoring edge case scenarios). It essentially only requires two changes, one to the block request logic and one to block validation. Can you (in your own words) describe these two changes in more detail?
<lightlike> If pruning and block is assumed valid: 1) In SendMessages, remove MSG_WITNESS_FLAG from fetch flags so our peers don't send us the witness data. 2)In validation, skip witness merkle tree checks because we don't have the witness.
<dergoegge> Without this PR, script validation is skipped under assume-valid, but other checks that involve witness data are not skipped. What other witness related checks exist as part of validation on master?
<pakaro> is there a separate check to ensure that there is no witness data in 1'ordinary' transactions and 2'coinbase' transactions, or does one check suffice?
<pakaro> in my understanding a coinbase tx is very similar to a normal tx, really just with nblocktime spending rules , nvalue, etc, so one check should suffice?
<dergoegge> The PR does not include an explicit code change for skipping all witness related checks. It only explicitly skips the witness merkle root check. Why does that work out?
<pakaro> dergoegge I dont think there are individual limits because there was that jpg-wizard spend and that file was 4MB, therefore unless the limit was the same was block weight, which would render the rule meaningless anyway
<dergoegge> It turns out that all the extra checks *just* pass when you don't have any witnesses. Which makes sense considering that segwit was a soft-fork. With the PR, we are essentially just pretending like we are a pre-segwit node (up to the assume-valid point).