The CheckInputs() function is called when checking the inputs to a transaction. Each input has a script which is verified. That verification can be run:
sequentially and synchronously, in which case CheckInputs() will return true if all scripts pass verification or pass back the error for the first script that failed verification; or
in parallel, in which case CheckInputs() will emplace the script check onto a queue for future verification.
This PR enables the input checks to be performed in parallel when a transaction is entering the mempool.
It does so by replacing calls to CheckInputs() with RunCheckInputsMaybeParallel(), which will push the input checks to an existing CCheckQueue worker queue.
There is a significant performance gain resulting from this optimization.
This PR also changes behavior when transactions are denied mempool acceptance. RunCheckInputsMaybeParallel (and by extension CheckInputs()) no longer set TX_NOT_STANDARD during input check failure, but only a consensus failure state.
Note: This has an effect on peer connections.
Current behaviour: A peer is disconnected when a consensus-invalid transaction is received, but remains connected if the transaction is only policy-invalid (non-standard).
Preposed behaviour: A peer is no longer disconnected, even if a consensus-invalid transaction is received.
Questions
When does RunCheckInputsMaybeParallel() still potentially call CheckInputs()?
How are the input check jobs dispatched to the thread pool? How are the check results returned after execution?
Does RunCheckInputsMaybeParallel() set the same “invalid reason” on input check failure as CheckInputs()?
<jnewbery> It's a bit of work to prepare to host, but I think it's really worth it. You'll get a lot out of really digging into the PR and understanding it well enough to talk about. We already have lightlike and elichai2 lined up to host future meetings.
<jnewbery> The PR this week is really interesting. Even though the number of lines touched is fairly small, it touches on some very important concepts: peer-to-peer, DoS protection and caching. I hadn't reviewed it before James suggested it for PR club, so thanks for pointing me at it!
<digi_james> Thanks John. This was a bit of a tricky PR for me to understand, but I hope you all will appreciate the nuances and perhaps even discover new ones today.
<digi_james> I suppose on the surface it would appear to be a performance optimization, verifying inputs fo new transactions being accepted to the mempool in parallel.
<lf> It's not clear to me why tx input check can be parallelized: is it not the case that if I create a transaction that uses two identical inputs a parallelized function will return false-positive validity
<lf> jnewbery so a broad assesment of the change is: "lets separate CheckTxInputs function into more indepedently functional pieces s.t. performance can increase"?
<jnewbery> this PR is simply changing AcceptToMemoryPool (ATMP) to call CheckInputs() with a checkqueue so that script verification for the different inputs happens in parallel
<hugohn> Q: is there any scenario where CCheckQueue (scriptcheckqueue) can contain more than one tx to be validated? Or is it guaranteed that the queue can only contain one pending tx at a time?
<hugohn> I see. yes that makes sense. If check queue accepts more than 1 tx at a time, !control.Wait() would not make sense because the failure could be ambiguous (not sure which tx is invalid)
<ariard> hugohn: IMO CCheckQueue let you do that, but you have a wrapper around CCheckQueueControl to let verification showing up as atomic for callers
<jnewbery> hugohn: I believe during block validation, the inputs from all the txs are enqueued onto a single CCheckQueueControl and then dispatched, but like you and james have said, that can't happen for checking txs for mempool entry
<hugohn> ariard: isn't CCheckQueueControl only to ensure all vChecks in the queue must be finished? it doesn't protect you against 2 concurrent txs in the queue
<digi_james> jnewbery: can CCheckQueueControl allow two callers to dispatch separate job sets, returning the completion of each separately to each caller as ariard suggested?
<ariard> digi_james: you can call ATMP for multiple transactions concurrently but given there is only one set of ScriptCheck threads, CCheckQueueControl constructor should be pass the scriptcheckqueue to ensure there isn't conflict
<digi_james> As jonatack mentioned, RunCheckInputsMaybeParallel can no longer differentiate between policy invalid and consensus invalid failures during input checks. Rather it just sets the invalid state to Consensus.
<digi_james> CheckInputs previously would run two script checks during failure. One with policy flags active, so checks for standardness, and if this failed, a second check would be run with consensus flags only.
<digi_james> Ok, one way to look at is in the CheckQueue design, which was originally written to be called when blocks are connected, there only consensus verification matters, standardness applies to mempool acceptance only
<jnewbery> digi_james: I think we could call CheckInputs() twice after this PR, but it'd require some code rearrangement, and we'd lose some of the performance benefit of the PR
<jnewbery> CheckInputs() is a bit weird right now. We call it and say "please run the scripts with these flags". CheckInputs() runs the scripts with the flags we pass it, but if those fails, it runs the scripts _again_ with different flags
<jnewbery> "The motivation for this is to mitigate adversarial behavior: in order to determine whether a transaction is invalid according to consensus rules (rather than policy) we end up validating a transaction's scripts twice, so an adversary trying a CPU DoS could construct transactions that only fail policy checks, but otherwise tie up our CPU for twice as long."
<jonatack> Unsure how it relates but Suhas also wrote: "AcceptToMemoryPoolWorker() already invokes CheckInputs multiple times, with different script flags, to detect certain kinds of segwit validation errors."
<digi_james> btw I realize this PR review may be difficult to follow, it's not easy to understnad and as you can see there are remaining nuances I haven't fully processed myself. Sorry about that.
<hugohn> nit: the (nScriptCheckThreads && !single_threaded_validation) check is used 3 times in RunCheckInputsMaybeParallel(). it seems suboptimal & makes things a bit hard to read.
<jnewbery> (but in fact we do also check against consensus rules later (in CheckInputsFromMempoolAndCache(), but that's for slightly different and obscure reasons, which I don't think we have time to get into!)
<jonatack> jnewbery: how critical is the caching issue you mention: "I believe this breaks script caching - CheckInputs() will only cache script success if running non-parallel."
<jnewbery> script caching means that once we've accepted the tx to our mempool, we don't need to revalidate the scripts when we receive a block with that tx
<jnewbery> that's one of the obscure reasons that we call CheckInputsFromMempoolAndCache() later - it's so we verify the scripts according to the consensus rules of the next block
<jnewbery> ariard: yes that's particularly obscure, but just the fact that we're calling CheckInputs() again seems pretty weird until you realise it's specifically for caching
<ariard> I mean CheckInputsFromMempoolAndCache seems only an optimization in case of softfork activation, which doesn't happen often,so you add a burden for every tx getting into memppol
<fjahr> Before everyone leaves, question on code structure/naming: I would like to understand the reasoning for wrapping CheckInputs() in RunCheckInputsMaybeParallel(). Rather than that I would probably have put that logic into ChecksInputs() itself and then have it delegate to CheckInputsParallel() or so. I would like to understand if there are other factors at play that I am not seeing, like keeping the changes in