Parallelize CheckInputs() in AcceptToMemoryPool() (
mempool) Jul 17, 2019
CheckInputs() function is called when checking the inputs to a transaction. Each input has a script which is verified. That verification can be run:
sequentially and synchronously, in which case
CheckInputs() will return
true if all scripts pass verification or pass back the error for the first script that failed verification; or
in parallel, in which case
CheckInputs() will emplace the script check onto a queue for future verification.
CheckInputs() would always check inputs sequentially when accepting a transaction to the mempool. Transaction inputs
can be checked in parallel during block validation. Note the .
vChecks argument when calling
This PR enables the input checks to be performed in parallel when a transaction is entering the mempool.
It does so by replacing calls to
, which will push the input checks to an existing
CCheckQueue worker queue.
There is a significant performance gain resulting from this optimization.
This PR also changes behavior when transactions are denied mempool acceptance.
RunCheckInputsMaybeParallel (and by extension
) no longer set TX_NOT_STANDARD during input check failure, but only a consensus failure state.
Note: This has an effect on peer connections.
Current behaviour: A peer is disconnected when a consensus-invalid transaction is received, but remains connected if the transaction is only policy-invalid (non-standard).
Preposed behaviour: A peer is no longer disconnected, even if a consensus-invalid transaction is received.
RunCheckInputsMaybeParallel() still potentially call
How are the input check jobs dispatched to the thread pool? How are the check results returned after execution?
RunCheckInputsMaybeParallel() set the same “invalid reason” on input check failure as
If not, why is this not possible?
How does this change affect RPC methods such as
What effect does the modified input check failure behaviour have on peer connections?
What was the reason to modify the behavior in AcceptToMempoolWorker() when input checks fail? Should this be separate PR?
1 12:57 <digi_james> Hello everyone :)
2 12:58 <jonatack> Hi James :)
3 12:59 <digi_james> jonatack: :)
12 13:02 <jnewbery> It's a bit of work to prepare to host, but I think it's really worth it. You'll get a lot out of really digging into the PR and understanding it well enough to talk about. We already have lightlike and elichai2 lined up to host future meetings.
13 13:02 <sosthene> wow, that's great
14 13:02 <jnewbery> The PR this week is really interesting. Even though the number of lines touched is fairly small, it touches on some very important concepts: peer-to-peer, DoS protection and caching. I hadn't reviewed it before James suggested it for PR club, so thanks for pointing me at it!
15 13:02 <jnewbery> Ok, over to James.
16 13:03 <digi_james> Thanks John. This was a bit of a tricky PR for me to understand, but I hope you all will appreciate the nuances and perhaps even discover new ones today.
17 13:04 <digi_james> I suppose on the surface it would appear to be a performance optimization, verifying inputs fo new transactions being accepted to the mempool in parallel.
18 13:04 <digi_james> So I suggest we start with that, and then move onto the parts which change behaviour afterwards ...
19 13:05 <digi_james> So perhaps I'll start out by asking how the parallelisation is achieved?
20 13:06 <digi_james> Is it new code that is queuing the check jobs and executing these?
21 13:07 <pinheadmz> no I think it is using something from block validation
22 13:07 <digi_james> pinheadmz: exactly
23 13:07 <ariard> well it's now used scriptcheckqueue to parallelize script verification, scriptcheckqueue was only used by ConnectBlock before
24 13:07 <ariard> s/used/using/g
25 13:08 <pinheadmz> this applies to each tx entering the mempool 1 at a time though?
26 13:08 <pinheadmz> so a tx with 100 inputs can be checked in parrallel
27 13:08 <digi_james> pinheadmz: yes exactly, parallel input checks for a given transaction
28 13:09 <digi_james> Is there a difference in validation logic between the script check queue and previously?
29 13:10 <digi_james> so if you look at RunCheckInputsParallelMaybe
31 13:11 <jnewbery> The comment says "Check whether all inputs of this transaction are valid (no double spends, scripts & sigs, amounts)
32 13:11 <jnewbery> historically, that was true (it'd check amounts and doublespends). Now it only checks scripts and sigs.
33 13:11 <jnewbery> That's what's being done in parallel in this PR - script verification
34 13:12 <lf> It's not clear to me why tx input check can be parallelized: is it not the case that if I create a transaction that uses two identical inputs a parallelized function will return false-positive validity
35 13:12 <digi_james> Thx John I wasnt aware it didnt do contextual verification
36 13:13 <lf> ok I think I understand: this is just verify valid inputs, not verify valid transaction
37 13:14 <jnewbery> yeah, it used to call CheckTxInputs that does some checking of amounts, etc.
38 13:14 <lf> Verify valid inputs aka assert that script conditions being met are allowed
40 13:14 <digi_james> John: If there are no contextual checks, how can it be paralellized? Inputs from a given tx double spending for example?
41 13:15 <jnewbery> All it's checking is the scripts. Those are completely independent from contextual information
42 13:16 <jnewbery> contextual checks and amount checks are done elsewhere
43 13:16 <lf> jnewbery so a broad assesment of the change is: "lets separate CheckTxInputs function into more indepedently functional pieces s.t. performance can increase"?
44 13:16 <digi_james> I see
46 13:16 <lf> e.g. "before CheckTxInputs can complete, CheckInputs needs to finish and that can be parallelized"?
47 13:17 <jnewbery> lf: no, that change that I linked to is from an old PR
48 13:17 <lf> jnewbery ty, reviewing now
49 13:17 <jnewbery> this PR is simply changing AcceptToMemoryPool (ATMP) to call CheckInputs() with a checkqueue so that script verification for the different inputs happens in parallel
50 13:18 <hugohn> Q: is there any scenario where CCheckQueue (scriptcheckqueue) can contain more than one tx to be validated? Or is it guaranteed that the queue can only contain one pending tx at a time?
51 13:19 <digi_james> The ATMP function takes a TX as an argument
52 13:19 <digi_james> so it is the inputs of that transaction which are pushed the the CCheckQueue
53 13:19 <hugohn> yes, all inputs of a given tx
54 13:20 <digi_james> When the queue is then run, it processes all queued input checkscript jobs of a given transaction.
55 13:20 <hugohn> my question is is it possible to queue up vChecks for more than 1 tx into scriptcheckqueue? I assume the answer is no, but want to confirm
56 13:20 <digi_james> I think ATMP locks up cs_main so that would prevent other tx inputs to be queued in the meantime
57 13:21 <digi_james> Sorry it locks the cs.mempool
58 13:22 <digi_james> Ok, in the interest of time, I suggest also covering the behavioural changes this PR introduces, what are they?
59 13:22 <hugohn> I see. yes that makes sense. If check queue accepts more than 1 tx at a time, !control.Wait() would not make sense because the failure could be ambiguous (not sure which tx is invalid)
60 13:22 <digi_james> hugohn: Thats a good point
61 13:23 <ariard> hugohn: IMO CCheckQueue let you do that, but you have a wrapper around CCheckQueueControl to let verification showing up as atomic for callers
62 13:24 <jnewbery> hugohn: I believe during block validation, the inputs from all the txs are enqueued onto a single CCheckQueueControl and then dispatched, but like you and james have said, that can't happen for checking txs for mempool entry
63 13:24 <PaulTroon> hugohn: can a blockvalidation use the queue at the same time as a inputs from a tx is queued?
64 13:24 <digi_james> ariard: But can you call ATMP for multiple transactions concurrently?
65 13:24 <digi_james> It would seem the mempool lock acquisition prevents this
66 13:24 <hugohn> ariard: isn't CCheckQueueControl only to ensure all vChecks in the queue must be finished? it doesn't protect you against 2 concurrent txs in the queue
67 13:25 <jnewbery> ATMP holds the main lock, so it can only be running in one thread
68 13:26 <digi_james> jnewbery: can CCheckQueueControl allow two callers to dispatch separate job sets, returning the completion of each separately to each caller as ariard suggested?
69 13:26 <ariard> digi_james: you can call ATMP for multiple transactions concurrently but given there is only one set of ScriptCheck threads, CCheckQueueControl constructor should be pass the scriptcheckqueue to ensure there isn't conflict
70 13:27 <jonatack> ariard: bypass?
71 13:27 <ariard> I mean CCheckQueueControl constructor is locking the passed CCheckQueue
72 13:27 <ariard> jonatakc: bypass what you mean ? Using the raw scriptcheckqueue without CCheckQueueControl instanciation ?
73 13:27 <jonatack> ah, be passed
74 13:28 <ariard> yes sorry
75 13:29 <digi_james> Ok, I think we should move to the behavioural change regarding invalid transactions
76 13:29 <jnewbery> ariard: ATMP holds the main lock, so it can't be called concurrently
77 13:29 <jonatack> behavior changes: CheckInputs invokes scripts only once and always returns CONSENSUS for script failures. Ban behavior.
78 13:30 <digi_james> jonatack: Thank you!
79 13:30 <hugohn> jnewbery: sorry still a bit confused. I see 2 locks used in ATMP, cs_main and pool.cs. which lock prevents you against 1+ txs in the queue?
80 13:30 <ariard> jnewbery: ah thanks but so why you need to lock scriptcheckqueue in this case ?
81 13:30 <hugohn> cs_main right?
83 13:30 <jnewbery> ariard: not sure, perhaps just to be extra safe!
84 13:30 <jnewbery> let's move on to behavioural changes
85 13:31 <ariard> hugohn: in fact in ConnectBlock transaction are mixed, you may have multiplee txs in queue
86 13:31 <ariard> IMO it's ok because you want atomic validity at block granularity
87 13:31 <ariard> and not a transaction granularity like in ATMP
89 13:32 <hugohn> ariard: it does makes sense for ConnectBlock I think? because block is invalid as long as any tx in the block is invalid
91 13:32 <jnewbery> hugohn ariard: please, let's move on
92 13:32 <ariard> yes we agree on this
93 13:32 <digi_james> As jonatack mentioned, RunCheckInputsMaybeParallel can no longer differentiate between policy invalid and consensus invalid failures during input checks. Rather it just sets the invalid state to Consensus.
94 13:33 <digi_james> CheckInputs previously would run two script checks during failure. One with policy flags active, so checks for standardness, and if this failed, a second check would be run with consensus flags only.
95 13:33 <digi_james> Why cant we do this in the new parallelized case?
96 13:35 <digi_james> Ok, one way to look at is in the CheckQueue design, which was originally written to be called when blocks are connected, there only consensus verification matters, standardness applies to mempool acceptance only
97 13:36 <jnewbery> digi_james: I think we could call CheckInputs() twice after this PR, but it'd require some code rearrangement, and we'd lose some of the performance benefit of the PR
98 13:37 <digi_james> john: Interesting. So both with all inputs in vChecks?
99 13:37 <jnewbery> (ie call CheckInputs() with policy flags, and then a second time with consensus flags)
100 13:37 <jnewbery> right
101 13:38 <digi_james> I am thinking out loud. So this would potentially be more scriptchecks than in the single threaded case, albeit parallelized.
102 13:38 <jnewbery> CheckInputs() is a bit weird right now. We call it and say "please run the scripts with these flags". CheckInputs() runs the scripts with the flags we pass it, but if those fails, it runs the scripts _again_ with different flags
103 13:38 <michaelf_> <digi_james>: What does "when blocks are connected" mean?
104 13:38 <jnewbery> this PR removes that second script verification
105 13:38 <digi_james> michaelf_: I mean when a block is connected to the tip of the chain.
106 13:39 <michaelf_> Ah ok thanks
107 13:39 <lf> chain tip update*?
108 13:39 <jnewbery> but it could also add back a second call to CheckInputs() with the consensus flags
110 13:39 <ariard> jnewbery: or CheckInputs() may also push the script verification in pvChecks
111 13:39 <jnewbery> "The motivation for this is to mitigate adversarial behavior: in order to determine whether a transaction is invalid according to consensus rules (rather than policy) we end up validating a transaction's scripts twice, so an adversary trying a CPU DoS could construct transactions that only fail policy checks, but otherwise tie up our CPU for twice as long."
112 13:42 <digi_james> jnewbery: If we add a second CheckInputs run (in the new parallelized case) as you suggested above, how would this be resolved?
113 13:42 <jnewbery> how would what be resolved?
114 13:43 <jonatack> Unsure how it relates but Suhas also wrote: "AcceptToMemoryPoolWorker() already invokes CheckInputs multiple times, with different script flags, to detect certain kinds of segwit validation errors."
115 13:43 <digi_james> Transactions failing policy checks, but tying up CPU for a second round of checks
116 13:43 <jnewbery> the fact that a malicious peer can make us verify twice at no cost to themselves?
118 13:43 <digi_james> jnewbery: exactly
119 13:43 <jnewbery> that's on open question. We probably want to deprioritize traffic from such peers
120 13:44 <jnewbery> jonatack: yes, that's right. There are 4 places that ATMP can call CheckInputs() I believe:
121 13:44 <jnewbery> first, when verifying against policy
122 13:45 <digi_james> btw I realize this PR review may be difficult to follow, it's not easy to understnad and as you can see there are remaining nuances I haven't fully processed myself. Sorry about that.
123 13:45 <jnewbery> if that fails, it calls CheckInputs twice more, to see if the failure was due to a segwit witness mutation
124 13:46 <jnewbery> if it succeeds, we call CheckInputs() again from CheckInputsFromMempoolAndCache()
125 13:46 <jnewbery> digi_james: don't apologise. It's a great PR to review!
126 13:47 <jonatack> jnewbery: thank you!
127 13:47 <digi_james> Are there any remaining questions in regards to the current PR state which we can cover?
128 13:47 <ariard> digi_james: that's fine you pick up a really nice one :)
129 13:47 <sosthene> I wonder why we check against policy before consensus rules? Wouldn't it make more sense the other way around?
130 13:47 <digi_james> policy is tighter than consensus
131 13:48 <digi_james> so consensus is a second check, with loosened rules if you will
132 13:48 <jnewbery> exactly right digi_james. The mainline case is that policy checking succeeds, in which case we don't need to check against consensus
133 13:48 <digi_james> a consensus failure will always be a policy failure, but not the other way around
134 13:48 <hugohn> nit: the (nScriptCheckThreads && !single_threaded_validation) check is used 3 times in RunCheckInputsMaybeParallel(). it seems suboptimal & makes things a bit hard to read.
135 13:49 <jonatack> digi_james: agree, great choice of PR :+1:
136 13:49 <sosthene> oh ok, I don't know why intuitively I was thinking it was the other way around (consensus tighter than policy)
137 13:49 <digi_james> jonatack: Thanks for joinin gus
138 13:49 <digi_james> *us
139 13:49 <jnewbery> (but in fact we do also check against consensus rules later (in CheckInputsFromMempoolAndCache(), but that's for slightly different and obscure reasons, which I don't think we have time to get into!)
141 13:51 <sosthene> digi_james: thanks it was very interesting, it probably wasn't easy to get ready for this one :)
142 13:51 <sosthene> gotta go now, goodby everyone
143 13:51 <jonatack> jnewbery: how critical is the caching issue you mention: "I believe this breaks script caching - CheckInputs() will only cache script success if running non-parallel."
144 13:51 <jnewbery> jonatack: it's pretty bad!
145 13:52 <hugohn> I thought suhas addressed that?
146 13:52 <jnewbery> script caching means that once we've accepted the tx to our mempool, we don't need to revalidate the scripts when we receive a block with that tx
148 13:53 <ariard> jnewbery: assuming script flags don't change right ?
150 13:54 <jnewbery> ariard: exactly correct. The cache entry includes which script flags were used in verification
151 13:55 <jnewbery> that's one of the obscure reasons that we call CheckInputsFromMempoolAndCache() later - it's so we verify the scripts according to the consensus rules of the next block
152 13:55 <jnewbery> so according to that blog post, block validation is ~40-50% faster with script caching
153 13:56 <jonatack> How does this change affect RPC methods such as testmempoolaccept?
154 13:56 <jnewbery> (or put differently, breaking script caching would ~double block validation time)
155 13:57 <ariard> Hmmm verifying the scripts according to the consensus rules of the next block, it's really obscure it's in case of softfork ?
156 13:57 <digi_james> jonatack: as of the current PR, it will return consensus failure
157 13:57 <digi_james> with script errors, but the 2nd check for policy failure has been removed.
158 13:58 <digi_james> But the rpc call adds a single_threaded_validation argument, so the inputs are checked in sequence
159 13:58 <jonatack> thanks!
160 13:58 <digi_james> ....which allows the caller to receive specific script errors, unlike the parallel case
161 13:59 <digi_james> Previously, you would get policy or consensus failures
162 13:59 <jnewbery> ariard: yes that's particularly obscure, but just the fact that we're calling CheckInputs() again seems pretty weird until you realise it's specifically for caching
163 13:59 <digi_james> That no longer the case
164 13:59 <ariard> I mean CheckInputsFromMempoolAndCache seems only an optimization in case of softfork activation, which doesn't happen often,so you add a burden for every tx getting into memppol
166 14:00 <jnewbery> ariard: no, it's an optimization for all txs
167 14:00 <hugohn> scriptExecutionCache.insert(hash_cache_entry); is called in the running parallel case
168 14:00 <jnewbery> before CheckInputsFromMempoolAndCache we haven't validated the scripts according to consensus rules
169 14:00 <ariard> but couldn't we cache at once in the first calls of CheckInputs ?
170 14:01 <ariard> oh gotcha it's because the check against consensus rules is only in case of standardness check failure
171 14:01 <jnewbery> ariard: no, because the first time round, if the scripts succeed verification with policy rules, we don't check against consensus rules
172 14:01 <jnewbery> exactly
173 14:01 <ariard> that's weird
174 14:02 <fjahr> Before everyone leaves, question on code structure/naming: I would like to understand the reasoning for wrapping CheckInputs() in RunCheckInputsMaybeParallel(). Rather than that I would probably have put that logic into ChecksInputs() itself and then have it delegate to CheckInputsParallel() or so. I would like to understand if there are other factors at play that I am not seeing, like keeping the changes in
175 14:02 <fjahr> CheckInputs() as minimal as possible for example. Then the chosen structure would make more sense to me.
177 14:03 <jnewbery> I also thought that was a bit weird, but didn't look into alternative ways of doing it
178 14:04 <digi_james> Ok, if you don't mind, Id suggest wrapping it up here.
179 14:04 <jnewbery> Thanks digi_james!
181 14:04 <digi_james> jnewbery: Cheer!
182 14:04 <digi_james> *Cheers
183 14:04 <jnewbery> Tough PR this week. Great job uncovering the subtleties.
184 14:04 <hugohn> wouldn't one reason be that CheckInputs() is also called by BlockConnected(), which already does things in parallel?
185 14:05 <jonatack> Thanks digi_james and everyone!