Cluster linearization: separate tests from tests-of-tests (tests)

Jun 11, 2025

https://github.com/bitcoin/bitcoin/pull/30605

Host: marcofleon - PR author: sipa

The PR branch HEAD was d2fc10b64505512a70bed05e7bb21d76c8f748cd at the time of this review club meeting.

Notes

A cluster is a group of transactions where some transactions depend on others, forming parent-child relationships. The simplest cluster is a single transaction, and most clusters consist of just a few transactions. However, there can be more complex configurations that form large directed acyclic graphs (with grandparents, siblings, uncles, etc). The current limit on cluster size is 64 transactions.
At a high level, cluster linearization is the process of sorting the transactions within a given cluster in a way that respects dependencies (parents before children) while maximizing the effective fee rate when considering “chunks” of this ordered list. The goal is to produce an optimal linearization: an ordering with the best possible sequence of chunk fee rates (fee rate profile). This linearization allows miners to include the most profitable transactions first. (for more in-depth reading, see here)
- The data structures and algorithms for cluster linearization can be found in cluster_linearize.h.
The linearization code is complex and therefore relies on fuzz testing to ensure correctness. Using the wide variety of data generated by the fuzzer to build dependency graphs (DepGraph instances) and produce/update linearizations is probably our best bet for finding any subtle bugs or edge cases in the code.
The cluster_linearize fuzz test includes a few helper classes and functions that are used to compare against the actual implementation. The motivation for this PR is to construct new fuzz targets that are used to make sure these helpers work properly. By having dedicated targets that test the test, we can be even more confident that our cluster linearization code is bug-free. Woo!
Note: the current cluster linearization algorithm may be replaced by a new and improved spanning-forest linearization (SFL) algorithm in the future. SFL is essentially a drop-in replacement, so while the underlying algorithm changes, it doesn’t change the tests much. It’s outside of the scope of this review club, but you can look at PR 32545 or this delving post if you’re interested.

Exercise

Try fuzzing the clusterlin_linearize target. You can refer to the fuzzing docs for assistance. Using --preset=libfuzzer-nosan will likely make building easier.
Now change chunking to simple_chunking at this line and run the target again.
After a few minutes, you should see a crash. Welcome to fuzzing. This is a simple example of a tiny bug in the test itself which fuzz testing catches quickly. A lot of the time, a crash happens because something is wrong in the fuzz harness itself, as opposed to a bug in the production code. This is why it can be useful to have targets or assertions that ensure that the fuzz test itself is correct.

Questions

Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?
What are the two new fuzz targets introduced and what are they testing? In terms of what they’re testing, how are they different from the other targets?
What are the benefits of separating out the fuzzing of the internal test helpers?
Were you able to run the clusterlin_linearize target? How many iterations (the first number libFuzzer shows at each line) did it take to produce a crash after making the s/chunking/simple_chunking/ change?
In commit 2763b75, why was --iterations_left moved?
In clusterlin_simple_finder, when finding a topologically valid subset to remove from the graph, why is it important to set non_empty to true? What could happen if we allowed empty sets?
In clusterlin_simple_linearize, why does the code only verify against all possible permutations when depgraph.TxCount() <= 8? What would happen if we tried to do this for larger transaction counts?
In commit 1e4f345, when a non-topological permutation is found, the code now fast forwards by reversing part of the permutation. Why does this optimization work, and how many permutations can it skip in the best case?
In SimpleCandidateFinder::FindCandidateSet(), the algorithm keeps track of two sets for each work unit: inc (transactions definitely included) and und (transactions that can still be added). Why does it need to track both sets instead of just tracking undecided transactions?
What is, no doubt, the coolest way to test code?

Meeting Log

17:00

<marcofleon> #startmeeting

17:00

<corebot`> marcofleon: Meeting started at 2025-06-11T17:00+0000

17:00

<corebot`> marcofleon: Current chairs: marcofleon

17:00

<corebot`> marcofleon: Useful commands: #action #info #idea #link #topic #motion #vote #close #endmeeting

17:00

<corebot`> marcofleon: See also: https://hcoop-meetbot.readthedocs.io/en/stable/

17:00

<corebot`> marcofleon: Participants should now identify themselves with '#here' or with an alias like '#here FirstLast'

17:00

<sliv3r__> Hi!

17:01

<stickies-v> hi

17:01

<monlovesmango> heyy

17:01

<sipa> hi

17:01

<marcofleon> welcome! we'll be going over https://bitcoincore.reviews/30605

17:02

<marcofleon> litte re run of last week, we can actually go over the questions this time

17:02

<marcofleon> Did you review the PR/notes? Concept ACK, approach ACK, tested ACK, or NACK?

17:02

<stickies-v> didn't review, just lurking today

17:03

<marcofleon> lurkers are welcome

17:03

<sliv3r__> tried to :) was not an easy one

17:03

<monlovesmango> yes a bit. tested too.

17:03

<marcofleon> agreed, the cluster linearization code can be tough

17:04

<sipa> i hope the testing code for it is easier!

17:04

<marcofleon> monlovesmango: ran the fuzz tests? nice! I guess we can go over that a bit with question 4

17:04

<marcofleon> okay, let's dive in

17:04

<marcofleon> What are the two new fuzz targets introduced and what are they testing? In terms of what they’re testing, how are they different from the other targets?

17:05

<sliv3r__> clusterlin_simple_finder and clusterlin_simple_linearize

17:06

<sipa> ✔️

17:06

<sliv3r__> here I have a question bc I though ones were testing the functionality and the other ones were testing the tests but when I checked the functions I saw that were just simpler re-implementations of the ones found in cluster_linearize.h

17:06

<sliv3r__> why is that?

17:07

<monlovesmango> clusterlin_simple_finder and clusterlin_simple_linearize. they are different than other targets bc they are fuzz testing functions implemented for fuzz tests

17:07

<sipa> sliv3r__: let me explain

17:07

<marcofleon> sliv3r__: That could be related to the next question actually

17:07

<marcofleon> in terms of what are the benefits of doing that

17:08

<sliv3r__> marcofleon: that could be the reason why I wrote IDK as an answer to the next one in my notes :)

17:08

<sipa> sliv3r__: the short answer is that SimpleCandidateFinder, ExhaustiveCandidateFinder, SimpleLinearize are part of the test suite, not of production code, even though they mimick the functionality of production code funcstions/classes

17:09

<sipa> the point is that the production code is complicated, which means it may be hard for reviewers to get confidence in their correctness

17:09

<sliv3r__> and how can we be sure that the behaviour is equal or at least equivalent?

17:10

<marcofleon> and so the simpler implementations provide an oracle to check against

17:10

<sipa> one tool that we (or i, as author, in this case) have available to make it easier on reviewers, is by writing a simpler, more obviously correct, but potentially slower/less feature-rich reimplementation that does the same thing, plus tests that they are identical

17:10

<sipa> so then if you as a reviewer have confidence in the reimplementation + the equivalence test, you also gain confidence in the original implementation

17:11

<sipa> sliv3r__: well that is what the fuzz tests do: run a scenario (controlled by the fuzzer) of operation to both the production version and the simpler version, and see they produce the same result

17:11

<sipa> that's not a proof of course, but it is one tool that can increase confidence if no such differences are found

17:11

<sliv3r__> make sense

17:11

<sipa> if you want to get an idea of how good it is, you can try inserting deliberate bugs and see how long it takes for the fuzzer to find it

17:11

<abubakarsadiq> hi

17:12

<marcofleon> abubakarsadiq: welcome

17:12

<sliv3r__> nice thanks, more clear now

17:12

<sipa> and in this case, even the "simpler" version is fairly nontrivial, so in addition, there is a third, *even* simpler and *even* slower version implemented

17:13

<marcofleon> i'll move on to question 4, unless there's more benefits other than increasing confidence of correctness? for question 3 I mean

17:13

<sipa> where the current fuzz tests compare all 3 with each other

17:13

<marcofleon> That's the exhaustive one yes?

17:13

<sipa> marcofleon: yeah

17:13

<monlovesmango> I think that makes sense for clusterlin_simple_finder bc it doesn't seem to compare with other prod code, but for clusterlin_simple_linearization it is comparing to ChunkLinearization which doesn't necessarily feel simpler

17:13

<sipa> marcofleon: there are 2 dimensions i think

17:14

<sipa> marcofleon: there is the candidate-finding logic, where we have SearchCandidateFinder (prod), SimpleCandidateFinder (simpler), ExhaustiveCandidateFinder (simplest)

17:14

<sipa> and there is the linearization logic on top, where there are also 3 version: Linearize (prod), SimpleLinearize (simpler), and the std::next_permutation based exhaustive search (simplest, but not a separate function)

17:15

<abubakarsadiq> sipa: haven't look yet, but after SFL simple candidate finder would be changed to match the new algorithm yeah?

17:16

<sipa> abubakarsadiq: there is no SearchCandidateFinder anymore post-SFL; SFL is a replacement for Linearize directly, which doesn't involve candidate finding anymore

17:16

<marcofleon> So building layers of trust essentially, makes sense

17:16

<sipa> SimpleCandidateFinder and ExhaustiveCandidateFinder stay, but they're just used in/for SimpleLinearize, which the SFL-based Linearize is then compared against

17:16

<sipa> kind of vestigial

17:17

<abubakarsadiq> So we will use exhaustive search to validate correctness of SFL?

17:17

<sipa> that would be a possibility!

17:17

<marcofleon> monlovesmango if you ran the fuzz tests a bit, question 4: Were you able to run the clusterlin_linearize target? How many iterations (the first number libFuzzer shows at each line) did it take to produce a crash after making the s/chunking/simple_chunking/ change?

17:17

<monlovesmango> sipa: ok std::next_permutation was what I was missing

17:17

<sipa> abubakarsadiq: but i don't think it gains us very much, as it's pretty non-trivial anyway, so the "added confidence" isn't as much as compared with the much simpler reimplementations gives us

17:17

<abubakarsadiq> Or the permutation of the graph linearizations I think will be more valid candidate

17:17

<monlovesmango> marcofleon: I actually had a question about that. I did get the failure, but I was using the python test runner and it doesn't show any output with the counts, just the error

17:18

<sliv3r__> marcofleon: it took for me 566593 iteartions

17:18

<monlovesmango> is there another way I should be running individual tests?

17:18

<sipa> i always run the tests individually

17:18

<sliv3r__> I did it running: FUZZ=clusterlin_linearize build_fuzz_nosan/bin/fuzz

17:18

<sipa> that ^

17:19

<marcofleon> exactly, yeah

17:19

<monlovesmango> sliv3r__: ok that helps thank you :)

17:19

<sipa> i usually add -use_value_profile=1 -workers=30 -jobs=30, to get 30 concurrent processes, with more tendency to explore more paths

17:19

<marcofleon> and yeah ~500k iterations was where i was at too. It crashes pretty quickly

17:19

<sliv3r__> monlovesmango: in the fuzzing docs you have an example at the very begining https://github.com/bitcoin/bitcoin/blob/master/doc/fuzzing.md

17:20

<marcofleon> And then running it like how sliv3r__ said you can add a directory at the end to save the inputs if you'd like. Keep a corpus for later

17:20

<sliv3r__> marcofleon: Then it says something like: Test unit written to ./crash-1cea6b51b877d277ba4d1ba7f522b7d3ac182349

17:20

<sliv3r__> what's that? bc it's impossible to human-read it :)

17:20

<sipa> marcofleon: maybe it makes sense to move the std::next_permutation based logic to an ExhaustiveLinearize function

17:21

<marcofleon> sipa: it could make it clearer, agreed

17:21

<marcofleon> sliv3r__: that's the input that caused the crash

17:21

<sipa> sliv3r__: it is the input to the fuzz harness, which interprets bytes from it (each of the provider.ConsumeIntegral... functions decodes some bytes from it)

17:22

<marcofleon> if you run the fuzz command and that input after, it should reproduce, hopefully

17:22

<sipa> you can re-run it, using FUZZ=clusterlin_linearize build_fuzz_nosan/bin/fuzz ./crash-1cea6b51b877d277ba4d1ba7f522b7d3ac182349

17:22

<sipa> which will not do any fuzzing, just run that one fuzz input through the harness

17:22

<monlovesmango> sipa: I think ExhaustiveLinearize would be good and consistent

17:22

<sliv3r__> yep runs one iteration and crashes instantly

17:23

<sipa> but now you can also modify the code, attempt to fix the bug, and retry with just that case

17:23

<marcofleon> that's when the debugging starts

17:23

<marcofleon> question 5: In commit 2763b75, why was --iterations_left moved?

17:23

<monlovesmango> sliv3r__: thank you I see that example now... I think I didn't really know what that param did bc I didn't realize 'process_message' was a test

17:24

<marcofleon> monlovesmango: sorry if I moved on too quickly! Yeah, the thing after FUZZ= is the fuzz target

17:24

<monlovesmango> marcofleon: no you're good!

17:24

<sliv3r__> sipa: right setting it back to chunking works :P

17:24

<sliv3r__> are we at q 5?

17:25

<marcofleon> we have a bunch of them. all looking something like "FUZZ_TARGET(clusterlin_depgraph_sim)" for example and then the test itself

17:25

<marcofleon> yeah, i'll paste again

17:25

<marcofleon> In commit 2763b75, why was --iterations_left moved?

17:25

<monlovesmango> it was moved bc otherwise it decrements for splits that don't actually need to be considered (which I imagine might also mess with the result being optimal assumption)

100

17:26

<sipa> i think the answer here is kind of nuanced

101

17:27

<marcofleon> I'm actually not sure if it was a bug fix per se... I think just an optimization?

102

17:27

<sipa> but it also doesn't matter much

103

17:27

<sliv3r__> If I didn't understand wrong we want to only reduce the counter when we add a new "valid" subset. So the number of iterations is proportional to the number of different connected subsets that cna be created instead to be realted to the total num of elements in the queue

104

17:27

<sipa> so the idea is that we want some kind of "cost metric" to control how much time is spent inside SimpleCandidateFinder, just so that it does not run forever, but can stop it after "too much" work has been performed

105

17:28

<sipa> it doesn't really matter what that metric is, as long as it's roughly proportional to runtime

106

17:28

<sipa> the metric that was being used so far was "queue elements processed"

107

17:28

<sipa> moving the --iterations_left is really changing the metric to something else: the number of candidate sets considered

108

17:29

<monlovesmango> interesting

109

17:29

<sipa> they're very similar, but each candidate set being considered can give rise to two queue elements

110

17:29

<sipa> and there is an initial queue element on the stack that's not a candidate set

111

17:29

<sipa> so the old metric is at most 2*N+1, where N is the new metric

112

17:30

<marcofleon> In clusterlin_simple_finder, when finding a topologically valid subset to remove from the graph, why is it important to set non_empty to true? What could happen if we allowed empty sets?

113

17:30

<sliv3r__> infinite loop

114

17:30

<sipa> so why change is: the new one is easier to reason about, because it's easy to count how many candidate sets a cluster can have (2^(N-1))

115

17:31

<sipa> so that's something that can be concretely tested for: if the limit is set to at least 2^(N-1), the optimal must be found

116

17:31

<sipa> makes sense?

117

17:32

<monlovesmango> seems like a better metric. yes.

118

17:32

<sipa> it's not a big change, just making things slightly easier to reason about

119

17:32

<sliv3r__> it makes more sense

120

17:32

<sipa> but the most important thing is that there is some metric, and it's being hit

121

17:33

<marcofleon> sure, I do thiink it makes sense to have it be only when a valid transaction is found

122

17:33

<marcofleon> a better metric, then just next in the queue

123

17:34

<marcofleon> *than

124

17:34

<sipa> *candidate set, not transaction

125

17:34

<marcofleon> valid set, sorry yeah

126

17:35

<abubakarsadiq> We have to find something to remove in the prior commit we evict the found transactions when the it's empty

127

17:35

<abubakarsadiq> infinite loop if we don't remove anything

128

17:35

<marcofleon> for q6, nice yeah

129

17:36

<sliv3r__> yeah so basically we have an empty one the next iteration is equal to the current one so loop loop loop :)

130

17:36

<marcofleon> it would just be the same set if nothing is removed

131

17:36

<sliv3r__> if we have*

132

17:36

<sipa> abubakarsadiq: (bonus) how likely would it be that we *keep* removing nothing, because just occasionally not removing anything wouldn't be an infinite loop?

133

17:38

<abubakarsadiq> Yeah highly unlikely

134

17:38

<sipa> actually no, extremely likely :p

135

17:38

<sliv3r__> I'm lost haha

136

17:38

<sipa> it would happen anytime you hit the end of the fuzz input

137

17:38

<sipa> because once you hit the end, ConsumeIntegral and friends will keep forever returning 0

138

17:38

<sipa> which would be interpreted as the empty set

139

17:39

<sipa> fuzz input is **not** random

140

17:39

<marcofleon> oh yes of course, there'd be no more data from the provider at some point

141

17:39

<sipa> it's hopefully better than random, but it can also be a lot worse

142

17:40

<sipa> abubakarsadiq: sorry, that was a trick question :)

143

17:40

<marcofleon> In clusterlin_simple_linearize, why does the code only verify against all possible permutations when depgraph.TxCount() <= 8? What would happen if we tried to do this for larger transaction counts?

144

17:40

<monlovesmango> very thought provoking trick question

145

17:41

<monlovesmango> the exhaustive search would start taking too long

146

17:41

<sliv3r__> I think I can answer for TxCount <= 7 bc I don't understand the optimization included in cce29ef63ecf114003b529093fdfd2574b830afc

147

17:41

<abubakarsadiq> was wondering the reason you choose to evict randomly. For testing weird and all kind of behaviors yeah. Because the found txs can also be the read transaction

148

17:41

<abubakarsadiq> sipa: it's okay I learn something :P

149

17:42

<marcofleon> We can throw in the next question, it's related. In commit 1e4f345, when a non-topological permutation is found, the code now fast forwards by reversing part of the permutation. Why does this optimization work, and how many permutations can it skip in the best case?

150

17:42

<sliv3r__> So basically the cost would be huge as the possible permutations is N!. For 7 would be 5050 and higher numbers increase the number heavily

151

17:42

<marcofleon> i tried with 9 and the execs/sec were about 4x slower

152

17:43

<sliv3r__> But with 1e4f345 the worst case still being 8! which is huge right? Why is this acceptable if it wasn't before? (bc we had 7)

153

17:43

<sipa> yeah, that's the answer... the number of iterations can grow factorially

154

17:44

<sipa> sliv3r__: just 8! = 40320, that's not enormous

155

17:44

<sipa> fuzz tests work best when they stay above 100-1000 attempts per second

156

17:44

<sliv3r__> why was it 7 before if 8 is ok?

157

17:44

<sipa> hmm?

158

17:45

<sipa> oh, with the optimization there are just fewer permutations searched through

159

17:45

<marcofleon> The optimization of skipping all invalid permutations makes it possible to bump up

160

17:45

<marcofleon> sorry, i should be more specific

161

17:45

<sipa> it's not actually possible for all 8! permutations to be topological, so many/most of them will be skipped by the optimization

162

17:46

<sipa> and the numbers 7 and 8 are just set based on benchmarks... which ones make the test too slow

163

17:46

<marcofleon> all the permutations, after having found an invalid prefix yes?

164

17:46

<sipa> yes, it'll skip all the permutations with the *same* invalid prefix

165

17:46

<sliv3r__> oh ok so there's no "human logic" to that :) I was getting too picky with the numbers

166

17:47

<abubakarsadiq> It's human logic :p

167

17:47

<sipa> so the number of permutations tried is equal to the number of topologically-valid ones, plus the number of topologically-invalid minimal prefixes

168

17:47

<sipa> i don't have numbers for what those are, it just works in practice (TM)

169

17:47

<marcofleon> and how many permutations can it skip in the best case then, just to complete the question

170

17:48

<marcofleon> also sipa: did you ever figure out how many permutations can be tried in total knowing that there are K valid ones?

171

17:48

<sipa> marcofleon: i have not

172

17:48

<abubakarsadiq> Depends on the cluster

173

17:48

<marcofleon> ^ that's what i was thinking too

174

17:48

<sipa> i thought it'd be easy to find a formula, but it does not seem simple at all

175

17:48

<sipa> after some experimentation

176

17:49

<sipa> but it skips a lot in practice, which is enough :p

177

17:49

<monlovesmango> 7! ?

178

17:49

<sliv3r__> sipa: why is not possible for all 8! permutations be topological?

179

17:49

<sipa> sliv3r__: that would imply there are no dependencies in the cluster, so it isn't a cluster

180

17:49

<marcofleon> monlovesmango: basically yeah, although I think you would subtract one from that

181

17:49

<sliv3r__> oh ok that was a stupid question :P

182

17:50

<sipa> sliv3r__: every dependency is a "parent must come before child in topological orderings" constraint, so easy additional dependency reduces the number of valid ones

183

17:50

<monlovesmango> whats the minus 1 for?

184

17:50

<sipa> by how much depends on the topology

185

17:50

<marcofleon> (n-1)! - 1 for n txs

186

17:51

<sipa> you skip from the first topologically-invalid order with a given prefix to the last one with the same prefix

187

17:51

<marcofleon> the -1 is the one tx that you did just try

188

17:51

<sipa> you don't immediately skip to the next prefix

189

17:51

<monlovesmango> marcofleon: makes sense thanks!

190

17:52

<sipa> i think we never answered question 3?

191

17:52

<marcofleon> What are the benefits of separating out the fuzzing of the internal test helpers?

192

17:52

<marcofleon> I thought we touched on it, but that's true, never explicitly answered

193

17:53

<monlovesmango> oh also I was able to test the fuzz crash and got 340233

194

17:53

<marcofleon> nice. I love fuzz crashes

195

17:53

<sliv3r__> 😂

196

17:53

<sipa> marcofleon: fwiw, i suspect the vast majority of the fuzz crashes in these PRs happen before the PR is opened

197

17:53

<monlovesmango> I think it help instill confidence on what is actually being tested

198

17:53

<sipa> as in, i use them to help development

199

17:54

<sipa> monlovesmango: maybe, but nothing really changes there... take Search/Simple/ExhaustiveCandidateFinder... before we had one fuzz test that compared all three (subject to cluster-not-too-large constraints etc)

200

17:54

<sipa> now we have two tests, one comparing Search with Simple, one comparing Simple with Exhaustive

201

17:55

<sipa> what's the benefit of the split?

202

17:55

<sliv3r__> Avoid redundancy

203

17:55

<sipa> yes!

204

17:55

<monlovesmango> yes for me thats much easier to reason about and so for me I can be more confident in comparing Search with Simple

205

17:55

<marcofleon> sipa: makes sense, it's a fairly easy and quick way to help initial bugs surface

206

17:55

<sliv3r__> and also easier to review the code

207

17:55

<sipa> if you're already confident in SimpleCandidateFinder, you have no "need" to waste computation effort on running the Simple/Exhaustive comparison too

208

17:56

<sipa> if you're not, you should be focusing on just establishing that confidence first without putting effort into comparing with the production code

209

17:56

<sipa> (whether that means more review, running fuzz tests, attending a review club, ...)

210

17:57

<sipa> so i feel the goals of what you obtain from these tests is different, and it makes sense to separate them for that reason

211

17:57

<marcofleon> Yeah, I think it's good to have the separate targets here

212

17:57

<sipa> but yes, clearer code too :0

213

17:58

<sipa> i will need to drop off now, but thanks for hosting marcofleon, and thanks all for spending time looking at my PR!

214

17:58

<sliv3r__> thanks for the explanations :)

215

17:58

<monlovesmango> thanks sipa!

216

17:58

<marcofleon> thanks for all your help sipa

217

17:58

<abubakarsadiq> thanks for hosting, thanks for your work sipa

218

17:58

<marcofleon> okay let's jump into SFL then?

219

17:59

<marcofleon> :P

220

17:59

<abubakarsadiq> We have txgraph reorg waiting though

221

17:59

<marcofleon> another day. Thanks for attending everybody

222

17:59

<abubakarsadiq> It's also juicy

223

17:59

<marcofleon> oh yeah I kinda forgot about that one, good call

224

17:59

<sliv3r__> Could you guys share the answer for the last-1 question? Didn't get a clear answer of that

225

18:00

<monlovesmango> I think for last question inc and und are not complementary

226

18:01

<sliv3r__> My idea checking the algorithm was that for inclusion if you don't have inc you may try to add a tx without an ancestor being in the anc set

227

18:01

<marcofleon> sure, maybe Abubakar has a clearer way to look at it, but I sort of initially thought that it was possible to get the inc with only having und

228

18:02

<marcofleon> but then I realized that in order to keep track of feerate, you would need inc I believe

229

18:03

<sipa> it's really a tristate: undecided, included (definitely in the candidate set), excluded (definitely not in the candidate set)

230

18:03

<monlovesmango> one of the sets that you add back to the queue has und - descendants(split), so I don't know how you would figure out inc from that queue set

231

18:03

<sipa> excluded is just implicitly everything not included and not undecided

232

18:04

<monlovesmango> tristate is a good way of phrasing it

233

18:05

<marcofleon> could you implicitly have something like included = m_todo - undecided - excluded?

234

18:07

<monlovesmango> would that be better? I'm not sure

235

18:08

<marcofleon> No probably not, that's just what prompted the question for me. Better to keep track of included explicitly

236

18:09

<sliv3r__> I think keeping inc explicitly is needed bc tx have dependencies between them, not explicitly knowing what is inc may cause you try to add a tx without some ancestor in the inc

237

18:11

<marcofleon> Right that makes sense

238

18:11

<marcofleon> still learning a lot about the clusterlin stuff, it's not easy

239

18:12

<sliv3r__> no it's not, this was first time I read about it and also about fuzz, too many new things in one pr :)

240

18:12

<marcofleon> sliv3r__ i saw your question on the PR and I believe smallest set breaks the tie

241

18:12

<marcofleon> And that's good to know actually, I'll keep that mind. I did realize this review club was trying to do two things at once, which didn't help

242

18:13

<sliv3r__> marcofleon: if that's the case would it make sense to assert the size?

243

18:13

<monlovesmango> this is my second fuzz pr review and still learning... haha

244

18:13

<sliv3r__> marcofleon: nono it was great :) double knowledge

245

18:14

<monlovesmango> sliv3r__: yes agree

246

18:15

<marcofleon> fuzzing is awesome, it's a great tool

247

18:16

<sliv3r__> Last question answer was: just release the code and let users complain and say that Core introduces bugs right (as it's the new tred saying that)? :P

248

18:16

<marcofleon> sliv3r__ I at least know in cluster_linearize.h FindCandidateSet says smallest set is the tiebreaker

249

18:17

<marcofleon> oh yeah haha q10 What is, no doubt, the coolest way to test code?

250

18:18

<sliv3r__> marcofleon: Yeah I saw the comment in the code. Just added a line in my prev comment so sipa can see it and say

251

18:19

<marcofleon> glad you two were able to fuzz it a bit. But yeah monlovesmango I recommend doing the indivdual targets like how we said earlier. The test runner is fine for running through a corpus of inputs once

252

18:19

<monlovesmango> yeah I agree its much better. I learned a lot more about fuzz best practices

253

18:20

<monlovesmango> thank you all for walking me through that :)

254

18:20

<sliv3r__> me too this was actually really usefull

255

18:20

<marcofleon> Glad to hear it. Okay that's it for me. Thanks again for coming, see you both soon!