💡

Bitcoin Core PR Review Club

Add MuHash3072 implementation (math and cryptography)

Oct 21, 2020

https://github.com/bitcoin/bitcoin/pull/19055

Host: jnewbery - PR author: fjahr

The PR branch HEAD was e19e500134 at the time of this review club meeting.

There was a previous review club meeting on PR 19055 when the PR included calculating the Muhash of the UTXO set. That review club session focused on the high-level concepts of using Muhash as a rolling hash for the UTXO set.

The scope of the PR has since been reduced to only include the implementation of the Muhash algorithm in C++. In this review club meeting, we’ll dig into the cryptographic code in detail.

Notes

This PR is an implementation of the Muhash algorithm, which was first described in the paper A New Paradigm for Collision-free Hashing: Incrementality at Reduced Cost by Bellare and Micciancio. Pieter Wuille wrote a mailing list post in 2017 on Rolling UTXO set hashes, which compared Muhash with Elliptic Curve Multiset Hash (another possible way of implementing rolling hashes). You should read Wuille’s mailing list post before starting to review the code. You can also look at the Bellare-Micciancio paper, but it’s more detail than you’ll need in order to review the implementation.
A Python implementation of the Muhash algorithm was merged in PR 19105 last month. We discussed that in a previous PR review club meeting. As you review the new code, you may find it helpful to compare it with the Python implementation. Python’s built-in support for bignums and modular inverses are much easier to follow than the optimized C++ code.
The new code is in the src/crypto directory, which also includes implementations of other frequently used cryptographic functions. Take a look at the SHA256, SHA3 and SipHash implementations, and you’ll notice some similarities in the way the interfaces are designed.

Questions

Specification

How much state is stored inside the MuHash3072 rolling hash object? How much data is returned to a user requesting the set hash?
Why was 3072 bits chosen as the size of the group?
Can the Muhash of a single object (eg a transaction) be calculated and cached? Would we do this in practice?
What is the most expensive operation to carry out in the rolling hash? What can we do to reduce the number of times we need to carry out this operation?
How can we test for membership in the Muhash set?

Implementation

What public methods does the MuHash3072 object expose to clients? What is the Span<> class template that’s used in some of those public methods?
What does the #ifdef HAVE___INT128 code in muhash.h do?
How is a MuHash3072 object constructed and initialized? What happens if the ChaCha20 output is larger than the modulus?
What happens if the result of a multiplication or division is larger than the modulus?
The Finalize() method has a comment “Does not change this object’s value.” Why is the function not marked const?
In some of the multiplication helper functions, we see lines like:

c1 = t >> LIMB_SIZE;

e.g. here.

What are those lines doing?
In some of the helper functions, we see some ternary operators like:

th += (c0 < tl) ? 1 : 0;

e.g. here.

What is th here? Why does it need to be incremented by 1 if c0 < tl?
Both the Multiply() and Square() functions have the following code at the end of the function:
```
    /* Perform a potential third reduction. */
    if (c0) FullReduce(in_out);
```
Why is that necessary? What is it doing?
Did you review the Inverse() function? Did AJ’s comment help?
How is this new code tested? Can you think of other ways that it could be tested?

Meeting Log

17:00

<@jnewbery> #startmeeting

17:00

<@jnewbery> Hi folks! Welcome to Bitcoin Core PR Review Club.

17:00

<emzy> hi

17:00

<felixweis> hi!

17:00

<kanzure> hi

17:00

<glozow> hi jnewbery!

17:00

<stacie> hi

17:00

<willcl_ark> hi

17:00

<jesseposner> hi

17:00

<@jnewbery> Feel free to say hi to let us all know you're here.

17:00

<fjahr> hi

17:00

<blueskies> hi

17:00

<elle> hi!

17:00

<sipa> hi

17:00

<lightlike> hi

17:00

<michaelfolkson> hi

17:00

<jcg> hi

17:00

<@jnewbery> Anyone here for the first time?

17:00

<buzz08> hi

17:01

<blueskies> first time for me

17:01

<jcg> me

17:01

<@jnewbery> welcome blueskies and jcg :)

17:01

<jcg> thanks

17:01

<blueskies> thank you :)

17:01

<@jnewbery> notes and questions are in the normal place: https://bitcoincore.reviews/19055-2

17:02

<@jnewbery> A reminder of the format: I've prepared some questions to guide the discussion a bit, but feel free to jump in at any point if you have a question

17:02

<@jnewbery> All questions are welcome. This is a place for us to all learn from each other

17:02

<gentile> hi

17:03

<@jnewbery> ok, who had a chance to review the PR this week? (y/n)

17:03

<jesseposner> y

17:03

<felixweis> y

17:03

<willcl_ark> y

17:03

<jcg> y

17:03

<blueskies> n

17:03

<michaelfolkson> y

17:03

<glozow> 0.5y

17:03

<stacie> 50% - got through the concept/spec, but not so much the code

17:03

<emzy> n

17:03

<lightlike> y

17:03

<buzz08> n

17:04

<elle> y-ish

17:04

<jonatack> hi

17:04

<@jnewbery> that's great. Any first impressions? I really enjoyed getting stuck into the low level cryptographic code. It's not something I look at very much

17:04

<willcl_ark> some of the operations def appear quite inaccessible at first look

17:05

<glozow> i finally got to use math

17:05

<@jnewbery> willcl_ark: I agree. Definitely took a lot of thinking to work out what's going on

17:05

<@jnewbery> First question: How much state is stored inside the MuHash3072 rolling hash object? How much data is returned to a user requesting the set hash?

17:05

<felixweis> yes i spend quite some more time on the python implementation because its a lot easier to understand

17:05

<willcl_ark> but they get a _bit_ better once you try to match them up with the crypto scheme itself

17:05

<@jnewbery> felixweiss: yeah, it's nice that python just does bignums and modular inverses for you

17:06

<glozow> Just a Num3072 which is an array of “limbs” (represented as uints size depending on what system supports) for a total = 3072 bits. Output is a 384b hash

17:06

<stacie> The rolling hash object stores 2 pieces of information - (1) the product of hashes for all the elements that have entered the set, and (2) the product of hashes for all elements that have been removed from the set. (at least that's what I gathered from the mailing list post :) ) When a user requests the hash, only a 256 bit hash is returned (but I see in the code that a 384 byte hash is returned).

17:06

<willcl_ark> looks like MuHash3072 stores 3072 bits and returns a 256 bit hash to the user

17:06

<jesseposner> Finalize returns a 384-byte hash, however, the spec calls for a compression step that reduces it to 256 bits

17:06

<willcl_ark> TBH I was confused by the user bit -- the C++ code looked like 384 bit hash, but the python code looked like a 256 bit hash

17:07

<sipa> 384 *bytes*

17:07

<sipa> aka 3072 bits

17:07

<sipa> which is the observable state

17:07

<sipa> the hash of that is the output hash

17:07

<felixweis> wasn't there both a numerator and denumerator before finalization?

17:08

<sipa> yes, but those aren't observable

17:08

<@jnewbery> very good everyone! Yes, the question was a bit vague. The MuHash class returns a 3072 bit object to the client code. When that gets integrated into Bitcoin Core, it'd be hashed down to 256 bits before returning it to the end user

17:08

<sipa> it's just an optimization to delay divisions/inverses as long as possible

17:08

<felixweis> which is quite smart considering how much slower div is to mul

17:09

<sipa> so instead of computing a1/d1*a2/d2*a3/d3 (for 3 adds and 3 deletes), it's computing (a1*a2*a3)/(d1*d2*d3)

17:09

<willcl_ark> ^ which is the answer to a later question IIRC

17:09

<sipa> sorry.

17:09

<@jnewbery> stacie: Storing a numerator and denominator would be an optimization. In the actual implementation we just store one Num3072 (the numerator/denominator)

17:09

<@jnewbery> Why was 3072 bits chosen as the size of the group?

17:09

<jesseposner> 3072-bit DL has about 128 bits of

17:10

<jesseposner> security

17:10

<jesseposner> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-May/014337.html

17:10

<glozow> I try answer: security = we’re trying to prevent collisions. Usually we’d like something like 128 bits of security, i.e. O(2^128) time to break. Muhash incremental hashing also makes it possible to attack slightly more efficiently than brute force, i.e. in O(2^(2*sqrt(n)-1) time. So 2*sqrt(3072) - 1 ~= 128 bits of security

17:10

<@jnewbery> (that's question 2, not a question for sipa :) )

17:10

<CD36> I am new to this code base. I am just curious if you guys thought of using ISO CPP guideline and Google Cpp Style Guide.

17:10

<willcl_ark> 3072-bit DL has about 128 bits of security when used modulo a sufficiently large prime or an elliptic curve group

17:10

<stacie> I don’t have a better answer to why 3072 bits other than it’s best practice. Wagner’s Birthday Problem/k-sum paper shows that when MuHash uses a sufficiently large prime for the modulo step, it is provably secure under the discrete log assumption. Common guidelines state that 3072 bits is enough security.

17:10

<sipa> in ECC 256-bit is enough for 128 bits of security

17:10

<jesseposner> and 128 bits of security is sufficient until the next revolutionary breakthrough in either mathematics or technology

17:11

<@jnewbery> jesseposner glozow willcl_ark stacie: yes!

17:11

<sipa> in modulo prime groups, 3072 bits is needed

17:11

<sipa> jesseposner: also, bitcoin has 128-bit security for lots of things already - it doesn't make sense to target higher

17:11

<@jnewbery> 3. Can the Muhash of a single object (eg a transaction) be calculated and cached? Would we do this in practice?

17:11

<stacie> jnewbery ah! noted. I thought the numerator/denominator thing was so cool

17:11

<willcl_ark> sipa: is that because of the RIPMD160?

17:11

<michaelfolkson> Yes. No

17:12

<sipa> willcl_ark: no... secp256k1 has 128 bit security, and 256-bit hashes for transactions have 128 bit collision resistance

17:12

<willcl_ark> jnewbery: because we can add or delete in any order, there's no need to cache individual elements

17:12

<@jnewbery> CD36: That's a bit off topic for now, but you can find our style guide at https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md#coding-style-general

17:12

<jesseposner> You could calculate a single object but if you don't need the homomorphic properties then there is no reason to use it because it's slow and complex when compared with sha256

17:12

<jonatack> CD36: see https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md ... the guides you refer to are indeed sometimes referenced by reviewers

17:12

<michaelfolkson> Caches are large

17:13

<michaelfolkson> (768 bytes)

17:13

<sipa> willcl_ark: P2SH has only 80 bits of collision resistance which matters for some uses, which is why P2WSH only has a 256-bit script hash mode

17:13

<willcl_ark> sipa: thanks, that's very helpful!

17:14

<@jnewbery> michaelfolkson: right (althought with this implementation, it'd be 384 bytes since we're not storing numerator and denominator)

17:14

<sipa> it depends on whether you'd be willing to do the inverse for every transactions

17:15

<@jnewbery> again, the question was a bit vague. The idea here is that it doesn't make sense to precompute the muhash of individual transactions in the mempool, since the cached muhash of the transaction would be large

17:15

<sipa> yeah, 384 or 768, it's both pretty big compared to average transaction-in-memory sizes

17:16

<@jnewbery> under a different scheme like ECMH, the cache would only be 64 bytes per object

17:16

<michaelfolkson> "slow and complex when compared with sha256" This isn't the case I don't think. Or at least not specific jesseposner

17:16

<sipa> michaelfolkson: it is many times slower than just SHA256'ing the whole UTXO set

17:17

<@jnewbery> sipa: if you were pre-caching, perhaps you'd want to store 384 bytes and then have half the multiplications and no inverse on the critical path?

17:17

<sipa> jnewbery: or you'd use ECMH instead :)

17:17

<@jnewbery> indeed

17:17

<@jnewbery> ok, next question has already been answered. What is the most expensive operation to carry out in the rolling hash? What can we do to reduce the number of times we need to carry out this operation?

17:18

<@jnewbery> (partly answered)

17:18

<jesseposner> the modular inverse is the most expensive operation so we wait to compute an inverse until the final hash is needed

17:18

<willcl_ark> calculating the modular inverse

17:18

<stacie> The most expensive operation to carry out in the rolling hash is computing the modular inverse. The rest of my answer to the rest is based on what I learned from the mailing list post, but as of 5 min ago I learned the actual implementation stores just one Num3072 for the numerator/denominator. To minimize the amount of times the inverse computation has to happen, the running hash is stored as a fraction. Newly

17:18

<stacie> created UTXOs (aka new elements in the set) are multiplied into the numerator, and spent UTXOs (aka elements needing removal from the set) are multiplied into the denominator. The inverse operation is only performed when a final hash is needed.

17:18

<michaelfolkson> That is very quick typing stacie

17:18

<@jnewbery> jesseposner willcl_ark stacie: exactly right!

17:18

* michaelfolkson needs time to read

17:19

<lightlike> In the python implementation, the numerator and denominator are part of the internal state, while here, it seems that the user is responsible for keeping track of them. Why the difference?

17:19

<jesseposner> I was wondering about that as well

17:19

<glozow> lightlike what do you mean user is responsible for keeping track?

17:19

<@jnewbery> lightlike: what do you mean by the user keeping track of them? The user just passes objects to add to and remove from the set

17:19

<stacie> michaelfolkson haha I did (some) of my homework this time ;) but I'm reaching the end of my answer sheet :P

17:20

<jesseposner> the numerator/denominator logic seems like it maps to the limb arithmetic

17:21

<lightlike> glozow: i thought the user would have a local var for both, and multiply the right one for adding/removing, and only doing division at the end - like it is done in crypto_test.cpp unit test.

17:21

<sipa> i think the reason is just that in python the bignum arithmetic is trivial, and doesn't deserve a class on its own - it's just a number

17:21

<willcl_ark> I would also say I was surprised to see the optimisation left on the table for follow up; was it much harder to implement?

17:21

<@jnewbery> jesseposner: The limb arithmetic is to implement wide (3072 bit) integers. Python allows you to do arithmetic on arbitrarily large ints, so it's not needed in the Python implementation

17:21

<felixweis> my guess is limb arithmetic is used because unlike python it doesn't do infinitly large integers

17:22

<sipa> so the data structure there is a higher-level thing doing something more high-level

17:22

<jesseposner> jnewbery: ah that makes sense, thanks!

17:22

<sipa> in the C++ code, MuHash3072 is really just a specialized 3072-bit bignum implementation

17:22

<@jnewbery> lightlike: That's just for the unit test. In normal usage, the client code would just pass objects to the MuHash object to add and subtract from the set

17:22

<sipa> if we'd want you could write a higher-level wrapper around that for the actual multiset hash scheme

17:23

<@jnewbery> Next question: How can we test for membership in the Muhash set?

17:23

<glozow> nope

17:23

<willcl_ark> You can't?

17:23

<michaelfolkson> Trick question

17:23

<jesseposner> We can't. In fact, it's not a set, but rather a hash representing a set: https://github.com/bitcoin/bitcoin/pull/19105/files#r438847029

17:23

<@jnewbery> very good. None of you fell for the trick question

17:23

<sipa> it's trivial, you recompute it from the entire set ;)

17:23

<jesseposner> :-)

17:23

<lightlike> jnewbery: I thought the user would use the "/" operator as rarely as possible because it involves a costly "Inverse" call. instead, they would keep a local running variable for the denominator, and only divide numerator/denominator at the end.

17:24

<michaelfolkson> Though sipa did say in the mailing list post that it doesn't support compact proofs of existence

17:24

<sipa> Here's why to do if you want to compactly test for membership in MuHash: lie down, cry, cry a lot

17:24

<@jnewbery> yes, this is a hashed representation of a set, not an accumulator

17:24

<felixweis> so for c++ we do the combination of numerator and denominator later outside of 2 MuHash3072 instances

17:25

<@jnewbery> lightlike: that'd be quite an imposition to place on the client code I think

17:25

<lightlike> jnewbery: but otherwise, if we'd call "/" immediately, it seem really inefficient

17:25

<sipa> that's what the C++ code does, though?

17:26

<sipa> let the user maintain numerator and denominator explicitly

17:26

<@jnewbery> It'd be trivial to enhance the implementation to store both the numerator and denominator

17:26

<fjahr> I am doing it in the implementation of the index right now, yes

17:26

<@jnewbery> sipa: oh! Is that how it's supposed to be used?

17:26

<sipa> jnewbery: yes

17:26

<willcl_ark> lightbulb.gif

17:27

<sipa> it would be totally reasonable to provide a higher-level class that does that for you

17:27

<sipa> matching the Python code

17:27

<@jnewbery> lightlike: I apologise. You were right. I haven't looked at how this code is integrated into the UTXO index

17:27

<sipa> the MuHash3072 class in C++ is *just* a specialized 3072-bit bignum implementation; it's not really the full set hash scheme (as it excludes the ChaCha20 and SHA256 too)

17:28

<sipa> ah no, it does the ChaCha20'ing, but not the hashing

17:29

<willcl_ark> a little confusingly-named if it's not the full implementation, but makes a lot of sense now

17:29

<sipa> it may make sense to refactor that

17:29

<sipa> into a Num3072 class that's just the numbers, and MuHash3072 that does num/denom/chacha20/sha256

17:29

<willcl_ark> what would you call the higher-level class otherwise

17:29

<willcl_ark> I think that would be nice

17:29

<gentile> agreed

17:30

<@jnewbery> sipa: Don't we already have that? Num3072 is the bignum class

17:30

<jesseposner> +1

17:30

<sipa> jnewbery: eh, it's just the storage without operations

17:30

<sipa> but good point

17:31

<@jnewbery> so you're saying move all the functions into class methods in Num3072?

17:31

<PatrickLemke> sipa Is there a specific reason why you implemented your own bignum support for C++?

17:31

<sipa> though in some cases you don't actually need the num/denom trick (if you're computing the muhash3072 from the utxo set in full, you're never deleting)

17:31

<sipa> PatrickLemke: avoiding a dependency for something that's possibly consensus critical at some point

17:32

<@jnewbery> we have quite a lot of questions to get through, so let's keep moving. What public methods does the MuHash3072 object expose to clients? What is the Span<> class template that’s used in some of those public methods?

17:32

<willcl_ark> The *= and /= operators

17:32

<jesseposner> *=, /=, Finalize

17:32

<nehan> Span is sort of like a slice in Go. It points to some range in another datastructure

17:33

<nehan> hi

17:33

<felixweis> i wasn't quite sure about the span because the INPUT_SIZE is supposedly fixed

17:33

<@jnewbery> hi nehan!

17:33

<felixweis> makes sense if we hash in the instanciation tho

17:33

<nehan> i wasn't clear on why Span either

17:33

<@jnewbery> yes, we have multiplication and division operators, a constructor and a Finalize() function

17:34

<willcl_ark> A span is just a template for a sequence of value of the same type I think

17:34

<@jnewbery> We talked about Spans in a previous review club meeting: https://bitcoincore.reviews/18468

17:34

<sipa> willcl_ark: it's not a template

17:34

<sipa> it's a way of passing a (pointer to array, length of array) conveniently, so it works with any container that stores sequential elements of that type

17:35

<@jnewbery> they're a lightwight, non-owning reference to some contiguous sequence of objects (bytes in this case)

17:35

<sipa> but it's a concrete data type, not a way of just making your function templated in the type of the passed object (which is an alternative that also works0

17:35

<jonatack> A Span represents a vector-like view to a range of contiguous elements in memory analogous to std::span in C++20

17:35

<jonatack> (one more definition ;)

17:36

<@jnewbery> Next question. What does the #ifdef HAVE___INT128 code in muhash.h do?

17:36

<jonatack> they're becoming quite ubiquitous in the codebase

17:36

<glozow> HAVE___INT128 = whether your system supports 128b integers?

17:36

<willcl_ark> If you have BigInt support, you can use MuHash with fewer, larger limbs to make up your 3072 bits. Does this require fewer inverse/multiplications when computing the hash?

17:37

<@jnewbery> willcl_ark: not fewer inverse/multiplications, but each inverse/multiplication requires fewer operations

17:37

<glozow> I imagine bigger integers = fewer operations = more optimized ?

17:37

<@jnewbery> glozow: right

17:37

<jesseposner> the HAVE___INT128 branch is optimized for 64-bit multiplication hardware (https://github.com/bitcoin/bitcoin/pull/19055/files#r507238291)

17:37

<willcl_ark> jnewbery: ah right, thanks

17:37

<glozow> How would one test consistency between HAVE___INT128 and !HAVE___INT128 platforms?

17:38

<sipa> willcl_ark: not fewer inverses, but each inverse (and 3072-bit multiplication) consists of 4x fewer limb multiplications if they're twice as big

17:38

<felixweis> https://www.youtube.com/watch?v=FRkJCvHWdwQ from 45:25 has a good short overview of <span> in c++20. Span behaves similar

17:39

<@jnewbery> fjahr helpfully did some benchmarking of the performance difference between using int128 and not: https://github.com/bitcoin/bitcoin/pull/19055#discussion_r508093977

17:39

<syrex> Which C++ standard is used?

17:39

<@jnewbery> glozow: very good question. What do people think?

17:39

<willcl_ark> That's a decent speedup

17:40

<nehan> well you need the hardware, but presumably you could #define it off and compare the results in a test?

17:40

<sipa> syrex: bitcoin core master is C++11 right now, but we'll transition to C++17 probably in the next few weeks (after 0.21 branch off)

17:40

<jonatack> syrex: c++11, migration to 17 is planned during the next release or so.

17:41

<@jnewbery> nehan: yeah, that sounds sensible. It'd be quite nice to do it automatically though

17:41

<@jnewbery> 4. How is a MuHash3072 object constructed and initialized? What happens if the ChaCha20 output is larger than the group order?

17:41

<willcl_ark> glozow: perhaps you exercise each mathmatical operation with and without it using some edge-sized values?

17:42

<michaelfolkson> You take the output modulo the group order..?

17:43

<willcl_ark> Looks like it must be initialised with a Span of type `unsigned char` containing one 32B key otherwise `assert(key32.size() == INPUT_SIZE)` will trigger

17:43

<sipa> michaelfolkson: you're confused; there is no group here (except the multiplicative group we're working in)

17:43

<@jnewbery> michaelfolkson: that's a good guess but it's not the right answer

17:44

<sipa> it's just taken modulo the modulus

17:44

<sipa> that modulus is not the order of some other group

17:44

<@jnewbery> sipa: I don't understand

17:45

<sipa> MuHash3072 is doing arithmetic in the field of integers modulo 2^3072 - 1103717

17:45

<sipa> in particular, the multiplication subgroup of that field is what we use

17:46

<sipa> but 2^3072 - 1103717 isn't the order of some other group (like scalar arithmetic is arithmetic modulo the order of an EC group)

17:46

<sipa> it's just a constant we've chosen

17:46

<jesseposner> This comment refers to the "order of the group" (perhaps it should be revised?): https://github.com/bitcoin/bitcoin/pull/19055/files#diff-ccad840af4d1bda6dda986297fdd142a8cf433cd4ab4222eea20fe1fd229a158R16

17:47

<sipa> no, that's correct

17:47

<sipa> actually, it's not

17:47

* Murch is waiting for sipa's computing process to halt

17:47

<sipa> it is the order of the additive group of integers modulus that number, trivially

17:48

<sipa> but that additive group isn't relevant to us here; we only multiply and divide

17:48

<sipa> it should just say "is chosen as modulus"

17:48

<fjahr> noted :)

17:48

<@jnewbery> ah! I think that's what's confused me

17:48

<sipa> jesseposner: thanks for pointing that out, i guess past-me made the same mistake

17:49

<lightlike> hmm, where exactly is the modulus taken? No "%" exists in all of muhash.cpp

17:49

<sipa> lightlike: thankfully! modulus operations are slow

17:49

<@jnewbery> ok, so let me revise my question. What happens if the output of the chacha20 is greater than the modulus?

17:49

<sipa> lightlike: like 100x slower than a multiplication

17:50

<sipa> lightlike: and the modulus is computed inside the Multiply and Square functions, simultaneously with the multiplication

17:50

<lightlike> sipa: ah, thanks!

17:51

<gentile> 9 minutes left

17:51

<@jnewbery> right, so if the output of the chacha20 is greater than the modulus, we don't do anything, because we reduce in the Finalize() function

17:51

<sipa> lightlike: there is a name for this technique, but if you want to compute (x mod (2^N - C)), where x is up to 2N bits, you can observe that it's equal to (x_low) + (x_high * C), where _low and _high are the bottom and upper half of x

17:52

<@jnewbery> Next question: The Finalize() method has a comment “Does not change this object’s value.” Why is the function not marked const?

17:52

<felixweis> I probably have to read this 2^N times to understand it...

17:53

<glozow> Wale, the effective value is never changed, but Finalize might call FullReduce() in case IsOverflow(), which mutates the Num3072 object?

17:53

<@jnewbery> glozow: exactly right!

17:53

<sipa> felixweis: what is 1437 mod 99?

17:53

<sipa> quick

17:54

<Murch> 51?

17:54

<sipa> indeed, why?

17:54

<Murch> 37+14

17:54

<sipa> bingo; 1400 mod 99 = 14

17:54

<glozow> 🤯

17:54

<sipa> what is 1437 mod 97?

17:54

<@jnewbery> FullReduce() doesn't change the value modulo 2^3072 - 1103717, but it may change the internal Num3072 representation

17:55

<Murch> 79

17:55

<sipa> yup, why?

17:55

<glozow> i needed some time to multiply 14*3 tho

17:55

<sipa> :D

17:55

<Murch> 3*14+37 < 97

17:55

<sipa> exactly, it's the same here: by having a modulus close to a power of two, we can reduce by multiplying the top half by C, and adding to the lower half

17:56

<@jnewbery> ok, we're almost out of time, so I'm going to skip to the last question. How is this new code tested? Can you think of other ways that it could be tested?

17:56

<felixweis> interesting

17:56

<felixweis> TIL

17:57

<glozow> we can test by having sipa do it in his head and compare with the code's results

17:57

<buzz08> :-D

17:57

<sipa> or use the python code

17:57

<sipa> it's faster

17:58

<@jnewbery> sipa: good answer!

17:58

<michaelfolkson> There are some unit test cases and some fuzzing.

17:58

<buzz08> sipa: good comeback

17:58

<@jnewbery> I think it'd be really cool if we integrated something like this: https://github.com/bitcoin/bitcoin/pull/19841#issuecomment-687667841

17:59

<@jnewbery> Where the python implementation and c++ implementation are tested against the same random input and checked that they arrive at the same result

17:59

<buzz08> a functional test case with sample I/O values ?

17:59

<Murch> jnewbery: Of course that only proves that they are coming to the same result, not that the result is correct ;)

18:00

<willcl_ark> Murch: we only need consensus though, right :P

18:00

<@jnewbery> buzz08: yeah, we do have test vectors, but it's also nice to have more random coverage

18:00

<Murch> willcl_ark: touche

18:00

<@jnewbery> ok, that's time. Thanks everyone! Hope you enjoyed looking at some lower level code this week

18:00

<sipa> if there is a bug in this code, it'll likely result in different results on 32-bit and 64-bit code

18:00

<willcl_ark> thanks jnewbery, sipa!

18:01

<stacie> thanks for hosting jnewbery!

18:01

<felixweis> great thanks jnewbery for hosting :-)

18:01

<glozow> thanks jnewbery!

18:01

<buzz08> great info guys, thanks to all

18:01

<nehan> thanks!

18:01

<blueskies> this was fascinating. thank you all!

18:01

<fjahr> thanks jnewbery :)

18:01

<thomasb06> (order of a group G, say o(G), implies that for all g in G, g * g * ... * g done o(G) times always gives the neutral element of G, say 1 with multiplicative notation. But by definition, the order of a group is the number of element of the group. The g * g * ... * g = 1 is a consequence. For example the group of symmetries that keep a cube invariant is of order 48, that is there are 48 symmetries in the group. Bu

18:01

<thomasb06> it takes g * g * ... * g done 48 times to get the neutral element, here the identity transformation. If you consider only the rotational symemetries, the group is of order 24 only.)

18:01

<lightlike> thanks!

18:01

<willcl_ark> Who is compiling the review-club compendium of factoids anyway

18:02

<thomasb06> thanks jnewberry

18:02

<michaelfolkson> "do you plan to add tests using the just-added Python MuHash3072 implementation?" jonatack

18:02

<gentile> thanks jnewbery

18:02

<elle> thanks!

18:02

<sipa> thomasb06: groups can be larger than their order

18:03

<michaelfolkson> Thanks jnewbery

18:03

<jonatack> michaelfolkson: is that from my review last May?

18:03

<fjahr> michaelfolkson: they are there in the follow-up pr

18:03

<michaelfolkson> Yup

18:03

<michaelfolkson> Ah nice fjahr

18:03

<sipa> thomasb06: the addition group of GF(2^2) has order 2, but size 4

18:03

<emzy> Tnx anyone

18:03

<jonatack> thanks jnewbery and fjahr, I'll re-review completely after branch-off

18:04

<jonatack> I have a fun favor to ask while you are here: if you haven't seen it yet, check out this tweet by Murch https://twitter.com/murchandamus/status/1318898781618917378

18:04

<jonatack> and cast your vote in these two twitter polls about bitcoin feerate units https://twitter.com/jonatack/status/1318890833131823104

18:05

<thomasb06> sipa well, as far as I remember, the order was the number of element when talking of a group

18:05

<michaelfolkson> Surely for users it has to be sat/VB... (don't want to swing poll)

18:05

<Murch> jonatack: So far people seem to be fairly unambiguous

18:05

<thomasb06> (my computer is swaping)

18:06

<jonatack> Murch: the first poll, definitely

18:08

<jonatack> michaelfolkson: seems so but i'm unsure how much of that preference is habit or unfamiliarity. for now proceeding with sat/vB

18:09

<michaelfolkson> Right thomasb06

18:10

<michaelfolkson> https://en.wikipedia.org/wiki/Order_(group_theory)

18:12

<thomasb06> hehe, the memories were not bad. It starts to be far though...

18:16

<michaelfolkson> I've been flailing around in EC world too recently. Have to remember which world you are operating in

18:18

<glozow> wait so group order != size?

18:19

<thomasb06> glozow: when talking of the set, no

18:19

<thomasb06> but in the case of sepc256k1, the group is cyclic so it's the same

18:20

<thomasb06> G = <g>

18:20

<thomasb06> so o(G) = o(g)

18:22

<thomasb06> michaelfolkson: yes, the ECs are defined over finite fields, it's always confusing

18:24

<glozow> the group = the underlying set + the binary operation, in this case mod multiplication?

18:26

<thomasb06> glozow: leave the mod multiplication, takes the symmetries that keep a cube invariant. It's a simple group

18:26

<thomasb06> *take

18:30

<sipa> glozow: the group we're working over here is the multiplication modulo 2^3072-1103717

18:30

<sipa> the _order_ of that group is (2^3072-1103717)-1, as it excludes the 0 element

18:30

<sipa> (which is not considered part of the multiplicative group)

18:31

<sipa> so the integers modulo P=2^3072-1103717 has P elements, but only P-1 of those participate in the multiplicative group

18:32

<sipa> that also means that if you take any number (except 0), and raise it to the power P-1, modulo P, you get 1

18:32

<sipa> which is the identity in that group

18:32

<sipa> as it's a cyclic group, its size is equal to its order, but that order is _not_ the modulus

18:39

<glozow> sipa that makes sense, thank you

19:04

<jesseposner> So if I'm getting this right, the modulus, 2^3072 - 1103717 (the largest 3072-bit safe prime number), is used to define a finite field. The order of that finite field is the modulus. However, the non-zero elements of the finite field form a multiplicative group, and thus the order of the group is (2^3072-1103717)-1 because it excludes the 0 element.

19:06

<thomasb06> jesseposner: it sounds good but I don't know what a modulus is... The number of elements?

19:07

<jesseposner> It is equal to the number of elements in the finite field, but is not equal to the number of elements in the group (because the group excludes the zero element of the field).

19:07

<thomasb06> jesseposner: order is for groups rather, so a finite field has a modulus but no order

19:08

<jesseposner> I believe both a group and a finite field have an order.

19:08

<thomasb06> then, the field is of modulus 2^3072 - 1103717

19:08

<jesseposner> yes

19:09

<thomasb06> and the invertibles are a group of order 2^3072 - 1103717 - 1

19:09

<thomasb06> (not used to the order term for fields though)

19:10

<jesseposner> I believe so

19:11

<sipa> jesseposner: yes, "order of a field" is really "order of its additive group"

19:11

<sipa> jesseposner: and the field we have here has an additive group (which we don't use!) of order MODULUS, and a multiplicative group of order MODULUS-1

19:11

<sipa> that is the case only because MODULUS is a prime

19:12

<sipa> if MODULUS was not a prime, then (integers mod MODULUS) would not be a field (it would be a ring instead), and its multiplicative group would have an order less than MODULUS-1

19:12

<jesseposner> ah, interesting!

19:13

<sipa> this is used in RSA, for example, where the ring (integers modulo p*q) is used; that ring has additive group of order p*q, but multiplicative group of order (p-1)*(q-1)

19:13

<sipa> and security relies on attackers not being able to figure out the order (p-1)*(q-1) given just p*q

19:13

<sipa> (obviously, if you have p and q individually, that's trivial, but they don't)

19:15

<sipa> the fact that the multiplicative group for us has order MODULUS-1 is used to compute inverses, btw: if x^(MODULUS-1)=1 for all x != 0, then x^(MODULUS-2) must be x^-1

19:15

<sipa> just divide both sides by x