BIP 350: Implement Bech32m and use it for v1+ segwit addresses (utils/log/libs)

https://github.com/bitcoin/bitcoin/pulls/20861

Host: glozow  -  PR author: sipa

The PR branch HEAD was 835ff6b at the time of this review club meeting.

In this PR Review Club meeting, we’ll discuss BIP350 and Bech32m.

Notes

  • An invoice address (aka output address, public address, or just address), not to be confused with public key, IP address, or P2P Addr message, is a string of characters that represents the destination for a Bitcoin transaction. Since users generate these addresses to send bitcoins and incorrect addresses can result in unspendable coins, addresses include checksums to help detect human errors such as missing characters, swapping characters, mistaking a q for a 9, etc.

  • Bech32 was introduced in BIP173 as a new standard for native segwit output addresses. For more background on Bech32, this video describes Bech32 checksums and their error correction properties.

  • Bech32 had an unexpected weakness, leading to the development of Bech32m, described in BIP350.

  • PR #20861 implements BIP350 Bech32m addresses for all segwit outputs with version 1 or higher. Note that such outputs are not currently supported by mainnet so this does not pose a compatibility problem for current users. It intentionally breaks forward compatibility for future software to prevent accidentally sending to an unspendable v1 output.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?

  2. Can you describe the length extension mutation issue found in Bech32? Does it affect Bitcoin addresses? Why or why not?

  3. How does Bech32m solve this length extension mutation issue?

  4. Which addresses will be encoded using Bech32, and which ones with Bech32m? How does this effect the compatibility of existing software clients?

  5. What are the three components of a Bech32m address encoding?

  6. How does Decode check whether an address is encoded as Bech32 or Bech32m? Can a string be valid in both formats?

  7. The space in this test string is not an accident. What does it test?

  8. For fun: Is Bech32 case-sensitive? (Hint: Why is “A12UEL5L” valid but “A12uEL5L” not?)

Meeting Log

  118:00 <glozow> #startmeeting
  218:00 <jnewbery> hi
  318:00 <glozow> Welcome to PR Review Club everyone!!!
  418:00 <amiti> hi!
  518:00 <maqusat> hi
  618:00 <AnthonyRonning> hi
  718:00 <glozow> Anyone here for the first time?
  818:00 <michaelfolkson> hi
  918:00 <pinheadmz> wuddup
 1018:00 <willcl_ark_> hi
 1118:00 <lightlike> hi
 1218:00 <AsILayHodling> hi
 1318:00 <b10c> hi
 1418:00 <glozow> Today, we're looking at #20861 BIP 350: Implement Bech32m and use it for v1+ segwit addresses
 1518:00 <glozow> Notes: https://bitcoincore.reviews/20861
 1618:00 <glozow> PR: https://github.com/bitcoin/bitcoin/pull/20861
 1718:00 <cguida> hi
 1818:00 <cguida> my first time
 1918:01 <glozow> Welcome cguida! :)
 2018:01 <AnthonyRonning> cguida: welcome!
 2118:01 <cguida> thanks! :)
 2218:01 <glozow> Did y'all get a chance to review the PR and/or BIPs? What was your review approach?
 2318:01 <cguida> Didn't get to running the code yet, but did some reading
 2418:02 <jnewbery> 0.2y
 2518:02 <glozow> link to BIP350: https://github.com/bitcoin/bips/blob/master/bip-0350.mediawiki
 2618:02 <AnthonyRonning> browsed a bit, not familiar with the concept at all yet
 2718:02 <amiti> mostly just looked through the review club notes & relevant sections in bips / code, didn't do a proper review.
 2818:02 <nehan> hi
 2918:02 <michaelfolkson> I'd Concept ACK, Approach ACKed a while ago. So looking at code, running tests etc
 3018:03 <maqusat> just had time to glance over
 3118:03 <sipa> hi
 3218:03 <pinheadmz> read bip and ML posts, havent tried code yet
 3318:03 <b10c> looked over the BIP and the reviews page
 3418:03 <emzy> hi
 3518:03 <glozow> Alrighty, maybe we could start with a light conceptual question: what is Bech32 used for exactly?
 3618:03 <pinheadmz> encoding data with error correction!
 3718:03 <pinheadmz> using a set of 32 characters
 3818:04 <glozow> pinheadmz: yes! what are we encoding, in the context of Bitcoin?
 3918:04 <pinheadmz> ok, segwit addresses
 4018:04 <pinheadmz> a segwit version followed by some amount of data
 4118:04 <jnewbery> *error detection and correction
 4218:04 <pinheadmz> could be a publichey hash, script hash or in the case of taproot, a bare public key
 4318:04 <b10c> addresses, but 'invoice' addresses and not IP addresses etc
 4418:04 <cguida> with a focus on character transcription errors
 4518:04 <sipa> (but despite supporting error correction, you should absolutely nevwr do that - if you detect errors, you should the user to go ask the real address again)
 4618:04 <pinheadmz> jnewbery thank u
 4718:05 <jonatack> hi
 4818:05 <glozow> ok so how important is error detection here, on the scale of meh to we-could-lose-coins?
 4918:05 <cguida> and simplifying display in qr codes!
 5018:05 <jnewbery> right, for sending to an address we shouldn't do error correction
 5118:05 <nehan> we-could-lose-coins
 5218:05 <pinheadmz> glozow youcould be sending bitcoin to the wrong person or to an unrecovaerbale key if you mess up!
 5318:05 <eoin> I'm a newb and don't know C++ or Python, how should I proceed?
 5418:05 <glozow> pinheadmz: yeah! so the error detection is key here :)
 5518:05 <schmidty> hi
 5618:06 <pinheadmz> eoin start in english? https://github.com/bitcoin/bips/blob/master/bip-0350.mediawiki
 5718:06 <michaelfolkson> glozow nehan: Depending on whether it is a character or two correction chances of losing coins is veeeery low
 5818:06 <AnthonyRonning> human readability is another aspect of bech32 as well, right?
 5918:06 <cguida> we-could-lose-coins, because the error could be a valid address
 6018:06 <cguida> with low probability
 6118:06 <jnewbery> eoin: welcome! Follow along as best you can. There are some good resources for newcomers here: https://bitcoincore.reviews/#other-resources-for-new-contributors
 6218:06 <jonatack> good resources also at: https://bitcoinops.org/en/topics/bech32/
 6318:06 <michaelfolkson> cguida: Depending on how many characters are being corrected
 6418:06 <nehan> glozow: i thought we were talking about error correction generally? or do you mean specifically in bech32 addresses?
 6518:06 <pinheadmz> the fun parts of bech32 to me are how characters are arranged by possible visual mistake i.e. v and w
 6618:07 <jonatack> optech topics are a "good first stop for info"
 6718:07 <cguida> michaelfolkson: yes
 6818:07 <pinheadmz> as if someone was reading a bitcoin address and typing it in manually
 6918:07 <nehan> *error detection
 7018:07 <sipa> michaelfolkson: if you do error correction the probability of sending to the wrong address goes up spectacularly; correction only works if you make up to 2 errors (with restrictions on what those errors are); if yiu make more, it is very likely that error correction will "correct" to the wrong thing
 7118:07 <cguida> eoin: python is easy to get started with, send me a message if you'd like some resources
 7218:07 <glozow> nehan: error detection generally, yes, I want to make sure we're all clear that it's a key goal here
 7318:07 <jnewbery> I think we should all just pretend that error *correction* is not a thing for the purposes of this conversation
 7418:07 <jnewbery> and just focus on error *detection*
 7518:08 <pinheadmz> sure but it is cool :-)
 7618:08 <pinheadmz> bech32 can fix up to 3 (?) mistakes
 7718:08 <sipa> 2
 7818:08 <pinheadmz> ty
 7918:08 <nehan> pinheadmz: id on't think that's true!
 8018:08 <glozow> Ok I think we're on the same page :) Next question is a little harder: Can you describe the length extension mutation issue found in Bech32?
 8118:09 <michaelfolkson> The probabilities are listed somewhere I think... maybe in sipa's SF Bitcoin Devs slides
 8218:09 <amiti> if the address ends with a p, you can insert or delete q characters right before & it won't invalidate the checksum
 8318:09 <nehan> checksum for <addr>p = <addr>qqqqqp
 8418:09 <cguida> sipa: right, a perhaps overengineered approach would be to present the 2 or 3 closest correction strings to the user? haha
 8518:09 <glozow> amiti: nehan: correct!
 8618:10 <pinheadmz> my understanding is that the bech32 data represents a polynomial, and since x^0 = 1, you can add a bunch of extra 0's at the end of a bech32 address and its just like (checksum * 1 * 1 * 1...) so it remains valid
 8718:10 <glozow> can anyone tells us why this is the case?
 8818:10 <sipa> cguida: the BIP says you cannot do more than point out likely positions of errors
 8918:10 <pinheadmz> or rather data * 1 * 1 * 1... so the checksum doesnt change
 9018:11 <tkc> cguida: I would be interested in those beginner resources also. This is not the topic for today obviously, but how to connect with you outside this?
 9118:11 <cguida> tkc eoin just send me a dm here on irc
 9218:11 <glozow> pinheadmz: nice! could you tell us how we get from a string to a polynomial?
 9318:12 <cguida> pinheadz: ohhh
 9418:12 <cguida> pinheadmz*
 9518:12 <pinheadmz> not... really..... but theres this chart: https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki#bech32
 9618:12 <pinheadmz> that maps charachters to numbers
 9718:12 <sipa> not sure i follow about the * 1 * 1 * 1
 9818:12 <pinheadmz> sipa my understand is pretty abstract i just barely kinda get it
 9918:13 <pinheadmz> that since x^0 = 1, a bunch of 0s at the end ends up just multiplying something by 1
10018:13 <pinheadmz> which doesnt change the value
10118:13 <sipa> hmm, no
10218:13 <sipa> glozow: i can explain if you want
10318:13 <michaelfolkson> +1 :)
10418:14 <pinheadmz> +p
10518:14 <pinheadmz> (anyone get it?)
10618:14 <glozow> heh ok so, "z"=2, "p"=1 and "q"=0, so what polynomial do we get from "zqzp?"
10718:14 <cguida> the checksum for bech32 has a 1 multiplied in, bech32m uses something else
10818:14 <glozow> sipa: go for it :P
10918:14 <cguida> or xored in
11018:14 <nehan> glozow: i think you should do it and sipa can chime in :)
11118:14 <michaelfolkson> nehan: +1
11218:15 <sipa> if you translate the characters to poiynomials, bech32 is essentially the equation code(x) mod g(x) = 1
11318:15 <glozow> jnewbery shared this earlier https://bitcoin.stackexchange.com/questions/91602/how-does-the-bech32-length-extension-mutation-weakness-work which has a good explanation
11418:15 <sipa> where code(x) is the polynomial corresponding to the data (incl checksum) of the bech32 string
11518:15 <sipa> and g(x) is a specific 6th degree constant
11618:16 <glozow> `g(x) = x^6 + 29x^5 + 22x^4 + 20x^3 + 21x^2 + 29x + 18`
11718:16 <pinheadmz> sipa what does 6th degree constant mean ?
11818:16 <sipa> pinheadmz: the exact polymonial glozow just gave
11918:16 <felixweis> polynomial of degree 6
12018:16 <glozow> degree 6 polynomial, same one used for every encoding
12118:16 <pinheadmz> sipa is that the value gmax crunched for a week on a super computer ?
12218:16 <sipa> it"s constant, not as in 0th degree, but as in: it is a constant, everyone uses tbe same
12318:17 <nehan> pinheadmz: a "constant" polynomial means its coefficients are fixed, I think
12418:17 <glozow> constant as in `const` :P
12518:17 <sipa> pinheadmz: that one took way longer; we're talking bech32 here, not bech32m
12618:17 <pinheadmz> right i was refrring to bech32
12718:17 <sipa> so, we can write that as code(x) = f(x) * g(x) + 1
12818:17 <pinheadmz> i understand bech32m also has a bruteforced constant
12918:17 <sipa> that's the definition of modulus
13018:18 <sipa> or: code(x) - 1 = f(x)*g(x)
13118:18 <glozow> so to answer my earlier question "z"=2, "p"=1 and "q"=0, so what polynomial do we get from "zqzp?"
13218:18 <glozow> it's `2x^3 + 0x^2 + 2x + 1` i.e. `2x^3 + 2x + 1`
13318:18 <sipa> indeed!
13418:19 <pinheadmz> sipa how is that a modulus? like, does it "wrap around"? bc its two polynomials being multilied?
13518:19 <glozow> does everyone see how we got that?
13618:19 <sipa> pinheadmz: it's just like numbers
13718:19 <glozow> let me know if it's unclear and we can slow down
13818:19 <sipa> yes, it wraps around
13918:19 <pinheadmz> but number * number approaches infinity without wrapping
14018:19 <glozow> so that modulus 1 is there so that we can't trivially create a new valid string from an old one
14118:19 <sipa> it"s in the degrew instead in number of digits here
14218:20 <sipa> once you go over 6th ddgree, it wraps around
14318:20 <nehan> pinheadmz: you might want to study group theory a little (abstract algebra). numbers are just examples; you can apply the concepts to sets of "things" as well
14418:20 <sipa> because you can subtract a bigger multiple of the modulus
14518:20 <nehan> pinheadmz: in this case, the set of things is a set of polynomials, and you can operate on them
14618:20 <cguida> glozow: I see how you got a polynomial from those inputs, but what's x in this case?
14718:20 <sipa> cguida: x is just a variable name
14818:21 <sipa> we nwver actually evaluate it in a specific value of x
14918:21 <sipa> we need one to write polynomials, that's it
15018:21 <glozow> cguida: you can think of polynomials as basically a vector of coefficients
15118:21 <cguida> ok so the x doesn't matter, just the coefficients?
15218:21 <glozow> helps to distinguish polynomials from polynomial functions
15318:22 <sipa> cguida: yeah, you can say zqzp is just [1,2,0,2] (we tend to write low powers first when representing as lists)
15418:22 <cguida> i'll need to play with it more i think
15518:23 <cguida> sipa: ok cool
15618:23 <sipa> but remember that when multiplying you need to think of them as popynomials
15718:23 <michaelfolkson> cguida: You might do algebra with say x, y, z without ever ascribing values to them. This way you are playing around with specific polynomials instead of x, y and z
15818:23 <b10c> Zx^3 + Qx^2 + Zx + P with Z=2, P=1 and Q=0 ==> `2x^3 + 0x^2 + 2x + 1`, right?
15918:23 <sipa> so!
16018:23 <glozow> ok so we have the condition for valid Bech32 being: if your string is represented as `p(x)`, you need `p(x) = f(x)*g(x) + 1` aka `p(x) mod g(x) = 1` to be true
16118:23 <nehan> how did you pick g(x)?
16218:23 <glozow> b10c: yep! exactly :)
16318:23 <sipa> nehan: many years of CPU time
16418:23 <felixweis> pinheadpmz: can confirm what nehan said, I watched a few lectures on group theory & number theory in the past couple weeks. helped also with the understanding of last weeks topic w.r.t. the magic behind minisketch
16518:24 <sipa> nehan: in 2017
16618:24 <nehan> sipa: what were you looking for?
16718:24 <sipa> nehan: read BIP173 :)
16818:24 <nehan> sipa: ok!
16918:24 <glozow> so what happens if your string ends with a "p," what's the constant term in your polyonimal?
17018:25 <b10c> +0
17118:25 <pinheadmz> felixweis thanks i watched a few as well, can recco the Christoph Parr series on youtube. still hard to grok that multiplying to things is the "definition of a modulus" :-)
17218:25 <sipa> pinheadmz: no
17318:25 <felixweis> also playing around and exploring stuff with sagemath
17418:25 <sipa> multiplication is multiplication
17518:25 <sipa> modulo is modulo
17618:25 <glozow> b10c: not quite, see the example you worked out?
17718:25 <cguida> It's 1?
17818:26 <glozow> cguida: bingo!
17918:26 <b10c> oh yeah, +1
18018:26 <glozow> b10c: :)
18118:26 <b10c> mixed up q and p
18218:26 <nehan> oh. for anyone else who was wondering, g(x) is GEN in bip173, i think, and is the basis of the code. I watched the talk so I recall what properties you were looking for from that.
18318:27 <sipa> pinheadmz: does this help? a polynomial mod 1 is always 0; a polynomial mod x is just its constant term; a polynomial mod x^2 is iets bottom 2 terms (i.e. a*x + b)
18418:27 <glozow> okay so, if your polynomial `p(x)` ends with +1, `x⋅(p(x) - 1) + 1` also works
18518:27 <sipa> pinheadmz: for other examples, a polynomial mod m(x) is subtracting as many times m(x) from it as yoh can, until you end up with something of degree less than
18618:27 <sipa> m
18718:28 <pinheadmz> sipa that does help
18818:28 <sipa> what is 2x^2 + 3x + 2 mod x+1?
18918:28 <pinheadmz> but "code(x) = f(x) * g(x) + 1 --- that's the definition of modulus" ?
19018:28 <pinheadmz> sipa 3x+2 ?
19118:29 <glozow> so then, let's say your polyonimal `p(x)` corresponds to string "zzp", what does `x*p(x)` correspond to?
19218:29 <sipa> pinheadmz: no, you subtracted x^2, that's not a multiple of x+1
19318:29 <cguida> glozow: by "works", you mean, solves the equation p(x)*g(x) = 1?
19418:29 <glozow> cguida: yes
19518:29 <pinheadmz> oh its just x+1 ?
19618:30 <sipa> pinheadmz: no
19718:30 <glozow> er, it solves `p(x) = f(x)*g(x) + 1` for some `f(x)`
19818:30 <glozow> but yes same idea
19918:30 <pinheadmz> sorry i can work it out later, math on IRC is making me sweat
20018:30 <sipa> first subtract 2x*(x+1), you get what?
20118:30 <michaelfolkson> x+2
20218:30 <sipa> indeed
20318:30 <cguida> whoops, yeah, i missed an f(x) haha
20418:30 <sipa> what is x+2 mod x+1?
20518:30 <pinheadmz> ok i see that michaelfolkson
20618:31 <michaelfolkson> 1
20718:31 <sipa> bingo
20818:31 <sipa> so x^2 + 3x + 2 mod x+1 = 1
20918:31 <michaelfolkson> Math is horrible until it clicks pinheadmz. Then it is beautiful ;)
21018:31 <glozow> okie we probably should move on, heh
21118:32 <glozow> How does Bech32m solve this length extension mutation issue?
21218:32 <cguida> new checksum constant!
21318:33 <glozow> cguida: yep!
21418:33 <sipa> nehan: indeed g(x) is the generator
21518:33 <cguida> i'm not sure why that fixes, other than to guess that it's because it's much larger than 1, so it doens't correspond to any of the letters
21618:34 <glozow> Imma just keep chugging along with the review club questions. Moving forward, which addresses will be encoded using Bech32, and which ones with Bech32m?
21718:35 <cguida> segwit v0 with bech32, subsequent versions bech32m
21818:35 <pinheadmz> segwit v0 keeps bech32, everything from here on out (starting with taproot, witness v1) will get bech32m
21918:35 <glozow> cguida: pinheadmz: correct!
22018:35 <sipa> cguida: the specific change doesn't work anymore, because to do the same, you'd need to (a) subtract the new constant (b) multiply by a power of x (c) add the constant again... if you work that out, you'll see that it requires changing many more characters changed, due to the new constant having many more nonzero coefficients
22118:35 <glozow> How does this affect the compatibility of existing software clients?
22218:35 <cguida> glozow: it doesn't!
22318:36 <cguida> hopefully haha
22418:36 <b10c> Does not affect it: v0 does not change and v1 likely doesn't exist yet
22518:36 <pinheadmz> existing, assuming no one has implemented taproot wallets yet using bech32 ...?
22618:36 <b10c> v1 clients*
22718:36 <sipa> pinheadmz: if they did, not for mainnet i hope!
22818:36 <michaelfolkson> pinheadmz: Assuming there are no problems with bech32m (which hopefully and most likely will be the case)
22918:36 <AnthonyRonning> so anyone that can send to a native segwit address can send to bech32m by default?
23018:36 <cguida> sipa: ahh cool, so it's sort of unpredictable what letters would need to change in order to keep the same checksum
23118:37 <pinheadmz> although sipa if i gave you a witness v1 bech32 address an old wallet would still be able to send to that address right?
23218:37 <cguida> sipa: and it would be multiple letters rather than just a single q
23318:37 <sipa> pinheadmz: yes, but also any miner could steal it
23418:37 <glozow> AnthonyRonning: they must, if it's v1+
23518:37 <pinheadmz> before activation yah
23618:38 <AnthonyRonning> glozow: cool, good to know!
23718:38 <pinheadmz> but after lockin, a wallet that doesnt know about bech32m would still work?
23818:38 <pinheadmz> just a version byte and data, assuming there were no actual length attacks against you
23918:38 <sipa> pinheadmz: yes, but nobody will be creating bech32 v1+ addresses
24018:38 <sipa> so that's not a concern
24118:38 <pinheadmz> ok
24218:39 <michaelfolkson> A wallet either recognizes SegWit v1 or it doesn't. bech32m is just encoding for SegWit v1 addresses
24318:39 <pinheadmz> well, i did send this one a few months ago https://blockstream.info/address/bc1pqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqs3wf0qm
24418:39 <pinheadmz> ;-)
24518:39 <glozow> Let's dive into code :) How does `Decode` check whether an address is encoded as Bech32 or Bech32m? Can a string be valid in both formats?
24618:39 <glozow> link to code: https://github.com/bitcoin/bitcoin/blob/835ff6b8568291870652ca0d33d934039e7b84a8/src/bech32.cpp#L168
24718:40 <sipa> cguida: yeah... though there could be many more or less similar types of mutations with different constants; the bech32m constant was chosen by searching through many patterns of classes of mutations, and picking one that prevents most
24818:40 <b10c> so would the current signet explorer does encode v1 addresses as v0: https://explorer.bc-2.jp/address/tb1p85lx6qpdvs4vlpjnhnexhqwmuetd7klc3dk4ggsmycrtc78n6nnqg2u5a8 would that break?
24918:40 <cguida> glozow: i wasn't clear on this. it appears to be something in "polymod"
25018:41 <b10c> - would*
25118:41 <michaelfolkson> glozow: A string cannot be valid in both formats. Just looking at the code
25218:41 <cguida> and i hear that it's impossible to have an address be both valid bech32 and bech32m
25318:41 <amiti> it can't be valid in both formats, you can xor with 1 / the new constant (`0x2bc830a3`) to see if you get the checksum
25418:42 <AnthonyRonning> wait so wallets/clients that do checksum checks before sending won't be able to send to a bech32m check until they update their encoding methods?
25518:42 <cguida> michaelfolkson: where do you see that in the code?
25618:42 <glozow> amiti: winner! yep, you basically check the mod and see which encoding it matches
25718:42 <AnthonyRonning> s/encoding/decoding
25818:43 <lightlike> it's in VerifyChecksum() - looks like you get the constant back
25918:43 <michaelfolkson> cguida: I just know that from other reading (BIP etc)
26018:43 <glozow> michaelfolkson: cguida: amiti: yes, the mod can't be both 1 and 0x2bc830a3
26118:43 <sipa> b10c: indeed, existing explorers show bech32 instead of bech32m for v1+... one reason why it'd nice to get bip350 implemented and adopted soon *hint* *hint*
26218:44 <glozow> Next question, when I was reviewing the PR I found it peculiar that there was a space in this test: https://github.com/bitcoin/bitcoin/blob/835ff6b8568291870652ca0d33d934039e7b84a8/src/test/bech32_tests.cpp#L80
26318:44 <glozow> and then I realized it's not on accident ;)
26418:45 <glozow> so what's the space for?
26518:45 <michaelfolkson> sipa: In the case they didn't.... and Taproot was to activate... I guess just temporarily it would suck for SegWit v1 lookups. But they would probably implement it without any need for hints?!
26618:45 <cguida> glozow: would love to see what the proof of that is
26718:45 <lightlike> would it be possible (with a near-zero probability) that we want to decode a BECH32M, have a wrong checksum, but get back a valid BECH32 encoding instead of Encoding::INVALID?
26818:46 <michaelfolkson> sipa: I get it makes sense to be merged into Core soon though
26918:46 <nehan> glowzow: space is not a valid character, right? but someone may copy/paste an address and get spaces
27018:46 <glozow> lightlike: I wondered this too :O maybe sipa has an answer?
27118:46 <sipa> lightlike: yes, but if that mismatches the expected code for the version number, it'll still be rejected
27218:46 <glozow> nehan: yep!
27318:47 <glozow> space is 0x20 in US-ASCII
27418:47 <glozow> which is not a valid character in the HRP
27518:47 <sipa> if you get a v0 with BECH32M: bad
27618:47 <michaelfolkson> I don't know what a space represents in base32 or bech32. Invalid character, so it gets ignored? Or causes an error?
27718:47 <sipa> michaelfolkson: invalid
27818:47 <glozow> michaelfolkson: it's invalid
27918:47 <nehan> michaelfolkson: error
28018:47 <sipa> but why is the test there then?
28118:48 <sipa> if you get v1+ with BECH32: bad
28218:48 <jnewbery> I was expecting to see a test for the same string without the space being valid
28318:48 <sipa> jnewbery: that'd be a good testcase too
28418:48 <nehan> jnewbery: that seems better!
28518:48 <sipa> why not both?
28618:49 <sipa> this test also does something useful :)
28718:49 <nehan> sipa: if the data-space is not a valid address, then it might be failing because of that, and not because of the space
28818:49 <cguida> to make sure an error is thrown when a space is included
28918:49 <nehan> but sure both!
29018:49 <sipa> nehan: yes, but this test tests something similar
29118:50 <sipa> both are trying to anticipate a particular mistake an implementer might make
29218:50 <sipa> yours is: implementer accepts the space but ignores it
29318:50 <glozow> it's particularly testing that the HRP can't have a space?
29418:50 <cguida> in case the address is sent in parts, or with newlines or something
29518:50 <b10c> why does L81 and L82 in the tests contain strings with "" in the middle? i.e. "\x7f""1g6xzxy" and "\x80""1vctc34",
29618:50 <sipa> glozow: yes, but in combimation with something else
29718:51 <sipa> b10c: that's just how you add unprintable characters inside a string
29818:51 <sipa> glozow: i think here it's assuming the implementer treats the space as a valid HRP
29918:51 <b10c> sipa: ty!
30018:51 <sipa> (with value 32)
30118:51 <cguida> what's hrp? sorry
30218:52 <glozow> cguida: human readable part
30318:52 <cguida> human readable part?
30418:52 <glozow> yeah
30518:52 <cguida> cool
30618:52 <glozow> like "bc" or "bcrt" or "tb"
30718:52 <glozow> = bitcoin, bitcoin regtest, testnet bitcoin
30818:52 <glozow> (i assume)
30918:53 <nehan> sipa: (this is super pedantic sorry) i think space+valid is better because it reduces the reasons why the test might fail to the one you're checking for. ok, space+invalid might happen too (you copied off by 1) but the reader of the test might not realize that space+invalid might fail even if space+valid passes, and maybe in the future someone redoes the tests and misses that.
31018:53 <glozow> also cguida: note that newline is a different character from space, although also invalid
31118:53 <michaelfolkson> glozow: Right https://bitcoin.stackexchange.com/questions/100508/can-you-break-down-what-data-is-encoded-into-a-bech32-address
31218:54 <michaelfolkson> Signet is tb as well
31318:54 <cguida> glozow: true, i was picturing a scenario in which the address is sent with newlines, and the user replaces them with a space thinking they need to be separate? really stretching here haha
31418:54 <jnewbery> maybe it'd be good to add a test that a valid test vector with a trailing space fails
31518:55 <MarcoFalke> jnewbery: I think we have that one already
31618:55 <nehan> also my concern above could easily be fixed with comments.
31718:55 <MarcoFalke> (oh, maybe we don't)
31818:55 <sipa> nehan: i don"t understand why one is better than the other?
31918:56 <sipa> they both test distinct failures
32018:56 <b10c> MarcoFalke: don't see one
32118:56 <sipa> non-overlapping ones
32218:56 <glozow> Last question before we wrap up: Is Bech32 case-sensitive?
32318:56 <nehan> since we're close to the end i have a question: addresses are 10 characters longer now, meaning there's more chance for a user to make a mistake. did anyone think about how to balance the # of errors detected vs. likelihood of mistake?
32418:56 <glozow> (and Bech32m)
32518:56 <michaelfolkson> Anyone want to add a PR to add a test for trailing space? If not I'm happy to do it
32618:56 <eoin> no
32718:56 <maqusat> no, but mixed case is not accepted
32818:56 <emzy> ni
32918:56 <emzy> no
33018:57 <cguida> sipa: ahh, i see it. it's to test that p2pkh addresses with a leading space are invalid
33118:57 <pinheadmz> it is kinda, mixed case is not allowed
33218:57 <emzy> No, because it ends up in smaller QR codes.
33318:57 <michaelfolkson> Oh it isn't merged yet, so it would be a PR to sipa's branch
33418:57 <pinheadmz> and sadly many exchanges dont accept ALL CAPS bech32 addresses
33518:57 <glozow> I suppose it depends on what you mean by case-sensitive, but I like maqusat's and pinheadmz's answer
33618:57 <pinheadmz> even though qr codes are better
33718:57 <glozow> you can't have mixed case
33818:57 <glozow> but both uppercase and lowercase versions are acceptable
33918:58 <pinheadmz> (btw did u know Ethereum uses MiXeD cAsE as its checksum? yeesh)
34018:58 <glozow> hahahahaha
34118:59 <pinheadmz> clever for backwards compatability but o_O ?!
34218:59 <nehan> sipa: i am predicting a future reader of the tests might miss that they test different things and think the two tests are redundant
34318:59 <glozow> Alrighty that wraps up our Bech32m program for today, I hope everybody learned something! ^_^
34418:59 <glozow> #endmeeting