libs)

Mar 10, 2021

https://github.com/bitcoin/bitcoin/pull/20861

Host: glozow - PR author: sipa

The PR branch HEAD was 835ff6b at the time of this review club meeting.

In this PR Review Club meeting, we’ll discuss BIP350 and Bech32m.

Notes

An invoice address (aka output address, public address, or just address), not to be confused with public key, IP address, or P2P Addr message, is a string of characters that represents the destination for a Bitcoin transaction. Since users generate these addresses to send bitcoins and incorrect addresses can result in unspendable coins, addresses include checksums to help detect human errors such as missing characters, swapping characters, mistaking a q for a 9, etc.
Bech32 was introduced in BIP173 as a new standard for native segwit output addresses. For more background on Bech32, this video describes Bech32 checksums and their error correction properties.
Bech32 had an unexpected weakness, leading to the development of Bech32m, described in BIP350.
PR #20861 implements BIP350 Bech32m addresses for all segwit outputs with version 1 or higher. Note that such outputs are not currently supported by mainnet so this does not pose a compatibility problem for current users. It intentionally breaks forward compatibility for future software to prevent accidentally sending to an unspendable v1 output.

Questions

Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?
Can you describe the length extension mutation issue found in Bech32? Does it affect Bitcoin addresses? Why or why not?
How does Bech32m solve this length extension mutation issue?
Which addresses will be encoded using Bech32, and which ones with Bech32m? How does this effect the compatibility of existing software clients?
What are the three components of a Bech32m address encoding?
How does Decode check whether an address is encoded as Bech32 or Bech32m? Can a string be valid in both formats?
The space in this test string is not an accident. What does it test?
For fun: Is Bech32 case-sensitive? (Hint: Why is “A12UEL5L” valid but “A12uEL5L” not?)

Meeting Log

18:00

<glozow> #startmeeting

18:00

<jnewbery> hi

18:00

<glozow> Welcome to PR Review Club everyone!!!

18:00

<amiti> hi!

18:00

<maqusat> hi

18:00

<AnthonyRonning> hi

18:00

<glozow> Anyone here for the first time?

18:00

<michaelfolkson> hi

18:00

<pinheadmz> wuddup

18:00

<willcl_ark_> hi

18:00

<lightlike> hi

18:00

<AsILayHodling> hi

18:00

<b10c> hi

18:00

<glozow> Today, we're looking at #20861 BIP 350: Implement Bech32m and use it for v1+ segwit addresses

18:00

<glozow> Notes: https://bitcoincore.reviews/20861

18:00

<glozow> PR: https://github.com/bitcoin/bitcoin/pull/20861

18:00

<cguida> hi

18:00

<cguida> my first time

18:01

<glozow> Welcome cguida! :)

18:01

<AnthonyRonning> cguida: welcome!

18:01

<cguida> thanks! :)

18:01

<glozow> Did y'all get a chance to review the PR and/or BIPs? What was your review approach?

18:01

<cguida> Didn't get to running the code yet, but did some reading

18:02

<jnewbery> 0.2y

18:02

<glozow> link to BIP350: https://github.com/bitcoin/bips/blob/master/bip-0350.mediawiki

18:02

<AnthonyRonning> browsed a bit, not familiar with the concept at all yet

18:02

<amiti> mostly just looked through the review club notes & relevant sections in bips / code, didn't do a proper review.

18:02

<nehan> hi

18:02

<michaelfolkson> I'd Concept ACK, Approach ACKed a while ago. So looking at code, running tests etc

18:03

<maqusat> just had time to glance over

18:03

<sipa> hi

18:03

<pinheadmz> read bip and ML posts, havent tried code yet

18:03

<b10c> looked over the BIP and the reviews page

18:03

<emzy> hi

18:03

<glozow> Alrighty, maybe we could start with a light conceptual question: what is Bech32 used for exactly?

18:03

<pinheadmz> encoding data with error correction!

18:03

<pinheadmz> using a set of 32 characters

18:04

<glozow> pinheadmz: yes! what are we encoding, in the context of Bitcoin?

18:04

<pinheadmz> ok, segwit addresses

18:04

<pinheadmz> a segwit version followed by some amount of data

18:04

<jnewbery> *error detection and correction

18:04

<pinheadmz> could be a publichey hash, script hash or in the case of taproot, a bare public key

18:04

<b10c> addresses, but 'invoice' addresses and not IP addresses etc

18:04

<cguida> with a focus on character transcription errors

18:04

<sipa> (but despite supporting error correction, you should absolutely nevwr do that - if you detect errors, you should the user to go ask the real address again)

18:04

<pinheadmz> jnewbery thank u

18:05

<jonatack> hi

18:05

<glozow> ok so how important is error detection here, on the scale of meh to we-could-lose-coins?

18:05

<cguida> and simplifying display in qr codes!

18:05

<jnewbery> right, for sending to an address we shouldn't do error correction

18:05

<nehan> we-could-lose-coins

18:05

<pinheadmz> glozow youcould be sending bitcoin to the wrong person or to an unrecovaerbale key if you mess up!

18:05

<eoin> I'm a newb and don't know C++ or Python, how should I proceed?

18:05

<glozow> pinheadmz: yeah! so the error detection is key here :)

18:05

<schmidty> hi

18:06

<pinheadmz> eoin start in english? https://github.com/bitcoin/bips/blob/master/bip-0350.mediawiki

18:06

<michaelfolkson> glozow nehan: Depending on whether it is a character or two correction chances of losing coins is veeeery low

18:06

<AnthonyRonning> human readability is another aspect of bech32 as well, right?

18:06

<cguida> we-could-lose-coins, because the error could be a valid address

18:06

<cguida> with low probability

18:06

<jnewbery> eoin: welcome! Follow along as best you can. There are some good resources for newcomers here: https://bitcoincore.reviews/#other-resources-for-new-contributors

18:06

<jonatack> good resources also at: https://bitcoinops.org/en/topics/bech32/

18:06

<michaelfolkson> cguida: Depending on how many characters are being corrected

18:06

<nehan> glozow: i thought we were talking about error correction generally? or do you mean specifically in bech32 addresses?

18:06

<pinheadmz> the fun parts of bech32 to me are how characters are arranged by possible visual mistake i.e. v and w

18:07

<jonatack> optech topics are a "good first stop for info"

18:07

<cguida> michaelfolkson: yes

18:07

<pinheadmz> as if someone was reading a bitcoin address and typing it in manually

18:07

<nehan> *error detection

18:07

<sipa> michaelfolkson: if you do error correction the probability of sending to the wrong address goes up spectacularly; correction only works if you make up to 2 errors (with restrictions on what those errors are); if yiu make more, it is very likely that error correction will "correct" to the wrong thing

18:07

<cguida> eoin: python is easy to get started with, send me a message if you'd like some resources

18:07

<glozow> nehan: error detection generally, yes, I want to make sure we're all clear that it's a key goal here

18:07

<jnewbery> I think we should all just pretend that error *correction* is not a thing for the purposes of this conversation

18:07

<jnewbery> and just focus on error *detection*

18:08

<pinheadmz> sure but it is cool :-)

18:08

<pinheadmz> bech32 can fix up to 3 (?) mistakes

18:08

<sipa> 2

18:08

<pinheadmz> ty

18:08

<nehan> pinheadmz: id on't think that's true!

18:08

<glozow> Ok I think we're on the same page :) Next question is a little harder: Can you describe the length extension mutation issue found in Bech32?

18:09

<michaelfolkson> The probabilities are listed somewhere I think... maybe in sipa's SF Bitcoin Devs slides

18:09

<amiti> if the address ends with a p, you can insert or delete q characters right before & it won't invalidate the checksum

18:09

<nehan> checksum for <addr>p = <addr>qqqqqp

18:09

<cguida> sipa: right, a perhaps overengineered approach would be to present the 2 or 3 closest correction strings to the user? haha

18:09

<glozow> amiti: nehan: correct!

18:10

<pinheadmz> my understanding is that the bech32 data represents a polynomial, and since x^0 = 1, you can add a bunch of extra 0's at the end of a bech32 address and its just like (checksum * 1 * 1 * 1...) so it remains valid

18:10

<glozow> can anyone tells us why this is the case?

18:10

<sipa> cguida: the BIP says you cannot do more than point out likely positions of errors

18:10

<pinheadmz> or rather data * 1 * 1 * 1... so the checksum doesnt change

18:11

<tkc> cguida: I would be interested in those beginner resources also. This is not the topic for today obviously, but how to connect with you outside this?

18:11

<cguida> tkc eoin just send me a dm here on irc

18:11

<glozow> pinheadmz: nice! could you tell us how we get from a string to a polynomial?

18:12

<cguida> pinheadz: ohhh

18:12

<cguida> pinheadmz*

18:12

<pinheadmz> not... really..... but theres this chart: https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki#bech32

18:12

<pinheadmz> that maps charachters to numbers

18:12

<sipa> not sure i follow about the * 1 * 1 * 1

18:12

<pinheadmz> sipa my understand is pretty abstract i just barely kinda get it

18:13

<pinheadmz> that since x^0 = 1, a bunch of 0s at the end ends up just multiplying something by 1

100

18:13

<pinheadmz> which doesnt change the value

101

18:13

<sipa> hmm, no

102

18:13

<sipa> glozow: i can explain if you want

103

18:13

<michaelfolkson> +1 :)

104

18:14

<pinheadmz> +p

105

18:14

<pinheadmz> (anyone get it?)

106

18:14

<glozow> heh ok so, "z"=2, "p"=1 and "q"=0, so what polynomial do we get from "zqzp?"

107

18:14

<cguida> the checksum for bech32 has a 1 multiplied in, bech32m uses something else

108

18:14

<glozow> sipa: go for it :P

109

18:14

<cguida> or xored in

110

18:14

<nehan> glozow: i think you should do it and sipa can chime in :)

111

18:14

<michaelfolkson> nehan: +1

112

18:15

<sipa> if you translate the characters to poiynomials, bech32 is essentially the equation code(x) mod g(x) = 1

113

18:15

<glozow> jnewbery shared this earlier https://bitcoin.stackexchange.com/questions/91602/how-does-the-bech32-length-extension-mutation-weakness-work which has a good explanation

114

18:15

<sipa> where code(x) is the polynomial corresponding to the data (incl checksum) of the bech32 string

115

18:15

<sipa> and g(x) is a specific 6th degree constant

116

18:16

<glozow> `g(x) = x^6 + 29x^5 + 22x^4 + 20x^3 + 21x^2 + 29x + 18`

117

18:16

<pinheadmz> sipa what does 6th degree constant mean ?

118

18:16

<sipa> pinheadmz: the exact polymonial glozow just gave

119

18:16

<felixweis> polynomial of degree 6

120

18:16

<glozow> degree 6 polynomial, same one used for every encoding

121

18:16

<pinheadmz> sipa is that the value gmax crunched for a week on a super computer ?

122

18:16

<sipa> it"s constant, not as in 0th degree, but as in: it is a constant, everyone uses tbe same

123

18:17

<nehan> pinheadmz: a "constant" polynomial means its coefficients are fixed, I think

124

18:17

<glozow> constant as in `const` :P

125

18:17

<sipa> pinheadmz: that one took way longer; we're talking bech32 here, not bech32m

126

18:17

<pinheadmz> right i was refrring to bech32

127

18:17

<sipa> so, we can write that as code(x) = f(x) * g(x) + 1

128

18:17

<pinheadmz> i understand bech32m also has a bruteforced constant

129

18:17

<sipa> that's the definition of modulus

130

18:18

<sipa> or: code(x) - 1 = f(x)*g(x)

131

18:18

<glozow> so to answer my earlier question "z"=2, "p"=1 and "q"=0, so what polynomial do we get from "zqzp?"

132

18:18

<glozow> it's `2x^3 + 0x^2 + 2x + 1` i.e. `2x^3 + 2x + 1`

133

18:18

<sipa> indeed!

134

18:19

<pinheadmz> sipa how is that a modulus? like, does it "wrap around"? bc its two polynomials being multilied?

135

18:19

<glozow> does everyone see how we got that?

136

18:19

<sipa> pinheadmz: it's just like numbers

137

18:19

<glozow> let me know if it's unclear and we can slow down

138

18:19

<sipa> yes, it wraps around

139

18:19

<pinheadmz> but number * number approaches infinity without wrapping

140

18:19

<glozow> so that modulus 1 is there so that we can't trivially create a new valid string from an old one

141

18:19

<sipa> it"s in the degrew instead in number of digits here

142

18:20

<sipa> once you go over 6th ddgree, it wraps around

143

18:20

<nehan> pinheadmz: you might want to study group theory a little (abstract algebra). numbers are just examples; you can apply the concepts to sets of "things" as well

144

18:20

<sipa> because you can subtract a bigger multiple of the modulus

145

18:20

<nehan> pinheadmz: in this case, the set of things is a set of polynomials, and you can operate on them

146

18:20

<cguida> glozow: I see how you got a polynomial from those inputs, but what's x in this case?

147

18:20

<sipa> cguida: x is just a variable name

148

18:21

<sipa> we nwver actually evaluate it in a specific value of x

149

18:21

<sipa> we need one to write polynomials, that's it

150

18:21

<glozow> cguida: you can think of polynomials as basically a vector of coefficients

151

18:21

<cguida> ok so the x doesn't matter, just the coefficients?

152

18:21

<glozow> helps to distinguish polynomials from polynomial functions

153

18:22

<sipa> cguida: yeah, you can say zqzp is just [1,2,0,2] (we tend to write low powers first when representing as lists)

154

18:22

<cguida> i'll need to play with it more i think

155

18:23

<cguida> sipa: ok cool

156

18:23

<sipa> but remember that when multiplying you need to think of them as popynomials

157

18:23

<michaelfolkson> cguida: You might do algebra with say x, y, z without ever ascribing values to them. This way you are playing around with specific polynomials instead of x, y and z

158

18:23

<b10c> Zx^3 + Qx^2 + Zx + P with Z=2, P=1 and Q=0 ==> `2x^3 + 0x^2 + 2x + 1`, right?

159

18:23

<sipa> so!

160

18:23

<glozow> ok so we have the condition for valid Bech32 being: if your string is represented as `p(x)`, you need `p(x) = f(x)*g(x) + 1` aka `p(x) mod g(x) = 1` to be true

161

18:23

<nehan> how did you pick g(x)?

162

18:23

<glozow> b10c: yep! exactly :)

163

18:23

<sipa> nehan: many years of CPU time

164

18:23

<felixweis> pinheadpmz: can confirm what nehan said, I watched a few lectures on group theory & number theory in the past couple weeks. helped also with the understanding of last weeks topic w.r.t. the magic behind minisketch

165

18:24

<sipa> nehan: in 2017

166

18:24

<nehan> sipa: what were you looking for?

167

18:24

<sipa> nehan: read BIP173 :)

168

18:24

<nehan> sipa: ok!

169

18:24

<glozow> so what happens if your string ends with a "p," what's the constant term in your polyonimal?

170

18:25

<b10c> +0

171

18:25

<pinheadmz> felixweis thanks i watched a few as well, can recco the Christoph Parr series on youtube. still hard to grok that multiplying to things is the "definition of a modulus" :-)

172

18:25

<sipa> pinheadmz: no

173

18:25

<felixweis> also playing around and exploring stuff with sagemath

174

18:25

<sipa> multiplication is multiplication

175

18:25

<sipa> modulo is modulo

176

18:25

<glozow> b10c: not quite, see the example you worked out?

177

18:25

<cguida> It's 1?

178

18:26

<glozow> cguida: bingo!

179

18:26

<b10c> oh yeah, +1

180

18:26

<glozow> b10c: :)

181

18:26

<b10c> mixed up q and p

182

18:26

<nehan> oh. for anyone else who was wondering, g(x) is GEN in bip173, i think, and is the basis of the code. I watched the talk so I recall what properties you were looking for from that.

183

18:27

<sipa> pinheadmz: does this help? a polynomial mod 1 is always 0; a polynomial mod x is just its constant term; a polynomial mod x^2 is iets bottom 2 terms (i.e. a*x + b)

184

18:27

<glozow> okay so, if your polynomial `p(x)` ends with +1, `x⋅(p(x) - 1) + 1` also works

185

18:27

<sipa> pinheadmz: for other examples, a polynomial mod m(x) is subtracting as many times m(x) from it as yoh can, until you end up with something of degree less than

186

18:27

<sipa> m

187

18:28

<pinheadmz> sipa that does help

188

18:28

<sipa> what is 2x^2 + 3x + 2 mod x+1?

189

18:28

<pinheadmz> but "code(x) = f(x) * g(x) + 1 --- that's the definition of modulus" ?

190

18:28

<pinheadmz> sipa 3x+2 ?

191

18:29

<glozow> so then, let's say your polyonimal `p(x)` corresponds to string "zzp", what does `x*p(x)` correspond to?

192

18:29

<sipa> pinheadmz: no, you subtracted x^2, that's not a multiple of x+1

193

18:29

<cguida> glozow: by "works", you mean, solves the equation p(x)*g(x) = 1?

194

18:29

<glozow> cguida: yes

195

18:29

<pinheadmz> oh its just x+1 ?

196

18:30

<sipa> pinheadmz: no

197

18:30

<glozow> er, it solves `p(x) = f(x)*g(x) + 1` for some `f(x)`

198

18:30

<glozow> but yes same idea

199

18:30

<pinheadmz> sorry i can work it out later, math on IRC is making me sweat

200

18:30

<sipa> first subtract 2x*(x+1), you get what?

201

18:30

<michaelfolkson> x+2

202

18:30

<sipa> indeed

203

18:30

<cguida> whoops, yeah, i missed an f(x) haha

204

18:30

<sipa> what is x+2 mod x+1?

205

18:30

<pinheadmz> ok i see that michaelfolkson

206

18:31

<michaelfolkson> 1

207

18:31

<sipa> bingo

208

18:31

<sipa> so x^2 + 3x + 2 mod x+1 = 1

209

18:31

<michaelfolkson> Math is horrible until it clicks pinheadmz. Then it is beautiful ;)

210

18:31

<glozow> okie we probably should move on, heh

211

18:32

<glozow> How does Bech32m solve this length extension mutation issue?

212

18:32

<cguida> new checksum constant!

213

18:33

<glozow> cguida: yep!

214

18:33

<sipa> nehan: indeed g(x) is the generator

215

18:33

<cguida> i'm not sure why that fixes, other than to guess that it's because it's much larger than 1, so it doens't correspond to any of the letters

216

18:34

<glozow> Imma just keep chugging along with the review club questions. Moving forward, which addresses will be encoded using Bech32, and which ones with Bech32m?

217

18:35

<cguida> segwit v0 with bech32, subsequent versions bech32m

218

18:35

<pinheadmz> segwit v0 keeps bech32, everything from here on out (starting with taproot, witness v1) will get bech32m

219

18:35

<glozow> cguida: pinheadmz: correct!

220

18:35

<sipa> cguida: the specific change doesn't work anymore, because to do the same, you'd need to (a) subtract the new constant (b) multiply by a power of x (c) add the constant again... if you work that out, you'll see that it requires changing many more characters changed, due to the new constant having many more nonzero coefficients

221

18:35

<glozow> How does this affect the compatibility of existing software clients?

222

18:35

<cguida> glozow: it doesn't!

223

18:36

<cguida> hopefully haha

224

18:36

<b10c> Does not affect it: v0 does not change and v1 likely doesn't exist yet

225

18:36

<pinheadmz> existing, assuming no one has implemented taproot wallets yet using bech32 ...?

226

18:36

<b10c> v1 clients*

227

18:36

<sipa> pinheadmz: if they did, not for mainnet i hope!

228

18:36

<michaelfolkson> pinheadmz: Assuming there are no problems with bech32m (which hopefully and most likely will be the case)

229

18:36

<AnthonyRonning> so anyone that can send to a native segwit address can send to bech32m by default?

230

18:36

<cguida> sipa: ahh cool, so it's sort of unpredictable what letters would need to change in order to keep the same checksum

231

18:37

<pinheadmz> although sipa if i gave you a witness v1 bech32 address an old wallet would still be able to send to that address right?

232

18:37

<cguida> sipa: and it would be multiple letters rather than just a single q

233

18:37

<sipa> pinheadmz: yes, but also any miner could steal it

234

18:37

<glozow> AnthonyRonning: they must, if it's v1+

235

18:37

<pinheadmz> before activation yah

236

18:38

<AnthonyRonning> glozow: cool, good to know!

237

18:38

<pinheadmz> but after lockin, a wallet that doesnt know about bech32m would still work?

238

18:38

<pinheadmz> just a version byte and data, assuming there were no actual length attacks against you

239

18:38

<sipa> pinheadmz: yes, but nobody will be creating bech32 v1+ addresses

240

18:38

<sipa> so that's not a concern

241

18:38

<pinheadmz> ok

242

18:39

<michaelfolkson> A wallet either recognizes SegWit v1 or it doesn't. bech32m is just encoding for SegWit v1 addresses

243

18:39

<pinheadmz> well, i did send this one a few months ago https://blockstream.info/address/bc1pqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqs3wf0qm

244

18:39

<pinheadmz> ;-)

245

18:39

<glozow> Let's dive into code :) How does `Decode` check whether an address is encoded as Bech32 or Bech32m? Can a string be valid in both formats?

246

18:39

<glozow> link to code: https://github.com/bitcoin/bitcoin/blob/835ff6b8568291870652ca0d33d934039e7b84a8/src/bech32.cpp#L168

247

18:40

<sipa> cguida: yeah... though there could be many more or less similar types of mutations with different constants; the bech32m constant was chosen by searching through many patterns of classes of mutations, and picking one that prevents most

248

18:40

<b10c> so would the current signet explorer does encode v1 addresses as v0: https://explorer.bc-2.jp/address/tb1p85lx6qpdvs4vlpjnhnexhqwmuetd7klc3dk4ggsmycrtc78n6nnqg2u5a8 would that break?

249

18:40

<cguida> glozow: i wasn't clear on this. it appears to be something in "polymod"

250

18:41

<b10c> - would*

251

18:41

<michaelfolkson> glozow: A string cannot be valid in both formats. Just looking at the code

252

18:41

<cguida> and i hear that it's impossible to have an address be both valid bech32 and bech32m

253

18:41

<amiti> it can't be valid in both formats, you can xor with 1 / the new constant (`0x2bc830a3`) to see if you get the checksum

254

18:42

<AnthonyRonning> wait so wallets/clients that do checksum checks before sending won't be able to send to a bech32m check until they update their encoding methods?

255

18:42

<cguida> michaelfolkson: where do you see that in the code?

256

18:42

<glozow> amiti: winner! yep, you basically check the mod and see which encoding it matches

257

18:42

<AnthonyRonning> s/encoding/decoding

258

18:43

<lightlike> it's in VerifyChecksum() - looks like you get the constant back

259

18:43

<michaelfolkson> cguida: I just know that from other reading (BIP etc)

260

18:43

<glozow> michaelfolkson: cguida: amiti: yes, the mod can't be both 1 and 0x2bc830a3

261

18:43

<sipa> b10c: indeed, existing explorers show bech32 instead of bech32m for v1+... one reason why it'd nice to get bip350 implemented and adopted soon *hint* *hint*

262

18:44

<glozow> Next question, when I was reviewing the PR I found it peculiar that there was a space in this test: https://github.com/bitcoin/bitcoin/blob/835ff6b8568291870652ca0d33d934039e7b84a8/src/test/bech32_tests.cpp#L80

263

18:44

<glozow> and then I realized it's not on accident ;)

264

18:45

<glozow> so what's the space for?

265

18:45

<michaelfolkson> sipa: In the case they didn't.... and Taproot was to activate... I guess just temporarily it would suck for SegWit v1 lookups. But they would probably implement it without any need for hints?!

266

18:45

<cguida> glozow: would love to see what the proof of that is

267

18:45

<lightlike> would it be possible (with a near-zero probability) that we want to decode a BECH32M, have a wrong checksum, but get back a valid BECH32 encoding instead of Encoding::INVALID?

268

18:46

<michaelfolkson> sipa: I get it makes sense to be merged into Core soon though

269

18:46

<nehan> glowzow: space is not a valid character, right? but someone may copy/paste an address and get spaces

270

18:46

<glozow> lightlike: I wondered this too :O maybe sipa has an answer?

271

18:46

<sipa> lightlike: yes, but if that mismatches the expected code for the version number, it'll still be rejected

272

18:46

<glozow> nehan: yep!

273

18:47

<glozow> space is 0x20 in US-ASCII

274

18:47

<glozow> which is not a valid character in the HRP

275

18:47

<sipa> if you get a v0 with BECH32M: bad

276

18:47

<michaelfolkson> I don't know what a space represents in base32 or bech32. Invalid character, so it gets ignored? Or causes an error?

277

18:47

<sipa> michaelfolkson: invalid

278

18:47

<glozow> michaelfolkson: it's invalid

279

18:47

<nehan> michaelfolkson: error

280

18:47

<sipa> but why is the test there then?

281

18:48

<sipa> if you get v1+ with BECH32: bad

282

18:48

<jnewbery> I was expecting to see a test for the same string without the space being valid

283

18:48

<sipa> jnewbery: that'd be a good testcase too

284

18:48

<nehan> jnewbery: that seems better!

285

18:48

<sipa> why not both?

286

18:49

<sipa> this test also does something useful :)

287

18:49

<nehan> sipa: if the data-space is not a valid address, then it might be failing because of that, and not because of the space

288

18:49

<cguida> to make sure an error is thrown when a space is included

289

18:49

<nehan> but sure both!

290

18:49

<sipa> nehan: yes, but this test tests something similar

291

18:50

<sipa> both are trying to anticipate a particular mistake an implementer might make

292

18:50

<sipa> yours is: implementer accepts the space but ignores it

293

18:50

<glozow> it's particularly testing that the HRP can't have a space?

294

18:50

<cguida> in case the address is sent in parts, or with newlines or something

295

18:50

<b10c> why does L81 and L82 in the tests contain strings with "" in the middle? i.e. "\x7f""1g6xzxy" and "\x80""1vctc34",

296

18:50

<sipa> glozow: yes, but in combimation with something else

297

18:51

<sipa> b10c: that's just how you add unprintable characters inside a string

298

18:51

<sipa> glozow: i think here it's assuming the implementer treats the space as a valid HRP

299

18:51

<b10c> sipa: ty!

300

18:51

<sipa> (with value 32)

301

18:51

<cguida> what's hrp? sorry

302

18:52

<glozow> cguida: human readable part

303

18:52

<cguida> human readable part?

304

18:52

<glozow> yeah

305

18:52

<cguida> cool

306

18:52

<glozow> like "bc" or "bcrt" or "tb"

307

18:52

<glozow> = bitcoin, bitcoin regtest, testnet bitcoin

308

18:52

<glozow> (i assume)

309

18:53

<nehan> sipa: (this is super pedantic sorry) i think space+valid is better because it reduces the reasons why the test might fail to the one you're checking for. ok, space+invalid might happen too (you copied off by 1) but the reader of the test might not realize that space+invalid might fail even if space+valid passes, and maybe in the future someone redoes the tests and misses that.

310

18:53

<glozow> also cguida: note that newline is a different character from space, although also invalid

311

18:53

<michaelfolkson> glozow: Right https://bitcoin.stackexchange.com/questions/100508/can-you-break-down-what-data-is-encoded-into-a-bech32-address

312

18:54

<michaelfolkson> Signet is tb as well

313

18:54

<cguida> glozow: true, i was picturing a scenario in which the address is sent with newlines, and the user replaces them with a space thinking they need to be separate? really stretching here haha

314

18:54

<jnewbery> maybe it'd be good to add a test that a valid test vector with a trailing space fails

315

18:55

<MarcoFalke> jnewbery: I think we have that one already

316

18:55

<nehan> also my concern above could easily be fixed with comments.

317

18:55

<MarcoFalke> (oh, maybe we don't)

318

18:55

<sipa> nehan: i don"t understand why one is better than the other?

319

18:56

<sipa> they both test distinct failures

320

18:56

<b10c> MarcoFalke: don't see one

321

18:56

<sipa> non-overlapping ones

322

18:56

<glozow> Last question before we wrap up: Is Bech32 case-sensitive?

323

18:56

<nehan> since we're close to the end i have a question: addresses are 10 characters longer now, meaning there's more chance for a user to make a mistake. did anyone think about how to balance the # of errors detected vs. likelihood of mistake?

324

18:56

<glozow> (and Bech32m)

325

18:56

<michaelfolkson> Anyone want to add a PR to add a test for trailing space? If not I'm happy to do it

326

18:56

<eoin> no

327

18:56

<maqusat> no, but mixed case is not accepted