#16512 Shuffle inputs and outputs after joining psbts (rpc)
Today’s PR is very small and should be a quick review. We’ll spend half the time talking about the PR and half covering general questions about the Bitcoin Core review process.
Please come prepared with your own questions about review in Bitcoin Core!
- Partially Signed Bitcoin Transactions (PSBTs) are a way for Bitcoin wallets and applications to pass unsigned or partially-signed transactions around.
- This is useful needed when a transaction requires multiple signatures (eg for a multisig output or an off-chain contract).
- The PSBT format is defined in BIP 174. The BIP also defines several responsibilities in the PSBT workflow: creator, updater, signer, combiner, input finalizer, and transaction extractor.
joinpsbtsRPC utility method takes multiple PSBTs and joins them into a single PSBT.
- Currently, the inputs and outputs are joined in order. This PR changes the method to shuffle both inputs and outputs.
- What is the justification for this PR? Why would users want inputs/outputs to be shuffled?
13:00 < jnewbery> hi 13:00 < jonatack> hi! 13:00 < sosthene> hello there! 13:00 < lightlike> hi 13:00 < fjahr> hi 13:01 < ariard> hi 13:01 < michaelfolkson> Hey 13:01 < emzy> hi 13:01 < jnewbery> first of all, sorry about the late notice for this week's PR, and for the date typo on the website. 13:01 < michaelfolkson> I'll ask you afterwards how I/we can help? 13:01 < jonatack> sosthene: i don't remember, but the discussions have been excellent 13:02 < jnewbery> It was a very busy week at Chaincode last week, and unfortunately PR review club suffered 13:02 < jnewbery> should be back to normal service next week! 13:02 < nehan> hi 13:02 < jnewbery> It was quite a small, straightforward PR this week. It shouldn't take too long to discuss 13:03 < jnewbery> we can use any extra time to just talk about the review process and ask/answer more general questions 13:03 < jnewbery> I only had one question: What is the justification for this PR? Why would users want inputs/outputs to be shuffled? 13:04 < dergigi> My guess would be privacy 13:04 < fjahr> As achow mentions in the message: inputs and outputs could be correlated 13:04 < sosthene> If I got it right in the PR, it is because not shuffling the order of the inputs and outputs in a psbt transaction makes it trivial to map input and ouputs together 13:04 < sosthene> yeah privacy I mean :) 13:04 < jnewbery> yeah, that's it. This is fixing a privacy hole 13:05 < jonatack> https://github.com/bitcoin/bips/blob/master/bip-0069.mediawiki BIP 69 discusses it in more detail 13:05 < jnewbery> thanks jonatack. I was just looking for that link 13:06 < jnewbery> There's some discussion around whether BIP 69 (Lexicographical Indexing of Transaction Inputs and Outputs) is good for privacy. Any thoughts about that? 13:07 < sosthene> maybe it's not very efficient against chain analysis? 13:07 < jonatack> IIRC the deterministic aspect it proposes is debatable or undesirable for privacy 13:08 < jnewbery> jonatack: why? 13:08 < ariard> e.g in multi-party protocols, if a transaction is leaked on chain, some party outputs may not be distinguished until redeem as they are P2WSH following bip69 13:09 < jonatack> jnewbery: a deterministic algo would enable 13:09 < jonatack> distinguishing joinpsbts 13:09 < jnewbery> ariard: I'm not sure I understand 13:09 < nehan> achow101's comment: "Not at all. That gives a different privacy leak as it will indicate that joinpsbts has been used on that transaction." 13:10 < jonatack> thus a different privacy leak iirc 13:10 < nehan> I don't quite understand 13:10 < jnewbery> jonatack: how? How would they look different from other transactions? 13:10 < nehan> why not enable it by default. then is it still a privacy leak? 13:10 < ariard> jnewbery: in Lightning, commitment transaction outputs are ordered following BIP69 13:11 < ariard> so you shouldn't be able to distinguish between incoming and outgoing HTLC output 13:11 < ariard> and know party total balance until redeem 13:11 < jnewbery> Why can BIP 69 be a privacy leak? 13:12 < ariard> well in this case, it's good for privacy leak 13:12 < jnewbery> The reason is that if _every_ wallet uses BIP 69, then it's good for privacy, but if only a subset of wallets use BIP 69, then they're identifiable 13:13 < jonatack> because very few txns follow bip69 IIUC 13:13 < jnewbery> jonatack: correct 13:13 < nehan> jnewbery: but the trend is (hopefully) more and more wallets adopting it, increasing the anonymity set for good behavior and decreasing it for bad (not adopting it) 13:13 < jnewbery> so if this PR used bip 69 instead of random shuffle, these transactions would be identifiable by chain analysis 13:13 < jonatack> from nopara here: https://github.com/bitcoin/bitcoin/issues/12457#issuecomment-383400504 13:13 < ariard> at least in Lightning it's mandatory bip69, and funding_output is already a fingreprint 13:14 < jnewbery> nehan: perhaps, if wallets do adopt it 13:14 < lightlike> but are there any major advantages for lexicographic ordering as opposed to random? 13:15 < nehan> lightlike: determinism, not having to shuffle a bunch 13:15 < jnewbery> There's some discussion around bip 69 here: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-October/016457.html 13:16 < jnewbery> lightlike: another reason might be that two parties in a multi-party contract can agree what the order should be if it's deterministic (eg the Lightning case that ariard mentions) 13:17 < jnewbery> apart from the privacy aspect, did anyone have other questions or observations about the PR? 13:17 < michaelfolkson> Why are so many (up to 10 attempts) required to get a shuffle join? Just being extra safe? 13:18 < jonatack> michaelfolkson: shuffling doesn't ensure they aren't equal every time 13:18 < michaelfolkson> Not guaranteed, but highly likely to be? 13:18 < jnewbery> michaelfolkson: to minimize false positives I suppose 13:19 < jnewbery> (we're talking about the new test here: https://github.com/bitcoin/bitcoin/pull/16512/files#diff-9cd872c8f61017722f3accc95f7874d4R376) 13:19 < jonatack> the test cannot be flakey, thus instagibbs proposing 128 times 13:19 < michaelfolkson> Ok 13:20 < michaelfolkson> Have to do the math to work out probability of staying same after 10 shuffles 13:20 < jnewbery> if it's 2 inputs it's 2^-9 13:21 < jnewbery> any other observations on the PR? 13:21 < jnewbery> any thoughts about this: https://github.com/bitcoin/bitcoin/pull/16512/files#r318133389 13:21 < michaelfolkson> But makes sense you don't even want a tiny probability of a test failing for an invalid reason 13:21 < jonatack> Does everything think the proposed test coverage is enough? 13:22 < lightlike> could also fake the randomness by providing the seed to the RNG 13:22 < jnewbery> michaelfolkson: yeah, false positives are really bad because they reduce people's trust in the testing (and so they stop caring if tests fail) 13:23 < sosthene> I don't quite get this thing about having 2 vectors of inputs and outputs 13:23 < jnewbery> lightlike: I don't think we have the ability to do that in bitcoind. And to do that I think we'd need to do a unit test instead of a functional test 13:23 < jonatack> If this was spaceflight or aeronautical piloting software, for example, or security-critical I'd want to see more coverage behind changes, to be sure they remain valid over time. 13:24 < jnewbery> sosthene: achow is talking about the internal structure of the PSBT 13:24 < jnewbery> basically saying that we can't reach in and just change the ordering of the inputs or outputs because it'd leave the PSBT in an inconsistent state 13:25 < sosthene> jnewbery: yes, I think I got it now 13:25 < sosthene> so it means you have to make the same shuffle operation on both vectors and make sure that the state remains consistent? 13:26 < lightlike> jnewbery: i have encountered this in other places, I think we can do it for some RNG functions 13:26 < jnewbery> My concern about the code here is that it's manually copying the PSBT. If the PSBT object is updated in future to contain new fields, then this code would need to be updated too 13:26 < michaelfolkson> So you'd write a test for this? 13:26 < jonatack> jnewbery: bingo 13:26 < lightlike> there is a flag g_mock_deterministic_tests=true in the functional framework, which applies to GetRand() for example 13:27 < jnewbery> I can easily imagine PSBT being updated and someone forgetting to update the code here, so joinpsbt accidentally drops some of the PSBT details 13:28 < sosthene> how would you test for this? 13:28 < michaelfolkson> Just test that they are equivalent? 13:28 < jonatack> I would want to extract the copy and shuffle functions out and unit test them. 13:28 < jnewbery> jonatack: I agree 13:28 < jonatack> As standalone fns. 13:29 < jnewbery> Make shuffle a member function of PartiallySignedTransaction 13:29 < jonatack> Yes. 13:30 < jnewbery> Let's open up to more general questions about the review process (but feel free to continue discussing 16512) 13:30 < michaelfolkson> Interesting 13:31 < michaelfolkson> I have some basic ones that I'm happy to end with if someone else wants to go first 13:32 < jnewbery> just ask now 13:32 < jonatack> I find it a challenge to decide how much time to spend on a review. 13:32 < jonatack> Even trivial ones can lead down a deep rabbit hole (for me) 13:33 < jonatack> of reading BIPs, history, gathering context, writing tests, etc. It's never enough. 13:33 < michaelfolkson> What can one do to help the author of a PR? Is one able to rebase, address nits etc for the author? 13:33 < jnewbery> jonatack: I agree, it's easy for that to happen! That's the best way to learn about the project I think - when you're motivated to understand something because you need it 13:34 < jonatack> So how do you decide which ones to PR and how deep to go? 13:34 < jonatack> Also: Hugh-priority vs easy ones 13:34 < jonatack> Time allocation is a challenge! 13:34 < jnewbery> jonatack: I have a horrible guilt-driven-devlopment model, where I spend my time looking that the PR I feel most guilty about neglecting for too long 13:35 < jnewbery> Time allocation _is_ a challenge! 13:35 < jonatack> Could spend literally days on some of them. 13:35 < michaelfolkson> (Cont) Or is the help limited to review, testing, providing feedback? (by no means trivial) 13:35 < jonatack> Ha! I feel guilty about dozens of PRs... I empathize. 13:37 < lightlike> So far I just choose the ones that seem really interesting to me. 13:37 < jnewbery> I suspect there are diminishing returns on how much benefit you can provide in your review. Like your 4th hour of reviewing a PR is likely to lead to less interesting observations than your 3rd hour of reviewing it 13:37 < jnewbery> (perhaps) 13:38 < jnewbery> michaelfolkson: rebasing/addressing nits/adding tests can be very useful. I recommend you check with the author before you spend a bunch of time on it though 13:38 < jnewbery> either leave a comment on the PR or message them and say "I want to help by doing this. Would that be helpful for you?" 13:39 < jonatack> TBH there a few PRs I spend more than an hour on, before forcing myself to move on. But for the deeper ones that's not enough. 13:39 < michaelfolkson> Also signing commits. Is it important that a regular contributor signs the commits or is it fine for a new contributor to sign the commit? I don't know whether it is checking the contributor is who they say they are if it is checking someone experienced has looked at it 13:39 < jnewbery> The only commits that need to be signed are the merge commits when a maintainer merges a PR 13:40 < michaelfolkson> Ok so it isn't standard practice for a contributor to sign their commits? 13:40 < jnewbery> feel free to sign commits on your own PRs, but it's not necessary 13:40 < michaelfolkson> Got it 13:40 < jonatack> lightlike: how long do you typically spend reviewing them? 13:42 < jkczyz> Hello, joining late. If you could change something fundamental about Bitcoin Core to improve the PR review process, what would it be? :) 13:42 < michaelfolkson> Another question. I opened a PR to do something that you John had already opened a PR for. Bar going through all the PRs how would I have known that? Eventually DrahtBot told me. Is it fine to use DrahtBot for this purpose? I'm assuming you'd prefer someone checked all the open PRs first... 13:43 < lightlike> jonatack: varies a lot, but usually more than an hour. For example I used a long train ride to understand amiti's WIP PR. 13:43 < michaelfolkson> I'm monitoring a lot of GitHub activity via email notifications but it is a lot. It is swamping my inbox 13:44 < jnewbery> michaelfolkson: I think just search for function names and any key words in the open PRs first. I don't think there's any way to run drahtbot offline (but I do think that would be useful) 13:44 < jonatack> lightlike: yes. for the more involved PRs I would need more as well. I would like to dedicate more time to the deeper and high priority ones, but it takes waaay more time. 13:44 < jnewbery> jkczyz: I'd add more great reviewers :) 13:45 < jnewbery> michaelfolkson: I gave up using github email notifications long ago 13:45 < jkczyz> jnewbery: haha, Fair enough. But anything about the code itself? 13:46 < michaelfolkson> Carl said on an interview he used IFTTT for specific GitHub alerts. Might be worth playing around with 13:46 < jnewbery> jkczyz: better modularization so people aren't stepping on each other's toes as much. This has already improved a lot in the last few months/years 13:46 < jkczyz> michaelfolkson: Filter your mail by email@example.com may help a bit 13:47 < jnewbery> I think splitting wallet from node and having a well-determined interface makes it way easier to review and think about wallet changes 13:47 < michaelfolkson> <jkczyz>: That is filtering by the author of the PR/issue/comment? 13:48 < michaelfolkson> <jkczyz>: So get notifications on the activity of specific people? 13:49 < jonatack> michaelfolkson: for reviewing PRs at least collisions aren't a worry :) 13:50 < jkczyz> michaelfolkson: Actually, nevermind. My filter is actually anything to me with list:bitcoin.bitcoin.github.com 13:51 < michaelfolkson> <jonatack>: Haha yeah that's right. I'm finding doing some very basic PRs is useful too though. Can't really review well if you haven't done a couple smoothly yourself 13:51 < michaelfolkson> But yeah definitely agree once one is comfortable the 10:1 ratio or whatever it was needs to be upheld 13:52 < jnewbery> I still think that reviewing is the most effective way you can contribute as a new contributor (and also will teach you much more about the code than opening PRs) 13:53 < michaelfolkson> Agreed 13:53 < jonatack> Yes -- much more value-add 13:54 < jnewbery> last 5 minutes. Any final questions? 13:54 < jonatack> and reviewing is also a great way to see things that can be done if you are looking for that 13:54 < jnewbery> jonatack: definitely 13:54 < michaelfolkson> If no one has any other questions shall I ask if you need any help with this club <jnewbery>? 13:54 < jonatack> e.g. in rpc_psbt.py... there is no logging, nada 13:54 < michaelfolkson> How should the PRs be chosen each week? Continue with John choosing them? 13:55 < jnewbery> Thanks Michael. I'm always looking for people to suggest PRs or host meetings 13:55 < jnewbery> You can suggest PRs here: https://github.com/bitcoin-core-review-club/bitcoin-core-review-club.github.io/issues/14 or by messaging me 13:57 < jnewbery> Best PRs are small(ish) and don't require too much contextual knowledge 13:57 < jnewbery> (so it doesn't take you hours and hours to review) 13:58 < michaelfolkson> The larger complicated ones are great for learning but for me at least won't get me in a position to give adequate review 13:58 < jnewbery> ok, if there are no more questions, let's wrap it up there. 13:58 < jnewbery> I'll post next week's PR later today 13:58 < jnewbery> Thanks everyone! 13:59 < jonatack> Thanks, John and all! 13:59 < michaelfolkson> Cool, thanks 13:59 < lightlike> thanks! 14:00 < jonatack> nehan: p.s. if you are still here, great job with the a16z presentation https://www.youtube.com/watch?v=5oQ0djZYXNI 14:03 < achow101> re bip69: according to p2sh.info, somewhere between 40% and 60% of txs use bip69. also core doesn't use bip69 at all 14:08 < jonatack> achow101: thanks 14:09 < milochen0418> thanks