LLVM 14 & LLD based macOS toolchain (
build system) Mar 23, 2022
Bitcoin Core releases binaries for multiple different operating systems (Linux, Windows & macOS),
cross-compilation (on Linux).
Historically, the project has used a very
toolchain for producing
its macOS release binaries. This is primarily due to there not being any “official” way to build
macOS binaries, using completely open source software, on a Linux system. The only Apple-sanctioned
way to build macOS binaries is using the Xcode toolchain, running on macOS hardware.
Using a non-standard toolchain is not ideal for a number of reasons:
We are the only user. No one else is testing things, finding bugs, or making improvements.
Creating our toolchain is reliant on Apple (infrequently)
releasing source code for the tools we need.
It’s also reliant on 3rd-parties patching those sources. i.e
ld64 port for Apple
The most promising alternative toolchain for producing macOS binaries on Linux, has been LLVMs
lld. However until recently,
lld was unable to link large / complex programs. Much work has been done over the past few years
to re-write the Mach-O backend for
lld, to where it is now able to link large, complex programs, i.e Chromium, as well as self-host. The new backend has now also
become the default, as of LLVM 13.
PR #21778 is a work in progress attempt to transition our macOS toolchain to
lld, as opposed to our current, custom-built toolchain.
Did you review the PR?
Concept ACK, approach ACK, tested ACK, or
Can you think of a reason why we might want to continue using our current macOS toolchain, rather
Did you try performing a
it work? If not, what problems did you run into? Do these changes effect our Guix (release) build process?
If so, how? (hint: look for usage of
Is there a Guix build change you’d expect to see, which is missing from the PR?
native_llvm’s preprocessing step we
rm -rf lib/libc++abi.so*:
Why do/did we do this? (remember we target a macOS SDK when building)
Is it actually necessary? (double-check what is ultimately copied from the tarball).
native_llvm.mk, we copy a number of tools (i.e
lld) out of the tarball:
What is one of them used for?
bonus: If we rename that tool when copying it, why might we do that?
1 17:00 <fanquake> #startmeeting
3 17:00 <fanquake> Welcome to PR review club. Today we're looking at #21778 LLVM 14 & LLD based macOS toolchain.
5 17:00 <danielabrozzoni> hi 🙂
8 17:00 <michaelfolkson> hi
13 17:00 <fanquake> Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?
15 17:01 <fanquake> Also, a reminder: if you have any questions of comments, you don't need to ask to say anything, just go right ahead! :)
16 17:01 <hebasto> approach ACK
17 17:01 <danielabrozzoni> concept ACK
18 17:01 <larryruane> concept ACK
19 17:01 <fanquake> This PR is a bit of special case, given it doesn't fully work yet, so we aren't expecting tested-ACKs.
20 17:02 <emzy> concept ACK
21 17:02 <fanquake> It'd be more interesting if someone was a concept/approach NACK. However we'll also cover that in a later question
22 17:02 <michaelfolkson> Haven't made my mind up. Presumably it is a Concept ACK as you are working on it but don't know enough to say it is a Concept ACK (yet)
23 17:02 <fanquake> Ok. Would someone like to give a one sentence summary of what the PR is trying to achieve?
25 17:03 <hebasto> replace very specific toolchain with well-known one
26 17:04 <fanquake> Exactly
27 17:04 <svav> The PR is trying to use a more standard toolchain to produce the MacOS binaries, as the current one is rather complicated
28 17:04 <michaelfolkson> ^ and the more standard toolchain has only become possible recently
29 17:04 <fanquake> Currently our macOS toolchain is pretty homebrew. It's constructed from pre-compiled binaries, 3rd-party sources, we compile our own llinker (ld64), and mush it all together todo macOS builds.
30 17:04 <glozow> replace our current method of cross compiling for macOS with... llvm and lld based toolchain
32 17:05 <glozow> except i don't actually know what llvm and lld are
33 17:05 <glozow> a compiler and a linker, i guess
34 17:05 <fanquake> LLVM ( https://llvm.org/) is " a collection of modular and reusable compiler and toolchain technologies."
35 17:06 <hebasto> a GCC killer?
36 17:06 <fanquake> It's the umbrella project for clangg (compiler) and lld (linker) as well as a number of other tools and libraries.
37 17:06 <fanquake> *clang
38 17:06 <fanquake> Including the *SAN libraries that we use in our sanitizer CIs etc
39 17:06 <jamesob> llvm is sort of like what java byte code is to the jvm, but it's supposed to be a general purpose target for many different languages
40 17:07 <glozow> Mr. Quake, what specific requirements does Bitcoin Core have that not every compiler/linker might support?
41 17:07 <sipa> i think that's where the name comes from, but now it's more used for the name of the project built around it, than the internal language itself
42 17:07 <fanquake> The reason LLVM is most interesting for a macOS toolchain is that Apple has also upstream a lot of work (given Clang & LLVM) is the native Apple toolchain, meaning it is "better" in a number of ways, at producing macOS binaries than GCC + friends may be.
43 17:09 <fanquake> In regards to macOS, GNU ld does not support producing macOS binaries, or maybe in a limited fasion?, as far as I'm aware. So a specific requirement we have is a working linker.
44 17:09 <glozow> I see okay
45 17:10 <fanquake> Speaking more generally, off the top of my head, I don't think there is anything particularly fancy we require in regards to compiler / linker requirements. Other than say, things more general, like c++17 support etc.
46 17:10 <fanquake> I guess we can move onto:
47 17:10 <fanquake> 2. Can you think of a reason why we might want to continue using our current macOS toolchain, rather than switch?
48 17:10 <hebasto> it just works now
49 17:10 <glozow> our current toolchain works, presumably?
50 17:10 <fanquake> that is certainly a good reason not to switch
51 17:10 <larryruane> Ignore if off-topic, but are there plans to move default linux builds from gcc to clang?
52 17:11 <jamesob> The only thing I can think of (and this is slightly fuzzy) is that MarcoFalke has highlighted some concerning issues with the llvm/clang development process. IIRC, basically anyone can commit to the tree and they rely on some kind of revert process if something goes afoul. But this is slightly off-topic and unsubstantiated (at least by me)
53 17:11 <fanquake> larryruane: that is a bit of a can of worms, probably best discussed later. However I'd say we'd be much more likely to move to clang for windows before, before thinking about moving for Linux.
54 17:12 <fanquake> jamesob: that' a fair point. Obviously we care about who is maintaining the tools we are (sometimes blindly) using
55 17:12 <jamesob> This is to say nothing of course about the possibly dubious infrastructure maintaining the gcc toolchain...
58 17:13 <fanquake> i.e has anyone ever review all the patches in that tree which are applied pre compiling ld64 and co
59 17:13 <jamesob> I certainly haven't lol
60 17:13 <fanquake> Cool. So two good reasons. Any others anyone can think of?
61 17:14 <glozow> Basic question, do we already use a llvm toolchian for something else?
62 17:14 <jamesob> glozow: afaik only in development
63 17:15 <fanquake> We do somewhat, given that we currently used a prebuilt clang (downloaded from LLVM) for the macOS builds, but don't use any of the other tools.
64 17:15 <lightlike> maybe if there was a slight difference in features, some non-essential but nice things the new toolchain wouldn't support?
65 17:15 <fanquake> lightlike: good point, and that is actually currently the case!
66 17:15 <fanquake> In the PR description I have a note about -bind_at_load, which is a linker flag we currently use.
67 17:16 <fanquake> It is not yet supported by lld. Although it's unclear if that will actually be an issue, as the reason for setting the flag may no-longer be relevant when building "modern" macOS binaries.
68 17:16 <fanquake> Needs following up on.
69 17:16 <hebasto> what does mean "modern"?
70 17:18 <fanquake> hebasto: as in targetting a more recent minimum version of macOS. As if you know that your binaries are only running on more recent versions, you can assumed certain behaviour out of the dynamic linker, which might obsolete the thing that passing the flag would achieve.
71 17:18 <emzy> iirc Macos switched to LLVM in xcode many years ago.
73 17:19 <fanquake> and we can enforce that the binaries are only run on recent versions at compile time, by passing minimum version flags to the linker (and sanity check those versions in our symbol check scripts)
74 17:19 <fanquake> Ok. Let's move into 3
76 17:19 <fanquake> Did you try performing a macOS cross compile? Did it work? If not, what problems did you run into?
77 17:19 <hebasto> yeap, partially
78 17:19 <emzy> only the guix build.
79 17:20 <hebasto> doesn't guix use another clang?
80 17:20 <fanquake> Currently yes, Guix uses it's "own" clang, from the clang-toolchain package. We do that by setting FORCE_USE_SYSTEM_CLANG during the Guix depends build.
81 17:21 <fanquake> That would continue to be the case going forward, even after these changes, however as I allude to in the next Q, there is a Guix related change missing from this PR.
82 17:22 <fanquake> If no-one was building, or didn't run into issue, we could move on to the next question.
83 17:22 <hebasto> facing with failure to configure qt on cross-compiling
84 17:22 <danielabrozzoni> I tried running the command in the pr (`make -C depends/ HOST=x86_64-apple-darwin -j9`), but it says `error adding symbols: DSO missing from command line`. I guess I'm doing something wrong ahah
85 17:22 <fanquake> Although also happy to just answer generally macOS cross-compilation questions at this point as well.
86 17:22 <glozow> when you say guix's "own" clang = comes with the guix package?
88 17:23 <glozow> ok understood
89 17:23 <fanquake> danielabrozzoni: Interesting, where about's during the depends build did that happen?
90 17:23 <fanquake> Happy to follow up and debug
91 17:24 <fanquake> Currently the most painful part of building is probably aquiring the macOS SDK. Did anyone have issues / attempt doing that?
92 17:24 <danielabrozzoni> I think it's something about qt, as it doesn't happen if I set NO_QT=1
93 17:24 <glozow> wgot the tarball
94 17:25 <emzy> fanquake: no issue, if you have an Apple account.
95 17:25 <michaelfolkson> [16:49:46] <larryruane> The PR review club notes ask, "Did you try performing a macOS cross compile?" -- is the intention to do this on the PR branch, or just on master? (I thought I'd give it a try)
96 17:25 <fanquake> danielabrozzoni: If you've got logs / info, feel free to dump it on the PR. What OS / hardware are you building on?
97 17:25 <michaelfolkson> Was the intention master or the PR branch?
98 17:26 <danielabrozzoni> Yeah, I'll look more into it and dump logs in the PR 🙂 I'm on NixOS
99 17:26 <danielabrozzoni> On x86
100 17:26 <fanquake> The PR branch. It's not fully working. However a depends build should work ok (including Qt), however building bitcoin-qt or libbitcoinconsensus will not at this stage.
101 17:27 <fanquake> So it's possible to run through building depends, pass a CONFIG_SITE to ./configure and build a working bitcoind, then run it on macOS
102 17:27 <fanquake> If there's no other build related questions, we could move onto #4
103 17:27 <fanquake> which is a 2-parter. I hope that is allowed
104 17:27 <fanquake> Do these changes effect our Guix (release) build process?
105 17:28 <fanquake> -> If so, how? (hint: look for usage of FORCE_USE_SYSTEM_CLANG)
106 17:28 <fanquake> -> Is there a Guix build change you’d expect to see, which is missing from the PR?
107 17:28 <fanquake> Feel free to answer/discuss either of the Qs
108 17:28 <hebasto> I didn't expect guix changes in this pr, rather in follow ups
109 17:30 <fanquake> hebasto: if we are migrating to LLVM/clang 14 in depends, would you expect a similar migration to clang 14 in guix to be a part of this PR?
110 17:30 <fanquake> i.e installing clang-toolchain-14 over clang-toolchain-10 as we currently do
113 17:31 <michaelfolkson> Clang is no longer a dependency now we aren't using native_cctools
114 17:31 <fanquake> I would think we'd want to keep the clang versions in sync, so that guix builds would be using clang 14, similar to users cross-compilling on linux would be.
115 17:32 <sipa> with this, do we still need the macos SDK etc?
116 17:32 <fanquake> michaelfolkson: I would still consider clang a dependency, given for a macOS cross-compile, we are downloading and then compiling using clang
117 17:33 <fanquake> sipa: yea we will still need it to build
118 17:33 <hebasto> the intention is good, but I see no reasons to combine two parts into one pr. let's keep it focused
119 17:34 <fanquake> One way to make obtaining it slightly less painful could be for someone to write a tool to extract it from the macOS command line tools .pkg, which is a few hundred mb, and save having to download 12 GB of Xcode.xip.
120 17:35 <fanquake> hebasto: I'm not sure. I think I would like to see the version change happen together, otherwise in the interim, you'd be Guix building with Clang 10, and the CI, or developers doing cross-compiles would be using 14.
121 17:35 <fanquake> However we can discuss further in the PR
122 17:36 <fanquake> Any other Guix related Qs? Or we could move along to #5
123 17:36 <michaelfolkson> What was the answer to #4?
124 17:37 <michaelfolkson> Sorry, struggling to follow... Do these changes effect our Guix (release) build process?
125 17:37 <fanquake> The discussion around whether we'd want to migrate to using Clang 14 for the Guix build, at the same time we swapped over to using Clang 14 in depends.
126 17:37 <michaelfolkson> Ok thanks
127 17:37 <fanquake> Yes, they do. Even though we use a Guix installed Clang for the Guix build, we still use the rest of the macOS toolchain ld64, cctools etc from depends.
128 17:37 <jamesob> Seems reasonable... moving both in lockstep reduces variability in builds
129 17:38 <fanquake> So if we migrate to using the llvm binutils and lld in depends, that would also then be used by the Guix build, but it would remain using it's own installed Clang.
130 17:39 <fanquake> One other point
131 17:39 <fanquake> You might be wondering why we don't just install and use everything that Guix provides when performing the Guix build.
132 17:40 <jamesob> Does the Guix-provided llvm toolchain not support e.g. linking for macOS?
133 17:40 <fanquake> The reason is that we need to maintain a macOS toolchain, that works outside of Guix, as Guix does not run everywhere, and there shouldn't be an expectation that you would need to use it to cross-compile.
134 17:40 <jamesob> Oh interesting... so the only point is to avoid a total reliance on Guix for the cross-compilation process?
135 17:41 <fanquake> jamesob: I'd be surprised if it did currently, as the features making this possible are basically only emerging in LLVM 14, which was officially relased last night
136 17:41 <jamesob> Ah okay, so it sounds like it's both
137 17:41 <fanquake> jamesob: I think not relying solely on Guix to be able to compile bitcoind for certain OS's is a good thing
138 17:42 <jamesob> Yep, agreed
139 17:42 <fanquake> Especially given that Guix doesn't work "everywhere". either hardware, or OS wise
140 17:42 <jamesob> Didn't know if that was unique to cross-compilation for macOS
141 17:42 <hebasto> also using guix in Ci is questionable
142 17:42 <jamesob> yeah guix is very resource intensive
143 17:42 <fanquake> Cool. Let's move onto #5
144 17:42 <michaelfolkson> Relying solely means only supporting Guix builds?
145 17:43 <fanquake> In native_llvm’s preprocessing step we rm -rf lib/libc++abi.so*:
146 17:43 <fanquake> -> Why do/did we do this? (remember we target a macOS SDK when building)
147 17:43 <fanquake> -> Is it actually necessary? (double-check what is ultimately copied from the tarball).
148 17:43 <fanquake> michaelfolkson: yes. We cannot tell people that they need to install and use Guix if they want to compile Bitcoin Core.
149 17:44 <michaelfolkson> Gotcha, yeah agreed
150 17:44 <fanquake> Which means the depends system must remain generic, and useable as widely as possible.
151 17:44 <fanquake> and be able to compile all binaries, hosts etc that we produce in release builds.
152 17:44 <hebasto> if one makes build for `native_llvm_fetched` target, it becomes obvious that `rm` do nothing
153 17:45 <fanquake> hebasto: correct. The preprocessing block as it stands is a no-op.
154 17:45 <jamesob> > why `rm`: is it because we'd be linking against a linux binary interface for a macos build?
155 17:46 <michaelfolkson> The arguments for not forcing Guix use are it is new(ish), users might not understand it, it is not industry standard outside of Bitcoin?
156 17:46 <fanquake> jamesob: I'm going to lazily dump a few lines I prepared that may explain further
157 17:46 <michaelfolkson> Or maybe it is just it has only been added to Core recently(ish)
158 17:46 <fanquake> The rm was originally added in https://github.com/bitcoin/bitcoin/pull/8210, to remove any LLVM C++ ABI related objects, given at that point we were copying more files out of the lib dirs in the clang tarball. Possibly just a belt-and-suspenders thing.
160 17:47 <fanquake> The code in its current state is pointless / broken for 2 reasons:
161 17:47 <fanquake> The only things we copy out of lib/ are `libLTO.so` and headers from lib/clang/clang-version/include, so deleting .so files from /lib in advance of that, doesn't achieve anything.
162 17:47 <fanquake> Besides that, the `libc++abi.so*` objects have actually changed location inside the lib/ dir, so even if we kept the current code, it wouldn't actually remove anything anyways.
163 17:48 <fanquake> michaelfolkson: there are a number of arguments for not forcing anyone that wants to compile your software to use a single package manager to do so. Happy to discuss further later.
164 17:48 <hebasto> I did not find `libc++abi.so*` anywhere
165 17:49 <hebasto> in fetched archive
166 17:49 <fanquake> hebasto: I no-longer have the tarball handy, but iirc, the .so* had moved down a directory or two. I'll double check.
167 17:50 <fanquake> We can probably move onto #6. I am hoping everyone hasn't quite fallen asleep yet.
168 17:50 <fanquake> In native_llvm.mk, we copy a number of tools (i.e llvm-*, not clang or lld) out of the tarball:
169 17:50 <fanquake> -> What is one of them used for?
170 17:50 <fanquake> -> bonus: If we rename that tool when copying it, why might we do that?
171 17:52 <hebasto> talking about tools, why we do not need llvm-as, i.e. assemble?
172 17:52 <jamesob> Do we rename to avoid PATH conflicts with system tooling? e.g. `which ld` vs. `$(host)-ld`?
173 17:53 <fanquake> hebasto: I'm going to say because whatever it would be doing, is being handled by something else, I'll have to get you an answer, sorry.
174 17:53 <michaelfolkson> You mean pick a particular tool and say what it does?
175 17:54 <michaelfolkson> (out of those tools copied)
176 17:54 <fanquake> jamesob: somewhat. The reason we rename lld to ld, is to make things "simpler" for the build system, as most build systems, especially those using autotools, look for, and expect to use a linker called ld.
177 17:54 <fanquake> Given we are in full control of our build system here, we can just rename lld, and have it "pretend" to be ld, for the sake of making everything work, and the build systems expecting GNU ld should mostly be none-the-wiser.
178 17:55 <jamesob> right - `ld` is sort of a generic name for a linker whereas `lld` is LLVM's specific ld-compatible linker
179 17:55 <fanquake> michaelfolkson: yes basically
180 17:55 <michaelfolkson> Search engine llvm-install-name-tool
181 17:55 <michaelfolkson> :)
184 17:55 <michaelfolkson> "tool to manipulate dynamic shared library install names and rpaths listed in a Mach-O binary"
185 17:56 <fanquake> When cross-compiling, autotools generally looks for native (build) tools that have the target arch in the name.
186 17:56 <fanquake> i.e x86_64-apple-darwin-strip
187 17:56 <jamesob> ah okay, so this is sort of an autotools convention?
188 17:56 <fanquake> So renaming some of the tools is also a bit of a convenience for autotools, and can prevent warning output like: "configure: WARNING: using cross tools not prefixed with host triplet".
189 17:56 <michaelfolkson> What is Mach-O, gosh search engine again
190 17:57 <sipa> Mac's executable file format.
191 17:57 <fanquake> MACHO is the macOS exeutable file format
192 17:57 <hebasto> it's macos's ELF :)
193 17:57 <fanquake> to cover a couple other tools, and why we might rename them:
194 17:57 <jamesob> interesting that we don't do the host-prefixing with clang/clang++, but I guess those are not platform specific?
195 17:57 — michaelfolkson sweats
196 17:57 <fanquake> llvm-install-name-tool -> install_name_tool as that is its "usual" name, and what other tools / build systems will look for / expect.
197 17:57 <fanquake> Same for llvm-libtool-darwin -> libtool as build systems / autotools expect libtool, not libtool-darwin.
198 17:57 <sipa> Windows' executable format is called PE, IIRC.
199 17:58 <fanquake> correct
200 17:58 <sipa> elves > machos > physical education
201 17:58 <fanquake> I think we are running out of time, but if anyone has any other related thoughts / questions, feel free to throw them out.
202 17:59 <fanquake> I realise this has been a bit of a whirlwind tour of macOS related things
203 17:59 <michaelfolkson> Ok slightly less uninformed Concept ACK from me
204 17:59 <jamesob> thanks fanquake
205 17:59 <glozow> thank you fanquake!
206 17:59 <emzy> thanks fanquake!
207 17:59 <danielabrozzoni> thank you!
208 17:59 <hebasto> fanquake: thank you
209 18:00 <kouloumos_> thank you!
210 18:00 <michaelfolkson> Thanks fanquake, that was great
211 18:00 <sipa> thanks for hosting!
212 18:00 <sipa> (and working on all this crazy stuff...)
213 18:00 <fanquake> Cool, thanks everyone. I think that means we can
214 18:00 <fanquake> #endmeeting