LLVM 14 & LLD based macOS toolchain (build system)

https://github.com/bitcoin/bitcoin/pull/21778

Host: fanquake  -  PR author: fanquake

Notes

  • Bitcoin Core releases binaries for multiple different operating systems (Linux, Windows & macOS), produced using cross-compilation (on Linux).

  • Historically, the project has used a very “non-standard” toolchain for producing its macOS release binaries. This is primarily due to there not being any “official” way to build macOS binaries, using completely open source software, on a Linux system. The only Apple-sanctioned way to build macOS binaries is using the Xcode toolchain, running on macOS hardware.

  • Using a non-standard toolchain is not ideal for a number of reasons:

    • We are the only user. No one else is testing things, finding bugs, or making improvements.

    • Creating our toolchain is reliant on Apple (infrequently) releasing source code for the tools we need.

    • It’s also reliant on 3rd-parties patching those sources. i.e ld64 port for Apple cctools.

    • It’s complicated.

  • The most promising alternative toolchain for producing macOS binaries on Linux, has been LLVMs clang + lld. However until recently, lld was unable to link large / complex programs. Much work has been done over the past few years to re-write the Mach-O backend for lld, to where it is now able to link large, complex programs, i.e Chromium, as well as self-host. The new backend has now also become the default, as of LLVM 13.

  • PR #21778 is a work in progress attempt to transition our macOS toolchain to clang + lld, as opposed to our current, custom-built toolchain.

Questions

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?

  2. Can you think of a reason why we might want to continue using our current macOS toolchain, rather than switch?

  3. Did you try performing a macOS cross compile? Did it work? If not, what problems did you run into?

  4. Do these changes effect our Guix (release) build process?
    • If so, how? (hint: look for usage of FORCE_USE_SYSTEM_CLANG)
    • Is there a Guix build change you’d expect to see, which is missing from the PR?
  5. In native_llvm’s preprocessing step we rm -rf lib/libc++abi.so*:
    • Why do/did we do this? (remember we target a macOS SDK when building)
    • Is it actually necessary? (double-check what is ultimately copied from the tarball).
  6. In native_llvm.mk, we copy a number of tools (i.e llvm-*, not clang or lld) out of the tarball:
    • What is one of them used for?
    • bonus: If we rename that tool when copying it, why might we do that?

Meeting Log

  117:00 <fanquake> #startmeeting
  217:00 <glozow> hi!
  317:00 <fanquake> Welcome to PR review club. Today we're looking at #21778 LLVM 14 & LLD based macOS toolchain.
  417:00 <emzy> hi
  517:00 <danielabrozzoni> hi 🙂
  617:00 <larryruane> hi
  717:00 <sipa> hi
  817:00 <michaelfolkson> hi
  917:00 <B_1o1> hi
 1017:00 <svav> Hi
 1117:00 <hebasto> hi
 1217:00 <kouloumos_> hi
 1317:00 <fanquake> Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK?
 1417:00 <ayush933> hi
 1517:01 <fanquake> Also, a reminder: if you have any questions of comments, you don't need to ask to say anything, just go right ahead! :)
 1617:01 <hebasto> approach ACK
 1717:01 <danielabrozzoni> concept ACK
 1817:01 <larryruane> concept ACK
 1917:01 <fanquake> This PR is a bit of special case, given it doesn't fully work yet, so we aren't expecting tested-ACKs.
 2017:02 <emzy> concept ACK
 2117:02 <fanquake> It'd be more interesting if someone was a concept/approach NACK. However we'll also cover that in a later question
 2217:02 <michaelfolkson> Haven't made my mind up. Presumably it is a Concept ACK as you are working on it but don't know enough to say it is a Concept ACK (yet)
 2317:02 <fanquake> Ok. Would someone like to give a one sentence summary of what the PR is trying to achieve?
 2417:02 <jamesob> hi
 2517:03 <hebasto> replace very specific toolchain with well-known one
 2617:04 <fanquake> Exactly
 2717:04 <svav> The PR is trying to use a more standard toolchain to produce the MacOS binaries, as the current one is rather complicated
 2817:04 <michaelfolkson> ^ and the more standard toolchain has only become possible recently
 2917:04 <fanquake> Currently our macOS toolchain is pretty homebrew. It's constructed from pre-compiled binaries, 3rd-party sources, we compile our own llinker (ld64), and mush it all together todo macOS builds.
 3017:04 <glozow> replace our current method of cross compiling for macOS with... llvm and lld based toolchain
 3117:05 <lightlike> hi
 3217:05 <glozow> except i don't actually know what llvm and lld are
 3317:05 <glozow> a compiler and a linker, i guess
 3417:05 <fanquake> LLVM (https://llvm.org/) is " a collection of modular and reusable compiler and toolchain technologies."
 3517:06 <hebasto> a GCC killer?
 3617:06 <fanquake> It's the umbrella project for clangg (compiler) and lld (linker) as well as a number of other tools and libraries.
 3717:06 <fanquake> *clang
 3817:06 <fanquake> Including the *SAN libraries that we use in our sanitizer CIs etc
 3917:06 <jamesob> llvm is sort of like what java byte code is to the jvm, but it's supposed to be a general purpose target for many different languages
 4017:07 <glozow> Mr. Quake, what specific requirements does Bitcoin Core have that not every compiler/linker might support?
 4117:07 <sipa> i think that's where the name comes from, but now it's more used for the name of the project built around it, than the internal language itself
 4217:07 <fanquake> The reason LLVM is most interesting for a macOS toolchain is that Apple has also upstream a lot of work (given Clang & LLVM) is the native Apple toolchain, meaning it is "better" in a number of ways, at producing macOS binaries than GCC + friends may be.
 4317:09 <fanquake> In regards to macOS, GNU ld does not support producing macOS binaries, or maybe in a limited fasion?, as far as I'm aware. So a specific requirement we have is a working linker.
 4417:09 <glozow> I see okay
 4517:10 <fanquake> Speaking more generally, off the top of my head, I don't think there is anything particularly fancy we require in regards to compiler / linker requirements. Other than say, things more general, like c++17 support etc.
 4617:10 <fanquake> I guess we can move onto:
 4717:10 <fanquake> 2. Can you think of a reason why we might want to continue using our current macOS toolchain, rather than switch?
 4817:10 <hebasto> it just works now
 4917:10 <glozow> our current toolchain works, presumably?
 5017:10 <fanquake> that is certainly a good reason not to switch
 5117:10 <larryruane> Ignore if off-topic, but are there plans to move default linux builds from gcc to clang?
 5217:11 <jamesob> The only thing I can think of (and this is slightly fuzzy) is that MarcoFalke has highlighted some concerning issues with the llvm/clang development process. IIRC, basically anyone can commit to the tree and they rely on some kind of revert process if something goes afoul. But this is slightly off-topic and unsubstantiated (at least by me)
 5317:11 <fanquake> larryruane: that is a bit of a can of worms, probably best discussed later. However I'd say we'd be much more likely to move to clang for windows before, before thinking about moving for Linux.
 5417:12 <fanquake> jamesob: that' a fair point. Obviously we care about who is maintaining the tools we are (sometimes blindly) using
 5517:12 <jamesob> This is to say nothing of course about the possibly dubious infrastructure maintaining the gcc toolchain...
 5617:12 <fanquake> It's also the same reason I'd like to move away from our current setup. If anything, I think there are far more eyes over the LLVM repos, as opposed to: https://github.com/tpoechtrager/cctools-port
 5717:13 <jamesob> Yep
 5817:13 <fanquake> i.e has anyone ever review all the patches in that tree which are applied pre compiling ld64 and co
 5917:13 <jamesob> I certainly haven't lol
 6017:13 <fanquake> Cool. So two good reasons. Any others anyone can think of?
 6117:14 <glozow> Basic question, do we already use a llvm toolchian for something else?
 6217:14 <jamesob> glozow: afaik only in development
 6317:15 <fanquake> We do somewhat, given that we currently used a prebuilt clang (downloaded from LLVM) for the macOS builds, but don't use any of the other tools.
 6417:15 <lightlike> maybe if there was a slight difference in features, some non-essential but nice things the new toolchain wouldn't support?
 6517:15 <fanquake> lightlike: good point, and that is actually currently the case!
 6617:15 <fanquake> In the PR description I have a note about -bind_at_load, which is a linker flag we currently use.
 6717:16 <fanquake> It is not yet supported by lld. Although it's unclear if that will actually be an issue, as the reason for setting the flag may no-longer be relevant when building "modern" macOS binaries.
 6817:16 <fanquake> Needs following up on.
 6917:16 <hebasto> what does mean "modern"?
 7017:18 <fanquake> hebasto: as in targetting a more recent minimum version of macOS. As if you know that your binaries are only running on more recent versions, you can assumed certain behaviour out of the dynamic linker, which might obsolete the thing that passing the flag would achieve.
 7117:18 <emzy> iirc Macos switched to LLVM in xcode many years ago.
 7217:18 <hebasto> thanks
 7317:19 <fanquake> and we can enforce that the binaries are only run on recent versions at compile time, by passing minimum version flags to the linker (and sanity check those versions in our symbol check scripts)
 7417:19 <fanquake> Ok. Let's move into 3
 7517:19 <fanquake> *onto
 7617:19 <fanquake> Did you try performing a macOS cross compile? Did it work? If not, what problems did you run into?
 7717:19 <hebasto> yeap, partially
 7817:19 <emzy> only the guix build.
 7917:20 <hebasto> doesn't guix use another clang?
 8017:20 <fanquake> Currently yes, Guix uses it's "own" clang, from the clang-toolchain package. We do that by setting FORCE_USE_SYSTEM_CLANG during the Guix depends build.
 8117:21 <fanquake> That would continue to be the case going forward, even after these changes, however as I allude to in the next Q, there is a Guix related change missing from this PR.
 8217:22 <fanquake> If no-one was building, or didn't run into issue, we could move on to the next question.
 8317:22 <hebasto> facing with failure to configure qt on cross-compiling
 8417:22 <danielabrozzoni> I tried running the command in the pr (`make -C depends/ HOST=x86_64-apple-darwin -j9`), but it says `error adding symbols: DSO missing from command line`. I guess I'm doing something wrong ahah
 8517:22 <fanquake> Although also happy to just answer generally macOS cross-compilation questions at this point as well.
 8617:22 <glozow> when you say guix's "own" clang = comes with the guix package?
 8717:23 <fanquake> glozow: yes, we install a clang-toolchain, and then use that clang for the build. See here: https://github.com/bitcoin/bitcoin/blob/bc562b9ef81d4e37c20c4fdbd73be44d6f12efa0/contrib/guix/manifest.scm#L616
 8817:23 <glozow> ok understood
 8917:23 <fanquake> danielabrozzoni: Interesting, where about's during the depends build did that happen?
 9017:23 <fanquake> Happy to follow up and debug
 9117:24 <fanquake> Currently the most painful part of building is probably aquiring the macOS SDK. Did anyone have issues / attempt doing that?
 9217:24 <danielabrozzoni> I think it's something about qt, as it doesn't happen if I set NO_QT=1
 9317:24 <glozow> wgot the tarball
 9417:25 <emzy> fanquake: no issue, if you have an Apple account.
 9517:25 <michaelfolkson> [16:49:46] <larryruane> The PR review club notes ask, "Did you try performing a macOS cross compile?" -- is the intention to do this on the PR branch, or just on master? (I thought I'd give it a try)
 9617:25 <fanquake> danielabrozzoni: If you've got logs / info, feel free to dump it on the PR. What OS / hardware are you building on?
 9717:25 <michaelfolkson> Was the intention master or the PR branch?
 9817:26 <danielabrozzoni> Yeah, I'll look more into it and dump logs in the PR 🙂 I'm on NixOS
 9917:26 <danielabrozzoni> On x86
10017:26 <fanquake> The PR branch. It's not fully working. However a depends build should work ok (including Qt), however building bitcoin-qt or libbitcoinconsensus will not at this stage.
10117:27 <fanquake> So it's possible to run through building depends, pass a CONFIG_SITE to ./configure and build a working bitcoind, then run it on macOS
10217:27 <fanquake> If there's no other build related questions, we could move onto #4
10317:27 <fanquake> which is a 2-parter. I hope that is allowed
10417:27 <fanquake> Do these changes effect our Guix (release) build process?
10517:28 <fanquake> -> If so, how? (hint: look for usage of FORCE_USE_SYSTEM_CLANG)
10617:28 <fanquake> -> Is there a Guix build change you’d expect to see, which is missing from the PR?
10717:28 <fanquake> Feel free to answer/discuss either of the Qs
10817:28 <hebasto> I didn't expect guix changes in this pr, rather in follow ups
10917:30 <fanquake> hebasto: if we are migrating to LLVM/clang 14 in depends, would you expect a similar migration to clang 14 in guix to be a part of this PR?
11017:30 <fanquake> i.e installing clang-toolchain-14 over clang-toolchain-10 as we currently do
11117:30 <hebasto> as we have https://github.com/bitcoin/bitcoin/blob/master/contrib/guix/libexec/build.sh#L221 -- no behavior change in guix
11217:30 <fanquake> for reference: https://github.com/bitcoin/bitcoin/blob/bc562b9ef81d4e37c20c4fdbd73be44d6f12efa0/contrib/guix/manifest.scm#L616
11317:31 <michaelfolkson> Clang is no longer a dependency now we aren't using native_cctools
11417:31 <fanquake> I would think we'd want to keep the clang versions in sync, so that guix builds would be using clang 14, similar to users cross-compilling on linux would be.
11517:32 <sipa> with this, do we still need the macos SDK etc?
11617:32 <fanquake> michaelfolkson: I would still consider clang a dependency, given for a macOS cross-compile, we are downloading and then compiling using clang
11717:33 <fanquake> sipa: yea we will still need it to build
11817:33 <hebasto> the intention is good, but I see no reasons to combine two parts into one pr. let's keep it focused
11917:34 <fanquake> One way to make obtaining it slightly less painful could be for someone to write a tool to extract it from the macOS command line tools .pkg, which is a few hundred mb, and save having to download 12 GB of Xcode.xip.
12017:35 <fanquake> hebasto: I'm not sure. I think I would like to see the version change happen together, otherwise in the interim, you'd be Guix building with Clang 10, and the CI, or developers doing cross-compiles would be using 14.
12117:35 <fanquake> However we can discuss further in the PR
12217:36 <fanquake> Any other Guix related Qs? Or we could move along to #5
12317:36 <michaelfolkson> What was the answer to #4?
12417:37 <michaelfolkson> Sorry, struggling to follow... Do these changes effect our Guix (release) build process?
12517:37 <fanquake> The discussion around whether we'd want to migrate to using Clang 14 for the Guix build, at the same time we swapped over to using Clang 14 in depends.
12617:37 <michaelfolkson> Ok thanks
12717:37 <fanquake> Yes, they do. Even though we use a Guix installed Clang for the Guix build, we still use the rest of the macOS toolchain ld64, cctools etc from depends.
12817:37 <jamesob> Seems reasonable... moving both in lockstep reduces variability in builds
12917:38 <fanquake> So if we migrate to using the llvm binutils and lld in depends, that would also then be used by the Guix build, but it would remain using it's own installed Clang.
13017:39 <fanquake> One other point
13117:39 <fanquake> You might be wondering why we don't just install and use everything that Guix provides when performing the Guix build.
13217:40 <jamesob> Does the Guix-provided llvm toolchain not support e.g. linking for macOS?
13317:40 <fanquake> The reason is that we need to maintain a macOS toolchain, that works outside of Guix, as Guix does not run everywhere, and there shouldn't be an expectation that you would need to use it to cross-compile.
13417:40 <jamesob> Oh interesting... so the only point is to avoid a total reliance on Guix for the cross-compilation process?
13517:41 <fanquake> jamesob: I'd be surprised if it did currently, as the features making this possible are basically only emerging in LLVM 14, which was officially relased last night
13617:41 <jamesob> Ah okay, so it sounds like it's both
13717:41 <fanquake> jamesob: I think not relying solely on Guix to be able to compile bitcoind for certain OS's is a good thing
13817:42 <jamesob> Yep, agreed
13917:42 <fanquake> Especially given that Guix doesn't work "everywhere". either hardware, or OS wise
14017:42 <jamesob> Didn't know if that was unique to cross-compilation for macOS
14117:42 <hebasto> also using guix in Ci is questionable
14217:42 <jamesob> yeah guix is very resource intensive
14317:42 <fanquake> Cool. Let's move onto #5
14417:42 <michaelfolkson> Relying solely means only supporting Guix builds?
14517:43 <fanquake> In native_llvm’s preprocessing step we rm -rf lib/libc++abi.so*:
14617:43 <fanquake> -> Why do/did we do this? (remember we target a macOS SDK when building)
14717:43 <fanquake> -> Is it actually necessary? (double-check what is ultimately copied from the tarball).
14817:43 <fanquake> michaelfolkson: yes. We cannot tell people that they need to install and use Guix if they want to compile Bitcoin Core.
14917:44 <michaelfolkson> Gotcha, yeah agreed
15017:44 <fanquake> Which means the depends system must remain generic, and useable as widely as possible.
15117:44 <fanquake> and be able to compile all binaries, hosts etc that we produce in release builds.
15217:44 <hebasto> if one makes build for `native_llvm_fetched` target, it becomes obvious that `rm` do nothing
15317:45 <fanquake> hebasto: correct. The preprocessing block as it stands is a no-op.
15417:45 <jamesob> > why `rm`: is it because we'd be linking against a linux binary interface for a macos build?
15517:46 <michaelfolkson> The arguments for not forcing Guix use are it is new(ish), users might not understand it, it is not industry standard outside of Bitcoin?
15617:46 <fanquake> jamesob: I'm going to lazily dump a few lines I prepared that may explain further
15717:46 <michaelfolkson> Or maybe it is just it has only been added to Core recently(ish)
15817:46 <fanquake> The rm was originally added in https://github.com/bitcoin/bitcoin/pull/8210, to remove any LLVM C++ ABI related objects, given at that point we were copying more files out of the lib dirs in the clang tarball. Possibly just a belt-and-suspenders thing.
15917:47 <fanquake> Likely irrelevant since https://github.com/bitcoin/bitcoin/pull/19240, fbcfcf695435c9587e9f9fd2809c4d5350b2558e, where we stopped copying any c++ libs from the clang tarball.
16017:47 <fanquake> The code in its current state is pointless / broken for 2 reasons:
16117:47 <fanquake> The only things we copy out of lib/ are `libLTO.so` and headers from lib/clang/clang-version/include, so deleting .so files from /lib in advance of that, doesn't achieve anything.
16217:47 <fanquake> Besides that, the `libc++abi.so*` objects have actually changed location inside the lib/ dir, so even if we kept the current code, it wouldn't actually remove anything anyways.
16317:48 <fanquake> michaelfolkson: there are a number of arguments for not forcing anyone that wants to compile your software to use a single package manager to do so. Happy to discuss further later.
16417:48 <hebasto> I did not find `libc++abi.so*` anywhere
16517:49 <hebasto> in fetched archive
16617:49 <fanquake> hebasto: I no-longer have the tarball handy, but iirc, the .so* had moved down a directory or two. I'll double check.
16717:50 <fanquake> We can probably move onto #6. I am hoping everyone hasn't quite fallen asleep yet.
16817:50 <fanquake> In native_llvm.mk, we copy a number of tools (i.e llvm-*, not clang or lld) out of the tarball:
16917:50 <fanquake> -> What is one of them used for?
17017:50 <fanquake> -> bonus: If we rename that tool when copying it, why might we do that?
17117:52 <hebasto> talking about tools, why we do not need llvm-as, i.e. assemble?
17217:52 <jamesob> Do we rename to avoid PATH conflicts with system tooling? e.g. `which ld` vs. `$(host)-ld`?
17317:53 <fanquake> hebasto: I'm going to say because whatever it would be doing, is being handled by something else, I'll have to get you an answer, sorry.
17417:53 <michaelfolkson> You mean pick a particular tool and say what it does?
17517:54 <michaelfolkson> (out of those tools copied)
17617:54 <fanquake> jamesob: somewhat. The reason we rename lld to ld, is to make things "simpler" for the build system, as most build systems, especially those using autotools, look for, and expect to use a linker called ld.
17717:54 <fanquake> Given we are in full control of our build system here, we can just rename lld, and have it "pretend" to be ld, for the sake of making everything work, and the build systems expecting GNU ld should mostly be none-the-wiser.
17817:55 <jamesob> right - `ld` is sort of a generic name for a linker whereas `lld` is LLVM's specific ld-compatible linker
17917:55 <fanquake> michaelfolkson: yes basically
18017:55 <michaelfolkson> Search engine llvm-install-name-tool
18117:55 <michaelfolkson> :)
18217:55 <glozow> when you say copy stuff out of the tarball, u mean like, these? https://github.com/bitcoin/bitcoin/pull/21778/files#diff-374d342fe41e2c3754a305bb1db9ba2c56f519fcd09c24cb26abba3ca64690feR19-R32
18317:55 <fanquake> The other reason we might rename tools to have the $(host)- is discussed here somewhat: https://www.gnu.org/software/automake/manual/html_node/Cross_002dCompilation.html
18417:55 <michaelfolkson> "tool to manipulate dynamic shared library install names and rpaths listed in a Mach-O binary"
18517:56 <fanquake> When cross-compiling, autotools generally looks for native (build) tools that have the target arch in the name.
18617:56 <fanquake> i.e x86_64-apple-darwin-strip
18717:56 <jamesob> ah okay, so this is sort of an autotools convention?
18817:56 <fanquake> So renaming some of the tools is also a bit of a convenience for autotools, and can prevent warning output like: "configure: WARNING: using cross tools not prefixed with host triplet".
18917:56 <michaelfolkson> What is Mach-O, gosh search engine again
19017:57 <sipa> Mac's executable file format.
19117:57 <fanquake> MACHO is the macOS exeutable file format
19217:57 <hebasto> it's macos's ELF :)
19317:57 <fanquake> to cover a couple other tools, and why we might rename them:
19417:57 <jamesob> interesting that we don't do the host-prefixing with clang/clang++, but I guess those are not platform specific?
19517:57 — michaelfolkson sweats
19617:57 <fanquake> llvm-install-name-tool -> install_name_tool as that is its "usual" name, and what other tools / build systems will look for / expect.
19717:57 <fanquake> Same for llvm-libtool-darwin -> libtool as build systems / autotools expect libtool, not libtool-darwin.
19817:57 <sipa> Windows' executable format is called PE, IIRC.
19917:58 <fanquake> correct
20017:58 <sipa> elves > machos > physical education
20117:58 <fanquake> I think we are running out of time, but if anyone has any other related thoughts / questions, feel free to throw them out.
20217:59 <fanquake> I realise this has been a bit of a whirlwind tour of macOS related things
20317:59 <michaelfolkson> Ok slightly less uninformed Concept ACK from me
20417:59 <jamesob> thanks fanquake
20517:59 <glozow> thank you fanquake!
20617:59 <emzy> thanks fanquake!
20717:59 <danielabrozzoni> thank you!
20817:59 <hebasto> fanquake: thank you
20918:00 <kouloumos_> thank you!
21018:00 <michaelfolkson> Thanks fanquake, that was great
21118:00 <sipa> thanks for hosting!
21218:00 <sipa> (and working on all this crazy stuff...)
21318:00 <fanquake> Cool, thanks everyone. I think that means we can
21418:00 <fanquake> #endmeeting