Similar to the previous commit, this commit untangles LLD integration
from the self-hosted linkers. Despite the big network of functions which
were involved, it turns out what was going on here is quite simple. The
LLD linking logic is actually very self-contained; it requires a few
flags from the `link.File.OpenOptions`, but that's really about it. We
don't need any of the mutable state on `Elf`/`Coff`/`Wasm`, for
instance. There was some legacy code trying to handle support for using
self-hosted codegen with LLD, but that's not a supported use case, so
I've just stripped it out.
For now, I've just pasted the logic for linking the 3 targets we
currently support using LLD for into this new linker implementation,
`link.Lld`; however, it's almost certainly possible to combine some of
the logic and simplify this file a bit. But to be honest, it's not
actually that bad right now.
This commit ends up eliminating the distinction between `flush` and
`flushZcu` (formerly `flushModule`) in linkers, where the latter
previously meant something along the lines of "flush, but if you're
going to be linking with LLD, just flush the ZCU object file, don't
actually link"?. The distinction here doesn't seem like it was properly
defined, and most linkers seem to treat them as essentially identical
anyway. Regardless, all calls to `flushZcu` are gone now, so it's
deleted -- one `flush` to rule them all!
The end result of this commit and the preceding one is that LLVM and LLD
fit into the pipeline much more sanely:
* If we're using LLVM for the ZCU, that state is on `zcu.llvm_object`
* If we're using LLD to link, then the `link.File` is a `link.Lld`
* Calls to "ZCU link functions" (e.g. `updateNav`) lower to calls to the
LLVM object if it's available, or otherwise to the `link.File` if it's
available (neither is available under `-fno-emit-bin`)
* After everything is done, linking is finalized by calling `flush` on
the `link.File`; for `link.Lld` this invokes LLD, for other linkers it
flushes self-hosted linker state
There's one messy thing remaining, and that's how self-hosted function
codegen in a ZCU works; right now, we process AIR with a call sequence
something like this:
* `link.doTask`
* `Zcu.PerThread.linkerUpdateFunc`
* `link.File.updateFunc`
* `link.Elf.updateFunc`
* `link.Elf.ZigObject.updateFunc`
* `codegen.generateFunction`
* `arch.x86_64.CodeGen.generate`
So, we start in the linker, take a scenic detour through `Zcu`, go back
to the linker, into its implementation, and then... right back out, into
code which is generic over the linker implementation, and then dispatch
on the *backend* instead! Of course, within `arch.x86_64.CodeGen`, there
are some more places which switch on the `link` implementation being
used. This is all pretty silly... so it shall be my next target.
The main goal of this commit is to make it easier to decouple codegen
from the linkers by being able to do LLVM codegen without going through
the `link.File`; however, this ended up being a nice refactor anyway.
Previously, every linker stored an optional `llvm.Object`, which was
populated when using LLVM for the ZCU *and* linking an output binary;
and `Zcu` also stored an optional `llvm.Object`, which was used only
when we needed LLVM for the ZCU (e.g. for `-femit-llvm-bc`) but were not
emitting a binary.
This situation was incredibly silly. It meant there were N+1 places the
LLVM object might be instead of just 1, and it meant that every linker
had to start a bunch of methods by checking for an LLVM object, and just
dispatching to the corresponding method on *it* instead if it was not
`null`.
Instead, we now always store the LLVM object on the `Zcu` -- which makes
sense, because it corresponds to the object emitted by, well, the Zig
Compilation Unit! The linkers now mostly don't make reference to LLVM.
`Compilation` makes sure to emit the LLVM object if necessary before
calling `flush`, so it is ready for the linker. Also, all of the
`link.File` methods which act on the ZCU -- like `updateNav` -- now
check for the LLVM object in `link.zig` instead of in every single
individual linker implementation. Notably, the change to LLVM emit
improves this rather ludicrous call chain in the `-fllvm -flld` case:
* Compilation.flush
* link.File.flush
* link.Elf.flush
* link.Elf.linkWithLLD
* link.Elf.flushModule
* link.emitLlvmObject
* Compilation.emitLlvmObject
* llvm.Object.emit
Replacing it with this one:
* Compilation.flush
* llvm.Object.emit
...although we do currently still end up in `link.Elf.linkWithLLD` to do
the actual linking. The logic for invoking LLD should probably also be
unified at least somewhat; I haven't done that in this commit.
* The `codegen_nav`, `codegen_func`, `codegen_type` tasks are renamed to
`link_nav`, `link_func`, and `link_type`, to more accurately reflect
their purpose of sending data to the *linker*. Currently, `link_func`
remains responsible for codegen; this will change in an upcoming
commit.
* Don't go on a pointless detour through `PerThread` when linking ZCU
functions/`Nav`s; so, the `linkerUpdateNav` etc logic now lives in
`link.zig`. Currently, `linkerUpdateFunc` is an exception, because it
has broader responsibilities including codegen, but this will be
solved in an upcoming commit.
For directory arguments that need both prefix and suffix strings
appended.
Needed to unbreak ffmpeg package after fe855691f6f742a14678cb617422977c2a55be39
I'm not convinced that some of the possibilities that these regexes allowed are real. I've literally never seen or heard of "armhfel", nor of "thumb" ever showing up in `uname -m`, etc.
* trailing whitespace
* langref: undefined _is_ materialized in all safe modes
I am also not super happy about the clause that immediately follows. I
_believe_ what we want to say here is that, simultaneously:
* undefined is guaranteed to be matrerialized in in all safe modes.
A Zig implementation that elides `ptr.* = undefined` in ReleaseSafe
mode would be a non-conforming implementation.
* A Zig program that relies on undefined being materialized is buggy.
But I don't think it's the time to engage this level of language-lawering!
Currently, Zig semantically loads an array as a temporary when indexing
it. This means it cannot be guaranteed that only the requested element
is loaded; in particular, our self-hosted backends do not elide the load
of the full array, so this test case was crashing on self-hosted.
Representing this with a `GenZir` field is incredibly bug-prone.
Instead, just pass this data directly to the relevant expression in the
very few places which actually provide a name strategy.
Resolves: #22798
Note that `openLoadArchive` already has linker script support.
With this change I get a failure parsing a real archive in the self
hosted elf linker, rather than the previous behavior of getting an error
while trying to parse a pseudo archive that is actually a load script.