minFunctionAlignment() is something we can know ahead of time for any given
target because it's a matter of ABI. However, defaultFunctionAlignment() is a
matter of optimization and every backend can do it differently depending on any
number of factors. For example, LLVM will base the choice on the CPU model in
its aarch64 backend. So just don't use this value in the frontend.
defaultFunctionAlignment() can be made more sophisticated over time based on the
CPU model and/or features. For now, I've picked some reasonable values for the
CPUs that are most commonly used in practice. (Values are sourced from LLVM.)
Ideally we'd like to use whatever alignment glibc actually ends up using in the
real libc.so.6. But we don't really have a way of getting at that information at
the moment, and it's not present in the abilist files. I haven't yet seen a
symbol that wasn't word-aligned, though, so I think this should be good enough
for 99% of symbols, if not actually 100%.
This prevents LLVM from...cleverly...merging all of the global variable stub
symbols that we emit under certain circumstances. This was observed in practice
when using zig-bootstrap for arm-linux-gnueabi(hf).
Make shared_objects a StringArrayHashMap so that deduping does not
need to happen in flush. That deduping code also was using an O(N^2)
algorithm, which is not allowed in this codebase. There is another
violation of this rule in resolveSymbols but this commit does not
address it.
This required reworking shared object parsing, breaking it into
independent components so that we could access soname earlier.
Shared object parsing had a few problems that I noticed and fixed in
this commit:
* Many instances of incorrect use of align(1).
* `shnum * @sizeOf(elf.Elf64_Shdr)` can overflow based on user data.
* `@divExact` can cause illegal behavior based on user data.
* Strange versyms logic that wasn't present in mold nor lld. The logic
was not commented and there is no git blame information in ziglang/zig
nor kubkon/zld. I changed it to match mold and lld instead.
* Use of ArrayList for slices of memory that are never resized.
* finding DT_VERDEFNUM in a different loop than finding DT_SONAME.
Ultimately I think we should follow mold's lead and ignore this
integer, relying on null termination instead.
* Doing logic based on VER_FLG_BASE rather than ignoring it like mold
and LLD do. No comment explaining why the behavior is different.
* Mutating the original ELF symbols rather than only storing the mangled
name on the new Symbol struct.
I noticed something that I didn't try to address in this commit: Symbol
stores a lot of redundant information that is already present in the ELF
symbols. I suspect that the codebase could benefit from reworking Symbol
to not store redundant information.
Additionally:
* Add some type safety to std.elf.
* Eliminate 1-3 file system reads for determining the kind of input
files, by taking advantage of file name extension and handling error
codes properly.
* Move more error handling methods to link.Diags and make them
infallible and thread-safe
* Make the data dependencies obvious in the parameters of
parseSharedObject. It's now clear that the first two steps (Header and
Parsed) can be done during the main Compilation pipeline, rather than
waiting for flush().
Some compilers such as Go reference the end of a section (addr + size)
which cannot be contained in any non-zero atom (since then this atom
would exceed section boundaries). In order to facilitate this behaviour,
we create a dummy zero-sized atom at section end (addr + size).
By organizing linker diagnostics into this struct, it becomes possible
to share more code between linker backends, and more importantly it
becomes possible to pass only the Diag struct to some functions, rather
than passing the entire linker state object in. This makes data
dependencies more obvious, making it easier to rearrange code and to
multithread.
Also fix MachO code abusing an atomic variable. Not only was it using
the wrong atomic operation, it is unnecessary additional state since
the state is already being protected by a mutex.
In order to reduce the logic that happens in flush() we need to see
which data is being accessed by all this logic, so we can see which
operations depend on each other.
`check_pie_supported` only uses the `OUTPUT_VARIABLE` to to signify errors
if PIE is actually supported is signaled by `CMAKE_<lang>_LINK_PIE_SUPPORTED`.
Checking if `OUTPUT_VARIABLE` is empty is not enough either since the check
is bypassed if its results are cached but the output variable is not cached.