* Compilation.objects changes to Compilation.link_inputs which stores
objects, archives, windows resources, shared objects, and strings
intended to be put directly into the dynamic section. Order is now
preserved between all of these kinds of linker inputs. If it is
determined the order does not matter for a particular kind of linker
input, that item should be moved to a different array.
* rename system_libs to windows_libs
* untangle library lookup from CLI types
* when doing library lookup, instead of using access syscalls, go ahead
and open the files and keep the handles around for passing to the
cache system and the linker.
* during library lookup and cache file hashing, use positioned reads to
avoid affecting the file seek position.
* library directories are opened in the CLI and converted to Directory
objects, warnings emitted for those that cannot be opened.
The compiler defaults this value to off so that users whose system
shared libraries are all ELF files don't have to pay the cost of
checking every file to find out if it is a text file instead.
When a GNU ld script is encountered, the error message instructs users
about the CLI flag that will immediately solve their problem.
along with the relevant logic, making the libraries within subject to
the same search criteria as all the other libraries.
this unfortunately means doing file system access on all .so files when
targeting ELF to determine if they are linker scripts, however, I have a
plan to address this.
thread-sanitizer reports data races here when running test-link. I tried
only removing the ones that triggered races, but after 10 back and
forths with the compiler and tsan, I got impatient and removed all of
them.
next time, let's be sure the test suite runs tsan-clean before merging
any changes that add parallelism.
after this commit, `zig build test-link` completes without any tsan
warnings.
closes#21778
This was the cause of aarch64-windows shared libraries causing "bad image" errors
during load-time linking. I also re-enabled the tests that were surfacing this bug.
The old `CallingConvention` type is replaced with the new
`NewCallingConvention`. References to `NewCallingConvention` in the
compiler are updated accordingly. In addition, a few parts of the
standard library are updated to use the new type correctly.
This commit begins implementing accepted proposal #21209 by making
`std.builtin.CallingConvention` a tagged union.
The stage1 dance here is a little convoluted. This commit introduces the
new type as `NewCallingConvention`, keeping the old `CallingConvention`
around. The compiler uses `std.builtin.NewCallingConvention`
exclusively, but when fetching the type from `std` when running the
compiler (e.g. with `getBuiltinType`), the name `CallingConvention` is
used. This allows a prior build of Zig to be used to build this commit.
The next commit will update `zig1.wasm`, and then the compiler and
standard library can be updated to completely replace
`CallingConvention` with `NewCallingConvention`.
The second half of #21209 is to remove `@setAlignStack`, which will be
implemented in another commit after updating `zig1.wasm`.
Make shared_objects a StringArrayHashMap so that deduping does not
need to happen in flush. That deduping code also was using an O(N^2)
algorithm, which is not allowed in this codebase. There is another
violation of this rule in resolveSymbols but this commit does not
address it.
This required reworking shared object parsing, breaking it into
independent components so that we could access soname earlier.
Shared object parsing had a few problems that I noticed and fixed in
this commit:
* Many instances of incorrect use of align(1).
* `shnum * @sizeOf(elf.Elf64_Shdr)` can overflow based on user data.
* `@divExact` can cause illegal behavior based on user data.
* Strange versyms logic that wasn't present in mold nor lld. The logic
was not commented and there is no git blame information in ziglang/zig
nor kubkon/zld. I changed it to match mold and lld instead.
* Use of ArrayList for slices of memory that are never resized.
* finding DT_VERDEFNUM in a different loop than finding DT_SONAME.
Ultimately I think we should follow mold's lead and ignore this
integer, relying on null termination instead.
* Doing logic based on VER_FLG_BASE rather than ignoring it like mold
and LLD do. No comment explaining why the behavior is different.
* Mutating the original ELF symbols rather than only storing the mangled
name on the new Symbol struct.
I noticed something that I didn't try to address in this commit: Symbol
stores a lot of redundant information that is already present in the ELF
symbols. I suspect that the codebase could benefit from reworking Symbol
to not store redundant information.
Additionally:
* Add some type safety to std.elf.
* Eliminate 1-3 file system reads for determining the kind of input
files, by taking advantage of file name extension and handling error
codes properly.
* Move more error handling methods to link.Diags and make them
infallible and thread-safe
* Make the data dependencies obvious in the parameters of
parseSharedObject. It's now clear that the first two steps (Header and
Parsed) can be done during the main Compilation pipeline, rather than
waiting for flush().
Some compilers such as Go reference the end of a section (addr + size)
which cannot be contained in any non-zero atom (since then this atom
would exceed section boundaries). In order to facilitate this behaviour,
we create a dummy zero-sized atom at section end (addr + size).
By organizing linker diagnostics into this struct, it becomes possible
to share more code between linker backends, and more importantly it
becomes possible to pass only the Diag struct to some functions, rather
than passing the entire linker state object in. This makes data
dependencies more obvious, making it easier to rearrange code and to
multithread.
Also fix MachO code abusing an atomic variable. Not only was it using
the wrong atomic operation, it is unnecessary additional state since
the state is already being protected by a mutex.
In order to reduce the logic that happens in flush() we need to see
which data is being accessed by all this logic, so we can see which
operations depend on each other.
Embrace the Path abstraction, doing more operations based on directory
handles rather than absolute file paths. Most of the diff noise here
comes from this one.
Fix sorting of crtbegin/crtend atoms. Previously it would look at all
path components for those strings.
Make the C runtime path detection partially a pure function, and move
some logic to glibc.zig where it belongs.
The initAtoms function now only uses the `elf_file` parameter for
reporting linker error messages, making it easier to see that the
function has no data dependencies other than the Object struct itself,
making it easier to parallelize or otherwise move that logic around.
Also removed an indirect call via `addExtra` since we already know the
atom's file is the current Object instance. All calls to `Atom.addExtra`
should be audited for similar reasons.
Also removed unjustified use of `inline fn`.
Special symbols include explictly force undefined symbols passed via -u
flag, missing entry point symbol, missing 'dyld_stub_binder' symbol, or
missing '_objc_msgsend' symbol.