This was done by regex substitution with `sed`. I then manually went
over the entire diff and fixed any incorrect changes.
This diff also changes a lot of `callconv(.C)` to `callconv(.c)`, since
my regex happened to also trigger here. I opted to leave these changes
in, since they *are* a correct migration, even if they're not the one I
was trying to do!
Whatever was in the frame pointer register prior to clone() will no longer be
valid in the child process, so zero it to protect FP-based unwinders. Similarly,
mark the link register as undefined to protect DWARF-based unwinders.
This is only zeroing the frame pointer(s) on Arm/Thumb because of an LLVM
assembler bug: https://github.com/llvm/llvm-project/issues/115891
Make shared_objects a StringArrayHashMap so that deduping does not
need to happen in flush. That deduping code also was using an O(N^2)
algorithm, which is not allowed in this codebase. There is another
violation of this rule in resolveSymbols but this commit does not
address it.
This required reworking shared object parsing, breaking it into
independent components so that we could access soname earlier.
Shared object parsing had a few problems that I noticed and fixed in
this commit:
* Many instances of incorrect use of align(1).
* `shnum * @sizeOf(elf.Elf64_Shdr)` can overflow based on user data.
* `@divExact` can cause illegal behavior based on user data.
* Strange versyms logic that wasn't present in mold nor lld. The logic
was not commented and there is no git blame information in ziglang/zig
nor kubkon/zld. I changed it to match mold and lld instead.
* Use of ArrayList for slices of memory that are never resized.
* finding DT_VERDEFNUM in a different loop than finding DT_SONAME.
Ultimately I think we should follow mold's lead and ignore this
integer, relying on null termination instead.
* Doing logic based on VER_FLG_BASE rather than ignoring it like mold
and LLD do. No comment explaining why the behavior is different.
* Mutating the original ELF symbols rather than only storing the mangled
name on the new Symbol struct.
I noticed something that I didn't try to address in this commit: Symbol
stores a lot of redundant information that is already present in the ELF
symbols. I suspect that the codebase could benefit from reworking Symbol
to not store redundant information.
Additionally:
* Add some type safety to std.elf.
* Eliminate 1-3 file system reads for determining the kind of input
files, by taking advantage of file name extension and handling error
codes properly.
* Move more error handling methods to link.Diags and make them
infallible and thread-safe
* Make the data dependencies obvious in the parameters of
parseSharedObject. It's now clear that the first two steps (Header and
Parsed) can be done during the main Compilation pipeline, rather than
waiting for flush().
The kernel does define the struct, it just doesn't use it. Yet both glibc and
musl expose it directly as their public stat struct, and std.c takes it from
std.os.linux. So just define it after all.
The compiler actually doesn't need any functional changes for this: Sema
does reification based on the tag indices of `std.builtin.Type` already!
So, no zig1.wasm update is necessary.
This change is necessary to disallow name clashes between fields and
decls on a type, which is a prerequisite of #9938.
The kernel sets r7 to 0 (success) or -1 (error), and stores the result in r2.
When r7 is -1 and the result is positive, it needs to be negated to get the
errno value that higher-level code, such as errnoFromSyscall(), expects to see.
The old code was missing the check that r2 is positive, but was also doing the
r7 check incorrectly; since it can only be set to 0 or -1, the blez instruction
would always branch.
In practice, this fix is necessary for e.g. the ENOSYS error to be interpreted
correctly. This manifested as hitting an unreachable branch when calling
process_vm_readv() in std.debug.MemoryAccessor.
What is `sparcel`, you might ask? Good question!
If you take a peek in the SPARC v8 manual, §2.2, it is quite explicit that SPARC
v8 is a big-endian architecture. No little-endian or mixed-endian support to be
found here.
On the other hand, the SPARC v9 manual, in §3.2.1.2, states that it has support
for mixed-endian operation, with big-endian mode being the default.
Ok, so `sparcel` must just be referring to SPARC v9 running in little-endian
mode, surely?
Nope:
* 40b4fd7a3e/llvm/lib/Target/Sparc/SparcTargetMachine.cpp (L226)
* 40b4fd7a3e/llvm/lib/Target/Sparc/SparcTargetMachine.cpp (L104)
So, `sparcel` in LLVM is referring to some sort of fantastical little-endian
SPARC v8 architecture. I've scoured the internet and I can find absolutely no
evidence that such a thing exists or has ever existed. In fact, I can find no
evidence that a little-endian implementation of SPARC v9 ever existed, either.
Or any SPARC version, actually!
The support was added here: https://reviews.llvm.org/D8741
Notably, there is no mention whatsoever of what CPU this might be referring to,
and no justification given for the "but some are little" comment added in the
patch.
My best guess is that this might have been some private exercise in creating a
little-endian version of SPARC that never saw the light of day. Given that SPARC
v8 explicitly doesn't support little-endian operation (let alone little-endian
instruction encoding!), and no CPU is known to be implemented as such, I think
it's very reasonable for us to just remove this support.