Future improvement: make plain error notes actually render as notes
rather than errors, but keep them as errors for the case of
sub-compilation errors, e.g. when compiler-rt has compilation errors.
This is a prelude to a more elaborate work which will implement
`-dead_strip` flag - garbage collection of unreachable atoms. Here,
when sorting sections, we also check that the section is actually
populated with some atoms, and if not, we exclude it from the final
linked image. This can happen when we do not import any symbols
from dynamic libraries in which case we will not be populating
the stubs sections or the GOT table, implying we can skip allocating
those sections. Furthermore, we also make a check that a segment
is actually occupied too, with the exception of `__TEXT` segment
which is non-optional given that it wraps the header and load commands
and thus is required by the `dyld` to perform dynamic linking, and
`__PAGEZERO` which is generally non-optional when the linked image
is an executable. For any other segment, if its section count is
zero, we mark it as dead and skip allocating it and generating
a load command for it.
This commit also includes some minor improvements to the linker such
as refactoring of the segment allocating codepaths, skipping
`__PAGEZERO` generation for dylibs, and skipping generation of zero-sized
atoms for special symbols such as `__mh_execute_header` and `___dso_handle`.
These special symbols are only allocated local and global symbol pair
and their VM addresses is set to the start of the `__TEXT` segment,
but no `Atom` is created, as it's not necessary given that they never
carry any machine code.
Finally, we now always force-link against `libSystem` which turns out
to be required for `dyld` to properly handle `LC_MAIN` load command
on older macOS versions such as 10.15.7.
MachO linker now handles `-needed-l<name>`, `-needed_library=<name>`
and `-needed_framework=<name>`. While on macOS `-l` is equivalent
to `-needed-l`, and `-framework` to `-needed_framework`, it can be
used to the same effect as on Linux if combined with `-dead_strip_dylibs`.
This commit also adds handling for `-needed_library` which is macOS
specific flag only (in addition to `-needed-l`).
Finally, in order to leverage new linker testing harness, this commit
added ability to specify lowering to those flags via `build.zig`:
`linkSystemLibraryNeeded` (and related), and `linkFrameworkNeeded`.
We now ensure the "bss" section is last, which allows us to not
emit this section and let the runtime initialize the memory with 0's instead.
This allows for smaller binaries.
The order of the other segments is arbitrary and does not matter, this may
change in the future.
Includes both traditiona and incremental codepaths with one caveat that
in incremental case, the requested size cannot be smaller than the
default padding size due to prealloc required due to incremental nature
of linking.
Also parse `-headerpad_max_install_names`, however, not actionable just yet -
missing implementation.
Decls will now be put into their respective segment.
e.g. a constant decl will be inserted into the "rodata" segment,
whereas an uninitialized decl will be put in the "bss" segment instead.
This adds clarification to the getGlobalSymbol doc comments,
as well as renames the `addExternFn` function for MachO to `getGlobalSymbol`.
This function will now be called from 'src/link.zig' as well.
Finally, this also enables compiling zig's libc using LLVM even though
the `fno-LLVM` flag is given.
`genFunctype` now accepts calling convention, param types, and return type
as part of its function signature rather than `fnData`. This means
we no longer have to create a dummy for our intrinsic call abstraction.
This also adds support for f16 division and builtins such as `@ceil` & more.
Rather than checking if the user wants to use LLVM for the current compilation,
check for the existance of LLVM as part of the compiler. This is temporarily,
until other backends gain the ability to compiler LLVM themselves.
This means that when a user passed `-fno-LLVM` we will use the native
backend for the user's code, but use LLVM for compiler-rt.
This also fixes emitting names for symbols in the Wasm linker,
by deduplicating symbol names when multiple symbols point the same object.
Rather than finding the original object file, we seekTo to the
object file's position within the archive file, and from there open
a new file handle. This file handle is passed to the `Object` parser
which will create the object.
When a new symbol is resolved to an existing symbol where
it doesn't overwrite the existing symbol, we now add this symbol
to the discarded list. This is required so when any relocation points
to the symbol, we can retrieve the correct symbol it's resolved by instead.
When performing relocations for a type index,
we first check if the target symbol is undefined. In which case,
we will obtain the type from the `import` rather than look into the
`functions` table.
This implements a very basic archive file parser that validates
the magic bytes, and then parses the symbol table and stores
the symbol and their position.
Multiple symbols can point to the same function, this means that when we loop over
the symbol list, we must deduplicate those functions being added twice.
Additionaly, we must also ensure that when we append a new type and set the type
index on a function, we must not do this again for the same function.
This commit also implements sorting of code atoms to ensure their order matches
the order of the function section to ensure the function signature matches
that of the function body.
Implements the creation of an undefined symbol for a compiler-rt intrinsic.
Also implements the building of the function call to said compiler-rt intrinsic.
If page aligned requested pagezero size is 0, skip generating
__PAGEZERO segment.
Add misc improvements to the pipeline, and correctly transfer the
requested __PAGEZERO size to the linker.
Pass `-pagezero_size` to the MachO linker. This is the final
"unsupported linker arg" that I could chase that CGo uses. After this
and #11874 we may be able to fail on an "unsupported linker arg" instead
of emiting a warning.
Test case:
zig=/code/zig/build/zig
CGO_ENABLED=1 GOOS=darwin GOARCH=amd64 CC="$zig cc -target x86_64-macos" CXX="$zig c++ -target x86_64-macos" go build -a -ldflags "-s -w" cgo.go
I compiled a trivial CGo program and executed it on an amd64 Darwin
host.
To be honest, I am not entirely sure what this is doing. This feels
right after reading what this argument does in LLVM sources, but I am by
no means qualified to make MachO pull requests. Will take feedback.
The function `writeDbgInfoNopsBuffered` was based on the function
`pwriteDbgInfoNops`, originally written by me, and then modified to
write to a memory buffer instead of an open file. When writing to a
file, any extra bytes beyond the end of the file extend the size of
the file, and the function body of `pwriteDbgInfoNops` takes advantage
of this when `next_padding_bytes` causes the write to go beyond the
end of the file. However, when writing to a memory buffer, the
underlying array list must be expanded if the write would cause the
buffer to expand.
Note that the current documentation for the `-z noexecstack` is
incorrect. This indicates that an object *does not* require an
executable stack.
This is actually the default of LLD, and there has never been a way to
override this default by passing `-z execstack` to LLD.
This commit removes the redundant `-z noexecstack` option from
zig build-exe/build-lib/build-obj and ignores the option if passed
to zig cc for compatibility.
As far as I can tell, there is no reason for code to require an
executable stack. This option only exists because the stack was
originally executable by default and some programs came to depend
on that behavior. Instead, mprotect(2) may be used to make memory
pages executable.
Full RELRO is a hardening feature that makes it impossible to perform
certian attacks involving overwriting parts of the Global Offset Table
to invoke arbitrary code.
It requires all symbols to be resolved before execution of the program
starts which may have an impact on startup time. However most if
not all popular Linux distributions enable full RELRO by default for
all binaries and this does not seem to make a noticeable difference
in practice.
"Partial RELRO" is equivalent to `-z relro -z lazy`.
"Full RELRO" is equivalent to `-z relro -z now`.
LLD defaults to `-z relro -z lazy`, which means Zig's current `-z relro`
option has no effect on LLD's behavior.
The changes made by this commit are as follows:
- Document that `-z relro` is the default and add `-z norelro`.
- Pass `-z now` to LLD by default to enable full RELRO by default.
- Add `-z lazy` to disable passing `-z now`.
Split type relocs into two kinds: local and global. Global relocs
use a global type resolver and calculate offset to the existing
definition of a type abbreviation.
Local relocs use offset in the abbrev section of the containing
atom plus addend to generate a local relocation.
gitrev kubkon/zig-yaml 8cf8dc3bb901fac8189f441392fc0989ad14cf71
Calculate line and col info indexed by token index. We can then
re-use this info to track current column number (aka indentation
level) of each "key:value" pair (map) or "- element" (list).
This significantly cleans up the code, and leads naturally to
handling of unindented lists in tbd files.