In `getDeclVAddr`, it may happen that the target `Decl` has not
been allocated space in virtual memory. In this case, we store a
relocation in the linker-global table which we will iterate over
when flushing the module, and fill in any missing address in the
final binary. Note that for optimisation, if the address was resolved
at the time of a call to `getDeclVAddr`, we skip relocating this
atom.
This commit also adds the glue code for lowering const slices in
the ARM backend.
We need to pad out the file to the required maximum size equal the
final section's offset plus the section's size. We only need to
this when populating initial metadata and only when section header
was updated.
* link: add a virtual function `lowerUnnamedConsts`, similar to
`updateFunc` or `updateDecl` which needs to be implemented by the
linker backend in order to be used with the `CodeGen` code
* elf: implement `lowerUnnamedConsts` specialization where we
lower unnamed constants to `.rodata` section. We keep track of the
atoms encompassing the lowered unnamed consts in a global table
indexed by parent `Decl`. When the `Decl` is updated or destroyed,
we clear the unnamed consts referenced within the `Decl`.
* macho: implement `lowerUnnamedConsts` specialization where we
lower unnamed constants to `__TEXT,__const` section. We keep track of the
atoms encompassing the lowered unnamed consts in a global table
indexed by parent `Decl`. When the `Decl` is updated or destroyed,
we clear the unnamed consts referenced within the `Decl`.
* x64: change `MCValue.linker_sym_index` into two `MCValue`s: `.got_load` and
`.direct_load`. The former signifies to the emitter that it should
emit a GOT load relocation, while the latter that it should emit
a direct load (`SIGNED`) relocation.
* x64: lower `struct` instantiations
In accordance with the requesting issue (#10750):
- `zig test` skips any tests that it cannot spawn, returning success
- `zig run` and `zig build` exit with failure, reporting the command the cannot be run
- `zig clang`, `zig ar`, etc. already punt directly to the appropriate clang/lld main(), even before this change
- Native `libc` Detection is not supported
Additionally, `exec()` and related Builder functions error at run-time, reporting the command that cannot be run
Clarify that `astgen.advanceSourceCursor` already increments absolute
values of the line and columns numbers; i.e., `GenZir.calcLine` is thus
not only obsolete but wrong by design.
Incidentally, this clean up allows for specifying the `FnDecl` line
numbers for DWARF use correctly as relative values with respect to
the start of the parent `Decl`. This `Decl` in turn has its line number
information specified relatively to its parent `Decl`, and so on, until
we reach the global scope.
This is only relevant for ELF files.
I also fixed a bug where passing a zig source file to `zig cc` would
incorrectly punt to clang because it thought there were no positional
arguments.
Implements slice types including `[]const u8` for passing as
formal parameters in DWARF. Breaking on a function accepting
a slice in `gdb` will now yield the same behavior as stage1 and/or
LLVM backend:
```zig
fn sumArrayLens(a: []const u32, b: []const u8) usize {
return a.len + b.len;
}
```
Both `a` and `b` can now be inspected in the debugger:
```
Breakpoint 1, sumArrayLens (a=..., b=...) at arr.zig:59
(gdb) p a
$1 = {ptr = 0x7fffffff685c, len = 5}
(gdb) p b
$2 = {ptr = 0x7fffffff683d "\252\252\252\\h\377\377\377\177", len = 3}
(gdb)
```
Augment relocation tracking mechanism to de-duplicate potential
creation of base as well as composite types while unrolling
composite types in the linker - there is still potential for
further space optimisation by moving all type information into
a separate section `.debug_types` and providing references to
entries within that section whenever required (e.g., `ref4` form).
Currently, we duplicate type definitions on a per-decl basis.
Anyhow, with this patch, an example function signature of the following
type:
```zig
fn byPtrPtr(ptr_ptr_x: **u32, ptr_x: *u32) void {
ptr_ptr_x.* = ptr_x;
}
```
will generate the following `.debug_info` for formal parameters:
```
<1><1aa>: Abbrev Number: 3 (DW_TAG_subprogram)
<1ab> DW_AT_low_pc : 0x8000197
<1b3> DW_AT_high_pc : 0x2c
<1b7> DW_AT_name : byPtrPtr
<2><1c0>: Abbrev Number: 7 (DW_TAG_formal_parameter)
<1c1> DW_AT_location : 1 byte block: 55 (DW_OP_reg5 (rdi))
<1c3> DW_AT_type : <0x1df>
<1c7> DW_AT_name : ptr_ptr_x
<2><1d1>: Abbrev Number: 7 (DW_TAG_formal_parameter)
<1d2> DW_AT_location : 1 byte block: 54 (DW_OP_reg4 (rsi))
<1d4> DW_AT_type : <0x1e4>
<1d8> DW_AT_name : ptr_x
<2><1de>: Abbrev Number: 0
<1><1df>: Abbrev Number: 5 (DW_TAG_pointer_type)
<1e0> DW_AT_type : <0x1e4>
<1><1e4>: Abbrev Number: 5 (DW_TAG_pointer_type)
<1e5> DW_AT_type : <0x1e9>
<1><1e9>: Abbrev Number: 4 (DW_TAG_base_type)
<1ea> DW_AT_encoding : 7 (unsigned)
<1eb> DW_AT_byte_size : 4
<1ec> DW_AT_name : u32
```
AstGen:
* rename the known_has_bits flag to known_non_opv to make it better
reflect what it actually means.
* add a known_comptime_only flag.
* make the flags take advantage of identifiers of primitives and the
fact that zig has no shadowing.
* correct the known_non_opv flag for function bodies.
Sema:
* Rename `hasCodeGenBits` to `hasRuntimeBits` to better reflect what it
does.
- This function got a bit more complicated in this commit because of
the duality of function bodies: on one hand they have runtime bits,
but on the other hand they require being comptime known.
* WipAnonDecl now takes a LazySrcDecl parameter and performs the type
resolutions that it needs during finish().
* Implement comptime `@ptrToInt`.
Codegen:
* Improved handling of lowering decl_ref; make it work for
comptime-known ptr-to-int values.
- This same change had to be made many different times; perhaps we
should look into merging the implementations of `genTypedValue`
across x86, arm, aarch64, and riscv.
If there is a big atom available for re-use in the free list, and
it's the last atom in section, it's ideal capacity might span the
entire section in which case we do not want to calculate the actual
end VM addr of the symbol since it may overflow. Instead, we just take
the max capacity available as end VM addr estimate. In this case,
the max capacity equals `std.math.maxInt(u64)`.
* rename `entry` to `entry_symbol_name` for the zig build API
* integrate with `zig cc` command line options
* integrate with COFF linking with LLD
* integrate with self-hosted ELF linker
* don't put it in the hash for MachO since it is ignored
This commit fixes two problems:
* `zig build-obj` regressed from the cache-mode branch. It would crash
because it assumed that dirname on the emit bin path would not be
null. This assumption was invalid when outputting to the current
working directory - a pretty common use case for `zig build-obj`.
* When using the LLVM backend, `-fno-emit-bin` combined with any other
kind of emitting, such as `-femit-asm`, emitted nothing.
Both issues are now fixed.
Doc comments reproduced here:
This function is called by the frontend before flush(). It communicates that
`options.bin_file.emit` directory needs to be renamed from
`[zig-cache]/tmp/[random]` to `[zig-cache]/o/[digest]`.
The frontend would like to simply perform a file system rename, however,
some linker backends care about the file paths of the objects they are linking.
So this function call tells linker backends to rename the paths of object files
to observe the new directory path.
Linker backends which do not have this requirement can fall back to the simple
implementation at the bottom of this function.
This function is only called when CacheMode is `whole`.
This solves stack trace regressions on Windows and macOS because the
linker backends do not observe object file paths until flush().
The two CacheMode values are `whole` and `incremental`.
`incremental` is what we had before; `whole` is new.
Whole cache mode uses everything as inputs to the cache hash;
and when a hit occurs it skips everything including linking.
This is ideal for when source files change rarely and for backends that
do not have good incremental compilation support, for example
compiler-rt or libc compiled with LLVM with optimizations on.
This is the main motivation for the additional mode, so that we can have
LLVM-optimized compiler-rt/libc builds, without waiting for the LLVM
backend every single time Zig is invoked.
Incremental cache mode hashes only the input file path and a few target
options, intentionally relying on collisions to locate already-existing
build artifacts which can then be incrementally updated.
The bespoke logic for caching stage1 backend build artifacts
is removed since we now have a global caching mechanism for
when we want to cache the entire compilation, *including* linking.
Previously we had to get "creative" with libs.txt and a special
byte in the hash id to communicate flags, so that when the cached
artifacts were re-linked, we had this information from stage1
even though we didn't actually run it. Now that `CacheMode.whole`
includes linking, this extra information does not need to be
preserved for cache hits. So although this changeset introduces
complexity, it also removes complexity.
The main trickiness here comes from the inherent differences between the
two modes: `incremental` wants a directory immediately to operate on,
while `whole` doesn't know the output directory until the compilation is
complete. This commit deals with this problem mostly inside `update()`,
where, on a cache miss, it replaces `zig_cache_artifact_directory` with a
temporary directory, and then renames it into place once the compilation is
complete.
Items remaining before this branch can be merged:
* [ ] make sure these things make it into the cache manifest:
- @import files
- @embedFile files
- we already add dep files from c but make sure the main .c files make
it in there too, not just the included files
* [ ] double check that the emit paths of other things besides the binary
are working correctly.
* [ ] test `-fno-emit-bin` + `-fstage1`
* [ ] test `-femit-bin=foo` + `-fstage1`
* [ ] implib emit directory copies bin_file_emit directory in create() and needs
to be adjusted to be overridden as well.
* [ ] make sure emit-h is handled correctly in the cache hash
* [ ] Cache: detect duplicate files added to the manifest
Some preliminary performance measurements of wall clock time and
peak RSS used:
stage1 behavior (1077 tests), llvm backend, release build:
* cold global cache: 4.6s, 1.1 GiB
* warm global cache: 3.4s, 980 MiB
stage2 master branch behavior (575 tests), llvm backend, release build:
* cold global cache: 0.62s, 191 MiB
* warm global cache: 0.40s, 128 MiB
stage2 this branch behavior (575 tests), llvm backend, release build:
* cold global cache: 0.62s, 179 MiB
* warm global cache: 0.27s, 90 MiB
Allocate a new program header and a new section to accomodate the read-only data
section ".rodata".
Separate TextBlock into multiple TextBlockList, to separate decl in different
sections.
If a Decl is not a function, it is added to the .rodata section.
Due to a deficiency in LLD, we need to special-case BPF to a simple
file copy when generating relocatables. Normally, we would expect
`lld -r` to work. However, because LLD wants to resolve BPF relocations
which it shouldn't, it fails before even generating the relocatable.
Co-authored-by: Matthew Knight <mattnite@protonmail.com>
In 7e23b3245a9bf6e002009e6c18c10a9995671afa I made -O flags to the
linker emit a warning that the argument does nothing. That was not
correct however; LLD does have some logic that does different things
depending on -O0, -O1, and -O2. It defaults to -O1, and it does less
optimizations with -O0 and more with -O2.
With this commit, e.g. `-Wl,-O1` is supported by the `zig cc` frontend,
and by default we pass `-O0` to LLD in debug mode, and `-O3` in release
modes.
I also fixed a bug in the LLD ELF linker line which was incorrectly
passing `-O` flags instead of `--lto-O` flags for LTO.
This mechanism for sending arbitrary linker args to LLD has no place in
the Zig frontend, because our goal is for the frontend to understand all
the arguments and not treat linker args like a black box.
For example we have self-hosted linking in addition to LLD, so we want to
have the options make sense to both linking codepaths, not just the LLD one.
Passing -O linker args will now result in a warning that the arg does
nothing.
After this change, the default for dynamic libraries (`-l` or
`--library`) is to only link them if they end up being actually used.
With the Zig CLI, the new options `-needed-l` or `--needed-library` can
be used to force link against a dynamic library.
With `zig cc`, this behavior can be overridden with `-Wl,--no-as-needed`
(and restored with `-Wl,--as-needed`).
Closes#10164
Previously, we have confused callee-saved with caller-saved registers
(the actual register sets were swapped). This commit fixes that
for both `.x86` and `.x86_64` native backends.
This commit also fixes the register allocation logic in `genBinMathOp`
for `.x86_64` native backend where in a situation such that we require
to spill a register, we would end up spilling the register that is
already involved in the instruction as the other operand. In such a
case, we make a note of this and spill a subsequent register instead.
Because ArrayList.initCapacity uses 'precise' capacity allocation, this should save memory on average, and definitely will save memory in cases where ArrayList is used where a regular allocated slice could have also be used.
Add an option to allow the '-z notext' option to be passed to the linker
via. the compiler frontend, which is a flag that tells the linker that
relocations in read-only sections are permitted. Certain targets such as
Solana BPF rely on this flag.
Expose all linker options i.e. '-z nodelete', '-z now', '-z relro' in
the compiler frontend. Usage documentation has been updated accordingly.
Expose the '-z notext' flag in the standard library build runner.
There was duplicated logic for whether to include compiler_rt in the
linker line both in the frontend and in the linker backends. Now the
logic is only in the frontend; the linker puts it on the linker line if
the frontend provides it.
Fixes the CI failures.
* AstGen: fix emitting `store_to_inferred_ptr` when it should be emitting
`store` for a variable that has an explicit alignment.
* Compilation: fix a couple memory leaks
* Sema: implement support for locals that have specified alignment.
* Sema: implement `@intCast` when it needs to emit an AIR instruction.
* Sema: implement `@alignOf`
* Implement debug printing for extended alloc ZIR instructions.
* LLVM backend: respect `sub_path` just like the other stage2 backends
do.
* Compilation has some new logic to only emit work queue jobs for
building stuff when it believes itself to be capable. The linker
backends no longer have duplicate logic; instead they respect the
optional bit on the respective asset.
Introduce an explicit decl_map for *Decl to LLVMValueRef. Doc comment
reproduced here:
Ideally we would use `llvm_module.getNamedFunction` to go from *Decl to
LLVM function, but that has some downsides:
* we have to compute the fully qualified name every time we want to do the lookup
* for externally linked functions, the name is not fully qualified, but when
a Decl goes from exported to not exported and vice-versa, we would use the wrong
version of the name and incorrectly get function not found in the llvm module.
* it works for functions not all globals.
Therefore, this table keeps track of the mapping.
Non-exported functions now use fully-qualified symbol names.
`Module.Decl.getFullyQualifiedName` now returns a sentinel-terminated
slice which is useful to pass to LLVMAddFunction.
Instead of using aliases for all external symbols, now the LLVM backend
takes advantage of LLVMSetValueName to rename functions that become
exported. Aliases are still used for the second and remaining exports.
freeDecl is now handled properly in the LLVM backend, deleting the
LLVMValueRef corresponding to the Decl being deleted. The linker
backends for ELF, COFF, Mach-O, and Wasm had to be updated to forward
the freeDecl call to the LLVM backend.