161 Commits

Author SHA1 Message Date
Jakub Konka
e8eb9778cc codegen: lower field_ptr to memory across linking backends
This requires generating an addend for the target relocation as
the field pointer might point at a field inner to the container.
2022-03-01 22:03:18 +01:00
Jakub Konka
ffb7ac6755 elf: use fully qualified decl names in the linker 2022-02-24 00:01:11 +01:00
Jakub Konka
f4f23e307c codegen: lower error_set and error_union 2022-02-22 21:56:34 +01:00
Andrew Kelley
2b3df5c81d link: avoid double close on openPath error 2022-02-16 14:12:15 -07:00
Jakub Konka
b9b1ab0240 elf: store pointer relocations indexed by containing atom
In `getDeclVAddr`, it may happen that the target `Decl` has not
been allocated space in virtual memory. In this case, we store a
relocation in the linker-global table which we will iterate over
when flushing the module, and fill in any missing address in the
final binary. Note that for optimisation, if the address was resolved
at the time of a call to `getDeclVAddr`, we skip relocating this
atom.

This commit also adds the glue code for lowering const slices in
the ARM backend.
2022-02-11 10:52:13 +01:00
Jakub Konka
57357c43e3 elf: pad out file to the required size when init data
We need to pad out the file to the required maximum size equal the
final section's offset plus the section's size. We only need to
this when populating initial metadata and only when section header
was updated.
2022-02-10 08:12:02 +01:00
Jakub Konka
ec3e638b97 elf: fix unaligned file offset of moved phdr containing GOT section 2022-02-09 13:22:50 +01:00
Jakub Konka
e42b5e76ba stage2: handle void type in Elf DWARF gen
Enable more behavior tests on both x64 and arm
2022-02-08 23:43:25 +01:00
Jakub Konka
5944e89016 stage2: lower unnamed constants in Elf and MachO
* link: add a virtual function `lowerUnnamedConsts`, similar to
  `updateFunc` or `updateDecl` which needs to be implemented by the
  linker backend in order to be used with the `CodeGen` code
* elf: implement `lowerUnnamedConsts` specialization where we
  lower unnamed constants to `.rodata` section. We keep track of the
  atoms encompassing the lowered unnamed consts in a global table
  indexed by parent `Decl`. When the `Decl` is updated or destroyed,
  we clear the unnamed consts referenced within the `Decl`.
* macho: implement `lowerUnnamedConsts` specialization where we
  lower unnamed constants to `__TEXT,__const` section. We keep track of the
  atoms encompassing the lowered unnamed consts in a global table
  indexed by parent `Decl`. When the `Decl` is updated or destroyed,
  we clear the unnamed consts referenced within the `Decl`.
* x64: change `MCValue.linker_sym_index` into two `MCValue`s: `.got_load` and
  `.direct_load`. The former signifies to the emitter that it should
  emit a GOT load relocation, while the latter that it should emit
  a direct load (`SIGNED`) relocation.
* x64: lower `struct` instantiations
2022-02-07 08:39:00 +01:00
Andrew Kelley
33fa296019 stage2: pass proper can_exit_early value to LLD
and adjust the warning message for invoking LLD twice in the same
process.
2022-02-06 22:29:40 -07:00
Cody Tapscott
5065830aa0 Avoid depending on child process execution when not supported by host OS
In accordance with the requesting issue (#10750):
- `zig test` skips any tests that it cannot spawn, returning success
- `zig run` and `zig build` exit with failure, reporting the command the cannot be run
- `zig clang`, `zig ar`, etc. already punt directly to the appropriate clang/lld main(), even before this change
- Native `libc` Detection is not supported

Additionally, `exec()` and related Builder functions error at run-time, reporting the command that cannot be run
2022-02-06 22:21:46 -07:00
Jakub Konka
228b798af5 elf: generated DWARF debug info for named structs 2022-02-03 18:47:36 +01:00
Jakub Konka
b77757fe39 elf: add basic handling of .data section 2022-02-03 08:47:06 +01:00
Jakub Konka
9de30bb065 x86_64: handle struct_field_ptr for register mcv 2022-02-02 10:48:21 +01:00
Jakub Konka
627cf6ce48 astgen: clean up source line calculation and management
Clarify that `astgen.advanceSourceCursor` already increments absolute
values of the line and columns numbers; i.e., `GenZir.calcLine` is thus
not only obsolete but wrong by design.

Incidentally, this clean up allows for specifying the `FnDecl` line
numbers for DWARF use correctly as relative values with respect to
the start of the parent `Decl`. This `Decl` in turn has its line number
information specified relatively to its parent `Decl`, and so on, until
we reach the global scope.
2022-01-31 22:29:29 -05:00
John Schmidt
63ee6e6625 Rename mem.bswapAllFields to byteSwapAllFields
To match the renaming of `@bswap` to `@byteSwap` in
1fdb24827f.
2022-01-28 21:03:21 -05:00
Andrew Kelley
d7deffee8d link: ELF, COFF, WASM: honor the "must_link" flag of positionals
Previously only the MachO linker was honoring the flag.
2022-01-28 12:18:53 -07:00
Andrew Kelley
40c9ce2caf zig cc: add --hash-style linker parameter
This is only relevant for ELF files.

I also fixed a bug where passing a zig source file to `zig cc` would
incorrectly punt to clang because it thought there were no positional
arguments.
2022-01-26 15:01:59 -07:00
Jakub Konka
4192be8403 elf: implement slice types in debug info
Implements slice types including `[]const u8` for passing as
formal parameters in DWARF. Breaking on a function accepting
a slice in `gdb` will now yield the same behavior as stage1 and/or
LLVM backend:

```zig
fn sumArrayLens(a: []const u32, b: []const u8) usize {
  return a.len + b.len;
}
```

Both `a` and `b` can now be inspected in the debugger:

```
Breakpoint 1, sumArrayLens (a=..., b=...) at arr.zig:59
(gdb) p a
$1 = {ptr = 0x7fffffff685c, len = 5}
(gdb) p b
$2 = {ptr = 0x7fffffff683d "\252\252\252\\h\377\377\377\177", len = 3}
(gdb)
```
2022-01-26 17:28:58 +01:00
Jakub Konka
53c668d3a9 stage2: add naive impl of pointer type in ELF
Augment relocation tracking mechanism to de-duplicate potential
creation of base as well as composite types while unrolling
composite types in the linker - there is still potential for
further space optimisation by moving all type information into
a separate section `.debug_types` and providing references to
entries within that section whenever required (e.g., `ref4` form).
Currently, we duplicate type definitions on a per-decl basis.

Anyhow, with this patch, an example function signature of the following
type:

```zig
fn byPtrPtr(ptr_ptr_x: **u32, ptr_x: *u32) void {
    ptr_ptr_x.* = ptr_x;
}
```

will generate the following `.debug_info` for formal parameters:

```
 <1><1aa>: Abbrev Number: 3 (DW_TAG_subprogram)
    <1ab>   DW_AT_low_pc      : 0x8000197
    <1b3>   DW_AT_high_pc     : 0x2c
    <1b7>   DW_AT_name        : byPtrPtr
 <2><1c0>: Abbrev Number: 7 (DW_TAG_formal_parameter)
    <1c1>   DW_AT_location    : 1 byte block: 55        (DW_OP_reg5 (rdi))
    <1c3>   DW_AT_type        : <0x1df>
    <1c7>   DW_AT_name        : ptr_ptr_x
 <2><1d1>: Abbrev Number: 7 (DW_TAG_formal_parameter)
    <1d2>   DW_AT_location    : 1 byte block: 54        (DW_OP_reg4 (rsi))
    <1d4>   DW_AT_type        : <0x1e4>
    <1d8>   DW_AT_name        : ptr_x
 <2><1de>: Abbrev Number: 0
 <1><1df>: Abbrev Number: 5 (DW_TAG_pointer_type)
    <1e0>   DW_AT_type        : <0x1e4>
 <1><1e4>: Abbrev Number: 5 (DW_TAG_pointer_type)
    <1e5>   DW_AT_type        : <0x1e9>
 <1><1e9>: Abbrev Number: 4 (DW_TAG_base_type)
    <1ea>   DW_AT_encoding    : 7       (unsigned)
    <1eb>   DW_AT_byte_size   : 4
    <1ec>   DW_AT_name        : u32
```
2022-01-25 23:51:19 +01:00
Andrew Kelley
366c767444 link: Elf, Wasm: forward strip flag when linking with LLD 2022-01-25 11:52:48 -07:00
Andrew Kelley
a2abbeef90 stage2: rework a lot of stuff
AstGen:
 * rename the known_has_bits flag to known_non_opv to make it better
   reflect what it actually means.
 * add a known_comptime_only flag.
 * make the flags take advantage of identifiers of primitives and the
   fact that zig has no shadowing.
 * correct the known_non_opv flag for function bodies.

Sema:
 * Rename `hasCodeGenBits` to `hasRuntimeBits` to better reflect what it
   does.
   - This function got a bit more complicated in this commit because of
     the duality of function bodies: on one hand they have runtime bits,
     but on the other hand they require being comptime known.
 * WipAnonDecl now takes a LazySrcDecl parameter and performs the type
   resolutions that it needs during finish().
 * Implement comptime `@ptrToInt`.

Codegen:
 * Improved handling of lowering decl_ref; make it work for
   comptime-known ptr-to-int values.
   - This same change had to be made many different times; perhaps we
     should look into merging the implementations of `genTypedValue`
     across x86, arm, aarch64, and riscv.
2022-01-24 21:53:57 -07:00
Jakub Konka
406c85f9ba macho+elf: fix integer overflow in allocateAtom
If there is a big atom available for re-use in the free list, and
it's the last atom in section, it's ideal capacity might span the
entire section in which case we do not want to calculate the actual
end VM addr of the symbol since it may overflow. Instead, we just take
the max capacity available as end VM addr estimate. In this case,
the max capacity equals `std.math.maxInt(u64)`.
2022-01-22 08:50:01 +01:00
Andrew Kelley
fd6d1fe015 stage2: improvements to entry point handling
* rename `entry` to `entry_symbol_name` for the zig build API
 * integrate with `zig cc` command line options
 * integrate with COFF linking with LLD
 * integrate with self-hosted ELF linker
 * don't put it in the hash for MachO since it is ignored
2022-01-19 11:41:08 -07:00
Kenta Iwasaki
5ae3e4e9bd lld: allow for entrypoint symbol name to be set
This commit enables for the entrypoint symbol to be set when linking ELF
or WebAssembly modules with lld using the Zig compiler.
2022-01-19 11:22:10 -07:00
Jakub Konka
5cde5f947f Introduce LinkObject with must_link field 2022-01-13 20:02:11 +01:00
Andrew Kelley
b6d6152e65 link: avoid creating stage2 llvm module when using stage1 2022-01-04 00:11:45 -07:00
Andrew Kelley
ff66a18555 linker: fix build-obj and -fno-emit-bin
This commit fixes two problems:

* `zig build-obj` regressed from the cache-mode branch. It would crash
  because it assumed that dirname on the emit bin path would not be
  null. This assumption was invalid when outputting to the current
  working directory - a pretty common use case for `zig build-obj`.

* When using the LLVM backend, `-fno-emit-bin` combined with any other
  kind of emitting, such as `-femit-asm`, emitted nothing.

Both issues are now fixed.
2022-01-03 20:03:22 -07:00
Andrew Kelley
d94303be2b stage2: introduce renameTmpIntoCache into the linker API
Doc comments reproduced here:

This function is called by the frontend before flush(). It communicates that
`options.bin_file.emit` directory needs to be renamed from
`[zig-cache]/tmp/[random]` to `[zig-cache]/o/[digest]`.
The frontend would like to simply perform a file system rename, however,
some linker backends care about the file paths of the objects they are linking.
So this function call tells linker backends to rename the paths of object files
to observe the new directory path.
Linker backends which do not have this requirement can fall back to the simple
implementation at the bottom of this function.
This function is only called when CacheMode is `whole`.

This solves stack trace regressions on Windows and macOS because the
linker backends do not observe object file paths until flush().
2022-01-03 14:49:35 -07:00
Andrew Kelley
e3bed8d81d stage2: introduce CacheMode
The two CacheMode values are `whole` and `incremental`.
`incremental` is what we had before; `whole` is new.
Whole cache mode uses everything as inputs to the cache hash;
and when a hit occurs it skips everything including linking.
This is ideal for when source files change rarely and for backends that
do not have good incremental compilation support, for example
compiler-rt or libc compiled with LLVM with optimizations on.
This is the main motivation for the additional mode, so that we can have
LLVM-optimized compiler-rt/libc builds, without waiting for the LLVM
backend every single time Zig is invoked.

Incremental cache mode hashes only the input file path and a few target
options, intentionally relying on collisions to locate already-existing
build artifacts which can then be incrementally updated.

The bespoke logic for caching stage1 backend build artifacts
is removed since we now have a global caching mechanism for
when we want to cache the entire compilation, *including* linking.
Previously we had to get "creative" with libs.txt and a special
byte in the hash id to communicate flags, so that when the cached
artifacts were re-linked, we had this information from stage1
even though we didn't actually run it. Now that `CacheMode.whole`
includes linking, this extra information does not need to be
preserved for cache hits. So although this changeset introduces
complexity, it also removes complexity.

The main trickiness here comes from the inherent differences between the
two modes: `incremental` wants a directory immediately to operate on,
while `whole` doesn't know the output directory until the compilation is
complete. This commit deals with this problem mostly inside `update()`,
where, on a cache miss, it replaces `zig_cache_artifact_directory` with a
temporary directory, and then renames it into place once the compilation is
complete.

Items remaining before this branch can be merged:

* [ ] make sure these things make it into the cache manifest:
  - @import files
  - @embedFile files
  - we already add dep files from c but make sure the main .c files make
    it in there too, not just the included files

* [ ] double check that the emit paths of other things besides the binary
  are working correctly.

* [ ] test `-fno-emit-bin` + `-fstage1`
* [ ] test `-femit-bin=foo` + `-fstage1`

* [ ] implib emit directory copies bin_file_emit directory in create() and needs
  to be adjusted to be overridden as well.

* [ ] make sure emit-h is handled correctly in the cache hash
* [ ] Cache: detect duplicate files added to the manifest

Some preliminary performance measurements of wall clock time and
peak RSS used:

stage1 behavior (1077 tests), llvm backend, release build:
 * cold global cache: 4.6s, 1.1 GiB
 * warm global cache: 3.4s, 980 MiB

stage2 master branch behavior (575 tests), llvm backend, release build:
 * cold global cache: 0.62s, 191 MiB
 * warm global cache: 0.40s, 128 MiB

stage2 this branch behavior (575 tests), llvm backend, release build:
 * cold global cache: 0.62s, 179 MiB
 * warm global cache: 0.27s, 90 MiB
2022-01-02 13:16:17 -07:00
Ersikan
e15a267668 elf: Put constant data in the .rodata section
Allocate a new program header and a new section to accomodate the read-only data
section ".rodata".

Separate TextBlock into multiple TextBlockList, to separate decl in different
sections.

If a Decl is not a function, it is added to the .rodata section.
2021-12-21 11:33:12 -08:00
Lee Cannon
1093b09a98
allocgate: renamed getAllocator function to allocator 2021-11-30 23:32:47 +00:00
Lee Cannon
75548b50ff
allocgate: stage 1 and 2 building 2021-11-30 23:32:47 +00:00
Lee Cannon
85de022c56
allocgate: std Allocator interface refactor 2021-11-30 23:32:47 +00:00
Andrew Kelley
902df103c6 std lib API deprecations for the upcoming 0.9.0 release
See #3811
2021-11-30 00:13:07 -07:00
Andrew Kelley
eec825cea2 zig cc: support -Bdynamic and -Bstatic parameters
Related: #10050
2021-11-26 16:26:19 -07:00
Jakub Konka
d6f43a1eac bpf: do not invoke lld when linking eBPF relocatables
Due to a deficiency in LLD, we need to special-case BPF to a simple
file copy when generating relocatables. Normally, we would expect
`lld -r` to work. However, because LLD wants to resolve BPF relocations
which it shouldn't, it fails before even generating the relocatable.

Co-authored-by: Matthew Knight <mattnite@protonmail.com>
2021-11-26 10:53:30 +01:00
Andrew Kelley
20cc7af8e6 stage2: support LLD -O flags on ELF
In 7e23b3245a9bf6e002009e6c18c10a9995671afa I made -O flags to the
linker emit a warning that the argument does nothing. That was not
correct however; LLD does have some logic that does different things
depending on -O0, -O1, and -O2. It defaults to -O1, and it does less
optimizations with -O0 and more with -O2.

With this commit, e.g. `-Wl,-O1` is supported by the `zig cc` frontend,
and by default we pass `-O0` to LLD in debug mode, and `-O3` in release
modes.

I also fixed a bug in the LLD ELF linker line which was incorrectly
passing `-O` flags instead of `--lto-O` flags for LTO.
2021-11-24 18:46:32 -07:00
Andrew Kelley
7e23b3245a stage2: remove extra_lld_args
This mechanism for sending arbitrary linker args to LLD has no place in
the Zig frontend, because our goal is for the frontend to understand all
the arguments and not treat linker args like a black box.

For example we have self-hosted linking in addition to LLD, so we want to
have the options make sense to both linking codepaths, not just the LLD one.

Passing -O linker args will now result in a warning that the arg does
nothing.
2021-11-24 17:14:20 -07:00
Andrew Kelley
4e5a88b288 stage2: default dynamic libraries to be linked as needed
After this change, the default for dynamic libraries (`-l` or
`--library`) is to only link them if they end up being actually used.

With the Zig CLI, the new options `-needed-l` or `--needed-library` can
be used to force link against a dynamic library.

With `zig cc`, this behavior can be overridden with `-Wl,--no-as-needed`
(and restored with `-Wl,--as-needed`).

Closes #10164
2021-11-20 17:23:44 -07:00
Jakub Konka
bc59a630ab stage2,x86_64: fix genBinMathOp and clarify callee-saved regs
Previously, we have confused callee-saved with caller-saved registers
(the actual register sets were swapped). This commit fixes that
for both `.x86` and `.x86_64` native backends.

This commit also fixes the register allocation logic in `genBinMathOp`
for `.x86_64` native backend where in a situation such that we require
to spill a register, we would end up spilling the register that is
already involved in the instruction as the other operand. In such a
case, we make a note of this and spill a subsequent register instead.
2021-11-19 20:01:35 +01:00
Jacob G-W
149bc79486 add tests for previous commit 2021-11-19 09:38:36 +01:00
Ryan Liptak
e97feb96e4 Replace ArrayList.init/ensureTotalCapacity pairs with initCapacity
Because ArrayList.initCapacity uses 'precise' capacity allocation, this should save memory on average, and definitely will save memory in cases where ArrayList is used where a regular allocated slice could have also be used.
2021-11-04 14:54:25 -04:00
Kenta Iwasaki
2cdffc97f0 zig: expose linker options and include '-z notext'
Add an option to allow the '-z notext' option to be passed to the linker
via. the compiler frontend, which is a flag that tells the linker that
relocations in read-only sections are permitted. Certain targets such as
Solana BPF rely on this flag.

Expose all linker options i.e. '-z nodelete', '-z now', '-z relro' in
the compiler frontend. Usage documentation has been updated accordingly.

Expose the '-z notext' flag in the standard library build runner.
2021-10-29 19:18:44 -04:00
Andrew Kelley
6115cf2240 migrate from std.Target.current to @import("builtin").target
closes #9388
closes #9321
2021-10-04 23:48:55 -07:00
Andrew Kelley
ba7f40c430 stage2: fix ELF linking to include compiler_rt
There was duplicated logic for whether to include compiler_rt in the
linker line both in the frontend and in the linker backends. Now the
logic is only in the frontend; the linker puts it on the linker line if
the frontend provides it.

Fixes the CI failures.
2021-09-29 15:37:34 -07:00
Andrew Kelley
99961f22dc stage2: enable building compiler_rt when using LLVM backend
* AstGen: fix emitting `store_to_inferred_ptr` when it should be emitting
   `store` for a variable that has an explicit alignment.
 * Compilation: fix a couple memory leaks
 * Sema: implement support for locals that have specified alignment.
 * Sema: implement `@intCast` when it needs to emit an AIR instruction.
 * Sema: implement `@alignOf`
 * Implement debug printing for extended alloc ZIR instructions.
2021-09-29 00:13:21 -07:00
Stephen Gregoratto
87fd502fb6 Initial bringup of the Solaris/Illumos port 2021-09-24 14:06:16 -04:00
Andrew Kelley
ef7fa76001 stage2: enable building freestanding libc with LLVM backend
* LLVM backend: respect `sub_path` just like the other stage2 backends
   do.
 * Compilation has some new logic to only emit work queue jobs for
   building stuff when it believes itself to be capable. The linker
   backends no longer have duplicate logic; instead they respect the
   optional bit on the respective asset.
2021-09-24 01:00:15 -07:00
Andrew Kelley
f215d98043 stage2: LLVM backend: improved naming and exporting
Introduce an explicit decl_map for *Decl to LLVMValueRef. Doc comment
reproduced here:

Ideally we would use `llvm_module.getNamedFunction` to go from *Decl to
LLVM function, but that has some downsides:
* we have to compute the fully qualified name every time we want to do the lookup
* for externally linked functions, the name is not fully qualified, but when
  a Decl goes from exported to not exported and vice-versa, we would use the wrong
  version of the name and incorrectly get function not found in the llvm module.
* it works for functions not all globals.
Therefore, this table keeps track of the mapping.

Non-exported functions now use fully-qualified symbol names.
`Module.Decl.getFullyQualifiedName` now returns a sentinel-terminated
slice which is useful to pass to LLVMAddFunction.

Instead of using aliases for all external symbols, now the LLVM backend
takes advantage of LLVMSetValueName to rename functions that become
exported. Aliases are still used for the second and remaining exports.

freeDecl is now handled properly in the LLVM backend, deleting the
LLVMValueRef corresponding to the Decl being deleted. The linker
backends for ELF, COFF, Mach-O, and Wasm had to be updated to forward
the freeDecl call to the LLVM backend.
2021-09-23 23:46:45 -07:00