142 Commits

Author SHA1 Message Date
Luuk de Gram
8e1c220be2 wasm: Add basic debug info references 2022-05-09 18:51:46 +02:00
Andrew Kelley
f7596ae942 stage2: use indexes for Decl objects
Rather than allocating Decl objects with an Allocator, we instead allocate
them with a SegmentedList. This provides four advantages:
 * Stable memory so that one thread can access a Decl object while another
   thread allocates additional Decl objects from this list.
 * It allows us to use u32 indexes to reference Decl objects rather than
   pointers, saving memory in Type, Value, and dependency sets.
 * Using integers to reference Decl objects rather than pointers makes
   serialization trivial.
 * It provides a unique integer to be used for anonymous symbol names,
   avoiding multi-threaded contention on an atomic counter.
2022-04-20 17:37:35 -07:00
Andrew Kelley
a7c05c06be stage2: expose progress bar API to linker backends
This gives us insight as to what is happening when we are waiting for
things such as LLVM emit object and LLD linking.
2022-04-17 04:09:35 -07:00
Jakub Konka
88d87d6506 stage2,macho: swap out inodes before checking for intermediary basename
This way we avoid the infamous SIGKILL on arm64 macos.
2022-04-16 12:06:58 +02:00
Andrew Kelley
2587474717 stage2: progress towards stage3
* The `@bitCast` workaround is removed in favor of `@ptrCast` properly
   doing element casting for slice element types. This required an
   enhancement both to stage1 and stage2.
 * stage1 incorrectly accepts `.{}` instead of `{}`. stage2 code that
   abused this is fixed.
 * Make some parameters comptime to support functions in switch
   expressions (as opposed to making them function pointers).
 * Avoid relying on local temporaries being mutable.
 * Workarounds for when stage1 and stage2 disagree on function pointer
   types.
 * Workaround recursive formatting bug with a `@panic("TODO")`.
 * Remove unreachable `else` prongs for some inferred error sets.

All in effort towards #89.
2022-04-14 10:12:45 -07:00
jagt
b9d86c6bc8 fix link.renameTmpIntoCache on windows when dest dir exists.
Previously it would fail as `renameW` do not ever fail with
`PathAlreadyExists`.

As a workaround we check for dest dir existence before rename
on Windows.
2022-04-12 06:01:25 -04:00
Jakub Konka
4ca9b4c44a dwarf: move DbgInfoTypeRelocsTable into Dwarf module 2022-03-27 20:53:06 +02:00
Andrew Kelley
593130ce0a stage2: lazy @alignOf
Add a `target` parameter to every function that deals with Type and
Value.
2022-03-22 15:45:58 -07:00
Jakub Konka
0376fd09bc macho: extend CodeSignature to accept entitlements
With this change, we can now bake in entitlements into the binary.
Additionally, I see this as the first step towards full code signature
support which includes baking in Apple issued certificates for
redistribution, etc.
2022-03-22 07:06:39 +01:00
Jakub Konka
68c224d6ec macho: simplify writing atoms for stage2
Also, fix premature exit in `link.File.makeWritable` in case we
are running M1 but executing binaries using Rosetta2.
2022-03-13 14:15:26 +01:00
Andrew Kelley
c6160fa3a5 LLVM: add compile unit to debug info
This commit also adds a bunch of bindings for debug info.
2022-03-08 14:58:53 -07:00
Jakub Konka
ba17552b4e dwarf: move all dwarf into standalone module
Hook up Elf and MachO linkers to the new solution.
2022-03-08 09:46:27 +01:00
Luuk de Gram
5a45fe2dba
wasm: Call generateSymbol for updateDecl
To unify the wasm backend with the other backends, we will now call `generateSymbol` to
lower a Decl into bytes. This means we also have to change some function signatures
to comply with the linker interface.

Since the general purpose generateSymbol is less featureful than wasm's, some tests are
temporarily disabled.
2022-03-06 19:38:50 +01:00
Jakub Konka
e8eb9778cc codegen: lower field_ptr to memory across linking backends
This requires generating an addend for the target relocation as
the field pointer might point at a field inner to the container.
2022-03-01 22:03:18 +01:00
Luuk de Gram
acec06cfaf wasm-linker: Implement updateDeclExports
We now correctly implement exporting decls. This means it is possible to export
a decl with a different name than the decl that is doing the export.
This also sets the symbols with the correct flags, so when we emit a relocatable
object file, a linker can correctly resolve symbols and/or export the symbol to the host environment.

This commit also includes fixes to ensure relocations have the correct offset to how other
linkers will expect the offset, rather than what we use internally.
Other linkers accept the offset, relative to the section.
Internally we use an offset relative to the atom.
2022-02-23 16:07:36 +01:00
xReveres
b2805666a7 stage1-wasm: implement shared memory 2022-02-23 08:57:20 +01:00
Jakub Konka
b9b1ab0240 elf: store pointer relocations indexed by containing atom
In `getDeclVAddr`, it may happen that the target `Decl` has not
been allocated space in virtual memory. In this case, we store a
relocation in the linker-global table which we will iterate over
when flushing the module, and fill in any missing address in the
final binary. Note that for optimisation, if the address was resolved
at the time of a call to `getDeclVAddr`, we skip relocating this
atom.

This commit also adds the glue code for lowering const slices in
the ARM backend.
2022-02-11 10:52:13 +01:00
Jakub Konka
5944e89016 stage2: lower unnamed constants in Elf and MachO
* link: add a virtual function `lowerUnnamedConsts`, similar to
  `updateFunc` or `updateDecl` which needs to be implemented by the
  linker backend in order to be used with the `CodeGen` code
* elf: implement `lowerUnnamedConsts` specialization where we
  lower unnamed constants to `.rodata` section. We keep track of the
  atoms encompassing the lowered unnamed consts in a global table
  indexed by parent `Decl`. When the `Decl` is updated or destroyed,
  we clear the unnamed consts referenced within the `Decl`.
* macho: implement `lowerUnnamedConsts` specialization where we
  lower unnamed constants to `__TEXT,__const` section. We keep track of the
  atoms encompassing the lowered unnamed consts in a global table
  indexed by parent `Decl`. When the `Decl` is updated or destroyed,
  we clear the unnamed consts referenced within the `Decl`.
* x64: change `MCValue.linker_sym_index` into two `MCValue`s: `.got_load` and
  `.direct_load`. The former signifies to the emitter that it should
  emit a GOT load relocation, while the latter that it should emit
  a direct load (`SIGNED`) relocation.
* x64: lower `struct` instantiations
2022-02-07 08:39:00 +01:00
gwenzek
0e1afb4d98
stage2: add support for Nvptx target
sample command:

/home/guw/github/zig/stage2/bin/zig build-obj cuda_kernel.zig -target nvptx64-cuda -O ReleaseSafe
this will create a kernel.ptx

expose PtxKernel call convention from LLVM
kernels are `export fn f() callconv(.PtxKernel)`
2022-02-05 16:33:00 +02:00
Andrew Kelley
40c9ce2caf zig cc: add --hash-style linker parameter
This is only relevant for ELF files.

I also fixed a bug where passing a zig source file to `zig cc` would
incorrectly punt to clang because it thought there were no positional
arguments.
2022-01-26 15:01:59 -07:00
Jakub Konka
53c668d3a9 stage2: add naive impl of pointer type in ELF
Augment relocation tracking mechanism to de-duplicate potential
creation of base as well as composite types while unrolling
composite types in the linker - there is still potential for
further space optimisation by moving all type information into
a separate section `.debug_types` and providing references to
entries within that section whenever required (e.g., `ref4` form).
Currently, we duplicate type definitions on a per-decl basis.

Anyhow, with this patch, an example function signature of the following
type:

```zig
fn byPtrPtr(ptr_ptr_x: **u32, ptr_x: *u32) void {
    ptr_ptr_x.* = ptr_x;
}
```

will generate the following `.debug_info` for formal parameters:

```
 <1><1aa>: Abbrev Number: 3 (DW_TAG_subprogram)
    <1ab>   DW_AT_low_pc      : 0x8000197
    <1b3>   DW_AT_high_pc     : 0x2c
    <1b7>   DW_AT_name        : byPtrPtr
 <2><1c0>: Abbrev Number: 7 (DW_TAG_formal_parameter)
    <1c1>   DW_AT_location    : 1 byte block: 55        (DW_OP_reg5 (rdi))
    <1c3>   DW_AT_type        : <0x1df>
    <1c7>   DW_AT_name        : ptr_ptr_x
 <2><1d1>: Abbrev Number: 7 (DW_TAG_formal_parameter)
    <1d2>   DW_AT_location    : 1 byte block: 54        (DW_OP_reg4 (rsi))
    <1d4>   DW_AT_type        : <0x1e4>
    <1d8>   DW_AT_name        : ptr_x
 <2><1de>: Abbrev Number: 0
 <1><1df>: Abbrev Number: 5 (DW_TAG_pointer_type)
    <1e0>   DW_AT_type        : <0x1e4>
 <1><1e4>: Abbrev Number: 5 (DW_TAG_pointer_type)
    <1e5>   DW_AT_type        : <0x1e9>
 <1><1e9>: Abbrev Number: 4 (DW_TAG_base_type)
    <1ea>   DW_AT_encoding    : 7       (unsigned)
    <1eb>   DW_AT_byte_size   : 4
    <1ec>   DW_AT_name        : u32
```
2022-01-25 23:51:19 +01:00
Kenta Iwasaki
5ae3e4e9bd lld: allow for entrypoint symbol name to be set
This commit enables for the entrypoint symbol to be set when linking ELF
or WebAssembly modules with lld using the Zig compiler.
2022-01-19 11:22:10 -07:00
Jakub Konka
5cde5f947f Introduce LinkObject with must_link field 2022-01-13 20:02:11 +01:00
Jakub Konka
16c55b15cb zld: support -Wl,-force_load=archive_path flag
This actually enables using `zig cc` as a linker for `cargo test`
with `serde_derive`.
2022-01-13 20:02:11 +01:00
Jakub Konka
d66c97d0ef
Merge pull request #10525 from g-w1/plan9-zig-test
Plan9 zig test
2022-01-09 13:27:56 +01:00
Lee Cannon
1cdc51ec10 handle error.PathAlreadyExists in renameTmpIntoCache 2022-01-07 16:50:37 -05:00
Jacob G-W
ab400ad624 Plan9: implement getDeclVAddr 2022-01-06 22:47:27 -05:00
Andrew Kelley
ff66a18555 linker: fix build-obj and -fno-emit-bin
This commit fixes two problems:

* `zig build-obj` regressed from the cache-mode branch. It would crash
  because it assumed that dirname on the emit bin path would not be
  null. This assumption was invalid when outputting to the current
  working directory - a pretty common use case for `zig build-obj`.

* When using the LLVM backend, `-fno-emit-bin` combined with any other
  kind of emitting, such as `-femit-asm`, emitted nothing.

Both issues are now fixed.
2022-01-03 20:03:22 -07:00
Andrew Kelley
d94303be2b stage2: introduce renameTmpIntoCache into the linker API
Doc comments reproduced here:

This function is called by the frontend before flush(). It communicates that
`options.bin_file.emit` directory needs to be renamed from
`[zig-cache]/tmp/[random]` to `[zig-cache]/o/[digest]`.
The frontend would like to simply perform a file system rename, however,
some linker backends care about the file paths of the objects they are linking.
So this function call tells linker backends to rename the paths of object files
to observe the new directory path.
Linker backends which do not have this requirement can fall back to the simple
implementation at the bottom of this function.
This function is only called when CacheMode is `whole`.

This solves stack trace regressions on Windows and macOS because the
linker backends do not observe object file paths until flush().
2022-01-03 14:49:35 -07:00
Andrew Kelley
9dc25dd0b6 stage2: fix UAF of system_libs 2022-01-02 13:16:17 -07:00
Andrew Kelley
e3bed8d81d stage2: introduce CacheMode
The two CacheMode values are `whole` and `incremental`.
`incremental` is what we had before; `whole` is new.
Whole cache mode uses everything as inputs to the cache hash;
and when a hit occurs it skips everything including linking.
This is ideal for when source files change rarely and for backends that
do not have good incremental compilation support, for example
compiler-rt or libc compiled with LLVM with optimizations on.
This is the main motivation for the additional mode, so that we can have
LLVM-optimized compiler-rt/libc builds, without waiting for the LLVM
backend every single time Zig is invoked.

Incremental cache mode hashes only the input file path and a few target
options, intentionally relying on collisions to locate already-existing
build artifacts which can then be incrementally updated.

The bespoke logic for caching stage1 backend build artifacts
is removed since we now have a global caching mechanism for
when we want to cache the entire compilation, *including* linking.
Previously we had to get "creative" with libs.txt and a special
byte in the hash id to communicate flags, so that when the cached
artifacts were re-linked, we had this information from stage1
even though we didn't actually run it. Now that `CacheMode.whole`
includes linking, this extra information does not need to be
preserved for cache hits. So although this changeset introduces
complexity, it also removes complexity.

The main trickiness here comes from the inherent differences between the
two modes: `incremental` wants a directory immediately to operate on,
while `whole` doesn't know the output directory until the compilation is
complete. This commit deals with this problem mostly inside `update()`,
where, on a cache miss, it replaces `zig_cache_artifact_directory` with a
temporary directory, and then renames it into place once the compilation is
complete.

Items remaining before this branch can be merged:

* [ ] make sure these things make it into the cache manifest:
  - @import files
  - @embedFile files
  - we already add dep files from c but make sure the main .c files make
    it in there too, not just the included files

* [ ] double check that the emit paths of other things besides the binary
  are working correctly.

* [ ] test `-fno-emit-bin` + `-fstage1`
* [ ] test `-femit-bin=foo` + `-fstage1`

* [ ] implib emit directory copies bin_file_emit directory in create() and needs
  to be adjusted to be overridden as well.

* [ ] make sure emit-h is handled correctly in the cache hash
* [ ] Cache: detect duplicate files added to the manifest

Some preliminary performance measurements of wall clock time and
peak RSS used:

stage1 behavior (1077 tests), llvm backend, release build:
 * cold global cache: 4.6s, 1.1 GiB
 * warm global cache: 3.4s, 980 MiB

stage2 master branch behavior (575 tests), llvm backend, release build:
 * cold global cache: 0.62s, 191 MiB
 * warm global cache: 0.40s, 128 MiB

stage2 this branch behavior (575 tests), llvm backend, release build:
 * cold global cache: 0.62s, 179 MiB
 * warm global cache: 0.27s, 90 MiB
2022-01-02 13:16:17 -07:00
Luuk de Gram
4cb2f11693 wasm-linker: Implement the --export-table and --import-table flags.
This implements the flags for both the linker frontend as well as the self-hosted linker.

Closes #5790
2021-12-21 12:38:50 -08:00
Jakub Konka
a08137330c macho: handle -install_name option for dylibs/MachO
The status quo for the `build.zig` build system is preserved in
the sense that, if the user does not explicitly override
`dylib.setInstallName(...);` in their build script, the default
of `@rpath/libname.dylib` applies. However, should they want to
override the default behaviour, they can either:

1) unset it with

```dylib.setIntallName(null);```

2) set it to an explicit string with

```dylib.setInstallName("somename.dylib");```

When it comes to the command line however, the default is not to
use `@rpath` for the install name when creating a dylib. The user
will now be required to explicitly specify the `@rpath` as part
of the desired install name should they choose so like so:

1) with `build-lib`

```
zig build-lib -dynamic foo.zig -install_name @rpath/libfoo.dylib
```

2) with `cc`

```
zig cc -shared foo.c -o libfoo.dylib -Wl,"-install_name=@rpath/libfoo.dylib"
```
2021-12-18 17:55:53 -08:00
Luuk de Gram
50201e1c30 wasm-linker: Allow specifying symbols to be exported
Notating a symbol to be exported in code will only tell the linker
where to find this symbol, so other object files can find it. However, this does not mean
said symbol will also be exported to the host environment. Currently, we 'fix' this by force
exporting every single symbol that is visible. This creates bigger binaries and means host environments
have access to symbols that they perhaps shouldn't have. Now, users can tell Zig which symbols
are to be exported, meaning all other symbols that are not specified will not be exported.

Another change is we now support `-rdynamic` in the wasm linker as well, meaning all symbols will
be put in the dynamic symbol table. This is the same behavior as with ELF. This means there's a 3rd strategy
users will have to build their wasm binary.
2021-12-14 14:02:23 -08:00
Lee Cannon
1093b09a98
allocgate: renamed getAllocator function to allocator 2021-11-30 23:32:47 +00:00
Lee Cannon
75548b50ff
allocgate: stage 1 and 2 building 2021-11-30 23:32:47 +00:00
Lee Cannon
85de022c56
allocgate: std Allocator interface refactor 2021-11-30 23:32:47 +00:00
Jakub Konka
8317dbd1cb macos: detect SDK path and version, then pass to the linker
Since we are already detecting the path to the native SDK,
if available, also fetch SDK's version and route that to the linker.
The linker can then use it to correctly populate LC_BUILD_VERSION
load command.
2021-11-26 16:26:44 +01:00
Jakub Konka
d6f43a1eac bpf: do not invoke lld when linking eBPF relocatables
Due to a deficiency in LLD, we need to special-case BPF to a simple
file copy when generating relocatables. Normally, we would expect
`lld -r` to work. However, because LLD wants to resolve BPF relocations
which it shouldn't, it fails before even generating the relocatable.

Co-authored-by: Matthew Knight <mattnite@protonmail.com>
2021-11-26 10:53:30 +01:00
Jakub Konka
a96b6ad83f Merge branch 'build-obj-no-link' of git://github.com/mattnite/zig into mattnite-build-obj-no-link 2021-11-26 10:45:16 +01:00
Andrew Kelley
20cc7af8e6 stage2: support LLD -O flags on ELF
In 7e23b3245a9bf6e002009e6c18c10a9995671afa I made -O flags to the
linker emit a warning that the argument does nothing. That was not
correct however; LLD does have some logic that does different things
depending on -O0, -O1, and -O2. It defaults to -O1, and it does less
optimizations with -O0 and more with -O2.

With this commit, e.g. `-Wl,-O1` is supported by the `zig cc` frontend,
and by default we pass `-O0` to LLD in debug mode, and `-O3` in release
modes.

I also fixed a bug in the LLD ELF linker line which was incorrectly
passing `-O` flags instead of `--lto-O` flags for LTO.
2021-11-24 18:46:32 -07:00
Andrew Kelley
27c5c7fb23 stage2: proper -femit-implib frontend support
* Improve the logic for determining whether emitting an import lib is
   eligible, and improve the error message when the user provides
   contradictory arguments.
 * Integrate with the EmitLoc / Emit system that already exists, and use
   the `-femit-implib[=path]`/`-fno-emit-implib` convention that already
   exists.
 * Proper integration with the caching system.
 * CLI: fix bug in error reporting for resolving EmitLoc values for
   other parameters.
2021-11-24 18:12:56 -07:00
Andrew Kelley
7e23b3245a stage2: remove extra_lld_args
This mechanism for sending arbitrary linker args to LLD has no place in
the Zig frontend, because our goal is for the frontend to understand all
the arguments and not treat linker args like a black box.

For example we have self-hosted linking in addition to LLD, so we want to
have the options make sense to both linking codepaths, not just the LLD one.

Passing -O linker args will now result in a warning that the arg does
nothing.
2021-11-24 17:14:20 -07:00
Kurt Kartaltepe
a950cb42bd Coff linker: Add IMPLIB support
Allow --out-implib and -implib as passed by cmake and meson to be
correctly passed through to the linker to generate import libraries.
2021-11-24 17:14:20 -07:00
Jakub Konka
0c1d610015 zld: handle -current_version and -compatibility_version
and transfer them correctly to the generated dylib as part of the dylib
id load command.
2021-11-23 15:59:49 +01:00
Andrew Kelley
6afcaf4a08 stage2: fix the build for 32-bit architectures
* Introduce a mechanism into Sema for emitting a compile error when an
   integer is too big and we need it to fit into a usize.
 * Add `@intCast` where necessary
 * link/MachO: fix an unnecessary allocation when all that was happening
   was appending zeroes to an ArrayList.
 * Add `error.Overflow` as a possible error to some codepaths, allowing
   usage of `math.intCast`.

closes #9710
2021-11-21 19:43:08 -07:00
Andrew Kelley
4e5a88b288 stage2: default dynamic libraries to be linked as needed
After this change, the default for dynamic libraries (`-l` or
`--library`) is to only link them if they end up being actually used.

With the Zig CLI, the new options `-needed-l` or `--needed-library` can
be used to force link against a dynamic library.

With `zig cc`, this behavior can be overridden with `-Wl,--no-as-needed`
(and restored with `-Wl,--as-needed`).

Closes #10164
2021-11-20 17:23:44 -07:00
Andrew Kelley
d2cdfb9490 stage2: add 4 new linker flags for WebAssembly
--import-memory          import memory from the environment
--initial-memory=[bytes] initial size of the linear memory
--max-memory=[bytes]     maximum size of the linear memory
--global-base=[addr]     where to start to place global data

See #8633
2021-11-09 14:29:20 -07:00
Matt Knight
a53becc034 don't invoke linker when just building an object 2021-11-07 10:28:25 -08:00
Ryan Liptak
e97feb96e4 Replace ArrayList.init/ensureTotalCapacity pairs with initCapacity
Because ArrayList.initCapacity uses 'precise' capacity allocation, this should save memory on average, and definitely will save memory in cases where ArrayList is used where a regular allocated slice could have also be used.
2021-11-04 14:54:25 -04:00