8856 Commits

Author SHA1 Message Date
Andrew Kelley
6b14c58f63 compiler-rt: simplify implementations
This improves readability as well as compatibility with stage2. Most of
compiler-rt is now enabled for stage2 with just a few functions disabled
(until stage2 passes more behavior tests).
2022-01-02 13:16:17 -07:00
Andrew Kelley
9dc25dd0b6 stage2: fix UAF of system_libs 2022-01-02 13:16:17 -07:00
Andrew Kelley
55243cfb31 stage2: fix implibs
Broken by the introduction of `CacheMode.whole`, they are now working
again by using the same mechanism as emitted binary files.
2022-01-02 13:16:17 -07:00
Andrew Kelley
013c286e43 stage2: fix memory leak in addNonIncrementalStuffToCacheManifest
I accidentally used errdefer instead of defer for the function-local
arena.
2022-01-02 13:16:17 -07:00
Andrew Kelley
73ba6bf30b stage2: fix memory leak of emit_bin.sub_path
Instead of juggling GPA-allocated sub_path (and ultimately dropping the
ball, in this analogy), `Compilation.create` allocates an
already-exactly-correct size `sub_path` that has the digest unpopulated.

This is then overwritten in place as necessary and used as the
`emit_bin.sub_path` value, and no allocations/frees are performed for
this file path.
2022-01-02 13:16:17 -07:00
Andrew Kelley
208a6c7d6a stage2: fix not calling deinit() on whole_cache_manifest
Need to mark it as "needing cleanup" a bit earlier in the function.
2022-01-02 13:16:17 -07:00
Andrew Kelley
0ad2a99675 stage2: CacheMode.whole: trigger loading zig source files
Previously the code asserted source files were already loaded, but this
is not the case when cached ZIR is loaded. Now it will trigger .zig
source code to be loaded for the purposes of hashing the source for
`CacheMode.whole`.

This additionally refactors stat_size, stat_inode, and stat_mtime fields
into using the `Cache.File.Stat` struct.
2022-01-02 13:16:17 -07:00
Andrew Kelley
dbd0a2c35d stage2: fix path to cache artifacts in libcxx,
libtsan, libunwind, and libcxxabi.
2022-01-02 13:16:17 -07:00
Andrew Kelley
d6c5602d46 stage2: fix CLI not populating output binary files
This fixes a regression in this branch that can be reproduced with the
following steps:

1. `zig build-exe hello.zig`
2. delete the "hello" binary
3. `zig build-exe hello.zig`
4. observe that the "hello" binary is missing

This happened because it was a cache hit, but nothing got copied to the
output directory.

This commit sets CacheMode to incremental - even for stage1 - when the
CLI requests `disable_lld_caching` (this option should be renamed),
resulting in the main Compilation to be repeated (uncached) for stage1,
populating the binary into the cwd as expected.

For stage2 the result is even better: the incremental compilation system
will look for build artifacts to incrementally compile, and start fresh
if not found.
2022-01-02 13:16:17 -07:00
Andrew Kelley
e36718165c stage2: add @import and @embedFile to CacheHash
when using `CacheMode.whole`. Also, I verified that `addDepFilePost` is
in fact including the original C source file in addition to the files it
depends on.
2022-01-02 13:16:17 -07:00
Andrew Kelley
67e31807df stage2: CacheMode.whole fixes
* Logic to check whether a bin file is not emitted is more complicated
   in between `Compilation.create` and `Compilation.update`. Fixed the
   logic that decides whether to build compiler-rt and other support
   artifacts.
 * Basically, one cannot inspect the value of `comp.bin_file.emit` until
   after update() is called - fixed another instance of this happening
   in the CLI.
 * In the CLI, `runOrTest` is updated to properly use the result value
   of `comp.bin_file.options.emit` rather than guessing whether the
   output binary is.
 * Don't assume that the emit output has no directory components in
   sub_path. In other words, don't assume that the emit directory is the
   final directory; there may be sub-directories.
2022-01-02 13:16:17 -07:00
Andrew Kelley
e3bed8d81d stage2: introduce CacheMode
The two CacheMode values are `whole` and `incremental`.
`incremental` is what we had before; `whole` is new.
Whole cache mode uses everything as inputs to the cache hash;
and when a hit occurs it skips everything including linking.
This is ideal for when source files change rarely and for backends that
do not have good incremental compilation support, for example
compiler-rt or libc compiled with LLVM with optimizations on.
This is the main motivation for the additional mode, so that we can have
LLVM-optimized compiler-rt/libc builds, without waiting for the LLVM
backend every single time Zig is invoked.

Incremental cache mode hashes only the input file path and a few target
options, intentionally relying on collisions to locate already-existing
build artifacts which can then be incrementally updated.

The bespoke logic for caching stage1 backend build artifacts
is removed since we now have a global caching mechanism for
when we want to cache the entire compilation, *including* linking.
Previously we had to get "creative" with libs.txt and a special
byte in the hash id to communicate flags, so that when the cached
artifacts were re-linked, we had this information from stage1
even though we didn't actually run it. Now that `CacheMode.whole`
includes linking, this extra information does not need to be
preserved for cache hits. So although this changeset introduces
complexity, it also removes complexity.

The main trickiness here comes from the inherent differences between the
two modes: `incremental` wants a directory immediately to operate on,
while `whole` doesn't know the output directory until the compilation is
complete. This commit deals with this problem mostly inside `update()`,
where, on a cache miss, it replaces `zig_cache_artifact_directory` with a
temporary directory, and then renames it into place once the compilation is
complete.

Items remaining before this branch can be merged:

* [ ] make sure these things make it into the cache manifest:
  - @import files
  - @embedFile files
  - we already add dep files from c but make sure the main .c files make
    it in there too, not just the included files

* [ ] double check that the emit paths of other things besides the binary
  are working correctly.

* [ ] test `-fno-emit-bin` + `-fstage1`
* [ ] test `-femit-bin=foo` + `-fstage1`

* [ ] implib emit directory copies bin_file_emit directory in create() and needs
  to be adjusted to be overridden as well.

* [ ] make sure emit-h is handled correctly in the cache hash
* [ ] Cache: detect duplicate files added to the manifest

Some preliminary performance measurements of wall clock time and
peak RSS used:

stage1 behavior (1077 tests), llvm backend, release build:
 * cold global cache: 4.6s, 1.1 GiB
 * warm global cache: 3.4s, 980 MiB

stage2 master branch behavior (575 tests), llvm backend, release build:
 * cold global cache: 0.62s, 191 MiB
 * warm global cache: 0.40s, 128 MiB

stage2 this branch behavior (575 tests), llvm backend, release build:
 * cold global cache: 0.62s, 179 MiB
 * warm global cache: 0.27s, 90 MiB
2022-01-02 13:16:17 -07:00
joachimschmidt557
c710d5eefe stage2 ARM: implement wrap_errunion_err for empty payloads 2022-01-02 15:15:59 -05:00
Andrew Kelley
98e84152bd
Merge pull request #10481 from Luukdegram/wasm-behavior-tests
Stage2: wasm - Implement more behavior tests
2022-01-01 15:41:26 -05:00
Jimmi Holst Christensen
a41ad639a8 fmt: Refactor parsing of placeholders into its own function
This saves on comptime format string parsing, as the compiler caches
comptime calls. The catch here, is that parsePlaceHolder cannot take the
placeholder string as a slice. It must take it as an array by value for
the caching to occure.

There is also some logic in here that ensures that the specifier_arg is
always them same slice when the items they contain are the same. This
makes the compiler stamp out less copies of formatType.
2022-01-01 15:40:24 -05:00
Jakub Konka
885d96735d
Merge pull request #10480 from joachimschmidt557/stage2-arm
stage2 ARM: zig test working
2022-01-01 17:02:31 +01:00
Luuk de Gram
3de111d993
wasm: Fix loading from pointers to support defer
Previously, the `load` instruction would just pass the pointer to the next instruction
for types that comply to `isByRef`. However, this meant that a defer would directly write
to the reference, rather than a copy. After this commit, we always copy the value.
2022-01-01 16:23:21 +01:00
Jakub Konka
0e3cd5bb1e stage2: remove safety check for optional payload in codegen
This will be enforced by Sema.
2022-01-01 14:27:06 +01:00
Luuk de Gram
ad1b040996
wasm: Implement pointer arithmetic and refactoring:
- This implements all pointer arithmetic related instructions such as ptr_add, ptr_sub, ptr_elem_val
- We refactored the code, to use `isByRef` to ensure consistancy.
- Pointers will now be loaded correctly, rather then being passed around.
- The behaviour test for pointers is now passing.
2022-01-01 12:59:43 +01:00
Luuk de Gram
28cfc49c3e
wasm: Implement memCpy and get if behavior tests passing 2022-01-01 12:59:43 +01:00
Luuk de Gram
b9a0401e23
wasm: Implement @ptrToInt and fix indirect function call
- Previously the table index and function type index were switched.
This commit swaps them.
- This also emits the correct indirect function calls count when importing the function table
2022-01-01 12:59:43 +01:00
Luuk de Gram
f644c8b047
wasm: Implement array_to_slice and bug fixes:
- Add method to easily create local for virtual stack
- Ensure function pointers are passed correctly
- Correctly handle slices as return types and values
- Fix wrapping error sets/payloads.
- Handle ptr-like optionals correctly, by using address '0' as null.
- Implement `array_to_slice`
- linker: Always emit a table, so call_indirect inside bodies do not fail if there's no table.
TODO: Only do this when we emit a call_indirect but the relocation cannot be resolved.
2022-01-01 12:59:18 +01:00
Luuk de Gram
29164a31cc
wasm: Pass 'bugs' behavior tests 2022-01-01 12:58:59 +01:00
Luuk de Gram
726ce85c10
wasm: Fix storing error. Pass bool.zig behavior tests 2022-01-01 12:58:12 +01:00
joachimschmidt557
a722e1fc0b
stage2 codegen: Add generateSymbol for optional stub 2022-01-01 12:51:29 +01:00
joachimschmidt557
845531dde1
stage2 ARM: implement airUnwrapErrErr + airCmp for error sets 2022-01-01 11:16:38 +01:00
joachimschmidt557
f8163f7eaf
stage2 ARM: implement airCall for function pointers 2022-01-01 11:16:34 +01:00
Marian Beermann
55709de185 stage1: fix @errorName null termination 2022-01-01 10:28:47 +02:00
Jakub Konka
c7f774803a stage2: implement loading-storing via pointer (in register)
* load address (pointer) to a stack variable in a register via
  `lea` instruction
* store value on the stack via a pointer stored in a register via
  `mov [reg], imm` instruction
* the lowerings naturally are handled automatically by Mir -> Isel
  layer
* add initial (without safety) implementation of `.optional_payload`
* add matching stage2 test cases
2021-12-31 18:10:28 +01:00
Jakub Konka
bc12d50170 stage2: implement genSetReg for ptr_stack_offset 2021-12-31 15:13:03 +01:00
Jakub Konka
5ec9f39dfe stage2: implement isNull() and isNonNull() 2021-12-31 15:07:48 +01:00
Jakub Konka
e7ac05e882 stage2: rename Emit to Isel for x86_64 2021-12-31 11:18:23 +01:00
Jarred Sumner
2d9508780a For unused references & redundant keywords, append the compiler error but continue running AstGen 2021-12-30 22:45:43 -05:00
Andrew Kelley
4645ec89f7
Merge pull request #10455 from joachimschmidt557/stage2-arm
stage2 ARM: basic slice + basic struct support
2021-12-30 15:25:36 -05:00
drew
2f53406ad8
CBE; implement airLoad and airStore for arrays (#10452)
Effectively a small continuation of #10152

This allows the for.zig behavior tests to pass. Unfortunately to fully test everything I had to move a lot of behavior tests from array.zig; most of them now pass (sorry @rainbowbismuth!)

I'm also conflicted on how I store constants into arrays because it's kind of stupid; array's can't be re-initialized using the same syntax, so instead of initializing each element, a new array is made which is copied into the destination. This also required that renderValue can't emit string literals for byte arrays given that they need to always have an extra byte for the NULL terminator, meaning that strings are no longer grep-able in the output.
2021-12-30 15:19:12 -05:00
joachimschmidt557
69d03d3a29
stage2 ARM: implement struct_field_ptr and struct_field_val 2021-12-30 14:39:06 +01:00
joachimschmidt557
ac7fa95af4
stage2 ARM: add genArmInlineMemcpy for copying types with size > 4 2021-12-30 14:24:03 +01:00
Jakub Konka
4ecc5956f6 stage2: update PrintMir with latest instructions and Isel changes 2021-12-29 22:06:38 +01:00
Jakub Konka
b7e2235973 stage2: lower 1-byte and 2-byte values saved to stack
* fix handling of `ah`, `bh`, `ch`, and `dh` registers (which are
  actually used as aliases to `dil`, etc. registers). Currenly, we
  treat them as aliases only meaning when we encounter `ah` we make
  sure to set the REX.W to promote the instruction to 64bits and use
  `dil` register instead - otherwise we might have mismatch between
  registers used in different parts of the codegen. In the future,
  we can and should use `ah`, etc. as upper 8bit halves of 16bit
  registers `ax`, etc.
* fix bug in `airCmp` where `.cmp` MIR instruction shouldn't force
  type `Bool` but let the type of the original type propagate downwards
  - we need this to make an informed choice of the target register
  size and hence choose the right encoding down the line.
* implement lowering of 1-byte and 2-byte values to stack and add
  matching stage2 tests for x86_64 codegen
2021-12-29 22:06:38 +01:00
Jakub Konka
08ea1a2eab stage2: add separate tag for MI encoding
To request memory-immediate encoding at the MIR side, we should now
use a new tag such as `mov_mem_imm` where the size of the memory
pointer is encoded as the flags:

```
0b00 => .byte_ptr,
0b01 => .word_ptr,
0b10 => .dword_ptr,
0b11 => .qword_ptr,
```
2021-12-29 22:06:38 +01:00
joachimschmidt557
baec07cfcd
stage2 ARM: change MCValue.immediate to u32 2021-12-29 11:27:37 +01:00
joachimschmidt557
96e59fd1c2
stage2 ARM: implement slice_elem_val for sizes > 4 2021-12-29 11:08:48 +01:00
Andrew Kelley
efb7148a45 Sema: more union fixes
* `Module.Union.getLayout`: fixes to support components of the union
   being 0 bits.
 * Implement `@typeInfo` for unions.
 * Add missing calls to `resolveTypeFields`.
 * Fix explicitly-provided union tag types passing a `Zir.Inst.Ref`
   where an `Air.Inst.Ref` was expected. We don't have any type safety
   for this; these typess are aliases.
 * Fix explicitly-provided `union(enum)` tag Values allocated to the
   wrong arena.
2021-12-28 23:22:09 -07:00
Andrew Kelley
91619cdf57 Sema: implement calling a fn ptr via a union field
Also, ignore `packed` on unions because that will be removed from the
language.
2021-12-28 23:22:09 -07:00
Andrew Kelley
81a3910e44 Sema: improve union support
* reduce number of branches in zirCmpEq
 * implement equality comparison for enums and unions
 * fix coercion from union to its tag type resulting in the wrong type
 * fix method calls of unions
 * implement peer type resolution for unions, enums, and enum literals
 * fix union tag type memory in the wrong arena
2021-12-28 20:20:30 -07:00
Andrew Kelley
6229d37dcf stage2: handle function dependency failures without crashing 2021-12-28 20:20:30 -07:00
joachimschmidt557
c0ae9647f9 stage2 ARM: implement slice_elem_val for types with size <= 4 2021-12-28 20:38:37 -05:00
Veikka Tuominen
4f4f0bc6f0 stage1: fix access of slice sentinel at comptime 2021-12-28 14:44:46 +02:00
Andrew Kelley
4b9b9e7257 stage2: LLVM backend: fix lowering of union constants
Comment from this commit reproduced here:

LLVM does not allow us to change the type of globals. So we must
create a new global with the correct type, copy all its attributes,
and then update all references to point to the new global,
delete the original, and rename the new one to the old one's name.
This is necessary because LLVM does not support const bitcasting
a struct with padding bytes, which is needed to lower a const union value
to LLVM, when a field other than the most-aligned is active. Instead,
we must lower to an unnamed struct, and pointer cast at usage sites
of the global. Such an unnamed struct is the cause of the global type
mismatch, because we don't have the LLVM type until the *value* is created,
whereas the global needs to be created based on the type alone, because
lowering the value may reference the global as a pointer.
2021-12-28 01:53:58 -07:00
Andrew Kelley
232f8a291d stage2: fix build on 32-bit targets
Regressed in 85d4c8620f602726b159efe1fe2ea0e07e3c5b59.
2021-12-28 01:15:36 -07:00