16769 Commits

Author SHA1 Message Date
boofexxx
435beb4e1d std: fix doc comment typo in os.zig 2022-02-07 09:55:47 +01:00
Jakub Konka
5944e89016 stage2: lower unnamed constants in Elf and MachO
* link: add a virtual function `lowerUnnamedConsts`, similar to
  `updateFunc` or `updateDecl` which needs to be implemented by the
  linker backend in order to be used with the `CodeGen` code
* elf: implement `lowerUnnamedConsts` specialization where we
  lower unnamed constants to `.rodata` section. We keep track of the
  atoms encompassing the lowered unnamed consts in a global table
  indexed by parent `Decl`. When the `Decl` is updated or destroyed,
  we clear the unnamed consts referenced within the `Decl`.
* macho: implement `lowerUnnamedConsts` specialization where we
  lower unnamed constants to `__TEXT,__const` section. We keep track of the
  atoms encompassing the lowered unnamed consts in a global table
  indexed by parent `Decl`. When the `Decl` is updated or destroyed,
  we clear the unnamed consts referenced within the `Decl`.
* x64: change `MCValue.linker_sym_index` into two `MCValue`s: `.got_load` and
  `.direct_load`. The former signifies to the emitter that it should
  emit a GOT load relocation, while the latter that it should emit
  a direct load (`SIGNED`) relocation.
* x64: lower `struct` instantiations
2022-02-07 08:39:00 +01:00
Andrew Kelley
21135387fb
Merge pull request #10782 from topolarity/gate-child-processes
Avoid depending on child process execution when not supported by host OS
2022-02-07 00:31:17 -05:00
Andrew Kelley
33fa296019 stage2: pass proper can_exit_early value to LLD
and adjust the warning message for invoking LLD twice in the same
process.
2022-02-06 22:29:40 -07:00
Cody Tapscott
c1cf158729 Replace argvCmd with std.mem.join 2022-02-06 22:21:46 -07:00
Cody Tapscott
5065830aa0 Avoid depending on child process execution when not supported by host OS
In accordance with the requesting issue (#10750):
- `zig test` skips any tests that it cannot spawn, returning success
- `zig run` and `zig build` exit with failure, reporting the command the cannot be run
- `zig clang`, `zig ar`, etc. already punt directly to the appropriate clang/lld main(), even before this change
- Native `libc` Detection is not supported

Additionally, `exec()` and related Builder functions error at run-time, reporting the command that cannot be run
2022-02-06 22:21:46 -07:00
Andrew Kelley
069dd01ce4
Merge pull request #10740 from ziglang/stage2-float-arithmetic
stage2: add more float arithmetic and f80 support
2022-02-06 22:41:43 -05:00
Andrew Kelley
eb82fdf96c Sema: panic instead of lowering to unavailable compiler-rt functions
Once the relevant compiler_rt functions are implemented, these panics
can be removed.
2022-02-06 20:38:57 -07:00
Andrew Kelley
65b6faa048 Sema: avoid @intToFloat for f80 which breaks on non-x86 targets
Currently Zig lowers `@intToFloat` for f80 incorrectly on non-x86
targets:

```
broken LLVM module found:
UIToFP result must be FP or FP vector
  %62 = uitofp i64 %61 to i128
SIToFP result must be FP or FP vector
  %66 = sitofp i64 %65 to i128
```

This happens because on such targets, we use i128 instead of x86_fp80 in
order to avoid "LLVM ERROR: Cannot select". `@intToFloat` must be
lowered differently to account for this difference as well.
2022-02-06 20:23:40 -07:00
Andrew Kelley
3bcce5f6d1 Sema: implement writing structs to memory at comptime 2022-02-06 20:07:43 -07:00
Andrew Kelley
d4805472c3 compiler_rt: addXf3: add coercion to @clz
We're going to remove the first parameter from this function in the
future. Stage2 already ignores the first parameter. So we put an `@as`
in here to make it work for both.
2022-02-06 20:06:00 -07:00
Andrew Kelley
495fd4ee3e AstGen: refactor redundant expressions
This is a non-functional change.
2022-02-06 19:45:49 -07:00
Marc Tiehuis
53e6c719ef std/math: optimize division with divisors less than a half-limb
This adds a new path which avoids using compiler_rt generated div
udivmod instructions in the case that a divisor is less than half the
max usize value. Two half-limb divisions are performed instead which
ensures that non-emulated division instructions are actually used. This
does not improve the udivmod code which should still be reviewed
independently of this issue.

Notably this improves the performance of the toString implementation of
non-power-of-two bases considerably.

Division performance is improved ~1000% based on some coarse testing.

The following test code is used to provide a rough comparison between
the old vs. new method.

```
const std = @import("std");
const Managed = std.math.big.int.Managed;

const allocator = std.heap.c_allocator;

fn fib(a: *Managed, n: usize) !void {
    var b = try Managed.initSet(allocator, 1);
    defer b.deinit();
    var c = try Managed.init(allocator);
    defer c.deinit();

    var i: usize = 0;
    while (i < n) : (i += 1) {
        try c.add(a.toConst(), b.toConst());

        a.swap(&b);
        b.swap(&c);
    }
}

pub fn main() !void {
    var a = try Managed.initSet(allocator, 0);
    defer a.deinit();

    try fib(&a, 1_000_000);

    // Note: Next two lines (and printed digit count) omitted on no-print version.
    const as = try a.toString(allocator, 10, .lower);
    defer allocator.free(as);

    std.debug.print("fib: digit count: {}, limb count: {}\n", .{ as.len, a.limbs.len });
}
```

```
==> time.no-print <==
limb count: 10849

________________________________________________________
Executed in   10.60 secs    fish           external
   usr time   10.44 secs    0.00 millis   10.44 secs
   sys time    0.02 secs    1.12 millis    0.02 secs

==> time.old <==
fib: digit count: 208988, limb count: 10849

________________________________________________________
Executed in   22.78 secs    fish           external
   usr time   22.43 secs    1.01 millis   22.43 secs
   sys time    0.03 secs    0.13 millis    0.03 secs

==> time.optimized <==
fib: digit count: 208988, limb count: 10849

________________________________________________________
Executed in   11.59 secs    fish           external
   usr time   11.56 secs    1.03 millis   11.56 secs
   sys time    0.03 secs    0.12 millis    0.03 secs
```

Perf data for non-optimized and optimized, verifying no udivmod is
generated by the new code.

```
$ perf report -i perf.data.old --stdio
- Total Lost Samples: 0
-
- Samples: 90K of event 'cycles:u'
- Event count (approx.): 71603695208
-
- Overhead  Command  Shared Object     Symbol
- ........  .......  ................  ...........................................
-
    52.97%  t        t                 [.] compiler_rt.udivmod.udivmod
    45.97%  t        t                 [.] std.math.big.int.Mutable.addCarry
     0.83%  t        t                 [.] main
     0.08%  t        libc-2.33.so      [.] __memmove_avx_unaligned_erms
     0.08%  t        t                 [.] __udivti3
     0.03%  t        [unknown]         [k] 0xffffffff9a0010a7
     0.02%  t        t                 [.] std.math.big.int.Managed.ensureCapacity
     0.01%  t        libc-2.33.so      [.] _int_malloc
     0.00%  t        libc-2.33.so      [.] __malloc_usable_size
     0.00%  t        libc-2.33.so      [.] _int_free
     0.00%  t        t                 [.] 0x0000000000004a80
     0.00%  t        t                 [.] std.heap.CAllocator.resize
     0.00%  t        libc-2.33.so      [.] _mid_memalign
     0.00%  t        libc-2.33.so      [.] sysmalloc
     0.00%  t        libc-2.33.so      [.] __posix_memalign
     0.00%  t        t                 [.] std.heap.CAllocator.alloc
     0.00%  t        ld-2.33.so        [.] do_lookup_x

$ perf report -i perf.data.optimized --stdio
- Total Lost Samples: 0
-
- Samples: 46K of event 'cycles:u'
- Event count (approx.): 36790112336
-
- Overhead  Command  Shared Object     Symbol
- ........  .......  ................  ...........................................
-
    79.98%  t        t                 [.] std.math.big.int.Mutable.addCarry
    15.14%  t        t                 [.] main
     4.58%  t        t                 [.] std.math.big.int.Managed.ensureCapacity
     0.21%  t        libc-2.33.so      [.] __memmove_avx_unaligned_erms
     0.05%  t        [unknown]         [k] 0xffffffff9a0010a7
     0.02%  t        libc-2.33.so      [.] _int_malloc
     0.01%  t        t                 [.] std.heap.CAllocator.alloc
     0.01%  t        libc-2.33.so      [.] __malloc_usable_size
     0.00%  t        libc-2.33.so      [.] systrim.constprop.0
     0.00%  t        libc-2.33.so      [.] _mid_memalign
     0.00%  t        t                 [.] 0x0000000000000c7d
     0.00%  t        libc-2.33.so      [.] malloc
     0.00%  t        ld-2.33.so        [.] check_match
```

Closes #10630.
2022-02-06 21:39:34 -05:00
Luuk de Gram
545aa790a4 Sema: Fix memory leak 2022-02-06 21:37:45 -05:00
Andrew Kelley
287ff4ab58 stage2: add more float arithmetic and f80 support
AstGen: Fixed bug where f80 types in source were triggering illegal
behavior.

Value: handle f80 in floating point arithmetic functions.

Value: implement floatRem and floatMod

This commit introduces dependencies on compiler-rt that are not
implemented. Those are a prerequisite to merging this branch.
2022-02-06 19:27:54 -07:00
John Schmidt
fd1284ebd0 stage2: apply type coercion in if expressions
When setting the break value in an if expression we must explicitly
check if a result location type coercion that needs to happen. This was
already done for switch expression, so let's just imitate that check
and fix for if expressions. To make this possible, we now also propagate
`rl_ty_inst` to sub scopes.
2022-02-06 21:26:26 -05:00
joachimschmidt557
adc9a282d8 stage2 ARM: fix load and store for abi_size < 4
Previously, in these cases, we would emit the ldr instruction even
though ldrb oder ldrh are the correct instructions.
2022-02-06 23:29:36 +01:00
joachimschmidt557
4468abfc42 stage2 ARM: enable a handful of passing behavior tests 2022-02-06 02:23:31 -05:00
Johannes Löthberg
6f87f49f3d CLI: remove remainders of --verbose-ast and --verbose-tokenize
These options were removed in 5e63baae8 (CLI: remove --verbose-ast and
--verbose-tokenize, 2021-06-09) but some remainders were left in.

Signed-off-by: Johannes Löthberg <johannes@kyriasis.com>
2022-02-06 01:57:04 -05:00
Andrew Kelley
8dcb1eba60
Merge pull request #10738 from Vexu/f80
Add compiler-rt functions for f80
2022-02-05 20:57:32 -05:00
Jakub Konka
f132f426b9 x86_64: add distinct MCValue representing symbol index in the linker
For PIE targets, we defer getting an address of value until the linker
has allocated all atoms and performed the relocations. In codegen,
we represent this via `MCValue.linker_sym_index` value.
2022-02-06 00:34:24 +01:00
joachimschmidt557
4b3b487627 stage2 regalloc: Introduce error.OutOfRegisters 2022-02-06 00:14:48 +01:00
joachimschmidt557
d4c3475f3d stage2 ARM: clarify usage of unfreezeRegs in airSliceElemVal 2022-02-05 15:58:46 +01:00
praschke
f2a82bafae std: allow tests to use cache and setOutputDir 2022-02-05 16:33:57 +02:00
gwenzek
0e1afb4d98
stage2: add support for Nvptx target
sample command:

/home/guw/github/zig/stage2/bin/zig build-obj cuda_kernel.zig -target nvptx64-cuda -O ReleaseSafe
this will create a kernel.ptx

expose PtxKernel call convention from LLVM
kernels are `export fn f() callconv(.PtxKernel)`
2022-02-05 16:33:00 +02:00
rohlem
fbc06f9c91 std.build.TranslateCStep: add C macro support
The string construction code is moved out of std.build.LibExeObjStep
into std.build.constructCMacroArg, to allow reusing it elsewhere.
2022-02-05 03:17:07 -05:00
Veikka Tuominen
7d04ab1f14 std.process: add option to support single quotes to ArgIteratorGeneral 2022-02-05 02:59:13 -05:00
Jan Philipp Hafer
01d48e55a5 compiler_rt: optimize mulo
- use usize to decide if register size is big enough to store
  multiplication result or if division is necessary
- multiplication routine with check of integer bounds
- wrapping multipliation and division routine from Hacker's Delight
2022-02-05 01:35:46 -05:00
Veikka Tuominen
5a7d43df23 stage1: make f80 always size 16, align 16 2022-02-04 22:44:56 +02:00
Veikka Tuominen
b2f84c6714 stage1: implement f80 negation on non native targets 2022-02-04 22:38:14 +02:00
Veikka Tuominen
6a736f0c8c compiler-rt: add add/sub for f80 2022-02-04 22:38:13 +02:00
Veikka Tuominen
9bbd3ab257 compiler-rt: add comparison functions for f80 2022-02-04 22:22:43 +02:00
Veikka Tuominen
72cef17b1a compiler-rt: add trunc functions for f80 2022-02-04 22:18:44 +02:00
Veikka Tuominen
5c4ef1a64c compiler-rt: add extend functions for f80 2022-02-04 22:16:07 +02:00
joachimschmidt557
04f379dd41 stage2 ARM: optimize airSliceElemVal for elem_size 1 or 4
In these cases, the AIR inst can be lowered to only one ldr
instruction.

Also fixes shifts in arm.bits.Offset
2022-02-04 21:07:10 +01:00
Kirk Scheibelhut
71321b6941
Various documentation fixes
Co-authored-by: Kirk Scheibelhut <kjs@scheibo.com>
Co-authored-by: extrasharp <genericpb@gmail.com>
2022-02-04 21:27:50 +02:00
Andrew Kelley
95fbce2b95 Sema: fixes to fieldVal, resolveStructFully, Type.eql
fieldVal handles pointer to pointer to array. This can happen for
example, if a pointer to an array is used as the condition expression of
a for loop.

resolveStructFully handles tuples (by doing nothing).

fixed Type comparison for tuples to handle comptime fields properly.
2022-02-03 23:59:32 -07:00
Kazuki Sakamoto
64f7231f86 stage1: Fix missing LLD library 2022-02-04 01:45:44 -05:00
Mateusz Radomski
1b6a1e691f
Sema: check for NaNs in cmp (#10760) 2022-02-04 00:58:27 -05:00
Andrew Kelley
0893326e0e Sema: slice improvements
* resolve_inferred_alloc now gives a proper mutability attribute to the
   corresponding alloc instruction. Previously, it would fail to mark
   things const.
 * slicing: fix the detection for when the end index equals the length
   of the underlying object. Previously it was using `end - start` but
   it should just use the end index directly. It also takes into account
   when slicing a comptime-known slice.
 * `Type.sentinel`: fix not handling all slice tags
2022-02-03 21:05:10 -07:00
Andrew Kelley
71e0cca7a7
Merge pull request #10780 from Luukdegram/wasm-behavior-tests
stage2: Wasm - Account for stack alignment
2022-02-03 20:23:46 -05:00
Jakub Konka
4ca9a8d192 x64: implement storing to MCValue.memory for PIE targets 2022-02-04 00:37:43 +01:00
Luuk de Gram
588b88b987
Move passing behavior tests
Singular tests (such as in the bug ones) are moved to top level with exclusions for non-passing backends.
The big behavior tests such as array_llvm and slice are moved to the inner scope with the C backend disabled.
They all pass for the wasm backend now
2022-02-03 22:31:29 +01:00
Luuk de Gram
e35414bf5c
wasm: Refactor stack to account for alignment
We now calculate the total stack size required for the current frame.
The default alignment of the stack is 16 bytes, and will be overwritten when the alignment
of a given type is larger than that.

After we have generated all instructions for the body, we calculate the total stack size
by forward aligning the stack size while accounting for the max alignment.
We then insert a prologue into the body, where we substract this size from the stack pointer
and save it inside a bottom stackframe local. We use this local then, to calculate
the stack pointer locals of all variables we allocate into the stack.

In a future iteration we can improve this further by storing the offsets as a new `stack_offset` `WValue`.
This has the benefit of not having to spend runtime cost of storing those offsets, but instead we append
those offsets whenever we need the value that lives in the stack.
2022-02-03 21:53:48 +01:00
Luuk de Gram
ae1e3c8f9b
wasm: Implement vector_init for array & structs
Implements the instruction `vector_init` for structs and arrays.
For arrays, it checks if the element must be passed by reference or not.
When not, it can simply use the `offset` field of a store instruction to copy the values
into the array. When it is byref, it will move the pointer by the element size, and then perform
a store operation. This ensures types like structs will be moved into the right position.
For structs we will always move the pointer, as we currently cannot verify if all fields are
not by ref.
2022-02-03 21:43:25 +01:00
Luuk de Gram
29013220d9
wasm: Implement elem_ptr
This implements lowering elem_ptr for decl's and constants.
To generate the correct pointer, we perform a relocation by using the addend
that represents the offset. The offset is calculated by taking the element's size
and multiplying that by the index.

For constants this generates a single immediate instruction, and for decl's
this generates a single pointer address.
2022-02-03 21:42:48 +01:00
Jakub Konka
3832b58229
Merge pull request #10775 from ziglang/x64-freeze-api
stage2: migrate x64 to freeze regalloc API, and remove the concept of register exceptions
2022-02-03 20:12:35 +01:00
Jakub Konka
228b798af5 elf: generated DWARF debug info for named structs 2022-02-03 18:47:36 +01:00
Jakub Konka
74a01e3d64 stage2: remove the concept of register exceptions 2022-02-03 18:08:29 +01:00
Jakub Konka
e0b1170b67 x64: swap out register exceptions for freeze/unfreeze api 2022-02-03 17:55:22 +01:00