125 Commits

Author SHA1 Message Date
Jakub Konka
f57b059e58 regalloc: refactor locking multiple registers at once 2022-05-07 13:27:11 +02:00
Jakub Konka
197c2a465f regalloc: rename freeze/unfreeze to lock/unlock registers 2022-05-07 10:46:05 +02:00
Jakub Konka
ac954eb539 regalloc: ensure we only freeze/unfreeze at the outermost scope
This prevents a nasty type of bugs where we accidentally unfreeze
a register that was frozen purposely in the outer scope, risking
accidental realloc of a taken register.

Fix CF flags spilling on aarch64 backend.
2022-05-07 00:57:55 +02:00
joachimschmidt557
c2d2307d09 stage2 AArch64: initial implementation of {add,sub}_with_overflow 2022-05-05 21:43:35 +02:00
Andrew Kelley
65389dc280 stage2: improve inline asm stage1 compatibility
* outputs can have names and be referenced with template replacements
   the same as inputs.
 * fix print_air.zig not decoding correctly.
 * LLVM backend: use a table for template names for simplicity
2022-05-02 22:14:17 -07:00
Andrew Kelley
09f1d62bdf add new builtin function @tan
The reason for having `@tan` is that we already have `@sin` and `@cos`
because some targets have machine code instructions for them, but in the
case that the implementation needs to go into compiler-rt, sin, cos, and
tan all share a common dependency which includes a table of data. To
avoid duplicating this table of data, we promote tan to become a builtin
alongside sin and cos.

ZIR: The tag enum is at capacity so this commit moves
`field_call_bind_named` to be `extended`. I measured this as one of
the least used tags in the zig codebase.

Fix libc math suffix for `f32` being wrong in both stage1 and stage2.
stage1: add missing libc prefix for float functions.
2022-04-27 16:45:23 -07:00
Andrew Kelley
f7596ae942 stage2: use indexes for Decl objects
Rather than allocating Decl objects with an Allocator, we instead allocate
them with a SegmentedList. This provides four advantages:
 * Stable memory so that one thread can access a Decl object while another
   thread allocates additional Decl objects from this list.
 * It allows us to use u32 indexes to reference Decl objects rather than
   pointers, saving memory in Type, Value, and dependency sets.
 * Using integers to reference Decl objects rather than pointers makes
   serialization trivial.
 * It provides a unique integer to be used for anonymous symbol names,
   avoiding multi-threaded contention on an atomic counter.
2022-04-20 17:37:35 -07:00
joachimschmidt557
3bfb1616db stage2 ARM: move genArgDbgInfo back to CodeGen
This removes the questionable Air -> Mir dependency that existed
before. The x86_64 backend also performed this change.
2022-04-16 09:41:27 +02:00
Andrew Kelley
2587474717 stage2: progress towards stage3
* The `@bitCast` workaround is removed in favor of `@ptrCast` properly
   doing element casting for slice element types. This required an
   enhancement both to stage1 and stage2.
 * stage1 incorrectly accepts `.{}` instead of `{}`. stage2 code that
   abused this is fixed.
 * Make some parameters comptime to support functions in switch
   expressions (as opposed to making them function pointers).
 * Avoid relying on local temporaries being mutable.
 * Workarounds for when stage1 and stage2 disagree on function pointer
   types.
 * Workaround recursive formatting bug with a `@panic("TODO")`.
 * Remove unreachable `else` prongs for some inferred error sets.

All in effort towards #89.
2022-04-14 10:12:45 -07:00
Andrew Kelley
b0edd8752a Liveness: modify encoding to support over 32 operands
Prior to this, Liveness encoded `asm`, `call`, and `aggregate_init` with
a single 32-bit integer, allowing up to 35 operands (3 are provided by
the regular tomb_bits). However, the Zig language allows function calls
with more than 35 arguments, inline assembly with more than 35 inputs,
and anonymous tuples with more than 35 elements.

The new encoding stores an index to the extra array instead of the bits
directly, and then as many extra elements as needed to encode all the
operands. The MSB is used as a flag to tell which element is the last
one, allowing for 31 bits per element.

Prior to this, print_air did not bother correctly printing tombstones
for these instructions; now it does.

In addition to updating the BigTomb iteration logic in the machine code
backends, this commit extracts the common logic into the Liveness namespace.
2022-04-12 11:22:12 -07:00
joachimschmidt557
8c12ad98b8
stage2 ARM: implement mul_with_overflow for ints <= 32 bits 2022-04-01 22:51:18 +02:00
joachimschmidt557
c4778fc029
stage2 ARM: implement mul_with_overflow for ints <= 16 bits 2022-04-01 22:02:56 +02:00
joachimschmidt557
77e70189f4
stage2 ARM: implement shl_with_overflow for ints <= 32 bits 2022-04-01 22:02:56 +02:00
joachimschmidt557
37a8c28802
stage2 ARM: implement add/sub_with_overflow for ints < 32 bits 2022-04-01 22:02:56 +02:00
joachimschmidt557
7285f0557c
stage2 ARM: implement add/sub_with_overflow for u32/i32 2022-04-01 22:02:55 +02:00
joachimschmidt557
e2e69803dc
stage2 ARM: change binOp lowering mechanism to use Mir tags
The Air -> Mir correspondence is not 1:1, so this better represents
what Mir insruction we actually want to generate.
2022-04-01 22:02:51 +02:00
Veikka Tuominen
75c2cff40e stage2: handle assembly input names 2022-03-31 01:33:28 -04:00
Andrew Kelley
05947ea870 stage2: implement @intToError with safety
This commit introduces a new AIR instruction `cmp_lt_errors_len`. It's
specific to this use case for two reasons:

 * The total number of errors is not stable during semantic analysis; it
   can only be reliably checked when flush() is called. So the backend
   that is lowering the instruction must emit a relocation of some kind
   and then populate it during flush().
 * The fewer AIR instructions in memory, the better for compiler
   performance, so we squish complex meanings into AIR tags without
   hesitation.

The instruction is implemented only in the LLVM backend so far. It does
this by creating a simple function which is gutted and re-populated
with each flush().

AstGen now uses ResultLoc.coerced_ty for `@intToError` and Sema does the
coercion.
2022-03-29 22:19:06 -07:00
John Schmidt
f47db0a0db sema: use pl_op for @select 2022-03-25 16:13:54 +01:00
John Schmidt
12d5efcbe6 stage2: implement @select 2022-03-25 16:13:54 +01:00
Andrew Kelley
98b932cfab fix merge conflicts 2022-03-22 20:17:43 -07:00
joachimschmidt557
be1cca3416 stage2 ARM: implement comparison of optional pointers 2022-03-22 20:16:05 -07:00
joachimschmidt557
95e166b2e1 stage2 ARM: implement min, max for integers <= 32 bits 2022-03-22 20:16:05 -07:00
joachimschmidt557
62529a291b stage2 ARM: More support for error unions 2022-03-22 20:16:05 -07:00
joachimschmidt557
a4e8294c91 stage2 ARM: change semantics of MCValue.stack_argument_offset
MCValue.stack_argument_offset now has the same semantics as
MCValue.stack_offset
2022-03-22 20:16:05 -07:00
joachimschmidt557
6ac04d8fd7 stage2 ARM: change semantics of MCValue.stack_offset
A stack_offset will now denote the exact offset applied to the start
of the stack frame (=fp when frame pointer is emitted)
2022-03-22 20:16:05 -07:00
Andrew Kelley
593130ce0a stage2: lazy @alignOf
Add a `target` parameter to every function that deals with Type and
Value.
2022-03-22 15:45:58 -07:00
William Sengir
0f48307041 stage2: add AIR instruction cmp_vector
The existing `cmp_*` instructions get their result type from `lhs`, but
vector comparison will always return a vector of bools with only the
length derived from its operands. This necessitates the creation of a
new AIR instruction.
2022-03-21 16:54:19 -07:00
Veikka Tuominen
a8520fbd0f stage2: add dbg_block_{begin,end} instruction 2022-03-19 11:20:38 +02:00
joachimschmidt557
c32e2c4d3c
stage2 ARM: remove MCValue.embedded_in_code 2022-03-18 12:19:22 +01:00
joachimschmidt557
3ecba7d7a2
stage2 ARM: implement slice_elem_ptr, ptr_elem_ptr 2022-03-18 12:12:14 +01:00
Andrew Kelley
7233a3324a stage2: implement @reduce
Notably, Value.eql and Value.hash are improved to treat NaN as equal to
itself, so that Type/Value can be hash map keys. Likewise float hashing
normalizes the float value before computing the hash.
2022-03-17 17:24:35 -07:00
joachimschmidt557
dcc1de12b0
stage2 ARM: implement addwrap, subwrap, mulwrap 2022-03-16 20:20:07 +01:00
joachimschmidt557
2412ac2c5f
stage2 ARM: fix shl for ints with bits < 32 2022-03-16 20:20:07 +01:00
joachimschmidt557
0eebdfcad3
stage2 ARM: fix bitwise negation of ints with bits < 32 2022-03-16 20:20:07 +01:00
joachimschmidt557
ca1ffb0951
stage2 ARM: genSetStack for stack_argument_offset 2022-03-16 20:19:58 +01:00
Veikka Tuominen
d83a26f068 stage2 llvm: keep track of inlined functions 2022-03-16 10:53:41 +02:00
Veikka Tuominen
0343811836 Sema: emit dbg_func around inline calls 2022-03-16 09:34:26 +02:00
Andrew Kelley
0bc9635490 stage2: add debug info for locals in the LLVM backend
Adds 2 new AIR instructions:
 * dbg_var_ptr
 * dbg_var_val

Sema no longer emits dbg_stmt AIR instructions when strip=true.

LLVM backend: fixed lowerPtrToVoid when calling ptrAlignment on
the element type is problematic.

LLVM backend: fixed alloca instructions improperly getting debug
location annotated, causing chaotic debug info behavior.

zig_llvm.cpp: fixed incorrect bindings for a function that should use
unsigned integers for line and column.

A bunch of C test cases regressed because the new dbg_var AIR
instructions caused their operands to be alive, exposing latent bugs.
Mostly it's just a problem that the C backend lowers mutable
and const slices to the same C type, so we need to represent that in the
C backend instead of printing two duplicate typedefs.
2022-03-13 03:41:31 -04:00
Andrew Kelley
4c1cc4d8d9
Merge pull request #11120 from Vexu/stage2
Stage2: make std.rand tests pass
2022-03-11 13:48:28 -05:00
joachimschmidt557
4590e980f7
stage2 ARM: implement caller-saved registers 2022-03-11 14:12:11 +01:00
joachimschmidt557
06058ed6f3
stage2 regalloc: replace Register.allocIndex with generic indexOfReg
* callee_preserved_regs and other ABI-specific information have been
moved to the respective abi.zig files
2022-03-11 13:29:16 +01:00
Veikka Tuominen
cba68090a6 stage2: implement @shuffle at runtime 2022-03-11 13:12:32 +02:00
Andrew Kelley
078037ab9b stage2: passing threadlocal tests for x86_64-linux
* use the real start code for LLVM backend with x86_64-linux
   - there is still a check for zig_backend after initializing the TLS
     area to skip some stuff.
 * introduce new AIR instructions and implement them for the LLVM
   backend. They are the same as `call` except with a modifier.
   - call_always_tail
   - call_never_tail
   - call_never_inline
 * LLVM backend calls hasRuntimeBitsIgnoringComptime in more places to
   avoid unnecessarily depending on comptimeOnly being resolved for some
   types.
 * LLVM backend: remove duplicate code for setting linkage and value
   name. The canonical place for this is in `updateDeclExports`.
 * LLVM backend: do some assembly template massaging to make `%%`
   rendered as `%`. More hacks will be needed to make inline assembly
   catch up with stage1.
2022-03-11 00:04:42 -07:00
joachimschmidt557
95fc41b2b4 stage2 ARM: implement ret_load 2022-03-08 21:10:04 +01:00
joachimschmidt557
3ea603c82a stage2 ARM: implement ptr_add, ptr_sub for all element sizes
Also reduces slice_elem_val to ptr_add, simplifying the implementation
2022-03-08 10:54:08 +01:00
Andrew Kelley
71b8760d3b stage2: rework @mulAdd
* mul_add AIR instruction: use `pl_op` instead of `ty_pl`. The type is
   always the same as the operand; no need to waste bytes redundantly
   storing the type.
 * AstGen: use coerced_ty for all the operands except for one which we
   use to communicate the type.
 * Sema: use the correct source location for requireRuntimeBlock in
   handling of `@mulAdd`.
 * native backends: handle liveness even for the functions that are
   TODO.
 * C backend: implement `@mulAdd`. It lowers to libc calls.
 * LLVM backend: make `@mulAdd` handle all float types.
   - improved fptrunc and fpext to handle f80 with compiler-rt calls.
 * Value.mulAdd: handle all float types and use the `@mulAdd` builtin.
 * behavior tests: revert the changes to testing `@mulAdd`. These
   changes broke the test coverage, making it only tested at
   compile-time.

Improved f80 support:
 * std.math.fma handles f80
 * move fma functions from freestanding libc to compiler-rt
   - add __fmax and fmal
   - make __fmax and fmaq only exported when they don't alias fmal.
   - make their linkage weak just like the rest of compiler-rt symbols.
 * removed `longDoubleIsF128` and replaced it with `longDoubleIs` which
   takes a type as a parameter. The implementation is now more accurate
   and handles more targets. Similarly, in stage2 the function
   CTypes.sizeInBits is more accurate for long double for more targets.
2022-03-06 16:11:39 -07:00
John Schmidt
6637335981 stage2: implement @mulAdd for scalar floats 2022-03-06 15:36:56 -07:00
joachimschmidt557
d486a7b811 stage2 ARM: generate less no-op branches
The checks detecting such no-op branches (essentially instructions
that branch to the instruction immediately following the branch) were
tightened to catch more of these occurrences.
2022-03-04 23:28:14 +01:00
Luuk de Gram
43cb19ea4d wasm: Implement @wasmMemoryGrow builtin
Similarly to the other wasm builtin, this implements the grow variation where the memory
index is a comptime known value. The operand as well as the result are runtime values.
This also verifies during semantic analysis the target we're building for is wasm, or else
emits a compilation error. This means that other backends do not have to handle this AIR instruction,
other than the wasm and LLVM backends.
2022-03-03 16:33:46 -07:00