277 Commits

Author SHA1 Message Date
Andrew Kelley
ede76f4fe3 stage2: fix generics with non-comptime anytype parameters
The `comptime_args` field of Fn has a clarified purpose:
For generic function instantiations, there is a `TypedValue` here
for each parameter of the function:
 * Non-comptime parameters are marked with a `generic_poison` for the value.
 * Non-anytype parameters are marked with a `generic_poison` for the type.

Sema now has a `fn_ret_ty` field. Doc comments reproduced here:
> When semantic analysis needs to know the return type of the function whose body
> is being analyzed, this `Type` should be used instead of going through `func`.
> This will correctly handle the case of a comptime/inline function call of a
> generic function which uses a type expression for the return type.
> The type will be `void` in the case that `func` is `null`.
Various places in Sema are modified in accordance with this guidance.

Fixed `resolveMaybeUndefVal` not returning `error.GenericPoison` when
Value Tag of `generic_poison` is encountered.

Fixed generic function memoization incorrect equality checking. The
logic now clearly deals properly with any combination of anytype and
comptime parameters.

Fixed not removing generic function instantiation from the table in case
a compile errors in the rest of `call` semantic analysis. This required
introduction of yet another adapter which I have called
`GenericRemoveAdapter`. This one is nice and simple - it's the same hash
function (the same precomputed hash is passed in) but the equality
function checks pointers rather than doing any logic.

Inline/comptime function calls coerce each argument in accordance with
the function parameter type expressions. Likewise the return type
expression is evaluated and provided (see `fn_ret_ty` above).

There's a new compile error "unable to monomorphize function". It's
pretty unhelpful and will need to get improved in the future. It happens
when a type expression in a generic function did not end up getting
resolved at a callsite. This can happen, for example, if a runtime
parameter is attempted to be used where it needed to be comptime known:

```zig
fn foo(x: anytype) [x]u8 { _ = x; }
```

In this example, even if we pass a number such as `10` for `x`, it is
not marked `comptime`, so `x` will have a runtime known value, making
the return type unable to resolve.

In the LLVM backend I implement cmp instructions for float types to pass
some behavior tests that used floats.
2021-08-06 16:24:39 -07:00
Andrew Kelley
c03a04a589 stage2: return type expressions of generic functions
* ZIR encoding for function instructions have a body for the return
   type. This lets Sema for generic functions do the same thing it does
   for parameters, handling `error.GenericPoison` in the evaluation of
   the return type by marking the function as generic.

 * Sema: fix missing block around the new Decl arena finalization. This
   led to a memory corruption.

 * Added some floating point support to the LLVM backend but didn't get
   far enough to pass any new tests.
2021-08-05 19:19:19 -07:00
Andrew Kelley
f58cbef165 stage2: std.mem.eql works now
* The `indexable_ptr_len` ZIR instruction now uses a `none_or_ref`
   ResultLoc. This prevents an unnecessary `ref` instruction from being
   emitted.
 * Sema: Fix `analyzeCall` using the incorrect ZIR object for the
   generic function callee.
 * LLVM backend: `genTypedValue` supports a `Slice` type encoded with
   the `decl_ref` `Value`.
2021-08-04 23:02:13 -07:00
Andrew Kelley
d4468affb7 stage2 generics improvements: anytype and param type exprs
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.

param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.

This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).

It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.

Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.

Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.

Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.

Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).

Tasks left in this branch:
 * Implement the memoization table.
 * Add test coverage.
 * Improve error reporting and source locations for compile errors.
2021-08-04 21:11:31 -07:00
Andrew Kelley
d5f173d28f
Merge pull request #9496 from Luukdegram/stage2-wasm
stage2: wasm - Wrapping, intcast and optionals
2021-08-01 20:33:55 -04:00
Andrew Kelley
ddf14323ea stage2: implement @truncate 2021-08-01 16:13:58 -07:00
Luuk de Gram
6e139d124b
wasm: Resolve feedback (wrapping arbitrary int sizes)
- This ensures we honor the user's integer size when performing wrapping operations.
- Also, instead of using ensureCapacity, we now use ensureUnusedCapacity.
2021-08-01 21:30:06 +02:00
Luuk de Gram
61de59e121
wasm: Implement optionals
This uses the same approach as error unions,
meaning it's a `WValue` with its tag set to `multi_value`.
The initial index of the multi_value will contain the null-tag, used to check if the value
is null or not. The other values will be the payload.

To support the `.?` shorthand syntax, we save the result from checking the null-tag
into a new local, which can then be loaded later in the block to either hit `unreachable` or
set the actual payload value.

Currently, it seems `.?` and `orelse unreachable` results in different AIR structure.
TODO: Is this expected?
2021-08-01 11:07:23 +02:00
Luuk de Gram
5667ab7dcd
wasm: Implement wrapping operands, add opcodes to wasm.zig
- Some opcodes have the incorrect value set in std.
- Some opcodes were missing and have now been added to std.
- Adding wrapping operands for add,sub and mul.
- Implement intCast which either extends or shortens the type.
2021-08-01 11:07:23 +02:00
Andrew Kelley
040c6eaaa0 stage2: garbage collect unused anon decls
After this change, the frontend and backend cooperate to keep track of
which Decls are actually emitted into the machine code. When any backend
sees a `decl_ref` Value, it must mark the corresponding Decl `alive`
field to true.

This prevents unused comptime data from spilling into the output object
files. For example, if you do an `inline for` loop, previously, any
intermediate value calculations would have gone into the object file.
Now they are garbage collected immediately after the owner Decl has its
machine code generated.

In the frontend, when it is time to send a Decl to the linker, if it has
not been marked "alive" then it is deleted instead.

Additional improvements:
 * Resolve type ABI layouts after successful semantic analysis of a
   Decl. This is needed so that the backend has access to struct fields.
 * Sema: fix incorrect logic in resolveMaybeUndefVal. It should return
   "not comptime known" instead of a compile error for global variables.
 * `Value.pointerDeref` now returns `null` in the case that the pointer
   deref cannot happen at compile-time. This is true for global
   variables, for example. Another example is if a comptime known
   pointer has a hard coded address value.
 * Binary arithmetic sets the requireRuntimeBlock source location to the
   lhs_src or rhs_src as appropriate instead of on the operator node.
 * Fix LLVM codegen for slice_elem_val which had the wrong logic for
   when the operand was not a pointer.

As noted in the comment in the implementation of deleteUnusedDecl, a
future improvement will be to rework the frontend/linker interface to
remove the frontend's responsibility of calling allocateDeclIndexes.

I discovered some issues with the plan9 linker backend that are related
to this, and worked around them for now.
2021-07-29 19:30:37 -07:00
Andrew Kelley
a5c6e51f03 stage2: more principled approach to comptime references
* AIR no longer has a `variables` array. Instead of the `varptr`
   instruction, Sema emits a constant with a `decl_ref`.
 * AIR no longer has a `ref` instruction. There is no longer any
   instruction that takes a value and returns a pointer to it. If this
   is desired, Sema must either create an anynomous Decl and return a
   constant `decl_ref`, or in the case of a runtime value, emit an
   `alloc` instruction, `store` the value to it, and then return the
   `alloc`.
 * The `ref_val` Value Tag is eliminated. `decl_ref` should be used
   instead. Also added is `eu_payload_ptr` which points to the payload
   of an error union, given an error union pointer.

In general, Sema should avoid calling `analyzeRef` if it can be helped.
For example in the case of field_val and elem_val, there should never be
a reason to create a temporary (alloc or decl). Recent previous commits
made progress along that front.

There is a new abstraction in Sema, which looks like this:

    var anon_decl = try block.startAnonDecl();
    defer anon_decl.deinit();
    // here 'anon_decl.arena()` may be used
    const decl = try anon_decl.finish(ty, val);
    // decl is typically now used with `decl_ref`.

This pattern is used to upgrade `ref_val` usages to `decl_ref` usages.

Additional improvements:

 * Sema: fix source location resolution for calling convention
   expression.
 * Sema: properly report "unable to resolve comptime value" for loads of
   global variables. There is now a set of functions which can be
   called if the callee wants to obtain the Value even if the tag is
   `variable` (indicating comptime-known address but runtime-known value).
 * Sema: `coerce` resolves builtin types before checking equality.
 * Sema: fix `u1_type` missing from `addType`, making this type have a
   slightly more efficient representation in AIR.
 * LLVM backend: fix `genTypedValue` for tags `decl_ref` and `variable`
   to properly do an LLVMConstBitCast.
 * Remove unused parameter from `Value.toEnum`.

After this commit, some test cases are no longer passing. This is due to
the more principled approach to comptime references causing more
anonymous decls to get sent to the linker for codegen. However, in all
these cases the decls are not actually referenced by the runtime machine
code. A future commit in this branch will implement garbage collection
of decls so that unused decls do not get sent to the linker for codegen.
This will make the tests go back to passing.
2021-07-29 15:59:51 -07:00
Andrew Kelley
dc88864c97 stage2: implement @boolToInt
This is the first commit in which some behavior tests are passing for
both stage1 and stage2.
2021-07-27 17:08:37 -07:00
Andrew Kelley
66e5920dc3 llvm backend: LLVMGetNamedGlobalAlias requires a null terminated string 2021-07-27 15:47:25 -07:00
Andrew Kelley
a8e964eadd stage2: zig test now works with the LLVM backend
Frontend improvements:

 * When compiling in `zig test` mode, put a task on the work queue to
   analyze the main package root file. Normally, start code does
   `_ = import("root");` to make Zig analyze the user's code, however in
   the case of `zig test`, the root source file is the test runner.
   Without this change, no tests are picked up.
 * In the main pipeline, once semantic analysis is finished, if there
   are no compile errors, populate the `test_functions` Decl with the
   set of test functions picked up from semantic analysis.
 * Value: add `array` and `slice` Tags.

LLVM backend improvements:

 * Fix incremental updates of globals. Previously the
   value of a global would not get replaced with a new value.
 * Fix LLVM type of arrays. They were incorrectly sending
   the ABI size as the element count.
 * Remove the FuncGen parameter from genTypedValue. This function is for
   generating global constants and there is no function available when
   it is being called.
   - The `ref_val` case is now commented out. I'd like to eliminate
     `ref_val` as one of the possible Value Tags. Instead it should
     always be done via `decl_ref`.
 * Implement constant value generation for slices, arrays, and structs.
 * Constant value generation for functions supports the `decl_ref` tag.
2021-07-27 14:19:53 -07:00
Andrew Kelley
31a59c229c stage2: improvements towards zig test
* Add AIR instruction: struct_field_val
   - This is part of an effort to eliminate the AIR instruction `ref`.
   - It's implemented for C backend and LLVM backend so far.
 * Rename `resolvePossiblyUndefinedValue` to `resolveMaybeUndefVal` just
   to save some columns on long lines.
 * Sema: add `fieldVal` alongside `fieldPtr` (renamed from
   `namedFieldPtr`). This is part of an effort to eliminate the AIR
   instruction `ref`. The idea is to avoid unnecessary loads, stores,
   stack usage, and IR instructions, by paying a DRY cost.

LLVM backend improvements:

 * internal linkage vs exported linkage is implemented, along with
   aliases. There is an issue with incremental updates due to missing
   LLVM API for deleting aliases; see the relevant comment in this commit.
   - `updateDeclExports` is hooked up to the LLVM backend now.
 * Fix usage of `Type.tag() == .noreturn` rather than calling `isNoReturn()`.
 * Properly mark global variables as mutable/constant.
 * Fix llvm type generation of function pointers
 * Fix codegen for calls of function pointers
 * Implement llvm type generation of error unions and error sets.
 * Implement AIR instructions: addwrap, subwrap, mul, mulwrap, div,
   bit_and, bool_and, bit_or, bool_or, xor, struct_field_ptr,
   struct_field_val, unwrap_errunion_err, add for floats, sub for
   floats.

After this commit, `zig test` on a file with `test "example" {}`
correctly generates and executes a test binary. However the
`test_functions` slice is undefined and just happens to be going into
the .bss section, causing the length to be 0. The next step towards
`zig test` will be replacing the `test_functions` Decl Value with the
set of test function pointers, before it is sent to linker/codegen.
2021-07-26 19:27:49 -07:00
Andrew Kelley
14d8a1c10d stage2 llvm backend improvements working towards zig test
* properly set global variables to const if they are not a global
   variable.
 * implement global variable initializations.
 * initial implementation of llvmType() for structs and functions.
 * implement genTypedValue for variable tags
 * implement more AIR instructions: varptr, slice_ptr, slice_len,
   slice_elem_val, ptr_slice_elem_val, unwrap_errunion_payload,
   unwrap_errunion_payload_ptr, unwrap_errunion_err,
   unwrap_errunion_err_ptr.
2021-07-25 22:38:50 -07:00
Andrew Kelley
c3d10dbda1 stage2 llvm backend: implement llvmType for error union and slices 2021-07-25 19:06:48 -07:00
Andrew Kelley
5b4885fe89 stage2 llvm backend: DeclGen and DeclFn have context field
instead of a context() accessor method.
2021-07-25 18:49:25 -07:00
Andrew Kelley
66986dd248 stage2 llvm backend: rename getLLVMType to llvmType 2021-07-25 18:46:19 -07:00
Andrew Kelley
b87105c921 stage2 llvm backend: implement assembly and ptrtoint
These AIR instructions are the next blockers for `zig test` to work for
this backend.

After this commit, the "hello world" x86_64 test case passes for the
LLVM backend as well.
2021-07-24 23:54:20 -07:00
Luuk de Gram
30376a82b2
Re-enable switch test cases and fix regressions 2021-07-24 20:05:41 +02:00
Luuk de Gram
5d98abd570
Support multi-value prongs 2021-07-24 19:49:25 +02:00
Luuk de Gram
72149ae7e4
Allow negative values 2021-07-24 19:49:25 +02:00
Luuk de Gram
cb41f0e58d
switchbr: When prongs are sparse values, use if/else-chain 2021-07-24 19:49:25 +02:00
Luuk de Gram
ad38fc1147
wasm: Rewrite switch_br to use br_table instead
This is an initial version, todo:
- Also make this work for u64 values, as the table must be indexed by u32.
- Add support for signed integers.
- Add support for enums.
2021-07-24 19:49:24 +02:00
Andrew Kelley
7b8cb881df stage2: improvements towards zig test
* There is now a main_pkg in addition to root_pkg. They are usually the
   same. When using `zig test`, main_pkg is the user's source file and
   root_pkg has the test runner.
 * scanDecl no longer looks for test decls outside the package being
   tested. honoring `--test-filter` is still TODO.
 * test runner main function has a void return value rather than
   `anyerror!void`
 * Sema is improved to generate better AIR for for loops on slices.
 * Sema: fix incorrect capacity calculation in zirBoolBr
 * Sema: add compile errors for trying to use slice fields as an lvalue.
 * Sema: fix type coercion for error unions
 * Sema: fix analyzeVarRef generating garbage AIR
 * C codegen: fix renderValue for error unions with 0 bit payload
 * C codegen: implement function pointer calls
 * CLI: fix usage text

 Adds 4 new AIR instructions:

  * slice_len, slice_ptr: to get the ptr and len fields of a slice.
  * slice_elem_val, ptr_slice_elem_val: to get the element value of
    a slice, and a pointer to a slice.

AstGen gains a new functionality:

 * One of the unused flags of struct decls is now used to indicate
   structs that are known to have non-zero size based on the AST alone.
2021-07-23 22:42:31 -07:00
Andrew Kelley
a5fb28070f add -femit-llvm-bc CLI option and implement it
* Added doc comments for `std.Target.ObjectFormat` enum
 * `std.Target.oFileExt` is removed because it is incorrect for Plan-9
   targets. Instead, use `std.Target.ObjectFormat.fileExt` and pass a
   CPU architecture.
 * Added `Compilation.Directory.joinZ` for when a null byte is desired.
 * Improvements to `Compilation.create` logic for computing `use_llvm`
   and reporting errors in contradictory flags. `-femit-llvm-ir` and
   `-femit-llvm-bc` will now imply `-fLLVM`.
 * Fix compilation when passing `.bc` files on the command line.
 * Improvements to the stage2 LLVM backend:
   - cleaned up error messages and error reporting. Properly bubble up
     some errors rather than dumping to stderr; others turn into panics.
   - properly call ZigLLVMCreateTargetMachine and
     ZigLLVMTargetMachineEmitToFile and implement calculation of the
     respective parameters (cpu features, code model, abi name, lto,
     tsan, etc).
   - LLVM module verification only runs in debug builds of the compiler
   - use LLVMDumpModule rather than printToString because in the case
     that we incorrectly pass a null pointer to LLVM it may crash during
     dumping the module and having it partially printed is helpful in
     this case.
   - support -femit-asm, -fno-emit-bin, -femit-llvm-ir, -femit-llvm-bc
   - Support LLVM backend when used with Mach-O and WASM linkers.
2021-07-22 19:51:32 -07:00
Andrew Kelley
f47cf93b47 stage2: C backend: fix ret AIR instruction
when operand has 0 runtime bits
2021-07-20 15:56:42 -07:00
Andrew Kelley
9c652cc650 stage2: C backend: implement support for switch_br AIR 2021-07-20 15:52:58 -07:00
Andrew Kelley
ea902ffe8f Sema: reimplement runtime switch
Now supports multiple items pointing to the same body. This is a common
pattern even when using a jump table, with multiple cases pointing to
the same block of code.

In the case of a range specified, the items are moved to branches in the
else body. A future improvement may make it possible to have jump table
items as well as ranges pointing to the same block of code.
2021-07-20 12:19:17 -07:00
Luuk de Gram
caa0de545e Resolve regressions
- Get correct types in wasm backend.
- `arg` is already a `Ref`, therefore simply use `@intToEnum`.
- Fix regression in `zirBoolBr, where the order of insertion was incorrect.
2021-07-20 12:19:17 -07:00
Luuk de Gram
1150fc13dc wasm: Resolve regressions, add intcast support 2021-07-20 12:19:16 -07:00
Andrew Kelley
95756299af stage2: fix compile errors in LLVM backend 2021-07-20 12:19:16 -07:00
Andrew Kelley
44fe9c52e1 stage2: wasm backend: update to latest naming convention 2021-07-20 12:19:16 -07:00
Andrew Kelley
934ebbe900 stage2: fix AIR not instruction (see prev commit) 2021-07-20 12:19:16 -07:00
Jacob G-W
414b144257 cbe: fix not (it is a ty_op, not un_op) 2021-07-20 12:19:16 -07:00
Andrew Kelley
4a0f38bb76 stage2: update LLVM backend to new AIR memory layout
Also fix compile errors when not using -Dskip-non-native
2021-07-20 12:19:16 -07:00
Andrew Kelley
761f36ff93 stage2: rework C backend for new AIR memory layout 2021-07-20 12:19:16 -07:00
Luuk de Gram
2438f61f1c Refactor entire wasm-backend to use new AIR memory layout 2021-07-20 12:19:16 -07:00
Luuk de Gram
424f260f85 Fix wasm-related compile errors:
- Update `fail()` to not require a `srcLoc`.
This brings it in line with other backends, and we were always passing 'node_offset = 0', anyway.
- Fix unused local due to change of architecture wrt function/decl generation.
- Replace all old instructions to indexes within the function signatures.
2021-07-20 12:19:16 -07:00
Andrew Kelley
c09b973ec2 stage2: compile error fixes for AIR memory layout branch
Now the branch is compiling again, provided that one uses
`-Dskip-non-native`, but many code paths are disabled. The code paths
can now be re-enabled one at a time and updated to conform to the new
AIR memory layout.
2021-07-20 12:19:16 -07:00
Andrew Kelley
0f38f68696 stage2: Air and Liveness are passed ephemerally
to the link infrastructure, instead of being stored with Module.Fn. This
moves towards a strategy to make more efficient use of memory by not
storing Air or Liveness data in the Fn struct, but computing it on
demand, immediately sending it to the backend, and then immediately
freeing it.

Backends which want to defer codegen until flush() such as SPIR-V
must move the Air/Liveness data upon `updateFunc` being called and keep
track of that data in the backend implementation itself.
2021-07-20 12:19:16 -07:00
Andrew Kelley
913393fd3b stage2: first pass over Module.zig for AIR memory layout 2021-07-20 12:19:16 -07:00
Andrew Kelley
ef7080aed1 stage2: update Liveness, SPIR-V for new AIR memory layout
also do the inline assembly instruction
2021-07-20 12:19:16 -07:00
Andrew Kelley
5d6f7b44c1 stage2: rework AIR memory layout
This commit changes the AIR file and the documentation of the memory
layout. The actual work of modifying the surrounding code (in Sema and
codegen) is not yet done.
2021-07-20 12:18:14 -07:00
Andrew Kelley
28dd9d478d C backend: TypedefMap is now ArrayHashMap
The C backend depends on insertion order into this map so that type
definitions will be declared before they are used.
2021-07-12 12:40:32 -07:00
jacob gw
34c21affa2 initial plan9 boilerplate
The code now compiles and fails with Plan9ObjectFormatUnimplemented
2021-07-08 14:10:49 -07:00
Andrew Kelley
9dbe684854 C backend: cleanups to wrapping int operations
* less branching by passing parameters in the main op code switch.
 * properly pass the target when asking the type system for int info.
 * handle u8, i16, etc when it is represented using
   int_unsigned/int_signed tag.
 * compile error instead of assertion failure for unimplemented cases
   (greater than 64 bits integer).
 * control flow cleanups
 * zig.h: expand macros into inline functions
 * reduce the complexity of the test case by making it one test case
   that calls multiple functions. Also fix the problem of c_int max
   value mismatch between host and target.
2021-07-08 11:21:06 -07:00
Matt Knight
fb16633ecb C backend: add/sub/mul wrapping for the C backend 2021-07-08 09:56:40 -07:00
Andrew Kelley
c2e66d9bab stage2: basic inferred error set support
* Inferred error sets are stored in the return Type of the function,
   owned by the Module.Fn. So it cleans up that memory in deinit().
 * Sema: update the inferred error set in zirRetErrValue
   - Update relevant code in wrapErrorUnion
 * C backend: improve some some instructions to take advantage of
   liveness analysis to avoid being emitted when unused.
 * C backend: when an error union has a payload type with no runtime
   bits, emit the error union as the same type as the error set.
2021-07-07 20:47:21 -07:00