130 Commits

Author SHA1 Message Date
jacob gw
0005b34637 stage2: implement sema for @errorToInt and @intToError 2021-03-28 18:22:01 -07:00
Isaac Freund
f80f8a7a78
AstGen: pass *GenZir as the first arg, not *Module
This avoids the unnecessary scope.getGenZir() virtual call for both
convenience and performance.
2021-03-28 22:42:17 +02:00
Isaac Freund
d123a5ec67
AstGen: scope result location related functions 2021-03-28 19:53:38 +02:00
Isaac Freund
402f87a213
stage2: rename WipZirCode => AstGen, astgen.zig => AstGen.zig 2021-03-28 19:10:10 +02:00
Andrew Kelley
1f5617ac07 stage2: implement bitwise expr and error literals 2021-03-26 23:46:37 -07:00
Andrew Kelley
da731e18c9 stage2: implement source location: .node_offset_var_decl_ty 2021-03-26 18:35:15 -07:00
Andrew Kelley
b2deaf8027 stage2: improve source locations of Decl access
* zir.Code: introduce a decls array. This is so that `decl_val` and
   `decl_ref` instructions can refer to a Decl with a u32 and therefore
   they can also store a source location. This is needed for proper
   compile error reporting.
 * astgen uses a hash map to avoid redundantly adding a Decl to the
   decls array.
 * fixed reporting "instruction illegal outside function body" instead
   of the desired message "unable to resolve comptime value".
 * astgen skips emitting dbg_stmt instructions in comptime scopes.
 * astgen has some logic to avoid adding unnecessary type coercion
   instructions for common values.
2021-03-25 23:45:17 -07:00
Andrew Kelley
31023de6c4 stage2: implement inline while
Introduce "inline" variants of ZIR tags:
 * block => block_inline
 * repeat => repeat_inline
 * break => break_inline
 * condbr => condbr_inline

The inline variants perform control flow at compile-time, and they
utilize the return value of `Sema.analyzeBody`.

`analyzeBody` now returns an Index, not a Ref, which is the ZIR index of
a break instruction. This effectively communicates both the intended
break target block as well as the operand, allowing parent blocks to
find out whether they, in turn, should return the break instruction up the
call stack, or accept the operand as the block's result and continue
analyzing instructions in the block.

Additionally:
 * removed the deprecated ZIR tag `block_comptime`.
 * removed `break_void_node` so that all break instructions use the same Data.
 * zir.Code: remove the `root_start` and `root_len` fields. There is now
   implied to be a block at index 0 for the root body. This is so that
   `break_inline` has something to point at and we no longer need the
   special instruction `break_flat`.
 * implement source location byteOffset() for .node_offset_if_cond
   .node_offset_for_cond is probably redundant and can be deleted.

We don't have `comptime var` supported yet, so this commit adds a test
that at least makes sure the condition is required to be comptime known
for `inline while`.
2021-03-25 00:55:36 -07:00
Andrew Kelley
01bfd835bb stage2: clean up break / noreturn astgen
* Module.addBreak and addBreakVoid return zir.Inst.Index not Ref
   because Index is the simpler type and we never need a Ref for these.
 * astgen: make noreturn stuff return the unreachable_value and avoid
   unnecessary calls to rvalue()
 * breakExpr: avoid unnecessary access into the tokens array
 * breakExpr: fix incorrect `@intCast` (previously this unsafely
   casted an Index to a Ref)
2021-03-24 20:45:14 -07:00
Timon Kruiper
522707622e astgen: implement breaking from a block 2021-03-24 19:54:03 -07:00
Andrew Kelley
0c6581e01d stage2: fix memory leak when updating a function 2021-03-24 15:46:06 -07:00
Andrew Kelley
180dae4196 stage2: further cleanups regarding zir.Inst.Ref
* Introduce helper functions on Module.WipZirCode and zir.Code
 * Move some logic around
 * re-introduce ref_start_index
 * prefer usize for local variables + `@intCast` at the end.
   Empirically this is easier to optimize.
 * Avoid using mem.{bytesAsSlice,sliceAsBytes} because it incurs an
   unnecessary multiplication/division which may cause problems for the
   optimizer.
 * Use a regular enum, not packed, for `Ref`. Memory layout is
   guaranteed for enums which specify their tag type. Packed enums have
   ABI alignment of 1 byte which is too small.
2021-03-24 15:36:23 -07:00
Isaac Freund
0c601965ab
stage2: make zir.Inst.Ref a non-exhaustive enum
This provides us greatly increased type safety and prevents the common
mistake of using a zir.Inst.Ref where a zir.Inst.Index was expected or
vice-versa. It also increases the ergonomics of using the typed values
which can be directly referenced with a Ref over the previous zir.Const
approach.

The main pain point is casting between a []Ref and []u32, which could be
alleviated in the future with a new std.mem function.
2021-03-24 19:11:44 +01:00
Andrew Kelley
a1afe69395 stage2: comment out failing test cases; implement more things
* comment out the failing stage2 test cases
   (so that we can uncomment the ones that are newly passing with
   further commits)
 * Sema: implement negate, negatewrap
 * astgen: implement field access, multiline string literals, and
   character literals
 * Module: when resolving an AST node into a byte offset, use the
   main_tokens array, not the firstToken function
2021-03-23 23:13:01 -07:00
Andrew Kelley
13ced07f23 stage2: fix while loops
also start to form a plan for how inline while loops will work
2021-03-23 21:37:10 -07:00
Andrew Kelley
bf7c3e9355 astgen: fixups regarding var decls and rl_ptr 2021-03-23 16:47:41 -07:00
Andrew Kelley
be673e6793 stage2: implement inttype ZIR
also add i128 and u128 to const inst list
2021-03-23 16:12:26 -07:00
Andrew Kelley
af73f79490 stage2: fix comptimeExpr and comptime function calls 2021-03-23 13:25:58 -07:00
Andrew Kelley
866be099f8 stage2: add helper functions to clean up astgen Ref/Index 2021-03-23 12:54:18 -07:00
Isaac Freund
668148549a
stage2: fix two return types to be Ref not Index
We currently have no type safety between zir.Inst.Ref, zir.Inst.Index,
and plain u32s.
2021-03-23 11:58:43 +01:00
Andrew Kelley
d24be85be8 stage2: fix if expressions 2021-03-22 23:47:13 -07:00
Andrew Kelley
568f333681 astgen: improve the ensure_unused_result elision 2021-03-22 18:57:46 -07:00
Andrew Kelley
2f391df2a7 stage2: Sema improvements and boolean logic astgen
* add `Module.setBlockBody` and related functions
 * redo astgen for `and` and `or` to use fewer ZIR instructions and
   require less processing for comptime known values
 * Sema: rework `analyzeBody` function. See the new doc comments in this
   commit. Divides ZIR instructions up into 3 categories:
   - always noreturn
   - never noreturn
   - sometimes noreturn
2021-03-22 17:29:56 -07:00
Isaac Freund
9f0b9b8da1
stage2: remove all async related code
The current plan is to avoid using async and related features in the
stage2 compiler so that we can bootstrap before implementing them.

Having this untested and incomplete code in the codebase increases
friction while working on stage2, in particular when preforming
larger refactors such as the current zir memory layout rework.

Therefore remove all async related code, leaving only error messages
in astgen.
2021-03-23 00:23:41 +01:00
Dimenus
240b15381d fix calculation in ensureCapacity 2021-03-22 11:58:44 -07:00
Isaac Freund
8111453cc1
astgen: implement array types 2021-03-22 14:54:13 +01:00
Andrew Kelley
5769c963e0 Sema: implement arithmetic 2021-03-21 19:23:12 -07:00
Isaac Freund
72bcdb639f
astgen: implement bool_and/bool_or 2021-03-22 00:51:25 +01:00
Isaac Freund
310a44d5be zir: add negate/negate_wrap, implement astgen
These were previously implemented as a sub/sub_wrap instruction with a
lhs of 0. Making this separate instructions however allows us to save
some memory as there is no need to store a lhs.
2021-03-21 20:32:39 +01:00
Andrew Kelley
7598a00f34 stage2: fix memory management of ZIR code
* free Module.Fn ZIR code when destroying the owner Decl
 * unreachable_safe and unreachable_unsafe are collapsed into one ZIR
   instruction with a safety flag.
 * astgen: emit an unreachable instruction for unreachable literals
 * don't forget to call deinit on ZIR code
 * astgen: implement some builtin functions
2021-03-20 22:40:08 -07:00
Andrew Kelley
8bad5dfa72 astgen: implement inline assembly 2021-03-20 21:48:35 -07:00
Andrew Kelley
50010447bd astgen: implement function calls 2021-03-20 17:09:06 -07:00
Andrew Kelley
56677f2f2d astgen: support blocks
We are now passing this test:

```zig
export fn _start() noreturn {}
```

```
test.zig:1:30: error: expected noreturn, found void
```

I ran into an issue where we get an integer overflow trying to compute
node index offsets from the containing Decl. The problem is that the
parser adds the Decl node after adding the child nodes. For some things,
it is easy to reserve the node index and then set it later, however, for
this case, it is not a trivial code change, because depending on tokens
after parsing the decl determines whether we want to add a new node or
not.

Possible strategies here:

1. Rework the parser code to make sure that Decl nodes are before
   children nodes in the AST node array.

2. Use signed integers for Decl node offsets.

3. Just flip the order of subtraction and addition. Expect Decl Node
   index to be greater than children Node indexes.

I opted for (3) because it seems like the simplest thing to do. We'll
want to unify the logic for computing the offsets though because if the
logic gets repeated, it will probably get repeated wrong.
2021-03-19 23:15:18 -07:00
Andrew Kelley
937c43ddf1 stage2: first pass at repairing ZIR printing 2021-03-19 19:33:11 -07:00
Andrew Kelley
0357cd8653 Sema: allocate inst_map with arena where appropriate 2021-03-19 15:31:50 -07:00
Andrew Kelley
81a935aef8 stage2: fix some math oopsies and typos 2021-03-19 15:19:47 -07:00
Andrew Kelley
132df14ee1 stage2: fix export source locations not being relative to Decl 2021-03-19 14:59:46 -07:00
jacob gw
c50397c268 llvm backend: use new srcloc
this allows to compile with ninja
2021-03-19 14:46:37 -07:00
jacob gw
e9810d9e79 zir-memory-layout: astgen: fill in identifier 2021-03-19 14:43:08 -07:00
Andrew Kelley
bd2154da3d stage2: the code is compiling again
(with a lot of things commented out)
2021-03-18 22:48:28 -07:00
Andrew Kelley
b2682237db stage2: get Module and Sema compiling again
There are some `@panic("TODO")` in there but I'm trying to get the
branch to the point where collaborators can jump in.

Next is to repair the seam between LazySrcLoc and codegen's expected
absolute file offsets.
2021-03-18 22:19:28 -07:00
Andrew Kelley
66245ac834 stage2: Module and Sema are compiling again
Next up is reworking the seam between the LazySrcLoc emitted by Sema
and the byte offsets currently expected by codegen.

And then the big one: updating astgen.zig to use the new memory layout.
2021-03-17 22:54:56 -07:00
Andrew Kelley
38b3d4b00a stage2: work through some compile errors in Module and Sema 2021-03-17 00:56:08 -07:00
Andrew Kelley
099af0e008 stage2: rename zir_sema.zig to Sema.zig 2021-03-16 00:04:17 -07:00
Andrew Kelley
aef3e534f5 stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:

 * `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
   each instruction independently, there is now a Tag and 8 bytes of
   data available for all ZIR instructions. Small instructions fit
   within these 8 bytes; larger ones use 4 bytes for an index into
   `extra`. There is also `string_bytes` so that we can have 4 byte
   references to strings. `zir.Inst.Tag` describes how to interpret
   those 8 bytes of data.
   - This is shared by all `Block` scopes.

 * `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
   structure, the arrays are mutable, and get resized as we add/delete
   things. There is extra state to keep track of things. This struct is
   stored on the stack. Once it is finished, it produces an immutable
   `zir.Code`, which will remain on the heap for the duration of a
   function's existence.
   - This is shared by all `GenZir` scopes.

 * `Sema`: represents in-progress semantic analysis of a `zir.Code`.
   This data is stored on the stack and is shared among all `Block`
   scopes. It is now the main "self" argument to everything in the file
   that was previously named `zir_sema.zig`.
   Additionally, I moved some logic that was in `Module` into here.

`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.

astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.

I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.

Overhaul of Source Locations:

Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.

Now there are more types involved into source locations, and more ways
to describe a source location.

 * AllErrors.Message: embrace the assumption that files always have less
   than 2 << 32 bytes.
 * SrcLoc gets more complicated, to model more complicated source
   locations.
 * Introduce LazySrcLoc, which can model interesting source locations
   with very little stored state. Useful for avoiding doing unnecessary
   work when no compile errors occur.

Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.

Miscellaneous:

 * std.zig: string literals have more helpful result values for
   reporting errors. There is now a lower level API and a higher level
   API.
   - side note: I noticed that the string literal logic needs some love.
     There is some unnecessarily hacky code there.
 * cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
   probably broke stuff and needs to get fixed.
 * Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
   think this quite how this code will be organized. Need some more
   careful planning about how to implement structs, unions, enums. They
   need to be independent Decls, just like a top level function.
2021-03-16 00:03:22 -07:00
Veikka Tuominen
8c6e7fb2c7
stage2: implement var args 2021-03-06 15:55:29 +02:00
Veikka Tuominen
17e6e09285
stage2: astgen async 2021-03-06 15:01:25 +02:00
jacob gw
2ebeb0dbf3 stage2: remove error number from error set map
This saves memory since it is already stored in module
as well as allowing for better threading.
Part 2 of what is outlined in #8079.
2021-03-03 11:49:54 -08:00
jacob gw
58b14d01ae stage2: remove value field from error
This saves memory and from what I have heard allows threading
to be easier.
2021-02-28 22:01:13 +02:00
g-w1
153c97ac9e improve stage2 to allow catch at comptime:
* add error_union value tag.
* add analyzeIsErr
* add Value.isError
* add TZIR wrap_errunion_payload and wrap_errunion_err for
  wrapping from T -> E!T and E -> E!T
* add anlyzeInstUnwrapErrCode and analyzeInstUnwrapErr
* add analyzeInstEnsureErrPayloadVoid:
* add wrapErrorUnion
* add comptime error comparison for tests
* tests!
2021-02-25 16:41:16 -08:00