76 Commits

Author SHA1 Message Date
Andrew Kelley
f75cdd1acd
Merge pull request #8470 from ziglang/stage2-start
stage2: blaze the trail for std lib integration
2021-04-09 10:15:46 -07:00
jacob gw
afe5862111 stage2: add error for private decls accessed from other files 2021-04-09 10:05:15 -07:00
Andrew Kelley
23db96931e AstGen: implement @typeInfo builtin 2021-04-08 22:38:13 -07:00
Andrew Kelley
6e05d53e88 AstGen: implement struct init with ResultLoc.ty 2021-04-08 22:33:40 -07:00
Andrew Kelley
91e1414b50 stage2: implement array access to a global array 2021-04-08 21:56:42 -07:00
Andrew Kelley
482b995a49 stage2: blaze the trail for std lib integration
This branch adds "builtin" and "std" to the import table when using the
self-hosted backend.

"builtin" gains one additional item:

```
pub const zig_is_stage2 = true; // false when using stage1 backend
```

This allows the std lib to do conditional compilation based on detecting
which backend is being used. This will be removed from builtin as soon
as self-hosted catches up to feature parity with stage1.
Keep a sharp eye out - people are going to be tempted to abuse this.
The general rule of thumb is do not use `builtin.zig_is_stage2`. However
this commit breaks the rule so that we can gain limited start.zig support
as we incrementally improve the self-hosted compiler.

This commit also implements `fullyQualifiedNameHash` and related
functionality, which effectively puts all Decls in their proper
namespaces. `fullyQualifiedName` is not yet implemented.

Stop printing "todo" log messages for test decls unless we are in test
mode.

Add "previous definition here" error notes for Decl name collisions.

This commit does not bring us yet to a newly passing test case.

Here's what I'm working towards:

```zig
const std = @import("std");

export fn main() c_int {
    const a = std.fs.base64_alphabet[0];
    return a - 'A';
}
```

Current output:

```
$ ./zig-cache/bin/zig build-exe test.zig
test.zig:3:1: error: TODO implement more analyze elemptr
zig-cache/lib/zig/std/start.zig:38:46: error: TODO implement structInitExpr ty
```

So the next steps are clear:
 * Sema: improve elemptr
 * AstGen: implement structInitExpr
2021-04-08 19:05:05 -07:00
Andrew Kelley
9f744f19e7
Merge pull request #8464 from gracefuu/grace/wasm-ops
stage2 wasm: Add division and bitwise/boolean ops &, |, ^, and, or
2021-04-08 13:41:49 -07:00
Andrew Kelley
b9e508c410 stage2: revert to only has_decl and export ZIR support
Reverting most of the code from the previous commits in this branch.
Will pull in the code with modifications bit by bit.
2021-04-08 11:29:31 -07:00
Timon Kruiper
ac14b52e85 stage2: add support for start.zig
This adds a simplified start2.zig that the current stage2 compiler is
able to generate code for.
2021-04-08 14:23:18 +02:00
Timon Kruiper
e85cd616ef stage2: implement builtin function hasDecl 2021-04-08 14:23:18 +02:00
Timon Kruiper
91e416bbf0 stage2: add a simplified export builtin call
std.builtin.ExportOptions is not yet supported, thus the second argument
of export is now a simple string that specifies the exported symbol
name.
2021-04-08 14:22:56 +02:00
Andrew Kelley
57aa289fde Sema: fix switch validation '_' prong on wrong type 2021-04-07 22:19:17 -07:00
Andrew Kelley
e730172e47 stage2: fix switch validation of handling all enum values
There were several problems, all fixed:
 * AstGen was storing field names as references to the original
   source code bytes. However, that data would be destroyed when the
   source file is updated. Now, it correctly stores the field names in
   the Decl arena for the enum. The same fix applies to error set field
   names.
 * Sema was missing a memset inside `analyzeSwitch`, leaving the "seen
   enum fields" array with undefined memory. Now that they are all
   properly set to null, the validation works.
 * Moved the "enum declared here" note to the end. It looked weird
   interrupting the notes for which enum values were missing.
2021-04-07 22:02:45 -07:00
Andrew Kelley
b67378fb08 Sema: @intToEnum error msg includes a "declared here" hint 2021-04-07 21:04:55 -07:00
Andrew Kelley
8f28e26e7a Sema: implement switch validation for enums 2021-04-07 16:39:10 -07:00
gracefu
6dc35efe49
Sema: fix typo bug for boolean ops (and, or) 2021-04-08 05:27:00 +08:00
gracefu
4c71942f84
stage2: Add .div to ir.zig 2021-04-08 05:26:56 +08:00
Andrew Kelley
18119aae30 Sema: implement comparison analysis for non-numeric types 2021-04-07 12:15:05 -07:00
Andrew Kelley
d9c25ec672 zir: use node union field for alloc_inferred
Previously we used `un_node` and passed `undefined` for the operand, but
this causes illegal behavior when printing ZIR code.
2021-04-07 11:34:23 -07:00
Andrew Kelley
4e8fb9e6a5 Sema: DRY up enum field analysis and add "declared here" notes 2021-04-07 11:26:07 -07:00
jacob gw
01a39fa1d4 stage2: coerce enum_literal -> enum 2021-04-07 07:19:11 -04:00
Andrew Kelley
19cf987198 C backend: implement Enum types and values
They are lowered directly as the integer tag type, with no typedef.
2021-04-06 23:19:46 -07:00
Andrew Kelley
acf9151008 stage2: implement field access for Enum.tag syntax 2021-04-06 22:36:28 -07:00
Andrew Kelley
b40d36c90b stage2: implement simple enums
A simple enum is an enum which has an automatic integer tag type,
all tag values automatically assigned, and no top level declarations.
Such enums are created directly in AstGen and shared by all the
generic/comptime instantiations of the surrounding ZIR code. This
commit implements, but does not yet add any test cases for, simple enums.

A full enum is an enum for which any of the above conditions are not
true. Full enums are created in Sema, and therefore will create a unique
type per generic/comptime instantiation. This commit does not implement
full enums. However the `enum_decl_nonexhaustive` ZIR instruction is
added and the respective Type functions are filled out.

This commit makes an improvement to ZIR code, removing the decls array
and removing the decl_map from AstGen. Instead, decl_ref and
decl_val ZIR instructions index into the `owner_decl.dependencies`
ArrayHashMap. We already need this dependencies array for incremental
compilation purposes, and so repurposing it to also use it for ZIR decl
indexes makes for efficient memory usage.

Similarly, this commit fixes up incorrect memory management by removing
the `const` ZIR instruction. The two places it was used stored memory in
the AstGen arena, which may get freed after Sema. Now it properly sets
up a new anonymous Decl for error sets and uses a normal decl_val
instruction.

The other usage of `const` ZIR instruction was float literals. These are
now changed to use `float` ZIR instruction when the value fits inside
`zir.Inst.Data` and `float128` otherwise.

AstGen + Sema: implement int_to_enum and enum_to_int. No tests yet; I expect to
have to make some fixes before they will pass tests. Will do that in the
branch before merging.

AstGen: fix struct astgen incorrectly counting decls as fields.

Type/Value: give up on trying to exhaustively list every tag all the
time. This makes the file more manageable. Also found a bug with
i128/u128 this way, since the name of the function was more obvious when
looking at the tag values.

Type: implement abiAlignment and abiSize for structs. This will need to
get more sophisticated at some point, but for now it is progress.

Value: add new `enum_field_index` tag.
Value: add hash_u32, needed when using ArrayHashMap.
2021-04-06 18:17:37 -07:00
Andrew Kelley
d47f0abd5b stage2: Sema: implement validate_struct_init_ptr 2021-04-02 21:06:09 -07:00
Andrew Kelley
97d7fddfb7 stage2: progress towards basic structs
Introduce `ResultLoc.none_or_ref` which is used by field access
expressions to avoid unnecessary loads when the field access itself
will do the load. This turns:

```zig
p.y - p.x - p.x
```

from

```zir
  %14 = load(%4) node_offset:8:12
  %15 = field_val(%14, "y") node_offset:8:13
  %16 = load(%4) node_offset:8:18
  %17 = field_val(%16, "x") node_offset:8:19
  %18 = sub(%15, %17) node_offset:8:16
  %19 = load(%4) node_offset:8:24
  %20 = field_val(%19, "x") node_offset:8:25
```

to

```zir
  %14 = field_val(%4, "y") node_offset:8:13
  %15 = field_val(%4, "x") node_offset:8:19
  %16 = sub(%14, %15) node_offset:8:16
  %17 = field_val(%4, "x") node_offset:8:25
```

Much more compact. This requires `Sema.zirFieldVal` to support both
pointers and non-pointers.

C backend: Implement typedefs for struct types, as well as the following
TZIR instructions:
 * mul
 * mulwrap
 * addwrap
 * subwrap
 * ref
 * struct_field_ptr

Note that add, addwrap, sub, subwrap, mul, mulwrap instructions are all
incorrect currently and need to be updated to properly handle wrapping
and non wrapping for signed and unsigned.

C backend: change indentation delta to 1, to make the output smaller and
to process fewer bytes.

I promise I will add a test case as soon as I fix those warnings that
are being printed for my test case.
2021-04-02 19:11:51 -07:00
Andrew Kelley
8ebfdc14f6 stage2: implement structs in the frontend
New ZIR instructions:
 * struct_decl_packed
 * struct_decl_extern

New TZIR instruction: struct_field_ptr

Introduce `Module.Struct`. It uses `Value` to store default values and
abi alignments.

Implemented Sema.analyzeStructFieldPtr and zirStructDecl.

Some stuff I changed from `@panic("TODO")` to `log.warn("TODO")`.
It's becoming more clear that we need the lazy value mechanism soon;
Type is becoming unruly, and some of these functions have too much logic
given that they don't have any context for memory management or error
reporting.
2021-04-01 22:39:09 -07:00
Andrew Kelley
c66b48194f stage2: AstGen and ZIR printing for struct decls 2021-04-01 19:27:17 -07:00
Andrew Kelley
50bcfb8c90 stage2: implement struct init syntax with ptr result loc 2021-04-01 11:58:55 -07:00
Andrew Kelley
c9e31febf8 stage2: finish implementation of LazySrcLoc 2021-03-31 23:00:00 -07:00
Andrew Kelley
b27d052676 stage2: finish source location reworkings in the branch
* remove the LazySrcLoc.todo tag
 * finish updating Sema and AstGen, remove the last of the
   `@panic("TODO")`.
2021-03-31 21:36:32 -07:00
Andrew Kelley
e8143f6cbe stage2: compile error for duplicate switch value on sparse 2021-03-31 18:39:34 -07:00
Andrew Kelley
cec766f73c stage2: compile error for duplicate switch value on boolean 2021-03-31 18:30:23 -07:00
jacob gw
fedc9ebd26 stage2: cbe: restore all previously passing tests! 2021-03-31 18:09:45 -07:00
Andrew Kelley
3cebaaad1c astgen: improved handling of coercion
GenZir struct now has rl_ty_inst field which tracks the result location
type (if any) a block expects all of its results to be coerced to.

Remove a redundant coercion on const local initialization with a
specified type.

Switch expressions, during elision of store_to_block_ptr instructions,
now re-purpose them to be type coercion when the block has a type in the
result location.
2021-03-31 18:05:37 -07:00
Andrew Kelley
08eedc962d Sema: fix else case code generation for switch 2021-03-31 16:17:47 -07:00
Andrew Kelley
abd06d8eab stage2: clean up RangeSet and fix swapped Sema switch logic for lhs/rhs 2021-03-31 15:39:04 -07:00
Andrew Kelley
e272c29c16 Sema: implement switch validation for ranges 2021-03-31 15:06:03 -07:00
Andrew Kelley
549af582e7 AstGen: switch expressions properly handle result locations 2021-03-30 23:57:22 -07:00
Andrew Kelley
2a1dd174cd stage2: rework AstGen for switch expressions
The switch_br ZIR instructions are now switch_block instructions. This
avoids a pointless block always surrounding a switchbr in emitted ZIR
code.

Introduce typeof_elem ZIR instruction for getting the type of the
element of a pointer value in 1 instruction.

Change typeof to be un_node, not un_tok.

Introduce switch_capture ZIR instructions for obtaining the capture
value of switch prongs.

Introduce Sema.resolveBody for when you want to extract a *Inst out of a
block and you know that there is only going to be 1 break from it.

What's not working yet: AstGen does not correctly elide
store instructions when it turns out that the result location does not
need to be used as a pointer.

Also Sema validation code for duplicate switch items is not yet
implemented.
2021-03-30 21:28:36 -07:00
Andrew Kelley
195ddab2be Sema: implement switch expressions
The logic for putting ranges into the else prong is moved from AstGen to
Sema. However, logic to emit multi-items the same as single-items cannot
be done until TZIR supports mapping multiple items to the same block of
code. This will be simple to represent when we do the upcoming TZIR memory
layout changes.

Not yet implemented in this commit is the validation of duplicate
values. The trick is going to be emitting error messages with accurate
source locations, without adding extra source nodes to the ZIR
switch instruction.

This will be done by computing the respective AST node based on the
switch node (which we do have available), only when a compile error
occurs and we need to know the source location to attach the message to.
2021-03-29 21:59:08 -07:00
Andrew Kelley
623d5f442c stage2: guidance on how to implement switch expressions
Here's what I think the ZIR should be. AstGen is not yet implemented to
match this, and the main implementation of analyzeSwitch in Sema is not
yet implemented to match it either.

Here are some example byte size reductions from master branch, with the
ZIR memory layout from this commit:

```
switch (foo) {
  a => 1,
  b => 2,
  c => 3,
  d => 4,
}
```

184 bytes (master) => 40 bytes (this branch)

```
switch (foo) {
  a, b => 1,
  c..d, e, f => 2,
  g => 3,
  else => 4,
}
```

240 bytes (master) => 80 bytes (this branch)
2021-03-28 23:12:26 -07:00
Andrew Kelley
8f469c1127 stage2: fix error sets 2021-03-28 19:40:21 -07:00
jacob gw
0005b34637 stage2: implement sema for @errorToInt and @intToError 2021-03-28 18:22:01 -07:00
Andrew Kelley
1f5617ac07 stage2: implement bitwise expr and error literals 2021-03-26 23:46:37 -07:00
Andrew Kelley
b2deaf8027 stage2: improve source locations of Decl access
* zir.Code: introduce a decls array. This is so that `decl_val` and
   `decl_ref` instructions can refer to a Decl with a u32 and therefore
   they can also store a source location. This is needed for proper
   compile error reporting.
 * astgen uses a hash map to avoid redundantly adding a Decl to the
   decls array.
 * fixed reporting "instruction illegal outside function body" instead
   of the desired message "unable to resolve comptime value".
 * astgen skips emitting dbg_stmt instructions in comptime scopes.
 * astgen has some logic to avoid adding unnecessary type coercion
   instructions for common values.
2021-03-25 23:45:17 -07:00
Andrew Kelley
4bfcd105ef stage2: fix @compileLog. 2021-03-25 20:11:23 -07:00
Andrew Kelley
b9c5a1fdf5 astgen: fix for loop expressions
also rename the ZIR instruction `deref_node` to `load`.
2021-03-25 19:25:26 -07:00
Andrew Kelley
31023de6c4 stage2: implement inline while
Introduce "inline" variants of ZIR tags:
 * block => block_inline
 * repeat => repeat_inline
 * break => break_inline
 * condbr => condbr_inline

The inline variants perform control flow at compile-time, and they
utilize the return value of `Sema.analyzeBody`.

`analyzeBody` now returns an Index, not a Ref, which is the ZIR index of
a break instruction. This effectively communicates both the intended
break target block as well as the operand, allowing parent blocks to
find out whether they, in turn, should return the break instruction up the
call stack, or accept the operand as the block's result and continue
analyzing instructions in the block.

Additionally:
 * removed the deprecated ZIR tag `block_comptime`.
 * removed `break_void_node` so that all break instructions use the same Data.
 * zir.Code: remove the `root_start` and `root_len` fields. There is now
   implied to be a block at index 0 for the root body. This is so that
   `break_inline` has something to point at and we no longer need the
   special instruction `break_flat`.
 * implement source location byteOffset() for .node_offset_if_cond
   .node_offset_for_cond is probably redundant and can be deleted.

We don't have `comptime var` supported yet, so this commit adds a test
that at least makes sure the condition is required to be comptime known
for `inline while`.
2021-03-25 00:55:36 -07:00
Andrew Kelley
d73a4940e0 stage2: cleanups from previous commits
Change some ZIR instructions from un_tok to un_node. Idea here is to
avoid needlessly accessing the tokens array.

Go ahead and finish the catch code in orelseCatchExpr. Otherwise it
would be an easy mistake to get the scopes wrong when updating that
code and introduce a bug.

Delete a function that is now dead code.
2021-03-24 16:14:11 -07:00