651 Commits

Author SHA1 Message Date
mlugg
ae845a33c0
Zir: represent declarations via an instruction
This commit changes how declarations (`const`, `fn`, `usingnamespace`,
etc) are represented in ZIR. Previously, these were represented in the
container type's extra data (e.g. as trailing data on a `struct_decl`).
However, this introduced the complexity of the ZIR mapping logic having
to also correlate some ZIR extra data indices. That isn't really a
problem today, but it's tricky for the introduction of `TrackedInst` in
the commit following this one. Instead, these type declarations now
simply contain a trailing list of ZIR indices to `declaration`
instructions, which directly encode all data related to the declaration
(including containing the declaration's body). Additionally, the ZIR for
`align` etc have been split out into their own bodies. This is not
strictly necessary, but it's much simpler to understand for an
insignificant cost in bytes, and will simplify the resolution of #131
(where we may need to evaluate the pointer type, including align etc,
without immediately evaluating the value body).
2024-01-23 19:16:47 +00:00
David Rubin
1b8f7e46fa
AstGen: detect duplicate field names
This logic was previously in Sema, which was unnecessary complexity, and meant the issue was not detected unless the declaration was semantically analyzed. This commit finishes the work which 941090d started.

Resolves: #17916
2024-01-20 17:23:47 +00:00
dweiller
7ef3d3876a astgen: fix error return trace on error union switch 2024-01-18 04:47:31 +11:00
travisstaloch
f3353708d8
AstGen: use correct token_src for switch, if and while exprs
fixes #18579
2024-01-16 18:22:44 +02:00
Techatrix
06410f58bd AstGen: properly handle ill-formed switch on error 2024-01-16 05:55:26 +01:00
Techatrix
8b9425c248 AstGen: add error message for capture error by ref in switch on error 2024-01-16 05:55:26 +01:00
Bogdan Romanyuk
4a1a5ee47b
AstGen: add error for redundant comptime var in comptime scope (#18242) 2024-01-09 20:09:39 -05:00
dweiller
67d7d7b5a7 fixup! astgen: use switch_block_err_union 2024-01-09 15:31:20 +11:00
dweiller
fc6dc797ce astgen/sema: fix source locations for switch_block_err_union 2024-01-09 14:42:12 +11:00
dweiller
6a18cee3af astgen/sema: use switch_block_err_union for if-else-switch 2024-01-09 14:42:12 +11:00
dweiller
b7eb59fc14 fix x86_64 crashes for switch_block_err_union
This change only emits the unwrap_errunion_err instruction if the error
capture is actually used in a branch.
2024-01-09 14:42:12 +11:00
dweiller
2cf648fba7 astgen: use switch_block_err_union 2024-01-09 14:42:12 +11:00
dweiller
4136097566 zir: add switch_block_err_union 2024-01-09 14:42:11 +11:00
dweiller
063d55c504 zir: remove unused zir as instruction 2024-01-09 14:42:11 +11:00
Ali Chraghi
0e856da224 add type safety to ZIR for null terminated strings 2024-01-08 16:33:33 -08:00
Jacob Young
047d6d996e cbe: fix non-msvc externs and exports
Closes #17817
2024-01-03 02:52:25 -05:00
Veikka Tuominen
69195d0cd4 AstGen: add error for using inline loops in comptime only scopes 2023-12-08 16:54:32 -08:00
Bogdan Romanyuk
2ff707be78
AstGen: check allowed non-function builtins with declarative field (#18120) 2023-11-26 02:21:58 -05:00
Bogdan Romanyuk
2252dcc508
Compiler: move checking function-scope-only builtins to AstGen 2023-11-25 17:29:07 +00:00
Meghan Denny
121d995fcb frontend: move AstRlAnnotate to std.zig namespace 2023-11-24 17:09:08 -08:00
Meghan Denny
84d58aaa1f frontend: move BuiltinFn to std.zig namespace 2023-11-24 17:04:52 -08:00
Meghan Denny
2b2c13926d AstGen: remove calls to tracy 2023-11-24 17:04:03 -08:00
mlugg
3c585730f2
AstGen: preserve result type in comptime block 2023-11-19 11:11:50 +00:00
mlugg
b355893438
compiler: correct unnecessary uses of 'var' 2023-11-19 11:11:49 +00:00
mlugg
baabc6013e
compiler: add error for unnecessary use of 'var'
When a local variable is never used as an lvalue, we can determine that
`const` would be sufficient for this variable, so emit an error in this
case. More sophisticated checking is unfortunately not possible with
Zig's current analysis model, since whether an lvalue is actually
mutated depends on semantic analysis, in which some code paths may not
be analyzed, so attempting to determine this would result in false
positive compile errors.

It's worth noting that an unfortunate consequence of this is that any
field call `a.b()` will allow `a` to be `var`, even if `b` does not take
a pointer as its first parameter - this is again a necessary compromise
because the parameter type is not known until semantic analysis.

Also update `translate-c` to not trigger these errors. This is done by
replacing the `_ = @TypeOf(x)` emitted with `_ = &x` - the reference
there means that the local is permitted to be `var`. A similar strategy
will be used to prevent compile errors in the behavior tests, where we
sometimes want to force a value to be runtime-known.

Resolves: #224
2023-11-19 09:55:07 +00:00
David
941090d94f
Move duplicate field detection for struct init expressions into AstGen
Partially addresses #17916.
2023-11-16 14:38:16 +00:00
mlugg
d99bed1b10 Sema: optimize runtime array_mul
There are two optimizations here, which work together to avoid a
pathological case.

The first optimization is that AstGen now records the result type of an
array multiplication expression where possible. This type is not used
according to the language specification, but instead as an optimization.
In the expression '.{x} ** 1000', if we know that the result must be an
array, then it is much more efficient to coerce the LHS to an array with
length 1 before doing the multiplication. Otherwise, we end up with a
1000-element tuple which we must coerce to an array by individually
extracting each field.

Secondly, the previous logic would repeatedly extract element/field
values from the LHS when initializing the result. This is unnecessary:
each element must only be extracted once, and the result reused.

These changes together give huge improvements to compiler performance on
a pathological case: AIR instructions go from 65551 to 15, and total AIR
bytes go from 1.86MiB to 264.57KiB. Codegen time spent on this function
(in a debug compiler build) goes from minutes to essentially zero.

Resolves: #17586
2023-11-08 23:55:53 -07:00
kcbanner
f10499be0a
sema: analyze field init bodies in a second pass
This change allows struct field inits to use layout information
of their own struct without causing a circular dependency.

`semaStructFields` caches the ranges of the init bodies in the `StructType`
trailing data. The init bodies are then resolved by `resolveStructFieldInits`,
which is called before the inits are actually required.

Within the init bodies, the struct decl's instruction is repurposed to refer
to the field type itself. This is to allow us to easily rebuild the inst_map
mapping required for the init body instructions to refer to the field type.

Thanks to @mlugg for the guidance on this one!
2023-11-07 00:49:35 +00:00
Andrew Kelley
62f45b802c make Zir.Inst.Index typed
This commit starts by making Zir.Inst.Index a nonexhaustive enum rather
than a u32 alias for type safety purposes, and the rest of the changes
are needed to get everything compiling again.
2023-10-28 10:14:15 -07:00
Andrew Kelley
94d61ce964
Merge pull request #17651 from Vexu/error-limit
Make distinct error limit configurable (attempt #2)
2023-10-23 03:19:03 -04:00
Veikka Tuominen
9d9e22e716 remove uses of non-configurable err_int 2023-10-22 14:29:26 +03:00
mlugg
dd402f6d83 AstGen: omit make_ptr_const for resolve_inferred_alloc
After the previous commit, these make_ptr_const ZIR instructions are
redundant.
2023-10-21 21:38:41 -04:00
Andrew Kelley
027aabf497 drop for loop syntax upgrade mechanisms 2023-10-13 03:43:54 -07:00
Veikka Tuominen
63bd2bff12 Sema: add @errorCast which works for both error sets and error unions
Closes #17343
2023-10-01 17:00:01 +03:00
antlilja
6a29646a55 Rename @fabs to @abs and accept integers
Replaces the @fabs builtin with a new @abs builtins which accepts
floats, signed integers and vectors of said types.
2023-09-27 11:15:53 -07:00
mlugg
09a57583a4
compiler: preserve result type information through address-of operator
This commit introduces the new `ref_coerced_ty` result type into AstGen.
This represents a expression which we want to treat as an lvalue, and
the pointer will be coerced to a given type.

This change gives known result types to many expressions, in particular
struct and array initializations. This allows certain casts to work
which previously required explicitly specifying types via `@as`. It also
eliminates our dependence on anonymous struct types for expressions of
the form `&.{ ... }` - this paves the way for #16865, and also results
in less Sema magic happening for such initializations, also leading to
potentially better runtime code.

As part of these changes, this commit also implements #17194 by
disallowing RLS on explicitly-typed struct and array initializations.
Apologies for linking these changes - it seemed rather pointless to try
and separate them, since they both make big changes to struct and array
initializations in AstGen. The rationale for this change can be found in
the proposal - in essence, performing RLS whilst maintaining the
semantics of the intermediary type is a very difficult problem to solve.

This allowed the problematic `coerce_result_ptr` ZIR instruction to be
completely eliminated, which in turn also simplified the logic for
inferred allocations in Sema - thanks to this, we almost break even on
line count!

In doing this, the ZIR instructions surrounding these initializations
have been restructured - some have been added and removed, and others
renamed for clarity (and their semantics changed slightly). In order to
optimize ZIR tag count, the `struct_init_anon_ref` and
`array_init_anon_ref` instructions have been removed in favour of using
`ref` on a standard anonymous value initialization, since these
instructions are now virtually never used.

Lastly, it's worth noting that this commit introduces a slightly strange
source of generic poison types: in the expression `@as(*anyopaque, &x)`,
the sub-expression `x` has a generic poison result type, despite no
generic code being involved. This turns out to be a logical choice,
because we don't know the result type for `x`, and the generic poison
type represents precisely this case, providing the semantics we need.

Resolves: #16512
Resolves: #17194
2023-09-23 22:01:08 +01:00
Wooster
4585cb1e2f AstGen: fix @export with undeclared identifier crashing
This required a third `if (found_already == null)` in another place in
AstGen.zig for this special case of `@export`.

Fixes #17188
2023-09-22 12:23:57 -07:00
Andrew Kelley
fa1beba74f InternPool: implement getStructType
This also modifies AstGen so that struct types use 1 bit each from the
flags to communicate if there are nonzero inits, alignments, or comptime
fields. This allows adding a struct type to the InternPool without
looking ahead in memory to find out the answers to these questions,
which is easier for CPUs as well as for me, coding this logic right now.
2023-09-21 14:48:40 -07:00
mlugg
28caaea093
AstGen: allow closure over known-runtime values within @TypeOf
AstGen emits an error when a closure over a known-runtime value crosses
a namespace boundary. This usually makes sense: however, this usage is
actually valid if the capture is within a `@TypeOf` operand. Sema
already has a special case to allow such closure within `@TypeOf` when
AstGen could not determine a value to be runtime-known. This commit
simply introduces analagous logic to AstGen to allow `var`s to cross
namespace boundaries within `@TypeOf`.
2023-09-17 12:41:11 +01:00
mlugg
f366d9f879 compiler: start using destructure syntax 2023-09-15 11:42:08 -07:00
mlugg
88f5315ddf compiler: implement destructuring syntax
This change implements the following syntax into the compiler:

```zig
const x: u32, var y, foo.bar = .{ 1, 2, 3 };
```

A destructure expression may only appear within a block (i.e. not at
comtainer scope). The LHS consists of a sequence of comma-separated var
decls and/or lvalue expressions. The RHS is a normal expression.

A new result location type, `destructure`, is used, which contains
result pointers for each component of the destructure. This means that
when the RHS is a more complicated expression, peer type resolution is
not used: each result value is individually destructured and written to
the result pointers. RLS is always used for destructure expressions,
meaning every `const` on the LHS of such an expression creates a true
stack allocation.

Aside from anonymous array literals, Sema is capable of destructuring
the following types:
* Tuples
* Arrays
* Vectors

A destructure may be prefixed with the `comptime` keyword, in which case
the entire destructure is evaluated at comptime: this means all `var`s
in the LHS are `comptime var`s, every lvalue expression is evaluated at
comptime, and the RHS is evaluated at comptime. If every LHS is a
`const`, this is not allowed: as with single declarations, the user
should instead mark the RHS as `comptime`.

There are a few subtleties in the grammar changes here. For one thing,
if every LHS is an lvalue expression (rather than a var decl), a
destructure is considered an expression. This makes, for instance,
`if (cond) x, y = .{ 1, 2 };` valid Zig code. A destructure is allowed
in almost every context where a standard assignment expression is
permitted. The exception is `switch` prongs, which cannot be
destructures as the comma is ambiguous with the end of the prong.

A follow-up commit will begin utilizing this syntax in the Zig compiler.

Resolves: #498
2023-09-15 11:33:53 -07:00
mlugg
cba7e8a4e9 AstGen: do not forward result pointers through @as
The `coerce_result_ptr` instruction is highly problematic and leads to
unintentional memory reinterpretation in some cases. It is more correct
to simply not forward result pointers through this builtin.

`coerce_result_ptr` is still used for struct and array initializations,
where it can still cause issues. Eliminating this usage will be a future
change.

Resolves: #16991
2023-09-15 01:05:02 -07:00
Veikka Tuominen
6484e279e5 AstGen: fix missing array type validation
Closes #17084
2023-09-07 16:56:07 +03:00
Veikka Tuominen
d1a14e7b6d AstGen: fix error on missing function prototype name
Closes #17070
2023-09-05 20:00:19 +03:00
mlugg
b8e6c42688 compiler: provide result type for @memset value
Resolves: #16986
2023-08-28 12:33:36 -07:00
mlugg
8d036d1d78 Sema: allow cast builtins on vectors
The following cast builtins did not previously work on vectors, and have
been made to:

* `@floatCast`
* `@ptrFromInt`
* `@intFromPtr`
* `@floatFromInt`
* `@intFromFloat`
* `@intFromBool`

Resolves: #16267
2023-08-28 12:32:02 -07:00
Andrew Kelley
ada0010471 compiler: move unions into InternPool
There are a couple concepts here worth understanding:

Key.UnionType - This type is available *before* resolving the union's
fields. The enum tag type, number of fields, and field names, field
types, and field alignments are not available with this.

InternPool.UnionType - This one can be obtained from the above type with
`InternPool.loadUnionType` which asserts that the union's enum tag type
has been resolved. This one has all the information available.

Additionally:

* ZIR: Turn an unused bit into `any_aligned_fields` flag to help
  semantic analysis know whether a union has explicit alignment on any
  fields (usually not).
* Sema: delete `resolveTypeRequiresComptime` which had the same type
  signature and near-duplicate logic to `typeRequiresComptime`.
  - Make opaque types not report comptime-only (this was inconsistent
    between the two implementations of this function).
* Implement accepted proposal #12556 which is a breaking change.
2023-08-22 13:54:14 -07:00
mlugg
283afb50b5 AstGen: disallow '-0' integer literal
The intent here is ambiguous: this resolves to the comptime_int '0', but
it's likely the user meant to use a floating-point literal.

Resolves: #16890
2023-08-21 11:47:31 +03:00
mlugg
321961d860 AstGen: add result location analysis pass
The main motivation for this change is eliminating the `block_ptr`
result location and corresponding `store_to_block_ptr` ZIR instruction.
This is achieved through a simple pass over the AST before AstGen which
determines, for AST nodes which have a choice on whether to provide a
result location, which choice to make, based on whether the result
pointer is consumed non-trivially.

This eliminates so much logic from AstGen that we almost break even on
line count! AstGen no longer has to worry about instruction rewriting
based on whether or not a result location was consumed: it always knows
what to do ahead of time, which simplifies a *lot* of logic. This also
incidentally fixes a few random AstGen bugs related to result location
handling, leading to the changes in `test/` and `lib/std/`.

This opens the door to future RLS improvements by making them much
easier to implement correctly, and fixes many bugs. Most ZIR is made
more compact after this commit, mainly due to not having redundant
`store_to_block_ptr` instructions lying around, but also due to a few
bugs in the old system which are implicitly fixed here.
2023-08-20 11:58:14 -07:00
mlugg
083ee8e0e2
InternPool: preserve indices of builtin types when resolved
Some builtin types have a special InternPool index (e.g.
`.type_info_type`) so that AstGen can refer to them before semantic
analysis. Unfortunately, this previously led to a second index existing
to refer to the type once it was resolved, complicating Sema by having
the concept of an "unresolved" type index.

This change makes Sema modify these InternPool indices in-place to
contain the expanded representation when resolved. The analysis of the
corresponding decls is caught in `Module.semaDecl`, and a field is set
on Sema telling it which index to place struct/union/enum types at. This
system could break if `std.builtin` contained complex decls which
evaluate multiple struct types, but this will be caught by the
assertions in `InternPool.resolveBuiltinType`.

The AstGen result types which were disabled in 6917a8c have been
re-enabled.

Resolves: #16603
2023-08-15 11:45:23 +01:00