The main motivation for this change is eliminating the `block_ptr`
result location and corresponding `store_to_block_ptr` ZIR instruction.
This is achieved through a simple pass over the AST before AstGen which
determines, for AST nodes which have a choice on whether to provide a
result location, which choice to make, based on whether the result
pointer is consumed non-trivially.
This eliminates so much logic from AstGen that we almost break even on
line count! AstGen no longer has to worry about instruction rewriting
based on whether or not a result location was consumed: it always knows
what to do ahead of time, which simplifies a *lot* of logic. This also
incidentally fixes a few random AstGen bugs related to result location
handling, leading to the changes in `test/` and `lib/std/`.
This opens the door to future RLS improvements by making them much
easier to implement correctly, and fixes many bugs. Most ZIR is made
more compact after this commit, mainly due to not having redundant
`store_to_block_ptr` instructions lying around, but also due to a few
bugs in the old system which are implicitly fixed here.
* Generalise NaN handling and make std.math.nan() give quiet NaNs
* Address uses of std.math.qnan_* and std.math.nan_* consts
* Comment out failing test due to issues with signalling NaN
* Fix issue in c_builtins.zig where we need qnan_u32
The key changes in this commit are:
```diff
- names: []const NullTerminatedString,
+ names: NullTerminatedString.Slice,
- values: []const Index,
+ values: Index.Slice,
```
Which eliminates the slices from `InternPool.Key.EnumType` and replaces
them with structs that contain `start` and `len` indexes. This makes the
lifetime of `EnumType` change from expiring with updates to InternPool,
to expiring when the InternPool is garbage-collected, which is currently
never.
This is gearing up for a larger change I started working on locally
which moves union types into InternPool.
As a bonus, I fixed some unnecessary instances of `@as`.
Some builtin types have a special InternPool index (e.g.
`.type_info_type`) so that AstGen can refer to them before semantic
analysis. Unfortunately, this previously led to a second index existing
to refer to the type once it was resolved, complicating Sema by having
the concept of an "unresolved" type index.
This change makes Sema modify these InternPool indices in-place to
contain the expanded representation when resolved. The analysis of the
corresponding decls is caught in `Module.semaDecl`, and a field is set
on Sema telling it which index to place struct/union/enum types at. This
system could break if `std.builtin` contained complex decls which
evaluate multiple struct types, but this will be caught by the
assertions in `InternPool.resolveBuiltinType`.
The AstGen result types which were disabled in 6917a8c have been
re-enabled.
Resolves: #16603
The panic handler decl_val was previously given a `unneeded` source
location, which was then added to the reference trace, resulting in a
crash if the source location was used in the reference trace. This
commit makes two trivial changes:
* Don't add unneeded source locations to the ref table (panic in debug, silently ignore in release)
* Pass a real source location when analyzing the panic handler
After ff37ccd, interned values are trivial to convert to Air refs, using
`Air.internedToRef`. This made functions like `Sema.addConstant` effectively
redundant. This commit removes `Sema.addConstant` and `Sema.addType`, replacing
them with direct usages of `Air.internedToRef`.
Additionally, a new helper `Module.undefValue` is added, and the following
functions are moved into Module:
* `Sema.addConstUndef` -> `Module.undefRef`
* `Sema.addUnsignedInt` -> `Module.intRef` (now also works for signed types)
The general pattern here is that any `Module.xyzValue` helper may also have a
corresponding `Module.xyzRef` helper, which just wraps the call in
`Air.internedToRef`.
as explainded at https://llvm.org/docs/LangRef.html#vector-type :
"In general vector elements are laid out in memory in the same way as array types.
Such an analogy works fine as long as the vector elements are byte sized.
However, when the elements of the vector aren’t byte sized it gets a bit more complicated.
One way to describe the layout is by describing what happens when a vector such
as <N x iM> is bitcasted to an integer type with N*M bits, and then following the
rules for storing such an integer to memory."
"When <N*M> isn’t evenly divisible by the byte size the exact memory layout
is unspecified (just like it is for an integral type of the same size)."
AstGen provides all function call arguments with a result location,
referenced through the call instruction index. The idea is that this
should be the parameter type, but for `anytype` parameters, we use
generic poison, which is required to be handled correctly.
Previously, generic instantiations and inline calls worked by evaluating
all args in advance, before resolving generic parameter types. This
means any generic parameter (not just `anytype` ones) had generic poison
result types. This caused missing result locations in some cases.
Additionally, the generic instantiation logic caused `zirParam` to
analyze the argument types a second time before coercion. This meant
that for nominal types (struct/enum/etc), a *new* type was created,
distinct to the result type which was previously forwarded to the
argument expression.
This commit fixes both of these issues. Generic parameter type
resolution is now interleaved with argument analysis, so that we don't
have unnecessary generic poison types, and generic instantiation logic
now handles parameters itself rather than falling through to the
standard zirParam logic, so avoids duplicating the types.
Resolves: #16566Resolves: #16258Resolves: #16753
Well, this was a journey!
The original issue I was trying to fix is covered by the new behavior
test in array.zig: in essence, `ty` and `coerced_ty` result locations
were not correctly propagated.
While fixing this, I noticed a similar bug in struct inits: the type was
propagated to *fields* fine, but the actual struct init was
unnecessarily anonymous, which could lead to unnecessary copies. Note
that the behavior test added in struct.zig was already passing - the bug
here didn't change any easy-to-test behavior - but I figured I'd add it
anyway.
This is a little harder than it seems, because the result type may not
itself be an array/struct type: it could be an optional / error union
wrapper. A new ZIR instruction is introduced to unwrap these.
This is also made a little tricky by the fact that it's possible for
result types to be unknown at the time of semantic analysis (due to
`anytype` parameters), leading to generic poison. In these cases, we
must essentially downgrade to an anonymous initialization.
Fixing these issues exposed *another* bug, related to type resolution in
Sema. That issue is now tracked by #16603. As a temporary workaround for
this bug, a few result locations for builtin function operands have been
disabled in AstGen. This is technically a breaking change, but it's very
minor: I doubt it'll cause any breakage in the wild.
This function does not seem to differ in any interesting way from
`!typeRequiresComptime`, other than the `is_extern` param which is only
used in one place, and some differences did not seem correct anyway.
My reasoning for changing opaque types to be comptime-only is that
`explainWhyTypeIsComptime` is quite happy to explain why they are. :D
* Disable runtime calls, since it is not possible to know the proper
stack adjustment to follow the callee abi.
* Disable runtime returns, since it is not possible to know where the
return address is stored in general.
* Allow implicit returns regardless of the return type, which allows
naked functions with a non-void return type to be written.
This change allows the following types to appear in extern structs:
* Zero-bit integers
* void
* zero-sized structs and packed structs
* enums with zero-bit backing integers
* arrays of any length with zero-size elements
This commit does two things which seem unrelated at first, but,
together, solve a miscompilation, and potentially slightly speed up
compiler perf, at the expense of making #2765 trickier to implement in
the future.
Sema: avoid returning a false positive for whether an inferred error set
is comptime-known to be empty.
AstGen: mark function calls as not being interested in a result
location. This prevents the test case "ret_ptr doesn't cause own
inferred error set to be resolved" from being regressed. If we want to
accept and implement #2765 in the future, it will require solving this
problem a different way, but the principle of YAGNI tells us to go ahead
with this change.
Old ZIR looks like this:
%97 = ret_ptr()
%101 = store_node(%97, %100)
%102 = load(%97)
%103 = ret_is_non_err(%102)
New ZIR looks like this:
%97 = ret_type()
%101 = as_node(%97, %100)
%102 = ret_is_non_err(%101)
closes#15669
fixes#10731
Thanks @nektro for previous work in #14878
This change creates a small breaking change:
It removes the `is_pub` field of a decl in `@typeInfo`
The logic incorrectly assumed that adhoc_inferred_error_set_type would
be part of the inferred_error_set InternPool.Key when it actually is
part of `simple_type`.
Regressed in the #16318 branch.
Found from compiling Bun. Unfortunately we do not have a behavior test
reduction for this bug.
Currently, in a debug build of the compiler, `@embedFile("")` is a crash;
in a release build the compiler, `@embedFile("")` is "error: unable to open '': OutOfMemory".
* getOwnedFunctionIndex no longer checks if the value is actually a
function.
* The callsites to `intern` that I added want to avoid the `getCoerced`
call, so I added `intern2`.
* Adding to inferred error sets should not happen if the destination
error set is not the inferred error set of the current Sema instance.
* adhoc_inferred_error_set_type can be seen by the backend. Treat it
like anyerror.
when other function's inferred error set is the return type of a
function, it should not try to insert the error set into it.
this implies that this branch fixes a bug in master branch.
When instantiating a generic function call, in the case of a function
call generated by the language, the LazySrcLoc provided for `call_src`
may not point to anything interesting.
* Introduce InternPool.Tag.func_coerced to handle the case of a
function body coerced to a new type. `InternPool.getCoerced` is now
implemented for function bodies in this branch.
* implement resolution of ad-hoc inferred error sets in
`Sema.analyzeCall`.
* fix generic_owner being set wrong for child Sema bodies of param
expressions.
* fix `Sema.resolveInferredErrorSetTy` when passed `anyerror`.
Previously, they shared function index with the owner decl, but that
would clobber the data stored for inferred error sets of runtime calls.
Now there is an adhoc_inferred_error_set_type which models the problem
much more correctly.