This commit reworks how Sema handles arithmetic on comptime-known
values, fixing many bugs in the process.
The general pattern is that arithmetic on comptime-known values is now
handled by the new namespace `Sema.arith`. Functions handling comptime
arithmetic no longer live on `Value`; this is because some of them can
emit compile errors, so some *can't* go on `Value`. Only semantic
analysis should really be doing arithmetic on `Value`s anyway, so it
makes sense for it to integrate more tightly with `Sema`.
This commit also implements more coherent rules surrounding how
`undefined` interacts with comptime and mixed-comptime-runtime
arithmetic. The rules are as follows.
* If an operation cannot trigger Illegal Behavior, and any operand is
`undefined`, the result is `undefined`. This includes operations like
`0 *| undef`, where the LHS logically *could* be used to determine a
defined result. This is partly to simplify the language, but mostly to
permit codegen backends to represent `undefined` values as completely
invalid states.
* If an operation *can* trigger Illegal Behvaior, and any operand is
`undefined`, then Illegal Behavior results. This occurs even if the
operand in question isn't the one that "decides" illegal behavior; for
instance, `undef / 1` is undefined. This is for the same reasons as
described above.
* An operation which would trigger Illegal Behavior, when evaluated at
comptime, instead triggers a compile error. Additionally, if one
operand is comptime-known undef, such that the other (runtime-known)
operand isn't needed to determine that Illegal Behavior would occur,
the compile error is triggered.
* The only situation in which an operation with one comptime-known
operand has a comptime-known result is if that operand is undefined,
in which case the result is either undefined or a compile error per
the above rules. This could potentially be loosened in future (for
instance, `0 * rt` could be comptime-known 0 with a runtime assertion
that `rt` is not undefined), but at least for now, defining it more
conservatively simplifies the language and allows us to easily change
this in future if desired.
This commit fixes many bugs regarding the handling of `undefined`,
particularly in vectors. Along with a collection of smaller tests, two
very large test cases are added to check arithmetic on `undefined`.
The operations which have been rewritten in this PR are:
* `+`, `+%`, `+|`, `@addWithOverflow`
* `-`, `-%`, `-|`, `@subWithOverflow`
* `*`, `*%`, `*|`, `@mulWithOverflow`
* `/`, `@divFloor`, `@divTrunc`, `@divExact`
* `%`, `@rem`, `@mod`
Other arithmetic operations are currently unchanged.
Resolves: #22743Resolves: #22745Resolves: #22748Resolves: #22749Resolves: #22914
* arm_apcs is the long dead "OABI" which we never had working support for.
* arm_aapcs16_vfp is for arm-watchos-none which is a dead target that we've
dropped support for.
This is all of the expected 0.14.0 progress on #21530, which can now be
postponed once this commit is merged.
This required rewriting the (un)wrap operations since the original
implementations were extremely buggy.
Also adds an easy way to retrigger Sema OPV bugs so that I don't have to
keep updating #22419 all the time.
This commit allows using ZON (Zig Object Notation) in a few ways.
* `@import` can be used to load ZON at comptime and convert it to a
normal Zig value. In this case, `@import` must have a result type.
* `std.zon.parse` can be used to parse ZON at runtime, akin to the
parsing logic in `std.json`.
* `std.zon.stringify` can be used to convert arbitrary data structures
to ZON at runtime, again akin to `std.json`.
This commit effectively reverts 9e683f0, and hence un-accepts #19777.
While nice in theory, this proposal turned out to have a few problems.
Firstly, supplying a result type implicitly coerces the operand to this
type -- that's the main point of result types! But for `try`, this is
actually a bad idea; we want a redundant `try` to be a compile error,
not to silently coerce the non-error value to an error union. In
practice, this didn't always happen, because the implementation was
buggy anyway; but when it did, it was really quite silly. For instance,
`try try ... try .{ ... }` was an accepted expression, with the inner
initializer being initially coerced to `E!E!...E!T`.
Secondly, the result type inference here didn't play nicely with
`return`. If you write `return try`, the operand would actually receive
a result type of `E!E!T`, since the `return` gave a result type of `E!T`
and the `try` wrapped it in *another* error union. More generally, the
problem here is that `try` doesn't know when it should or shouldn't
nest error unions. This occasionally broke code which looked like it
should work.
So, this commit prevents `try` from propagating result types through to
its operand. A key motivation for the original proposal here was decl
literals; so, as a special case, `try .foo(...)` is still an allowed
syntax form, caught by AstGen and specially lowered. This does open the
doors to allowing other special cases for decl literals in future, such
as `.foo(...) catch ...`, but those proposals are for another time.
Resolves: #21991Resolves: #22633
I recently saw a user hit the "comptime call of extern function" error,
and get confused because they didn't know why the scope was `comptime`.
So, use `explainWhyBlockIsComptime` on this and related errors to add
all the relevant notes.
The added test case shows the motivating situation.
* The langspec definition of `@memcpy` has been changed so that the
source and destination element types must be in-memory coercible,
allowing all such calls to be raw copying operations, not actually
applying any coercions.
* Implement aliasing check for comptime `@memcpy`; a compile error will
now be emitted if the arguments alias.
* Implement more efficient comptime `@memcpy` by loading and storing a
whole array at once, similar to how `@memset` is implemented.
* `std.builtin.Panic` -> `std.builtin.panic`, because it is a namespace.
* `root.Panic` -> `root.panic` for the same reason. There are type
checks so that we still allow the legacy `pub fn panic` strategy in
the 0.14.0 release.
* `std.debug.SimplePanic` -> `std.debug.simple_panic`, same reason.
* `std.debug.NoPanic` -> `std.debug.no_panic`, same reason.
* `std.debug.FormattedPanic` is now a function `std.debug.FullPanic`
which takes as input a `panicFn` and returns a namespace with all the
panic functions. This handles the incredibly common case of just
wanting to override how the message is printed, whilst keeping nice
formatted panics.
* Remove `std.builtin.panic.messages`; now, every safety panic has its
own function. This reduces binary bloat, as calls to these functions
no longer need to prepare any arguments (aside from the error return
trace).
* Remove some legacy declarations, since a zig1.wasm update has
happened. Most of these were related to the panic handler, but a quick
grep for "zig1" brought up a couple more results too.
Also, add some missing type checks to Sema.
Resolves: #22584
formatted -> full
The original motivation here was to fix regressions caused by #22414.
However, while working on this, I ended up discussing a language
simplification with Andrew, which changes things a little from how they
worked before #22414.
The main user-facing change here is that any reference to a prior
function parameter, even if potentially comptime-known at the usage
site or even not analyzed, now makes a function generic. This applies
even if the parameter being referenced is not a `comptime` parameter,
since it could still be populated when performing an inline call. This
is a breaking language change.
The detection of this is done in AstGen; when evaluating a parameter
type or return type, we track whether it referenced any prior parameter,
and if so, we mark this type as being "generic" in ZIR. This will cause
Sema to not evaluate it until the time of instantiation or inline call.
A lovely consequence of this from an implementation perspective is that
it eliminates the need for most of the "generic poison" system. In
particular, `error.GenericPoison` is now completely unnecessary, because
we identify generic expressions earlier in the pipeline; this simplifies
the compiler and avoids redundant work. This also entirely eliminates
the concept of the "generic poison value". The only remnant of this
system is the "generic poison type" (`Type.generic_poison` and
`InternPool.Index.generic_poison_type`). This type is used in two
places:
* During semantic analysis, to represent an unknown result type.
* When storing generic function types, to represent a generic parameter/return type.
It's possible that these use cases should instead use `.none`, but I
leave that investigation to a future adventurer.
One last thing. Prior to #22414, inline calls were a little inefficient,
because they re-evaluated even non-generic parameter types whenever they
were called. Changing this behavior is what ultimately led to #22538.
Well, because the new logic will mark a type expression as generic if
there is any change its resolved type could differ in an inline call,
this redundant work is unnecessary! So, this is another way in which the
new design reduces redundant work and complexity.
Resolves: #22494Resolves: #22532Resolves: #22538
We can still often determine a comptime result based on the type, even
if the pointer is runtime-known.
Also, we previously used load -> is non null instead of AIR
`is_non_null_ptr` if the pointer is comptime-known, but that's a bad
heuristic. Instead, we should check for the pointer to be
comptime-known, *and* for the load to be comptime-known, and only in
that case should we call `Sema.analyzeIsNonNull`.
Resolves: #22556
This was done by regex substitution with `sed`. I then manually went
over the entire diff and fixed any incorrect changes.
This diff also changes a lot of `callconv(.C)` to `callconv(.c)`, since
my regex happened to also trigger here. I opted to leave these changes
in, since they *are* a correct migration, even if they're not the one I
was trying to do!
`Type.hasWellDefinedLayout` was in disagreement with pointer loading
logic about auto-layout structs with zero fields, `struct {}`. For
consistency, these types should not have a well-defined layout.
This is technically a breaking change.
`Sema.explainWhyValueContainsReferenceToComptimeVar` (concise name!)
adds notes to an error explaining how to get from a given `Value` to a
pointer to some `comptime var` (or a comptime field). Previously, this
error could be very opaque in any case where it wasn't obvious where the
comptime var pointer came from; particularly for type captures. Now, the
error notes explain this to the user.
This rewrite improves some error messages, hugely simplifies the logic,
and fixes several bugs. One of these bugs is technically a new rule
which Andrew and I agreed on: if a parameter has a comptime-only type
but is not declared `comptime`, then the corresponding call argument
should not be *evaluated* at comptime; only resolved. Implementing this
required changing how function types work a little, which in turn
required allowing a new kind of function coercion for some generic use
cases: function coercions are now allowed to implicitly *remove*
`comptime` annotations from parameters with comptime-only types. This is
okay because removing the annotation affects only the call site.
Resolves: #22262
The old lowering was kind of neat, but it unintentionally allowed the
syntax `for (123) |_| { ... }`, and there wasn't really a way to fix
that. So, instead, we include both the start and the end of the range in
the `for_len` instruction (each operand to `for` now has *two* entries
in this multi-op instruction). This slightly increases the size of ZIR
for loops of predominantly indexables, but the difference is small
enough that it's not worth complicating ZIR to try and fix it.