Also:
* improve the "ambiguous reference" error by swapping the order of
"declared here" and "also declared here" notes.
* improve the "not accessible from inner function" error:
- point out that it has to do with the thing being mutable
- eliminate the incorrect association with it being a function
- note where it crosses a namespace boundary
* struct field types are evaluated in a context that has the struct
namespace visible. Likewise with align expressions, linksection
expressions, enum tag values, and union/enum tag argument
expressions.
Closes#9194Closes#9622
* stage2 AstGen: add missing compile error for declaring a local
that shadows a primitive. Even with `@""` syntax, it may not have
the same name as a primitive.
* stage2 AstGen: add a compile error for a global declaration
whose name matches a primitive. However it is allowed when using
`@""` syntax.
* stage1: delete all "declaration shadows primitive" compile errors
because they are now handled by stage2 AstGen.
* stage1/stage2 AstGen: notice when using `@""` syntax and:
- treat `_` as a regular identifier
- skip checking if an identifire is a primitive
Check the new test cases for clarifications on semantics.
closes#6062
Locals are not allowed to shadow declarations, but declarations are
allowed to shadow each other, as long as there are no ambiguous
references.
closes#678
Previous commit shifted everything down in the start.zig file, and
unfortunately our stage2 test harness depends on absolute line
numbers for a couple tests.
This is a backwards-compatible language change.
Previously, `@intToEnum` coerced its integer operand to the integer tag
type of the destination enum type, often requiring the callsite to
additionally wrap the operand in an `@intCast`. Now, the `@intCast` is
implicit, and any integer operand can be passed to `@intToEnum`.
The same as before, it is illegal behavior to pass any integer which does
not have a corresponding enum tag.
* Introduce `memoized_calls` to `Module` which stores all the comptime
function calls that are cached. It is keyed on the `*Fn` and the
comptime arguments, but it does not yet properly detect comptime function
pointers and avoid memoizing in this case. So it will have false
positives for when a comptime function call mutates data through a
pointer parameter.
* Sema: Add a new helper function: `resolveConstMaybeUndefVal`
* Value: add `enumToInt` method and use it in `zirEnumToInt`. It is
also used by the hashing function.
* Value: fix representation of optionals to match error unions.
Previously it would not handle nested optionals correctly. Now it
matches the memory layout of error unions and supports nested
optionals properly. This required changes in all the backends for
generating optional constants.
* TypedValue gains `eql` and `hash` methods.
* Value: Implement hashing for floats, optionals, and enums.
Additionally, the zig type tag is added to the hash, where it was not
previously, so that values of differing types will get different
hashes.
In some cases (such as bitcast), an operand may be the same MCValue as
the result. If that operand died and was a register, it was freed by
processDeath. We have to "re-allocate" the register.
The big change in this commit is making `semaDecl` resolve the fields if
the Decl ends up being a struct or union. It needs to do this while
the `Sema` is still in scope, because it will have the resolved AIR
instructions that the field type expressions possibly reference. We do
this after the decl is populated and set to `complete` so that a `Decl`
may reference itself.
Everything else is fixes and improvements to make the test suite pass
again after making this change.
* New AIR instruction: `ptr_elem_ptr`
- Implemented for LLVM backend
* New Type tag: `type_info` which represents `std.builtin.TypeInfo`. It
is used by AstGen for the operand type of `@Type`.
* ZIR instruction `set_float_mode` uses `coerced_ty` to avoid
superfluous `as` instruction on operand.
* ZIR instruction `Type` uses `coerced_ty` to properly handle result
location type of operand.
* Fix two instances of `enum_nonexhaustive` Value Tag not handled
properly - it should generally be handled the same as `enum_full`.
* Fix struct and union field resolution not copying Type and Value
objects into its Decl arena.
* Fix enum tag value resolution discarding the ZIR=>AIR instruction map
for the child Sema, when they still needed to be accessed.
* Fix `zirResolveInferredAlloc` use-after-free in the AIR instructions
data array.
* Fix `elemPtrArray` not respecting const/mutable attribute of pointer
in the result type.
* Fix LLVM backend crashing when `updateDeclExports` is called before
`updateDecl`/`updateFunc` (which is, according to the API, perfectly
legal for the frontend to do).
* Fix LLVM backend handling element pointer of pointer-to-array. It
needed another index in the GEP otherwise LLVM saw the wrong type.
* Fix LLVM test cases not returning 0 from main, causing test failures.
Fixes a regression introduced in
6a5094872f10acc629543cc7f10533b438d0283a.
* Implement comptime shift-right.
* Implement `@Type` for integers and `@TypeInfo` for integers.
* Implement union initialization syntax.
* Implement `zirFieldType` for unions.
* Implement `elemPtrArray` for a runtime-known operand.
* Make `zirLog2IntType` support RHS of shift being `comptime_int`. In
this case it returns `comptime_int`.
The motivating test case for this commit was originally:
```zig
test "example" {
var l: List(10) = undefined;
l.array[1] = 1;
}
fn List(comptime L: usize) type {
var T = u8;
return struct {
array: [L]T,
};
}
```
However I changed it to:
```zig
test "example" {
var l: List = undefined;
l.array[1] = 1;
}
const List = blk: {
const T = [10]u8;
break :blk struct {
array: T,
};
};
```
Which ended up being a similar, smaller problem. The former test case
will require a similar solution in the implementation of comptime
function calls - checking if the result of the function call is a struct
or union, and using the child `Sema` before it is destroyed to resolve
the fields.
Previously, I have incorrectly assumed that with two-level namespace
we only need to link in dylibs/frameworks that actually export symbols
which are undefined in the linked image. Turns out, regardless of
whether we link with two-level namespace (default on macOS) or a
flat namespace (more common on other platforms), we always need to
put the dylibs/frameworks as specified by the user from the linker
line into the final linked image.
* Value: rename `error_union` to `eu_payload` and clarify the intended
usage in the doc comments. The way error unions is represented with
Value is fixed to not have ambiguous values.
* Fix codegen for error union constants in all the backends.
* Implement the AIR instructions having to do with error unions in the
LLVM backend.
* New AIR instructions: ptr_add, ptr_sub, ptr_elem_val, ptr_ptr_elem_val
- See the doc comments for details.
* Sema: implement runtime pointer arithmetic.
* Sema: implement elem_val for many-pointers.
* Sema: support coercion from `*[N:s]T` to `[*]T`.
* Type: isIndexable handles many-pointers.
* Introduce `ret_load` ZIR instruction which does return semantics
based on a corresponding `ret_ptr` instruction. If the return type of
the function has storage for the return type, it simply returns.
However if the return type of the function is by-value, it loads the
return value from the `ret_ptr` allocation and returns that.
* AstGen: improve `finishThenElseBlock` to not emit break instructions
after a return instruction in the same block.
* Sema: `ret_ptr` instruction works correctly in comptime contexts.
Same with `alloc_mut`.
The test case with a recursive inline function having an implicitly
comptime return value now has a runtime return value because of the fact
that it calls a function in a non-comptime context.
The `comptime_args` field of Fn has a clarified purpose:
For generic function instantiations, there is a `TypedValue` here
for each parameter of the function:
* Non-comptime parameters are marked with a `generic_poison` for the value.
* Non-anytype parameters are marked with a `generic_poison` for the type.
Sema now has a `fn_ret_ty` field. Doc comments reproduced here:
> When semantic analysis needs to know the return type of the function whose body
> is being analyzed, this `Type` should be used instead of going through `func`.
> This will correctly handle the case of a comptime/inline function call of a
> generic function which uses a type expression for the return type.
> The type will be `void` in the case that `func` is `null`.
Various places in Sema are modified in accordance with this guidance.
Fixed `resolveMaybeUndefVal` not returning `error.GenericPoison` when
Value Tag of `generic_poison` is encountered.
Fixed generic function memoization incorrect equality checking. The
logic now clearly deals properly with any combination of anytype and
comptime parameters.
Fixed not removing generic function instantiation from the table in case
a compile errors in the rest of `call` semantic analysis. This required
introduction of yet another adapter which I have called
`GenericRemoveAdapter`. This one is nice and simple - it's the same hash
function (the same precomputed hash is passed in) but the equality
function checks pointers rather than doing any logic.
Inline/comptime function calls coerce each argument in accordance with
the function parameter type expressions. Likewise the return type
expression is evaluated and provided (see `fn_ret_ty` above).
There's a new compile error "unable to monomorphize function". It's
pretty unhelpful and will need to get improved in the future. It happens
when a type expression in a generic function did not end up getting
resolved at a callsite. This can happen, for example, if a runtime
parameter is attempted to be used where it needed to be comptime known:
```zig
fn foo(x: anytype) [x]u8 { _ = x; }
```
In this example, even if we pass a number such as `10` for `x`, it is
not marked `comptime`, so `x` will have a runtime known value, making
the return type unable to resolve.
In the LLVM backend I implement cmp instructions for float types to pass
some behavior tests that used floats.
When doing a function call, if the return type requires comptime, the
function is analyzed as an inline/comptime call.
There is an important TODO here. I will reproduce the comment from this
commit:
> In the case of a comptime/inline function call of a generic function,
> the function return type needs to be the resolved return type based on
> the function parameter type expressions being evaluated with comptime arguments
> passed in. Otherwise, it ends up being .generic_poison and failing the
> comptime/inline function call analysis.
* ZIR encoding for function instructions have a body for the return
type. This lets Sema for generic functions do the same thing it does
for parameters, handling `error.GenericPoison` in the evaluation of
the return type by marking the function as generic.
* Sema: fix missing block around the new Decl arena finalization. This
led to a memory corruption.
* Added some floating point support to the LLVM backend but didn't get
far enough to pass any new tests.
Module has a new field `monomorphed_funcs` which stores the set of
`*Module.Fn` objects which are generic function instantiations.
The hash is based on hashes of comptime values of parameters known to be
comptime based on an explicit comptime keyword or must-be-comptime
type expressions that can be evaluated without performing monomorphization.
This allows function calls to be semantically analyzed cheaply for
generic functions which are already instantiated.
The table is updated with a single `getOrPutAdapted` in the semantic
analysis of `call` instructions, by pre-allocating the `Fn` object and
passing it to the child `Sema`.
* The `indexable_ptr_len` ZIR instruction now uses a `none_or_ref`
ResultLoc. This prevents an unnecessary `ref` instruction from being
emitted.
* Sema: Fix `analyzeCall` using the incorrect ZIR object for the
generic function callee.
* LLVM backend: `genTypedValue` supports a `Slice` type encoded with
the `decl_ref` `Value`.
AstGen result locations now have a `coerced_ty` tag which is the same as
`ty` except it assumes that Sema will do a coercion, so it does not
redundantly add an `as` instruction into the ZIR code. This results in
cleaner ZIR and about a 14% reduction of ZIR bytes.
param and param_comptime ZIR instructions now have a block body for
their type expressions. This allows Sema to skip evaluation of the
block in the case that the parameter is comptime-provided. It also
allows a new mechanism to function: when evaluating type expressions of
generic functions, if it would depend on another parameter, it returns
`error.GenericPoison` which bubbles up and then is caught by the
param/param_comptime instruction and then handled.
This allows parameters to be evaluated independently so that the type
info for functions which have comptime or anytype parameters will still
have types populated for parameters that do not depend on values of
previous parameters (because evaluation of their param blocks will return
successfully instead of `error.GenericPoison`).
It also makes iteration over the block that contains function parameters
slightly more efficient since it now only contains the param
instructions.
Finally, it fixes the case where a generic function type expression contains
a function prototype. Formerly, this situation would cause shared state
to clobber each other; now it is in a proper tree structure so that
can't happen. This fix also required adding a field to Sema
`comptime_args_fn_inst` to make sure that the `comptime_args` field
passed into Sema is applied to the correct `func` instruction.
Source location for `node_offset_asm_ret_ty` is fixed; it was pointing at
the asm output name rather than the return type as intended.
Generic function instantiation is fixed, notably with respect to
parameter type expressions that depend on previous parameters, and with
respect to types which must be always comptime-known. This involves
passing all the comptime arguments at a callsite of a generic function,
and allowing the generic function semantic analysis to coerce the values
to the proper types (since it has access to the evaluated parameter type
expressions) and then decide based on the type whether the parameter is
runtime known or not. In the case of explicitly marked `comptime`
parameters, there is a check at the semantic analysis of the `call`
instruction.
Semantic analysis of `call` instructions does type coercion on the
arguments, which is needed both for generic functions and to make up for
using `coerced_ty` result locations (mentioned above).
Tasks left in this branch:
* Implement the memoization table.
* Add test coverage.
* Improve error reporting and source locations for compile errors.
#8589 introduced correct handling of signed (possibly negative) array access
of pointers. Since unadorned integer literals in C are signed, this resulted
in inefficient generated code when indexing a pointer by a non-negative
integer literal.
This way, we can explicitly signal if a test requires the presence
of macOS SDK to build. For instance, when testing our in-house
MachO linker for correctly linking Objective-C, we require the
presence of the SDK on the host system, and we can enforce this
with `-Denable-macos-sdk` flag to `zig build test-standalone`.