* Sema: resolve type fully when emitting an alloc AIR instruction to
avoid tripping assertion for checking struct field alignment.
* LLVM backend: keep a reference to the LLVM target data alive during
lowering so that we can ask LLVM what it thinks the ABI alignment
and size of LLVM types are. We need this in order to lower tuples and
structs so that we can put in extra padding bytes when Zig disagrees
with LLVM about the size or alignment of something.
* LLVM backend: make the LLVM struct type packed that contains the most
aligned union field and the padding. This prevents the struct from
being too big according to LLVM. In the future, we may want to
consider instead emitting unions in a "flat" manner; putting the tag,
most aligned union field, and padding all in the same struct field
space.
* LLVM backend: make structs with 2 or fewer fields return isByRef=false.
This results in more efficient codegen. This required lowering of
bitcast to sometimes store the struct into an alloca, ptrcast, and
then load because LLVM does not allow bitcasting structs.
* enable more passing behavior tests.
* Fix compile error for `zirErrorUnionType`.
* Convert zirMergeErrorSets logic to call `Type.errorSetMerge`.
It does not need to create a Decl as the TODO comment hinted.
* Extract out a function called `resolveInferredErrorSetTy`.
* Rework `resolvePeerTypes` with respect to error unions and
error sets. This is a less complex implementation that passes all the
same tests and uses many fewer lines of code by taking advantage of
the function `coerceInMemoryAllowedErrorSets`.
- Always merge error sets in the order that makes sense, even when
that means `@typeInfo` incompatibility with stage1.
* `Type.errorSetMerge` no longer overallocates.
* Don't skip passing tests.
Similar to how Type.eql was reworked in the previous commit, this commit
reworks Type.hash to check all the different kinds of tags that a Type
can be represented with. It also completes the implementation for all
types except error sets, which need to have Type.eql enhanced as well.
Several issues with pointer types are fixed:
Prior to this commit, Zig would not canonicalize a pointer type with
an explicit alignment to alignment=0 if it matched the pointee ABI
alignment. In order to fix this, `Type.ptr` now takes a Target
parameter. I also moved the host_size canonicalization to `Type.ptr`
since target is now available. Similarly, is_allowzero in the case of
C pointers is now treated as a canonicalization done by the function
rather than a precondition.
in-memory coercion for pointers now properly checks ABI alignment
of pointee types instead of incorrectly treating the 0 value as an
alignment.
Type equality is completely reworked based on the tag() rather than the
zigTypeTag(). It's still semantically based on zigTypeTag() but that
knowledge is implied rather than dictating the control flow of the
logic. Importantly, this fixes cases for opaques, structs, tuples,
enums, and unions, where type equality was incorrectly returning based
on whether the tag() values were equal.
Additionally, pointer type equality now takes into account alignment.
Because we canonicalize non-zero alignment which equals pointee type ABI
alignment to alignment=0, this now can be a simple integer comparison.
Type hashing is implemented for pointers and floats. Array types now
additionally hash their sentinels.
This regressed some behavior tests that were passing but only because
of bugs regarding type equality.
The C backend has a noticeable problem with lowering differently-aligned
pointers (particularly slices) as the same type, causing C compilation
errors due to duplicate declarations.
This implements #10113 for the self-hosted compiler only. It removes the
ability to override alignment of packed struct fields, and removes the
ability to put pointers and arrays inside packed structs.
After this commit, nearly all the behavior tests pass for the stage2 llvm
backend that involve packed structs.
I didn't implement the compile errors or compile error tests yet. I'm
waiting until we have stage2 building itself and then I want to rework
the compile error test harness with inspiration from Vexu's arocc test
harness. At that point it should be a much nicer dev experience to work
on compile errors.
Get rid of `std.math.F80Repr`. Instead of trying to match the memory
layout of f80, we treat it as a value, same as the other floating point
types. The functions `make_f80` and `break_f80` are introduced to
compose an f80 value out of its parts, and the inverse operation.
stage2 LLVM backend: fix pointer to zero length array tripping LLVM
assertion. It now checks for when the element type is a zero-bit type
and lowers such thing the same way that pointers to other zero-bit types
are lowered.
Both stage1 and stage2 LLVM backends are adjusted so that f80 is lowered
as x86_fp80 on x86_64 and i386 architectures, and identical to a u80 on
others. LLVM constants are lowered in a less hacky way now that #10860
is fixed, by using the expression `(exp << 64) | fraction` using llvm
constants.
Sema is improved to handle c_longdouble by recursively handling it
correctly for whatever the float bit width is. In both stage1 and
stage2.
When Sema sees a store_node instruction, it now checks for
the possibility of this pattern:
%a = ret_ptr
%b = store(%a, %c)
Where %c is an error union. In such case we need to add to the
current function's inferred error set, if any.
Coercion from error union to error union will be handled ideally if the
operand is comptime known. In such case it does the appropriate
unwrapping, then wraps again.
In the future, coercion from error union to error union should do the
same thing for a runtime value; emitting a runtime branch to check if
the value is an error or not.
`Value.arrayLen` for structs returns the number of fields. This is so
that Liveness can use it for the `vector_init` instruction (soon to be
renamed to `aggregate_init`).
fieldVal handles pointer to pointer to array. This can happen for
example, if a pointer to an array is used as the condition expression of
a for loop.
resolveStructFully handles tuples (by doing nothing).
fixed Type comparison for tuples to handle comptime fields properly.
* resolve_inferred_alloc now gives a proper mutability attribute to the
corresponding alloc instruction. Previously, it would fail to mark
things const.
* slicing: fix the detection for when the end index equals the length
of the underlying object. Previously it was using `end - start` but
it should just use the end index directly. It also takes into account
when slicing a comptime-known slice.
* `Type.sentinel`: fix not handling all slice tags
AstGen:
* rename the known_has_bits flag to known_non_opv to make it better
reflect what it actually means.
* add a known_comptime_only flag.
* make the flags take advantage of identifiers of primitives and the
fact that zig has no shadowing.
* correct the known_non_opv flag for function bodies.
Sema:
* Rename `hasCodeGenBits` to `hasRuntimeBits` to better reflect what it
does.
- This function got a bit more complicated in this commit because of
the duality of function bodies: on one hand they have runtime bits,
but on the other hand they require being comptime known.
* WipAnonDecl now takes a LazySrcDecl parameter and performs the type
resolutions that it needs during finish().
* Implement comptime `@ptrToInt`.
Codegen:
* Improved handling of lowering decl_ref; make it work for
comptime-known ptr-to-int values.
- This same change had to be made many different times; perhaps we
should look into merging the implementations of `genTypedValue`
across x86, arm, aarch64, and riscv.
This commit updates stage2 to enforce the property that the syntax
`fn()void` is a function *body* not a *pointer*. To get a pointer, the
syntax `*const fn()void` is required.
ZIR puts function alignment into the func instruction rather than the
decl because this way it makes it into function types. LLVM backend
respects function alignments.
Struct and Union have methods `fieldSrcLoc` to help look up source
locations of their fields. These trigger full loading, tokenization, and
parsing of source files, so should only be called once it is confirmed
that an error message needs to be printed.
There are some nice new error hints for explaining why a type is
required to be comptime, particularly for structs that contain function
body types.
`Type.requiresComptime` is now moved into Sema because it can fail and
might need to trigger field type resolution. Comptime pointer loading
takes into account types that do not have a well-defined memory layout
and does not try to compute a byte offset for them.
`fn()void` syntax no longer secretly makes a pointer. You get a function
body type, which requires comptime. However a pointer to a function body
can be runtime known (obviously).
Compile errors that report "expected pointer, found ..." are factored
out into convenience functions `checkPtrOperand` and `checkPtrType` and
have a note about function pointers.
Implemented `Value.hash` for functions, enum literals, and undefined values.
stage1 is not updated to this (yet?), so some workarounds and disabled
tests are needed to keep everything working. Should we update stage1 to
these new type semantics? Yes probably because I don't want to add too
much conditional compilation logic in the std lib for the different
backends.
Previously, optional slices returned the pointer size as abi size.
We now account for slices to calculate the correct size which is abi-alignment + slice ABI size.
When asking a struct or union whether the type requires comptime, it may
need to ask itself recursively, for example because of a field which is
a pointer to itself. This commit adds a field to each to keep track of
when computing the "requires comptime" value and returns `false` if the
check is already ongoing.
* AIR instruction vector_init gains the ability to init arrays and
tuples in addition to vectors. This will probably also gain the
ability to initialize structs and be renamed to `aggregate_init`.
* AstGen prefers to use an `anon_array_init` ZIR instruction for
local variables when the init expr is an array literal and there is
no type.
AIR:
* `array_elem_val` is now allowed to be used with a vector as the array
type.
* New instructions: splat, vector_init
AstGen:
* The splat ZIR instruction uses coerced_ty for the ResultLoc, avoiding
an unnecessary `as` instruction, since the coercion will be performed
in Sema.
* Builtins that accept vectors now ignore the type parameter. Comment
from this commit reproduced here:
The accepted proposal #6835 tells us to remove the type parameter from
these builtins. To stay source-compatible with stage1, we still observe
the parameter here, but we do not encode it into the ZIR. To implement
this proposal in stage2, only AstGen code will need to be changed.
Sema:
* `clz` and `ctz` ZIR instructions are now handled by the same function
which accept AIR tag and comptime eval function pointer to
differentiate.
* `@typeInfo` for vectors is implemented.
* `@splat` is implemented. It takes advantage of `Value.Tag.repeated` 😎
* `elemValue` is implemented for vectors, when the index is a scalar.
Handling a vector index is still TODO.
* Element-wise coercion is implemented for vectors. It could probably
be optimized a bit, but it is at least complete & correct.
* `Type.intInfo` supports vectors, returning int info for the element.
* `Value.ctz` initial implementation. Needs work.
* `Value.eql` is implemented for arrays and vectors.
LLVM backend:
* Implement vector support when lowering `array_elem_val`.
* Implement vector support when lowering `ctz` and `clz`.
* Implement `splat` and `vector_init`.
resolveTypeForCodegen is called when we needed to resolve a type fully,
even through pointer. This commit fully implements this, even through
pointer fields on structs and unions.
The function has now also been renamed to resolveTypeFully
This allows stage2 to build more of compiler-rt.
I also changed `-%` to `-` for comptime ints in the div and mul
implementations of compiler-rt. This is clearer code and also happens to
work around a bug in stage2.
This improves readability as well as compatibility with stage2. Most of
compiler-rt is now enabled for stage2 with just a few functions disabled
(until stage2 passes more behavior tests).
* `Module.Union.getLayout`: fixes to support components of the union
being 0 bits.
* Implement `@typeInfo` for unions.
* Add missing calls to `resolveTypeFields`.
* Fix explicitly-provided union tag types passing a `Zir.Inst.Ref`
where an `Air.Inst.Ref` was expected. We don't have any type safety
for this; these typess are aliases.
* Fix explicitly-provided `union(enum)` tag Values allocated to the
wrong arena.
Introduced a new AIR instruction: `tag_name`. Reasons to do this
instead of lowering it in Sema to a switch, function call, array
lookup, or if-else tower:
* Sema is a bottleneck; do less work in Sema whenever possible.
* If any optimization passes run, and the operand to becomes
comptime-known, then it could change to have a comptime result
value instead of lowering to a function or array or something which
would then have to be garbage-collected.
* Backends may want to choose to use a function and a switch branch,
or they may want to use a different strategy.
Codegen for `@tagName` is implemented for the LLVM backend but not any
others yet.
Introduced some new `Type` tags:
* `const_slice_u8_sentinel_0`
* `manyptr_const_u8_sentinel_0`
The motivation for this was to make typeof() on the tag_name AIR
instruction non-allocating.
A bunch more enum tests are passing now.
Layout algorithm: all `align(0)` fields are squished together as if they
were a single integer with a number of bits equal to `@bitSizeOf` each
field added together. Then the natural ABI alignment of that integer is
used for that pseudo-field.
Previously, this function would return an incorrect result for structs
and unions which did not have their fields resolved yet.
This required introducing more logic in Sema to resolve types before
doing certain things such as creating an anonmyous Decl and emitting
function call AIR.
As a result a couple more struct tests pass.
Oh, and I implemented the language change to make sizeOf for pointers
always return pointer size bytes even if the element type is 0 bits.
While this is technically incorrect, proper handling of anyopaque, as well
as regular opaque, is probably best left until pointers to zero-sized types
having no bits is abolished.
This allows the inferred error set of comptime and inline invocations to be
resolved separately from the inferred error set of the runtime version or other
comptime/inline invocations.