This improves the ABI alignment resolution code.
This commit fully enables the MachO linker code in stage3. Note,
however, that there are still miscompilations in stage3.
Prior to this commit, the logic for ABI size and ABI alignment for
integers was naive and incorrect. This results in wasted hardware as
well as undefined behavior in the LLVM backend when we memset an
incorrect number of bytes to 0xaa due to disagreeing with LLVM about the
ABI size of integers.
This commit introduces a "max int align" value which is different per
Target. This value is used to derive the ABI size and alignment of all
integers.
This commit makes an interesting change from stage1, which treats
128-bit integers as 16-bytes aligned for x86_64-linux. stage1 is
incorrect. The maximum integer alignment on this system is only 8 bytes.
This change breaks the behavior test called "128-bit cmpxchg" because on
that target, 128-bit cmpxchg does require a 16-bytes aligned pointer to
a 128 bit integer. However, this alignment property does not belong on
*all* 128 bit integers - only on the pointer type in the `@cmpxchg`
builtin function prototype. The user can then use an alignment override
annotation on a 128-bit integer variable or struct field to obtain such
a pointer.
Rather than allocating Decl objects with an Allocator, we instead allocate
them with a SegmentedList. This provides four advantages:
* Stable memory so that one thread can access a Decl object while another
thread allocates additional Decl objects from this list.
* It allows us to use u32 indexes to reference Decl objects rather than
pointers, saving memory in Type, Value, and dependency sets.
* Using integers to reference Decl objects rather than pointers makes
serialization trivial.
* It provides a unique integer to be used for anonymous symbol names,
avoiding multi-threaded contention on an atomic counter.
According to Apple docs, the long double type is a double precision
IEEE754 binary floating-point type, which makes it identical to the
double type. This behavior contrasts to the standard specification,
in which a long double is a quad-precision, IEEE754 binary,
floating-point type.
Thus, we need to take this into account when using the compiler
intrinsics so that we select the correct function version for
FloatMulAdd.
* The `@bitCast` workaround is removed in favor of `@ptrCast` properly
doing element casting for slice element types. This required an
enhancement both to stage1 and stage2.
* stage1 incorrectly accepts `.{}` instead of `{}`. stage2 code that
abused this is fixed.
* Make some parameters comptime to support functions in switch
expressions (as opposed to making them function pointers).
* Avoid relying on local temporaries being mutable.
* Workarounds for when stage1 and stage2 disagree on function pointer
types.
* Workaround recursive formatting bug with a `@panic("TODO")`.
* Remove unreachable `else` prongs for some inferred error sets.
All in effort towards #89.
* AstGen: restore the param_type ZIR instruction and pass it to the
expression for function call arguments. This does not solve the
problem for generic function parameters, but it catches stage2 up to
stage1 which also does not solve the problem for generic function
parameters.
- Most of the enhancements in this commit will still be needed for a
more sophisticated further improvement to handle generic function
types.
- In Sema, handling of `as` coercion recognizes the `var_args_param`
Type Tag and passes the operand through doing no coercion.
- That was the last ZIR tag and we are now using all 256 ZIR tags.
* AstGen: array init and struct init expressions use the anon form even
when the result location has a type. Prevents the type system
incorrectly believing, for example, that a tuple is actually an array
when the result location is a param_type of a function with `anytype`
parameter.
* Sema: add missing coercion in `unionInit` to coerce the init to the
corresponding union field type.
* `Value.fieldValue` now takes a type and does not take an allocator.
closes#11293
After this commit, stage2 passes all the parser tests.
This commit adds a new optional argument to several Value methods which
provides the ability to resolve types if it comes to it. This prevents
having duplicated logic inside both Sema and Value.
With this commit, the "struct contains slice of itself" test is passing
by exploiting the new lazy_align Value Tag.
* make it always return a fully qualified name. stage1 is inconsistent
about this.
* AstGen: fix anon_name_strategy to correctly be `func` when anon type
creation happens in the operand of the return expression.
* Sema: implement type names for the "function" naming strategy.
* Put "enum", "union", "opaque", or "struct" in place of "anon" when
creating respective anonymous Decl names.
* std.testing: add `expectStringStartsWith`. Didn't end up using it
after all.
Also this enables the real test runner for stage2 LLVM backend (sans
wasm32) since it works now.
* don't store `has_well_defined_layout` in memory.
* remove struct `hasWellDefinedLayout` logic. it's just
`layout != .Auto`. This means we only need one implementation, in
Type.
* fix some of the cases being wrong in `hasWellDefinedLayout`, such as
optional pointers.
* move `tag_ty_inferred` field into a position that makes it more
obvious how the struct layout will be done. Also we don't have a
compiler that intelligently moves fields around so this layout is
better.
* Sema: don't `resolveTypeLayout` in `zirCoerceResultPtr` unless
necessary.
* Rename `ComptimePtrLoadKit` `target` field to `pointee` to avoid
confusion with `target`.
The core change here is that we no longer blindly trust that parent
pointers (.elem_ptr, .field_ptr, .eu_payload_ptr, .union_payload_ptr)
were derived from the "true" type of the underlying decl. When types
diverge, direct dereference fails and we are forced to bitcast, as
usual.
In order to maximize our chances to have a successful bitcast, this
includes several changes to the dereference procedure:
- `root` is now `parent` and is the largest Value containing the
dereference target, with the condition that its layout and the
byte offset of the target within are both well-defined.
- If the target cannot be dereferenced directly, because the
pointers were not derived from the true type of the underlying
decl, then it is returned as null.
- `beginComptimePtrDeref` now accepts an optional array_ty param,
which is used to directly dereference an array from an elem_ptr,
if necessary. This allows us to dereference array types without
well-defined layouts (e.g. `[N]?u8`) at an offset
The load_ty also allows us to correctly "over-read" an .elem_ptr to an
array of [N]T, if necessary. This makes direct dereference work for
array types even in the presence of an offset, which is necessary if
the array has no well-defined layout (e.g. loading from `[6]?u8`)
This follows the same strategy as sema.typeRequiresComptime() and
type.comptimeOnly(): Two versions of the function, one which performs
resolution just-in-time and another which asserts that resolution is
complete.
Thankfully, this doesn't cause very viral type resolution, since
auto-layout structs and unions are very common and are known to not have
a well-defined layout without resolving their fields.
Introduce `Module.ensureFuncBodyAnalyzed` and corresponding `Sema`
function. This mirrors `ensureDeclAnalyzed` except also waits until the
function body has been semantically analyzed, meaning that inferred
error sets will have been populated.
Resolving error sets can now emit a "unable to resolve inferred error
set" error instead of producing an incorrect error set type. Resolving
error sets now calls `ensureFuncBodyAnalyzed`. Closes#11046.
`coerceInMemoryAllowedErrorSets` now does a lot more work to avoid
resolving an inferred error set if possible. Same with
`wrapErrorUnionSet`.
Inferred error set types no longer check the `func` field to determine if
they are equal. That was incorrect because an inline or comptime function
call produces a unique error set which has the same `*Module.Fn` value for
this field. Instead we use the `*Module.Fn.InferredErrorSet` pointers to
test equality of inferred error sets.
* use the real start code for LLVM backend with x86_64-linux
- there is still a check for zig_backend after initializing the TLS
area to skip some stuff.
* introduce new AIR instructions and implement them for the LLVM
backend. They are the same as `call` except with a modifier.
- call_always_tail
- call_never_tail
- call_never_inline
* LLVM backend calls hasRuntimeBitsIgnoringComptime in more places to
avoid unnecessarily depending on comptimeOnly being resolved for some
types.
* LLVM backend: remove duplicate code for setting linkage and value
name. The canonical place for this is in `updateDeclExports`.
* LLVM backend: do some assembly template massaging to make `%%`
rendered as `%`. More hacks will be needed to make inline assembly
catch up with stage1.
* Reduce branching in Type.eql and Type.hash for error sets.
* `Type.eql` uses element-wise bytes comparison since it can rely on
the error sets being pre-sorted.
* Avoid unnecessarily skipping tests that are passing.
This implements type equality for error sets. This is done
through element-wise error set comparison.
Inferred error sets are always distinct types and other error sets are
always sorted. See #11022.
This required adjusting `Type.nameAlloc` to be used with a
general-purpose allocator and added `Type.nameAllocArena` for the arena
use case (avoids allocation sometimes).
* mul_add AIR instruction: use `pl_op` instead of `ty_pl`. The type is
always the same as the operand; no need to waste bytes redundantly
storing the type.
* AstGen: use coerced_ty for all the operands except for one which we
use to communicate the type.
* Sema: use the correct source location for requireRuntimeBlock in
handling of `@mulAdd`.
* native backends: handle liveness even for the functions that are
TODO.
* C backend: implement `@mulAdd`. It lowers to libc calls.
* LLVM backend: make `@mulAdd` handle all float types.
- improved fptrunc and fpext to handle f80 with compiler-rt calls.
* Value.mulAdd: handle all float types and use the `@mulAdd` builtin.
* behavior tests: revert the changes to testing `@mulAdd`. These
changes broke the test coverage, making it only tested at
compile-time.
Improved f80 support:
* std.math.fma handles f80
* move fma functions from freestanding libc to compiler-rt
- add __fmax and fmal
- make __fmax and fmaq only exported when they don't alias fmal.
- make their linkage weak just like the rest of compiler-rt symbols.
* removed `longDoubleIsF128` and replaced it with `longDoubleIs` which
takes a type as a parameter. The implementation is now more accurate
and handles more targets. Similarly, in stage2 the function
CTypes.sizeInBits is more accurate for long double for more targets.
* AIR: use pl_op instead of ty_pl for wasm_memory_size. No need to
store the type because the type is always `u32`.
* AstGen: use coerced_ty for `@wasmMemorySize` and `@wasmMemoryGrow`
and do the coercions in Sema.
* Sema: use more accurate source locations for errors.
* Provide more information in the compiler error message.
* Codegen: use liveness data to avoid lowering unused
`@wasmMemorySize`.
* LLVM backend: add implementation
- I wasn't able to test it because we are hitting a linker error for
`-target wasm32-wasi -fLLVM`.
* C backend: use `zig_unimplemented()` instead of silently doing wrong
behavior for these builtins.
* behavior tests: branch only on stage2_arch for inclusion of the
wasm.zig file. We would change it to `builtin.cpu.arch` but that is
causing a compiler crash on some backends.