The core of this change is to re-use the escape sequence parsing logic
for parsing both string and character literals.
The actual fix is that UTF-8 encoding was missing for string literals
with \u{...} escape sequences.
* Sema: resolve type fully when emitting an alloc AIR instruction to
avoid tripping assertion for checking struct field alignment.
* LLVM backend: keep a reference to the LLVM target data alive during
lowering so that we can ask LLVM what it thinks the ABI alignment
and size of LLVM types are. We need this in order to lower tuples and
structs so that we can put in extra padding bytes when Zig disagrees
with LLVM about the size or alignment of something.
* LLVM backend: make the LLVM struct type packed that contains the most
aligned union field and the padding. This prevents the struct from
being too big according to LLVM. In the future, we may want to
consider instead emitting unions in a "flat" manner; putting the tag,
most aligned union field, and padding all in the same struct field
space.
* LLVM backend: make structs with 2 or fewer fields return isByRef=false.
This results in more efficient codegen. This required lowering of
bitcast to sometimes store the struct into an alloca, ptrcast, and
then load because LLVM does not allow bitcasting structs.
* enable more passing behavior tests.
Packed structs were tripping an LLVM assertion due to calling
`LLVMConstZExt` from i16 to i16. Solved by using instead
`LLVMConstZExtOrBitCast`.
Unions were tripping an LLVM assertion due to a typo using the union
llvm type to construct an integer value rather than the tag type.
This was introduced in d1a46548349a902c30057b3ba66ebad9bc25bdd2: when a
BufSet clones the keys, it used to assign the new pointers to the old
struct. Fix that by assigning the pointers to the correct, i.e. the new,
struct.
This caused double-free when using arena allocator for the new struct,
also in the test case.
Symbols that have globals used to have their lookup key be the symbol name.
This key is now the offset into the string table.
Imports have both the module name (library name) and name (of the symbol), those strings are now
also being interned. This can save us up to 24bytes per import which have both their module name and name de-duplicated.
Module names are almost entirely the same for all imports, providing us with a big chance of saving us 12 bytes at least.
Just like imports, exports can also have a seperate name than the internal symbol name. Rather than storing the slice,
we now store the offset of this string instead.
For all symbols read from object files as well as generated from Zig code
will now be interned and have their offset into the string table saved on the `Symbol` instead.
Besides interning, local symbols now also use a decl's fully qualified name.
When a decl/symbol is extern/to-be-imported, the name of the decl itself will be used for symbol resolving.
Similarly for symbols that will be exported, will have their 'export name' set.
This is preliminary work for string interning in the wasm linker.
Using an arena would defeat the purpose of de-duplicating strings as we wouldn't be able to free memory
of duplicated strings.
This change also means we can simplify wasm binary parsing, by creating a general purpose parser that
parses the binary into its sections, but untyped. Doing this, allows us to re-use the base of that, for
object file, but also debug info parsing.
* Fix compile error for `zirErrorUnionType`.
* Convert zirMergeErrorSets logic to call `Type.errorSetMerge`.
It does not need to create a Decl as the TODO comment hinted.
* Extract out a function called `resolveInferredErrorSetTy`.
* Rework `resolvePeerTypes` with respect to error unions and
error sets. This is a less complex implementation that passes all the
same tests and uses many fewer lines of code by taking advantage of
the function `coerceInMemoryAllowedErrorSets`.
- Always merge error sets in the order that makes sense, even when
that means `@typeInfo` incompatibility with stage1.
* `Type.errorSetMerge` no longer overallocates.
* Don't skip passing tests.
Similar to how Type.eql was reworked in the previous commit, this commit
reworks Type.hash to check all the different kinds of tags that a Type
can be represented with. It also completes the implementation for all
types except error sets, which need to have Type.eql enhanced as well.
Several issues with pointer types are fixed:
Prior to this commit, Zig would not canonicalize a pointer type with
an explicit alignment to alignment=0 if it matched the pointee ABI
alignment. In order to fix this, `Type.ptr` now takes a Target
parameter. I also moved the host_size canonicalization to `Type.ptr`
since target is now available. Similarly, is_allowzero in the case of
C pointers is now treated as a canonicalization done by the function
rather than a precondition.
in-memory coercion for pointers now properly checks ABI alignment
of pointee types instead of incorrectly treating the 0 value as an
alignment.
Type equality is completely reworked based on the tag() rather than the
zigTypeTag(). It's still semantically based on zigTypeTag() but that
knowledge is implied rather than dictating the control flow of the
logic. Importantly, this fixes cases for opaques, structs, tuples,
enums, and unions, where type equality was incorrectly returning based
on whether the tag() values were equal.
Additionally, pointer type equality now takes into account alignment.
Because we canonicalize non-zero alignment which equals pointee type ABI
alignment to alignment=0, this now can be a simple integer comparison.
Type hashing is implemented for pointers and floats. Array types now
additionally hash their sentinels.
This regressed some behavior tests that were passing but only because
of bugs regarding type equality.
The C backend has a noticeable problem with lowering differently-aligned
pointers (particularly slices) as the same type, causing C compilation
errors due to duplicate declarations.