Instead of explicitly creating a `Module.Decl` object for each anonymous
declaration, each `InternPool.Index` value is implicitly understood to
be an anonymous declaration when encountered by backend codegen.
The memory management strategy for these anonymous decls then becomes to
garbage collect them along with standard InternPool garbage.
In the interest of a smooth transition, this commit only implements this
new scheme for string literals and leaves all the previous mechanisms in
place.
My previous change for reading / writing to unions at comptime did not handle
union field read/writes correctly in all cases. Previously, if a field was
written to a union, it would overwrite the entire value. This is problematic
when a field of a larger size is subsequently read, because the value would not
be long enough, causing a panic.
Additionally, the writing behaviour itself was incorrect. Writing to a field of
a packed or extern union should only overwrite the bits corresponding to that
field, allowing for memory reintepretation via field writes / reads.
I addressed these problems as follows:
Add the concept of a "backing type" for extern / packed unions
(`Type.unionBackingType`). For extern unions, this is a `u8` array, for packed
unions it's an integer matching the `bitSize` of the union. Whenever union
memory is read at comptime, it's read as this type.
When union memory is written at comptime, the tag may still be known. If so, the
memory is written using the tagged type. If the tag is unknown (because this
union had previously been read from memory), it's simply written back out as the
backing type.
I added `write_packed` to the `reinterpret` field of
`ComptimePtrMutationKit`. This causes writes of the operand to be packed - which
is necessary when writing to a field of a packed union. Without this, writing a
value to a `u1` field would overwrite the entire byte it occupied.
The final case to address was reading a different (potentially larger) field
from a union when it was written with a known tag. To handle this, a new kind of
bitcast was introduced (`bitCastUnionFieldVal`) which supports reading a larger
field by using a backing buffer that has the unwritten bits set to
undefined. The reason to support this (vs always just writing the union as it's
backing type), is that no reads to larger fields ever occur at comptime, it
would be strictly worse to have spent time writing the full backing type.
This reverts commit 21780899eb17a0cb795ff40e5fae6556c38ea13e.
After this commit, a version of the compiler which supports the new
`@abs` builtin is required.
When the tag is not known, it's set to `.none`. In this case, the value is either an
array of bytes (for extern unions) or an integer (for packed unions).
This changeset fixes the handling of alignment in several places. The
new rules are:
* `@alignOf(T)` where `T` is a runtime zero-bit type is at least 1,
maybe greater.
* Zero-bit fields in `extern` structs *do* force alignment, potentially
offsetting following fields.
* Zero-bit fields *do* have addresses within structs which can be
observed and are consistent with `@offsetOf`.
These are not necessarily all implemented correctly yet (see disabled
test), but this commit fixes all regressions compared to master, and
makes one new test pass.
All of the logic in `Value.elemValue` is quite questionable, but
printing an error is definitely better than crashing. Notably, this
should stop us from hitting crashes when dumping AIR.
Structs were previously using `SegmentedList` to be given indexes, but
were not actually backed by the InternPool arrays.
After this, the only remaining uses of `SegmentedList` in the compiler
are `Module.Decl` and `Module.Namespace`. Once those last two are
migrated to become backed by InternPool arrays as well, we can introduce
state serialization via writing these arrays to disk all at once.
Unfortunately there are a lot of source code locations that touch the
struct type API, so this commit is still work-in-progress. Once I get it
compiling and passing the test suite, I can provide some interesting
data points such as how it affected the InternPool memory size and
performance comparison against master branch.
I also couldn't resist migrating over a bunch of alignment API over to
use the log2 Alignment type rather than a mismash of u32 and u64 byte
units with 0 meaning something implicitly different and special at every
location. Turns out you can do all the math you need directly on the
log2 representation of alignments.
Instead of using actual slices for InternPool.Key.AnonStructType, this
commit changes to use Slice types instead, which store a
long-lived index rather than a pointer.
This is a follow-up to 7ef1eb1c27754cb0349fdc10db1f02ff2dddd99b.
There are a couple concepts here worth understanding:
Key.UnionType - This type is available *before* resolving the union's
fields. The enum tag type, number of fields, and field names, field
types, and field alignments are not available with this.
InternPool.UnionType - This one can be obtained from the above type with
`InternPool.loadUnionType` which asserts that the union's enum tag type
has been resolved. This one has all the information available.
Additionally:
* ZIR: Turn an unused bit into `any_aligned_fields` flag to help
semantic analysis know whether a union has explicit alignment on any
fields (usually not).
* Sema: delete `resolveTypeRequiresComptime` which had the same type
signature and near-duplicate logic to `typeRequiresComptime`.
- Make opaque types not report comptime-only (this was inconsistent
between the two implementations of this function).
* Implement accepted proposal #12556 which is a breaking change.
The key changes in this commit are:
```diff
- names: []const NullTerminatedString,
+ names: NullTerminatedString.Slice,
- values: []const Index,
+ values: Index.Slice,
```
Which eliminates the slices from `InternPool.Key.EnumType` and replaces
them with structs that contain `start` and `len` indexes. This makes the
lifetime of `EnumType` change from expiring with updates to InternPool,
to expiring when the InternPool is garbage-collected, which is currently
never.
This is gearing up for a larger change I started working on locally
which moves union types into InternPool.
As a bonus, I fixed some unnecessary instances of `@as`.
Fixes regression introduced in
db33ee45b7261c9ec62a1087cfc9377bc4e7aa8f.
Found from compiling bun. Unfortunately we do not have a behavior test
reduction for this bug.
* getOwnedFunctionIndex no longer checks if the value is actually a
function.
* The callsites to `intern` that I added want to avoid the `getCoerced`
call, so I added `intern2`.
* Adding to inferred error sets should not happen if the destination
error set is not the inferred error set of the current Sema instance.
* adhoc_inferred_error_set_type can be seen by the backend. Treat it
like anyerror.
Abridged summary:
* Move `Module.Fn` into `InternPool`.
* Delete a lot of confusing and problematic `Sema` logic related to
generic function calls.
This commit removes `Module.Fn` and replaces it with two new
`InternPool.Tag` values:
* `func_decl` - corresponding to a function declared in the source
code. This one contains line/column numbers, zir_body_inst, etc.
* `func_instance` - one for each monomorphization of a generic
function. Contains a reference to the `func_decl` from whence the
instantiation came, along with the `comptime` parameter values (or
types in the case of `anytype`)
Since `InternPool` provides deduplication on these values, these fields
are now deleted from `Module`:
* `monomorphed_func_keys`
* `monomorphed_funcs`
* `align_stack_fns`
Instead of these, Sema logic for generic function instantiation now
unconditionally evaluates the function prototype expression for every
generic callsite. This is technically required in order for type
coercions to work. The previous code had some dubious, probably wrong
hacks to make things work, such as `hashUncoerced`. I'm not 100% sure
how we were able to eliminate that function and still pass all the
behavior tests, but I'm pretty sure things were still broken without
doing type coercion for every generic function call argument.
After the function prototype is evaluated, it produces a deduplicated
`func_instance` `InternPool.Index` which can then be used for the
generic function call.
Some other nice things made by this simplification are the removal of
`comptime_args_fn_inst` and `preallocated_new_func` from `Sema`, and the
messy logic associated with them.
I have not yet been able to measure the perf of this against master
branch. On one hand, it reduces memory usage and pointer chasing of the
most heavily used `InternPool` Tag - function bodies - but on the other
hand, it does evaluate function prototype expressions more than before.
We will soon find out.
All of the std except these few functions call it "eql" instead of "eq".
This has previously tripped me up when I expected the equality check function to be called "eql"
(just like all the rest of the std) instead of "eq".
The motivation is consistency.
If search "eq" on Autodoc, these functions stick out and it looks inconsistent.
I just noticed there are also a few functions spelling it out as "equal" (such as std.mem.allEqual).
Maybe those functions should also spell it "eql" but that can be done in a future PR.
Most of this migration was performed automatically with `zig fmt`. There
were a few exceptions which I had to manually fix:
* `@alignCast` and `@addrSpaceCast` cannot be automatically rewritten
* `@truncate`'s fixup is incorrect for vectors
* Test cases are not formatted, and their error locations change
I achieved this through a major refactor of the logic of analyzeMinMax.
This change should be compatible with vectors of comptime_int, which
Andrew said are supposed to work (but which currently do not).
These are frequently invalidated whenever a string is interned, so avoid
creating pointers to `string_bytes` wherever possible. This is an
attempt to fix random CI failures.
Previously, these checks worked by performing the arithmetic operation,
then checking whether the result fit in the type in question. Since all
values are now typed, this approach was no longer valid, and was
tripping some assertions due to trying to store too-large values in
smaller types.
Now, `intAdd`, `intSub`, `intMul` and `intDiv` all check for overflow,
and if it happens, re-do the operation with the result being a
`comptime_int`, and reporting the error (and vector index) to the caller
so that the error can be reported.
After this change, all test cases are passing.
In an effort to delete `Value.hashUncoerced`, generic instantiation has
been redesigned. Instead of just storing instantiations in
`monomorphed_funcs`, partially instantiated generic argument types are
also cached. This isn't quite the single `getOrPut` that it used to be,
but one `get` per generic argument plus one get for the instantiation,
with an equal number of `put`s per unique instantiation isn't bad.
The main motivation for this commit is eliminating Decl.value_arena.
Everything else is dominoes.
Decl.name used to be stored in the GPA, now it is stored in InternPool.
It ended up being simpler to migrate other strings to be interned as
well, such as struct field names, union field names, and a few others.
This ended up requiring a big diff, sorry about that. But the changes
are pretty nice, we finally start to take advantage of InternPool's
existence.
global_error_set and error_name_list are simplified. Now it is a single
ArrayHashMap(NullTerminatedString, void) and the index is the error tag
value.
Module.tmp_hack_arena is re-introduced (it was removed in
eeff407941560ce8eb5b737b2436dfa93cfd3a0c) in order to deal with
comptime_args, optimized_order, and struct and union fields. After
structs and unions get moved into InternPool properly, tmp_hack_arena
can be deleted again.
This is neither a type nor a value. Simplifies `addStrLit` as well as
the many places that switch on `InternPool.Key`.
This is a partial revert of bec29b9e498e08202679aa29a45dab2a06a69a1e.
This is a bit odd, because this value doesn't actually exist:
see #15909. This gets all the empty enum/union behavior tests passing.
Also adds an assertion to `Sema.analyzeBodyInner` which would have
helped figure out the issue here much more quickly.
Key.PtrType is now an extern struct so that hashing it can be done by
reinterpreting bytes directly. It also uses the same representation for
type_pointer Tag encoding and the Key. Accessing pointer attributes now
requires packed struct access, however, many operations are now a copy
of a u32 rather than several independent fields.
This function moves the top two most used Key variants - pointer types
and pointer values - to use a single-shot hash function that branches
for small keys instead of calling memcpy.
As a result, perf against merge-base went from 1.17x ± 0.04 slower to
1.12x ± 0.04 slower. After the pointer value hashing was changed, total
CPU instructions spent in memcpy went from 4.40% to 4.08%, and after
additionally improving pointer type hashing, it further decreased to
3.72%.