Instead of explicitly creating a `Module.Decl` object for each anonymous
declaration, each `InternPool.Index` value is implicitly understood to
be an anonymous declaration when encountered by backend codegen.
The memory management strategy for these anonymous decls then becomes to
garbage collect them along with standard InternPool garbage.
In the interest of a smooth transition, this commit only implements this
new scheme for string literals and leaves all the previous mechanisms in
place.
Instead of linear search every time a packed struct field's bit or byte
offset is wanted, they are computed once during resolution of the packed
struct's backing int type, and stored in InternPool for O(1) lookup.
Closes#17178
When the tag is not known, it's set to `.none`. In this case, the value is either an
array of bytes (for extern unions) or an integer (for packed unions).
When struct types have no field names, the names are implicitly
understood to be strings corresponding to the field indexes in
declaration order. It used to be the case that a NullTerminatedString
would be stored for each field in this case, however, now, callers must
handle the possibility that there are no names stored at all. This
commit introduces `legacyStructFieldName`, a function to fake the
previous behavior. Probably something better could be done by reworking
all the callsites of this function.
This also modifies AstGen so that struct types use 1 bit each from the
flags to communicate if there are nonzero inits, alignments, or comptime
fields. This allows adding a struct type to the InternPool without
looking ahead in memory to find out the answers to these questions,
which is easier for CPUs as well as for me, coding this logic right now.
Structs were previously using `SegmentedList` to be given indexes, but
were not actually backed by the InternPool arrays.
After this, the only remaining uses of `SegmentedList` in the compiler
are `Module.Decl` and `Module.Namespace`. Once those last two are
migrated to become backed by InternPool arrays as well, we can introduce
state serialization via writing these arrays to disk all at once.
Unfortunately there are a lot of source code locations that touch the
struct type API, so this commit is still work-in-progress. Once I get it
compiling and passing the test suite, I can provide some interesting
data points such as how it affected the InternPool memory size and
performance comparison against master branch.
I also couldn't resist migrating over a bunch of alignment API over to
use the log2 Alignment type rather than a mismash of u32 and u64 byte
units with 0 meaning something implicitly different and special at every
location. Turns out you can do all the math you need directly on the
log2 representation of alignments.
Instead of using actual slices for InternPool.Key.AnonStructType, this
commit changes to use Slice types instead, which store a
long-lived index rather than a pointer.
This is a follow-up to 7ef1eb1c27754cb0349fdc10db1f02ff2dddd99b.
There are a couple concepts here worth understanding:
Key.UnionType - This type is available *before* resolving the union's
fields. The enum tag type, number of fields, and field names, field
types, and field alignments are not available with this.
InternPool.UnionType - This one can be obtained from the above type with
`InternPool.loadUnionType` which asserts that the union's enum tag type
has been resolved. This one has all the information available.
Additionally:
* ZIR: Turn an unused bit into `any_aligned_fields` flag to help
semantic analysis know whether a union has explicit alignment on any
fields (usually not).
* Sema: delete `resolveTypeRequiresComptime` which had the same type
signature and near-duplicate logic to `typeRequiresComptime`.
- Make opaque types not report comptime-only (this was inconsistent
between the two implementations of this function).
* Implement accepted proposal #12556 which is a breaking change.
The key changes in this commit are:
```diff
- names: []const NullTerminatedString,
+ names: NullTerminatedString.Slice,
- values: []const Index,
+ values: Index.Slice,
```
Which eliminates the slices from `InternPool.Key.EnumType` and replaces
them with structs that contain `start` and `len` indexes. This makes the
lifetime of `EnumType` change from expiring with updates to InternPool,
to expiring when the InternPool is garbage-collected, which is currently
never.
This is gearing up for a larger change I started working on locally
which moves union types into InternPool.
As a bonus, I fixed some unnecessary instances of `@as`.
Some builtin types have a special InternPool index (e.g.
`.type_info_type`) so that AstGen can refer to them before semantic
analysis. Unfortunately, this previously led to a second index existing
to refer to the type once it was resolved, complicating Sema by having
the concept of an "unresolved" type index.
This change makes Sema modify these InternPool indices in-place to
contain the expanded representation when resolved. The analysis of the
corresponding decls is caught in `Module.semaDecl`, and a field is set
on Sema telling it which index to place struct/union/enum types at. This
system could break if `std.builtin` contained complex decls which
evaluate multiple struct types, but this will be caught by the
assertions in `InternPool.resolveBuiltinType`.
The AstGen result types which were disabled in 6917a8c have been
re-enabled.
Resolves: #16603
Since the same Key.Func data structure is used for coerced function
bodies as well as uncoerced function bodies, there is danger of them
being hashed and equality-checked as the same. When that happens, the
type of a function body value might be wrong, causing lots of problems.
In this instance, it causes an assertion failure.
This commit fixes it by introducing an `uncoerced_ty` field which
is different than `ty` in the case of `func_coerced` and is used to
differentiate when doing hashing and equality checking.
I have a new behavior test to cover this bug, but it revealed *another*
bug at the same time, so I will fix it in the next commit and and add
the new test therein.
* Introduce InternPool.Tag.func_coerced to handle the case of a
function body coerced to a new type. `InternPool.getCoerced` is now
implemented for function bodies in this branch.
* implement resolution of ad-hoc inferred error sets in
`Sema.analyzeCall`.
* fix generic_owner being set wrong for child Sema bodies of param
expressions.
* fix `Sema.resolveInferredErrorSetTy` when passed `anyerror`.
Previously, they shared function index with the owner decl, but that
would clobber the data stored for inferred error sets of runtime calls.
Now there is an adhoc_inferred_error_set_type which models the problem
much more correctly.
Before, it incorrectly passed an InternPool.Index where an extra array
index was expected (to the function which is renamed to `extraErrorSet`
in this commit).
There is one case where function types may be inequal but we still want
to find the same function body instance in InternPool.
In the case of the functions having an inferred error set, the key used
to find an existing function body will necessarily have a unique
inferred error set type, because it refers to the function body
InternPool Index. To make this case work we omit the inferred error set
from the equality and hashing functions.
* move inferred error sets into InternPool.
- they are now represented by pointing directly at the corresponding
function body value.
* inferred error set working memory is now in Sema and expires after
the Sema for the function corresponding to the inferred error set is
finished having its body analyzed.
* error sets use a InternPool.Index.Slice rather than an actual slice
to avoid lifetime issues.
The idea here is to move towards a future where anonymous decls are
represented entirely by an `InternPool.Index`. This was needed to start
implementing `InternPool.getFuncDecl` which requires moving creation and
deletion of Decl objects into InternPool.
* remove `Namespace.anon_decls`
* remove the concept of cleaning up resources from anonymous decls,
relying on InternPool instead.
* move namespace and decl object allocation into InternPool
Abridged summary:
* Move `Module.Fn` into `InternPool`.
* Delete a lot of confusing and problematic `Sema` logic related to
generic function calls.
This commit removes `Module.Fn` and replaces it with two new
`InternPool.Tag` values:
* `func_decl` - corresponding to a function declared in the source
code. This one contains line/column numbers, zir_body_inst, etc.
* `func_instance` - one for each monomorphization of a generic
function. Contains a reference to the `func_decl` from whence the
instantiation came, along with the `comptime` parameter values (or
types in the case of `anytype`)
Since `InternPool` provides deduplication on these values, these fields
are now deleted from `Module`:
* `monomorphed_func_keys`
* `monomorphed_funcs`
* `align_stack_fns`
Instead of these, Sema logic for generic function instantiation now
unconditionally evaluates the function prototype expression for every
generic callsite. This is technically required in order for type
coercions to work. The previous code had some dubious, probably wrong
hacks to make things work, such as `hashUncoerced`. I'm not 100% sure
how we were able to eliminate that function and still pass all the
behavior tests, but I'm pretty sure things were still broken without
doing type coercion for every generic function call argument.
After the function prototype is evaluated, it produces a deduplicated
`func_instance` `InternPool.Index` which can then be used for the
generic function call.
Some other nice things made by this simplification are the removal of
`comptime_args_fn_inst` and `preallocated_new_func` from `Sema`, and the
messy logic associated with them.
I have not yet been able to measure the perf of this against master
branch. On one hand, it reduces memory usage and pointer chasing of the
most heavily used `InternPool` Tag - function bodies - but on the other
hand, it does evaluate function prototype expressions more than before.
We will soon find out.
All of the std except these few functions call it "eql" instead of "eq".
This has previously tripped me up when I expected the equality check function to be called "eql"
(just like all the rest of the std) instead of "eq".
The motivation is consistency.
If search "eq" on Autodoc, these functions stick out and it looks inconsistent.
I just noticed there are also a few functions spelling it out as "equal" (such as std.mem.allEqual).
Maybe those functions should also spell it "eql" but that can be done in a future PR.
Most of this migration was performed automatically with `zig fmt`. There
were a few exceptions which I had to manually fix:
* `@alignCast` and `@addrSpaceCast` cannot be automatically rewritten
* `@truncate`'s fixup is incorrect for vectors
* Test cases are not formatted, and their error locations change