This reverts commit 78855bd21866b515018259a2194e036e4b3120df.
This commit did not replace uses of `Type.err_int` of which there are
currently 60 uses.
Re-opens #786
* add Module instances for each package's build.zig and attach it to the
dependencies.zig module with the hash digest hex string as the name.
* fix incorrectly skipping the wrong packages for creating
dependencies.zig
* a couple more renaming of "package" to "module"
Finish the work started in 4c4fb839972f66f55aa44fc0aca5f80b0608c731.
Now the compiler compiles again.
Wire up dependency tree fetching code in the CLI for `zig build`.
Everything is hooked up except for `createDependenciesModule` is not yet
implemented.
* start renaming "package" to "module" (see #14307)
- build system gains `main_mod_path` and `main_pkg_path` is still
there but it is deprecated.
* eliminate the object-oriented memory management style of what was
previously `*Package`. Now it is `*Package.Module` and all pointers
point to externally managed memory.
* fixes to get the new Fetch.zig code working. The previous commit was
work-in-progress. There are still two commented out code paths, the
one that leads to `Compilation.create` and the one for `zig build`
that fetches the entire dependency tree and creates the required
modules for the build runner.
Instead of explicitly creating a `Module.Decl` object for each anonymous
declaration, each `InternPool.Index` value is implicitly understood to
be an anonymous declaration when encountered by backend codegen.
The memory management strategy for these anonymous decls then becomes to
garbage collect them along with standard InternPool garbage.
In the interest of a smooth transition, this commit only implements this
new scheme for string literals and leaves all the previous mechanisms in
place.
My previous change for reading / writing to unions at comptime did not handle
union field read/writes correctly in all cases. Previously, if a field was
written to a union, it would overwrite the entire value. This is problematic
when a field of a larger size is subsequently read, because the value would not
be long enough, causing a panic.
Additionally, the writing behaviour itself was incorrect. Writing to a field of
a packed or extern union should only overwrite the bits corresponding to that
field, allowing for memory reintepretation via field writes / reads.
I addressed these problems as follows:
Add the concept of a "backing type" for extern / packed unions
(`Type.unionBackingType`). For extern unions, this is a `u8` array, for packed
unions it's an integer matching the `bitSize` of the union. Whenever union
memory is read at comptime, it's read as this type.
When union memory is written at comptime, the tag may still be known. If so, the
memory is written using the tagged type. If the tag is unknown (because this
union had previously been read from memory), it's simply written back out as the
backing type.
I added `write_packed` to the `reinterpret` field of
`ComptimePtrMutationKit`. This causes writes of the operand to be packed - which
is necessary when writing to a field of a packed union. Without this, writing a
value to a `u1` field would overwrite the entire byte it occupied.
The final case to address was reading a different (potentially larger) field
from a union when it was written with a known tag. To handle this, a new kind of
bitcast was introduced (`bitCastUnionFieldVal`) which supports reading a larger
field by using a backing buffer that has the unwritten bits set to
undefined. The reason to support this (vs always just writing the union as it's
backing type), is that no reads to larger fields ever occur at comptime, it
would be strictly worse to have spent time writing the full backing type.
Instead of linear search every time a packed struct field's bit or byte
offset is wanted, they are computed once during resolution of the packed
struct's backing int type, and stored in InternPool for O(1) lookup.
Closes#17178
When the tag is not known, it's set to `.none`. In this case, the value is either an
array of bytes (for extern unions) or an integer (for packed unions).
Previously it would canonicalize or not depending on some volatile
internal state of the compiler, now it forces resolution of the element
type to determine the alignment if it needs to.
Structs were previously using `SegmentedList` to be given indexes, but
were not actually backed by the InternPool arrays.
After this, the only remaining uses of `SegmentedList` in the compiler
are `Module.Decl` and `Module.Namespace`. Once those last two are
migrated to become backed by InternPool arrays as well, we can introduce
state serialization via writing these arrays to disk all at once.
Unfortunately there are a lot of source code locations that touch the
struct type API, so this commit is still work-in-progress. Once I get it
compiling and passing the test suite, I can provide some interesting
data points such as how it affected the InternPool memory size and
performance comparison against master branch.
I also couldn't resist migrating over a bunch of alignment API over to
use the log2 Alignment type rather than a mismash of u32 and u64 byte
units with 0 meaning something implicitly different and special at every
location. Turns out you can do all the math you need directly on the
log2 representation of alignments.
* Use 32-bit integers instead of pointers for compactness and
serialization friendliness.
* Use a separate hash map for runtime and comptime capture scopes,
avoiding the 1-bit union tag.
* Use a compact array representation instead of a tree of hash maps.
* Eliminate the only instance of ref-counting in the compiler, instead
relying on garbage collection (not implemented yet but is the plan for
almost all long-lived objects related to incremental compilation).
Because a code modification may need to access capture scope data, this
makes capture scope data long-lived state. My goal is to get incremental
compilation state serialization down to a single pwritev syscall, by
unifying the on-disk representation with the in-memory representation.
This commit eliminates the last remaining pointer field of
`Module.Decl`.
This makes the call sites easier to read, reduces the number of `catch`
expressions required, and prepares for comptime reasons to appear
earlier in the list of notes.
There are a couple concepts here worth understanding:
Key.UnionType - This type is available *before* resolving the union's
fields. The enum tag type, number of fields, and field names, field
types, and field alignments are not available with this.
InternPool.UnionType - This one can be obtained from the above type with
`InternPool.loadUnionType` which asserts that the union's enum tag type
has been resolved. This one has all the information available.
Additionally:
* ZIR: Turn an unused bit into `any_aligned_fields` flag to help
semantic analysis know whether a union has explicit alignment on any
fields (usually not).
* Sema: delete `resolveTypeRequiresComptime` which had the same type
signature and near-duplicate logic to `typeRequiresComptime`.
- Make opaque types not report comptime-only (this was inconsistent
between the two implementations of this function).
* Implement accepted proposal #12556 which is a breaking change.
The key changes in this commit are:
```diff
- names: []const NullTerminatedString,
+ names: NullTerminatedString.Slice,
- values: []const Index,
+ values: Index.Slice,
```
Which eliminates the slices from `InternPool.Key.EnumType` and replaces
them with structs that contain `start` and `len` indexes. This makes the
lifetime of `EnumType` change from expiring with updates to InternPool,
to expiring when the InternPool is garbage-collected, which is currently
never.
This is gearing up for a larger change I started working on locally
which moves union types into InternPool.
As a bonus, I fixed some unnecessary instances of `@as`.
Some builtin types have a special InternPool index (e.g.
`.type_info_type`) so that AstGen can refer to them before semantic
analysis. Unfortunately, this previously led to a second index existing
to refer to the type once it was resolved, complicating Sema by having
the concept of an "unresolved" type index.
This change makes Sema modify these InternPool indices in-place to
contain the expanded representation when resolved. The analysis of the
corresponding decls is caught in `Module.semaDecl`, and a field is set
on Sema telling it which index to place struct/union/enum types at. This
system could break if `std.builtin` contained complex decls which
evaluate multiple struct types, but this will be caught by the
assertions in `InternPool.resolveBuiltinType`.
The AstGen result types which were disabled in 6917a8c have been
re-enabled.
Resolves: #16603
After ff37ccd, interned values are trivial to convert to Air refs, using
`Air.internedToRef`. This made functions like `Sema.addConstant` effectively
redundant. This commit removes `Sema.addConstant` and `Sema.addType`, replacing
them with direct usages of `Air.internedToRef`.
Additionally, a new helper `Module.undefValue` is added, and the following
functions are moved into Module:
* `Sema.addConstUndef` -> `Module.undefRef`
* `Sema.addUnsignedInt` -> `Module.intRef` (now also works for signed types)
The general pattern here is that any `Module.xyzValue` helper may also have a
corresponding `Module.xyzRef` helper, which just wraps the call in
`Air.internedToRef`.
AstGen provides all function call arguments with a result location,
referenced through the call instruction index. The idea is that this
should be the parameter type, but for `anytype` parameters, we use
generic poison, which is required to be handled correctly.
Previously, generic instantiations and inline calls worked by evaluating
all args in advance, before resolving generic parameter types. This
means any generic parameter (not just `anytype` ones) had generic poison
result types. This caused missing result locations in some cases.
Additionally, the generic instantiation logic caused `zirParam` to
analyze the argument types a second time before coercion. This meant
that for nominal types (struct/enum/etc), a *new* type was created,
distinct to the result type which was previously forwarded to the
argument expression.
This commit fixes both of these issues. Generic parameter type
resolution is now interleaved with argument analysis, so that we don't
have unnecessary generic poison types, and generic instantiation logic
now handles parameters itself rather than falling through to the
standard zirParam logic, so avoids duplicating the types.
Resolves: #16566Resolves: #16258Resolves: #16753
* Disable runtime calls, since it is not possible to know the proper
stack adjustment to follow the callee abi.
* Disable runtime returns, since it is not possible to know where the
return address is stored in general.
* Allow implicit returns regardless of the return type, which allows
naked functions with a non-void return type to be written.
* getOwnedFunctionIndex no longer checks if the value is actually a
function.
* The callsites to `intern` that I added want to avoid the `getCoerced`
call, so I added `intern2`.
* Adding to inferred error sets should not happen if the destination
error set is not the inferred error set of the current Sema instance.
* adhoc_inferred_error_set_type can be seen by the backend. Treat it
like anyerror.
* move inferred error sets into InternPool.
- they are now represented by pointing directly at the corresponding
function body value.
* inferred error set working memory is now in Sema and expires after
the Sema for the function corresponding to the inferred error set is
finished having its body analyzed.
* error sets use a InternPool.Index.Slice rather than an actual slice
to avoid lifetime issues.
The idea here is to move towards a future where anonymous decls are
represented entirely by an `InternPool.Index`. This was needed to start
implementing `InternPool.getFuncDecl` which requires moving creation and
deletion of Decl objects into InternPool.
* remove `Namespace.anon_decls`
* remove the concept of cleaning up resources from anonymous decls,
relying on InternPool instead.
* move namespace and decl object allocation into InternPool
Abridged summary:
* Move `Module.Fn` into `InternPool`.
* Delete a lot of confusing and problematic `Sema` logic related to
generic function calls.
This commit removes `Module.Fn` and replaces it with two new
`InternPool.Tag` values:
* `func_decl` - corresponding to a function declared in the source
code. This one contains line/column numbers, zir_body_inst, etc.
* `func_instance` - one for each monomorphization of a generic
function. Contains a reference to the `func_decl` from whence the
instantiation came, along with the `comptime` parameter values (or
types in the case of `anytype`)
Since `InternPool` provides deduplication on these values, these fields
are now deleted from `Module`:
* `monomorphed_func_keys`
* `monomorphed_funcs`
* `align_stack_fns`
Instead of these, Sema logic for generic function instantiation now
unconditionally evaluates the function prototype expression for every
generic callsite. This is technically required in order for type
coercions to work. The previous code had some dubious, probably wrong
hacks to make things work, such as `hashUncoerced`. I'm not 100% sure
how we were able to eliminate that function and still pass all the
behavior tests, but I'm pretty sure things were still broken without
doing type coercion for every generic function call argument.
After the function prototype is evaluated, it produces a deduplicated
`func_instance` `InternPool.Index` which can then be used for the
generic function call.
Some other nice things made by this simplification are the removal of
`comptime_args_fn_inst` and `preallocated_new_func` from `Sema`, and the
messy logic associated with them.
I have not yet been able to measure the perf of this against master
branch. On one hand, it reduces memory usage and pointer chasing of the
most heavily used `InternPool` Tag - function bodies - but on the other
hand, it does evaluate function prototype expressions more than before.
We will soon find out.
All of the std except these few functions call it "eql" instead of "eq".
This has previously tripped me up when I expected the equality check function to be called "eql"
(just like all the rest of the std) instead of "eq".
The motivation is consistency.
If search "eq" on Autodoc, these functions stick out and it looks inconsistent.
I just noticed there are also a few functions spelling it out as "equal" (such as std.mem.allEqual).
Maybe those functions should also spell it "eql" but that can be done in a future PR.
This actually used to be how it worked in stage1, and there was this
issue to change it: #2649
So this commit is a reversal to that idea. One motivation for that issue
was avoiding emitting the panic handler in compilations that do not have
any calls to panic. This commit only resolves the panic handler in the
event of a safety check function being emitted, so it does not have that
flaw.
The other reason given in that issue was for optimizations that elide
safety checks. It's yet to be determined whether that was a good idea or
not; this can get re-explored when we start adding optimization passes
to AIR.
This commit adds these AIR instructions, which are only emitted if
`backendSupportsFeature(.safety_checked_arithmetic)` is true:
* add_safe
* sub_safe
* mul_safe
It removes these nonsensical AIR instructions:
* addwrap_optimized
* subwrap_optimized
* mulwrap_optimized
The safety-checked arithmetic functions push the burden of invoking the
panic handler into the backend. This makes for a messier compiler
implementation, but it reduces the amount of AIR instructions emitted by
Sema, which reduces time spent in the secondary bottleneck of the
compiler. It also generates more compact LLVM IR, reducing time spent in
the primary bottleneck of the compiler.
Finally, it eliminates 1 stack allocation per safety-check which was
being used to store the resulting tuple. These allocations were going to
be annoying when combined with suspension points.