This implements the accepted proposal #18816. Namespace-owning types
(struct, enum, union, opaque) are no longer unique whenever analysed;
instead, their identity is determined based on their AST node and the
set of values they capture.
Reified types (`@Type`) are deduplicated based on the structure of the
type created. For instance, if two structs are created by the same
reification with identical fields, layout, etc, they will be the same
type.
This commit does not produce a working compiler; the next commit, adding
captures for decl references, is necessary. It felt appropriate to split
this up.
Resolves: #18816
Namespace types (`struct`, `enum`, `union`, `opaque`) do not use
structural equality - equivalence is based on their Decl index (and soon
will change to AST node + captures). However, we previously stored all
other information in the corresponding `InternPool.Key` anyway. For
logical consistency, it makes sense to have the key only be the true key
(that is, the Decl index) and to load all other data through another
function. This introduces those functions, by the name of
`loadStructType` etc. It's a big diff, but most of it is no-brainer
changes.
In future, it might be nice to eliminate a bunch of the loaded state in
favour of accessor functions on the `LoadedXyzType` types (like how we
have `LoadedUnionType.size()`), but that can be explored at a later
date.
This changes the representation of closures in Zir and Sema. Rather than
a pair of instructions `closure_capture` and `closure_get`, the system
now works as follows:
* Each ZIR type declaration (`struct_decl` etc) contains a list of
captures in the form of ZIR indices (or, for efficiency, direct
references to parent captures). This is an ordered list; indexes into
it are used to refer to captured values.
* The `extended(closure_get)` ZIR instruction refers to a value in this
list via a 16-bit index (limiting this index to 16 bits allows us to
store this in `extended`).
* `Module.Namespace` has a new field `captures` which contains the list
of values captured in a given namespace. This is initialized based on
the ZIR capture list whenever a type declaration is analyzed.
This change eliminates `CaptureScope` from semantic analysis, which is a
nice simplification; but the main motivation here is that this change is
a prerequisite for #18816.
* Introduce `-Ddebug-extensions` for enabling compiler debug helpers
* Replace safety mode checks with `std.debug.runtime_safety`
* Replace debugger helper checks with `!builtin.strip_debug_info`
Sometimes, you just have to debug optimized compilers...
The signature and variants of Sema's main loop have evolved over time to
what was a quite confusing state of affairs. This commit makes minor
changes to how `analyzeBodyInner` works, and restructures/renames the
wrapper functions, adding doc comments to clarify their purposes. The
most notable change is that `analyzeBodyInner` now returns
`CompileError!void`; inline breaks are now all communicated via
`error.ComptimeBreak`.
* make test names contain the fully qualified name
* make test filters match the fully qualified name
* allow multiple test filters, where a test is skipped if it does not
match any of the specified filters
Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior.
WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8.
Closes#18694Closes#1774Closes#2565
If an adapted string key with embedded nulls was put in a hash map with
`std.hash_map.StringIndexAdapter`, then an incorrect hash would be
entered for that entry such that it is possible that when looking for
the exact key that matches the prefix of the original key up to the
first null would sometimes match this entry due to hash collisions and
sometimes not if performed later after a grow + rehash, causing the same
key to exist with two different indices breaking every string equality
comparison ever, for example claiming that a container type doesn't
contain a field because the field name string in the struct and the
string representing the identifier to lookup might be equal strings but
have different string indices. This could maybe be fixed by changing
`std.hash_map.StringIndexAdapter.hash` to only hash up to the first
null, therefore ensuring that the entry's hash is correct and that all
future lookups will be consistent, but I don't trust anything so instead
I assert that there are no embedded nulls.
When coercing the operand of a `ret_node` etc instruction, the source
location for errors used to point to the entire `return` statement.
Instead, we now point to the operand, as would be expected if there was
an explicit `as_node` instruction (like there used to be).
This logic (currently) has a non-trivial cost (particularly in terms of
peak RSS) for tracking dependencies. Until incremental compilation is in
use in the wild, it doesn't make sense for users to pay that cost.
* Functions failing codegen now set this failure on the function
analysis state. Decl analysis `codegen_failure` is reserved for
failures generating constant values.
* `liveness_failure` is consolidated into `codegen_failure`, as we do
not need to distinguish these, and Liveness.Verify is just a debugging
feature anyway.
* `sema_failure_retryable` and `codegen_failure_retryable` are removed.
Instead, retryable failures are recorded in the new
`Zcu.retryable_failures` list. On an incremental update, this list is
flushed, and all elements are marked as outdated so that we re-attempt
analysis and code generation.
Also remove the `generation` fields from `Zcu` and `Decl` as these are
not needed by our new strategy for incremental updates.
* Mark root Decls for re-analysis separately
* Check for re-analysis of root Decls
* Remove `outdated` entry when analyzing fn body
* Remove legacy `outdated` field from Decl analysis state
* Invalidate `decl_val` dependencies
* Recursively mark and un-mark all dependencies correctly
* Queue analysis of outdated dependers in `Compilation.performAllTheWork`
Introduces logic to invalidate `decl_val` dependencies after
`Zcu.semaDecl` completes. Also, recursively un-mark dependencies as PO
where needed.
With this, all dependency invalidation logic is in place. The next step
is analyzing outdated dependencies and triggering appropriate
re-analysis.
Sema now tracks dependencies appropriately. Early logic in Zcu for
resolving outdated decls/functions is in place. The setup used does not
support `usingnamespace`; compilations using this construct are not yet
supported by this incremental compilation model.
It is problematic for the cached `InternPool` state to directly
reference ZIR instruction indices, as these are not stable across
incremental updates. The existing ZIR mapping logic attempts to handle
this by iterating the existing Decl graph for a file after `AstGen` and
update ZIR indices on `Decl`s, struct types, etc. However, this is
unreliable due to generic instantiations, and relies on specialized
logic for everything which may refer to a ZIR instruction (e.g. a
struct's owner decl). I therefore determined that a prerequisite change
for incremental compilation would be to rework how we store these
indices.
This commit introduces a `TrackedInst` type which provides a stable
index (`TrackedInst.Index`) for a single ZIR instruction in the
compilation. The `InternPool` now stores these values in place of ZIR
instruction indices. This makes the ZIR mapping logic relatively
trivial: after `AstGen` completes, we simply iterate all `TrackedInst`
values and update those indices which have changed. In future, if the
corresponding ZIR instruction has been removed, we must also invalidate
any dependencies on this instruction to trigger any required
re-analysis, however the dependency system does not yet exist.
This commit changes how declarations (`const`, `fn`, `usingnamespace`,
etc) are represented in ZIR. Previously, these were represented in the
container type's extra data (e.g. as trailing data on a `struct_decl`).
However, this introduced the complexity of the ZIR mapping logic having
to also correlate some ZIR extra data indices. That isn't really a
problem today, but it's tricky for the introduction of `TrackedInst` in
the commit following this one. Instead, these type declarations now
simply contain a trailing list of ZIR indices to `declaration`
instructions, which directly encode all data related to the declaration
(including containing the declaration's body). Additionally, the ZIR for
`align` etc have been split out into their own bodies. This is not
strictly necessary, but it's much simpler to understand for an
insignificant cost in bytes, and will simplify the resolution of #131
(where we may need to evaluate the pointer type, including align etc,
without immediately evaluating the value body).
Fixes edge cases where the `startsWith` that was used previously would return a false positive on a resolved path like `foo.zig` when the resolved root was `foo`. Before this commit, such a path would be treated as a sub path of 'foo' with a resolved sub file path of 'zig' (and the `.` would be assumed to be a path separator). After this commit, `foo.zig` will be correctly treated as outside of the root of `foo`.
Closes#18355
Now, link.File will always be null when -fno-emit-bin is specified, and
in the case that LLVM artifacts are still required, the Zcu instance has
an LlvmObject.
implement builtin.zig file population for all modules rather than
assuming there is only one global builtin.zig module.
move some fields from link.File to Compilation
move some fields from Module to Compilation
compute debug_format in global Compilation config resolution
wire up C compilation to the concept of owner modules
make whole cache mode call link.File.createEmpty() instead of
link.File.open()
Much of the logic from Compilation.create() is extracted into
Compilation.Config.resolve() which accepts many optional settings and
produces concrete settings. This separate step is needed by API users of
Compilation so that they can pass the resolved global settings to the
Module creation function, which itself needs to resolve per-Module
settings.
Since the target and other things are no longer global settings, I did
not want them stored in link.File (in the `options` field). That options
field was already a kludge; those options should be resolved into
concrete settings. This commit also starts to work on that, deleting
link.Options, moving the fields into Compilation and
ObjectFormat-specific structs instead. Some fields were ephemeral and
should not have been stored at all, such as symbol_size_hint.
The link.File object of Compilation is now a `?*link.File` and `null`
when -fno-emit-bin is passed. It is now arena-allocated along with
Compilation itself, avoiding some messy cleanup code that was there
before.
On the command line, it is now possible to configure the standard
library itself by using `--mod std` just like any other module. This
meant that the CLI needed to create the standard library module rather
than having Compilation create it.
There are a lot of changes in this commit and it's still not done. I
didn't realize how quickly this changeset was going to balloon out of
control, and there are still many lines that need to be changed before
it even compiles successfully.
* introduce std.Build.Cache.HashHelper.oneShot
* add error_tracing to std.Build.Module
* extract build.zig file generation into src/Builtin.zig
* each CSourceFile and RcSourceFile now has a Module owner, which
determines some of the C compiler flags.