If an adapted string key with embedded nulls was put in a hash map with
`std.hash_map.StringIndexAdapter`, then an incorrect hash would be
entered for that entry such that it is possible that when looking for
the exact key that matches the prefix of the original key up to the
first null would sometimes match this entry due to hash collisions and
sometimes not if performed later after a grow + rehash, causing the same
key to exist with two different indices breaking every string equality
comparison ever, for example claiming that a container type doesn't
contain a field because the field name string in the struct and the
string representing the identifier to lookup might be equal strings but
have different string indices. This could maybe be fixed by changing
`std.hash_map.StringIndexAdapter.hash` to only hash up to the first
null, therefore ensuring that the entry's hash is correct and that all
future lookups will be consistent, but I don't trust anything so instead
I assert that there are no embedded nulls.
When coercing the operand of a `ret_node` etc instruction, the source
location for errors used to point to the entire `return` statement.
Instead, we now point to the operand, as would be expected if there was
an explicit `as_node` instruction (like there used to be).
This logic (currently) has a non-trivial cost (particularly in terms of
peak RSS) for tracking dependencies. Until incremental compilation is in
use in the wild, it doesn't make sense for users to pay that cost.
* Functions failing codegen now set this failure on the function
analysis state. Decl analysis `codegen_failure` is reserved for
failures generating constant values.
* `liveness_failure` is consolidated into `codegen_failure`, as we do
not need to distinguish these, and Liveness.Verify is just a debugging
feature anyway.
* `sema_failure_retryable` and `codegen_failure_retryable` are removed.
Instead, retryable failures are recorded in the new
`Zcu.retryable_failures` list. On an incremental update, this list is
flushed, and all elements are marked as outdated so that we re-attempt
analysis and code generation.
Also remove the `generation` fields from `Zcu` and `Decl` as these are
not needed by our new strategy for incremental updates.
* Mark root Decls for re-analysis separately
* Check for re-analysis of root Decls
* Remove `outdated` entry when analyzing fn body
* Remove legacy `outdated` field from Decl analysis state
* Invalidate `decl_val` dependencies
* Recursively mark and un-mark all dependencies correctly
* Queue analysis of outdated dependers in `Compilation.performAllTheWork`
Introduces logic to invalidate `decl_val` dependencies after
`Zcu.semaDecl` completes. Also, recursively un-mark dependencies as PO
where needed.
With this, all dependency invalidation logic is in place. The next step
is analyzing outdated dependencies and triggering appropriate
re-analysis.
Sema now tracks dependencies appropriately. Early logic in Zcu for
resolving outdated decls/functions is in place. The setup used does not
support `usingnamespace`; compilations using this construct are not yet
supported by this incremental compilation model.
It is problematic for the cached `InternPool` state to directly
reference ZIR instruction indices, as these are not stable across
incremental updates. The existing ZIR mapping logic attempts to handle
this by iterating the existing Decl graph for a file after `AstGen` and
update ZIR indices on `Decl`s, struct types, etc. However, this is
unreliable due to generic instantiations, and relies on specialized
logic for everything which may refer to a ZIR instruction (e.g. a
struct's owner decl). I therefore determined that a prerequisite change
for incremental compilation would be to rework how we store these
indices.
This commit introduces a `TrackedInst` type which provides a stable
index (`TrackedInst.Index`) for a single ZIR instruction in the
compilation. The `InternPool` now stores these values in place of ZIR
instruction indices. This makes the ZIR mapping logic relatively
trivial: after `AstGen` completes, we simply iterate all `TrackedInst`
values and update those indices which have changed. In future, if the
corresponding ZIR instruction has been removed, we must also invalidate
any dependencies on this instruction to trigger any required
re-analysis, however the dependency system does not yet exist.
This commit changes how declarations (`const`, `fn`, `usingnamespace`,
etc) are represented in ZIR. Previously, these were represented in the
container type's extra data (e.g. as trailing data on a `struct_decl`).
However, this introduced the complexity of the ZIR mapping logic having
to also correlate some ZIR extra data indices. That isn't really a
problem today, but it's tricky for the introduction of `TrackedInst` in
the commit following this one. Instead, these type declarations now
simply contain a trailing list of ZIR indices to `declaration`
instructions, which directly encode all data related to the declaration
(including containing the declaration's body). Additionally, the ZIR for
`align` etc have been split out into their own bodies. This is not
strictly necessary, but it's much simpler to understand for an
insignificant cost in bytes, and will simplify the resolution of #131
(where we may need to evaluate the pointer type, including align etc,
without immediately evaluating the value body).
Fixes edge cases where the `startsWith` that was used previously would return a false positive on a resolved path like `foo.zig` when the resolved root was `foo`. Before this commit, such a path would be treated as a sub path of 'foo' with a resolved sub file path of 'zig' (and the `.` would be assumed to be a path separator). After this commit, `foo.zig` will be correctly treated as outside of the root of `foo`.
Closes#18355
Now, link.File will always be null when -fno-emit-bin is specified, and
in the case that LLVM artifacts are still required, the Zcu instance has
an LlvmObject.
implement builtin.zig file population for all modules rather than
assuming there is only one global builtin.zig module.
move some fields from link.File to Compilation
move some fields from Module to Compilation
compute debug_format in global Compilation config resolution
wire up C compilation to the concept of owner modules
make whole cache mode call link.File.createEmpty() instead of
link.File.open()
Much of the logic from Compilation.create() is extracted into
Compilation.Config.resolve() which accepts many optional settings and
produces concrete settings. This separate step is needed by API users of
Compilation so that they can pass the resolved global settings to the
Module creation function, which itself needs to resolve per-Module
settings.
Since the target and other things are no longer global settings, I did
not want them stored in link.File (in the `options` field). That options
field was already a kludge; those options should be resolved into
concrete settings. This commit also starts to work on that, deleting
link.Options, moving the fields into Compilation and
ObjectFormat-specific structs instead. Some fields were ephemeral and
should not have been stored at all, such as symbol_size_hint.
The link.File object of Compilation is now a `?*link.File` and `null`
when -fno-emit-bin is passed. It is now arena-allocated along with
Compilation itself, avoiding some messy cleanup code that was there
before.
On the command line, it is now possible to configure the standard
library itself by using `--mod std` just like any other module. This
meant that the CLI needed to create the standard library module rather
than having Compilation create it.
There are a lot of changes in this commit and it's still not done. I
didn't realize how quickly this changeset was going to balloon out of
control, and there are still many lines that need to be changed before
it even compiles successfully.
* introduce std.Build.Cache.HashHelper.oneShot
* add error_tracing to std.Build.Module
* extract build.zig file generation into src/Builtin.zig
* each CSourceFile and RcSourceFile now has a Module owner, which
determines some of the C compiler flags.
The motivating problem here was a memory leak in the hash maps of
Module.Namespace.
The commit deletes more of the legacy incremental compilation
implementation. It had things like use of orderedRemove and trying to do
too much OOP-style creation and deletion of objects.
Instead, this commit iterates over all the namespaces on Module deinit
and calls deinit on the hash map fields. This logic is much simpler to
reason about.
Similarly, change global inline assembly to an array hash map since
iterating over the values is a primary use of it, and clean up the
remaining values on Module deinit, solving another memory leak.
After this there are no more memory leaks remaining when using the
x86 backend in a libc-less compiler.
This incremental compilation logic will need to be reworked so that it
does not depend on buried pointers - that is, long-lived pointers that
are owned by non-top-level objects such as Decl.
In the meantime, this fixes memory leaks since the memory management of
these dependencies has bitrotted.
- Add resolveUnionAlignment, to resolve a union's alignment only, without triggering layout resolution.
- Update resolveUnionLayout to cache size, alignment, and padding. abiSizeAdvanced and abiAlignmentAdvanced
now use this information instead of computing it each time.
This commit starts by making Zir.Inst.Index a nonexhaustive enum rather
than a u32 alias for type safety purposes, and the rest of the changes
are needed to get everything compiling again.
This field had the wrong type. It's not a `Zir.Inst.Index`, it's
actually a `Zir.OptionalExtraIndex`. Also, the former is currently
aliased to `u32` while the latter is a nonexhaustive enum that gives us
more type checking.
This commit is preparation for making this field non-optional. Now it
can be changed to `Zir.ExtraIndex` and then the compiler will point out
all the places that the non-optional assumption is being violated.