This fixes an issue with the implementation of #18816. Consider the
following code:
```zig
pub fn Wrap(comptime T: type) type {
return struct {
pub const T1 = T;
inner: struct { x: T1 },
};
}
```
Previously, the type of `inner` was not considered to be "capturing" any
value, as `T1` is a decl. However, since it is declared within a generic
function, this decl reference depends on the context, and thus should be
treated as a capture.
AstGen has been augmented to tunnel references to decls through closure
when the decl was declared in a potentially-generic context (i.e. within
a function).
This implements the accepted proposal #18816. Namespace-owning types
(struct, enum, union, opaque) are no longer unique whenever analysed;
instead, their identity is determined based on their AST node and the
set of values they capture.
Reified types (`@Type`) are deduplicated based on the structure of the
type created. For instance, if two structs are created by the same
reification with identical fields, layout, etc, they will be the same
type.
This commit does not produce a working compiler; the next commit, adding
captures for decl references, is necessary. It felt appropriate to split
this up.
Resolves: #18816
Namespace types (`struct`, `enum`, `union`, `opaque`) do not use
structural equality - equivalence is based on their Decl index (and soon
will change to AST node + captures). However, we previously stored all
other information in the corresponding `InternPool.Key` anyway. For
logical consistency, it makes sense to have the key only be the true key
(that is, the Decl index) and to load all other data through another
function. This introduces those functions, by the name of
`loadStructType` etc. It's a big diff, but most of it is no-brainer
changes.
In future, it might be nice to eliminate a bunch of the loaded state in
favour of accessor functions on the `LoadedXyzType` types (like how we
have `LoadedUnionType.size()`), but that can be explored at a later
date.
This changes the representation of closures in Zir and Sema. Rather than
a pair of instructions `closure_capture` and `closure_get`, the system
now works as follows:
* Each ZIR type declaration (`struct_decl` etc) contains a list of
captures in the form of ZIR indices (or, for efficiency, direct
references to parent captures). This is an ordered list; indexes into
it are used to refer to captured values.
* The `extended(closure_get)` ZIR instruction refers to a value in this
list via a 16-bit index (limiting this index to 16 bits allows us to
store this in `extended`).
* `Module.Namespace` has a new field `captures` which contains the list
of values captured in a given namespace. This is initialized based on
the ZIR capture list whenever a type declaration is analyzed.
This change eliminates `CaptureScope` from semantic analysis, which is a
nice simplification; but the main motivation here is that this change is
a prerequisite for #18816.
Fixes a mismatch where the manifests would be written to the local cache dir, but the .def/.lib files themselves would be written to the global cache dir. This meant that if you cleared your global cache, your local cache would still think that the .lib files existed in the cache and it'd lead to 'No such file or directory' errors at linktime.
This mismatch was introduced in the stage1 -> stage2 transition of this code.
* Introduce `-Ddebug-extensions` for enabling compiler debug helpers
* Replace safety mode checks with `std.debug.runtime_safety`
* Replace debugger helper checks with `!builtin.strip_debug_info`
Sometimes, you just have to debug optimized compilers...
Thanks to jacobly0 for figuring this out. The chain of events causing
the failure this triggered is as follows.
* As of a recent commit, certain bodies no longer emit a redundant
`block`, meaning there are more likely to be "interesting"
instructions (i.e. not blocks) at the end of parent GenZir scopes.
* When emitting the first `dbg_stmt` in such a body, the elision logic
incorrectly looks at a tag from an instruction in an enclosing scope.
* The tag of this instruction may be `undefined`, meaning that in unsafe
builds it may be incorrectly identified as a `dbg_stmt` instruction.
* This instruction from another body is clobbered rather than emitting
an actual `dbg_stmt` instruction. Note that this does not produce
invalid ZIR, since the creator of the undefined instruction replaces
the previously-undefined payload later.
In `ld -r` mode, the linker will emit `N_GSYM` for any defined
external symbols as well as private externals. In the former case,
the thing is easy since `N_EXT` bit will be set in the nlist's type.
In the latter however we will encounter a local symbol with `N_PEXT`
bit set (non-extern, but was private external) which we also need
to include when resolving symbol stabs.
The major change in the logic for parsing symbol stabs per input
object file is that we no longer try to force-resolve a `N_GSYM`
as a global symbol. This was a mistake since every symbol stab
always describes a symbol defined within the parsed input object file.
We then work out if we should forward `N_GSYM` in the output symtab
after we have resolved all symbols, but never before - intel we lack
when initially parsing symbol stabs. Therefore, we simply record
which symbol has a debug symbol stab, and work out its precise type
when emitting output symtab after symbol resolution has been done.
Now, all the tests that use `testWithAllSupportedPathTypes` will also run each test with both `/` and `\` as the path separator on Windows.
Also, removes the now-redundant "Dir.symLink with relative target that has a / path separator" since the same thing is now tested in the "Dir.readLink" test
Since we now elide more ZIR blocks in AstGen, care must be taken in
codegen to introduce lexical scopes for every body, not just `block`s.
Also, elide a few unnecessary AIR blocks in Sema.
The signature and variants of Sema's main loop have evolved over time to
what was a quite confusing state of affairs. This commit makes minor
changes to how `analyzeBodyInner` works, and restructures/renames the
wrapper functions, adding doc comments to clarify their purposes. The
most notable change is that `analyzeBodyInner` now returns
`CompileError!void`; inline breaks are now all communicated via
`error.ComptimeBreak`.
In the code `if (cond) { ... }`, the "then body" of the `if` is
technically a block. However, we don't need to emit a real ZIR `block`
corresponding to it, because we are already within a condbr body; we
have a separate gz, and appropriate scoping for allocs and debug
variables. In this case, and many like it, we can trivially elide the
block here, instead emitting the block statements directly into the
current `GenZir`. This results in a significant decrease in ZIR bytes
for real code.
Certain symbols were left unmarked, meaning they would not be emit into
the final binary incorrectly. We now mark the synthetic symbols to ensure
they are emit as they are already created under the circumstance they're
needed for. This also re-enables disabled tests that were left disabled
in a previous merge conflict.
Lastly, this adds the shared-memory test to the test harnass as it was
previously forgotten and therefore regressed.
Rather than using the logger, we now emit proper 'compiler'-errors just
like the ELF and MachO linkers with notes. We now also support emitting
multiple errors before quiting the linking process in certain phases,
such as symbol resolution. This means we will print all symbols which
were resolved incorrectly, rather than the first one we encounter.
This introduces some type safety so we cannot accidently give an atom
index as a symbol index. This also means we do not have to store any
optionals and therefore allow for memory optimizations. Lastly, we can
now always simply access the symbol index of an atom, rather than having
to call `getSymbolIndex` as it is easy to forget.
We now use a single function to use the in-house WebAssembly linker
rather than wasm-ld. For both incremental compilation and traditional
linking we use the same codepath.
Previously we could directly write the type index because we used the
index that was known in the final binary. However, as we now process
the Zig module as its own relocatable object file, we must ensure to
generate a relocation for type indexes. This also ensures that we can
later link the relocatable object file as a standalone also.
This also fixes generating indirect function table entries for ZigObject
as it now correctly points to the relocation symbol index rather than
the symbol index that owns the relocation.