Since we track `reify` instructions across incremental updates, it is
acceptable to treat it as the baseline for a relative source location.
This turns out to be a good idea, since it makes it easy to define the
source location for a reified type.
🦀 src_decl is gone 🦀
This commit eliminates the `src_decl` field from `Sema.Block`. This
change goes further to eliminating unnecessary responsibilities of
`Decl` in preparation for its major upcoming refactor.
The two main remaining reponsibilities had to do with namespace types:
`src_decl` was used to determine their line number and their name. The
former use case is solved by storing the line number alongside type
declarations (and reifications) in ZIR; this is actually more correct,
since previously the line number assigned to the type was really the
line number of the source declaration it was syntactically contained
within, which does not necessarily line up. Consequently, this change
makes debug info for namespace types more correct, although I am not
sure how debuggers actually utilize this line number, if at all. Naming
types was solved by a new field on `Block`, called `type_name_ctx`. In a
sense, it represents the "namespace" we are currently within, including
comptime function calls etc. We might want to revisit this in future,
since the type naming rules seem to be a bit hand-wavey right now.
As far as I can tell, there isn't any more preliminary work needed for
me to start work on the behemoth task of splitting `Zcu.Decl` into the
new `Nav` (Named Addressable Value) and `Cau` (Comptime Analysis Unit)
types. This will be a sweeping change, impacting essentially every part
of the pipeline after `AstGen`.
The justification for using relative source nodes in ZIR is that it
allows source locations -- which may be serialized across incremental
updates -- to be relative to the source location of their containing
declaration. However, having those "baseline" instructions themselves be
relative to their own parent is counterproductive, since the source
location updating problem is only being moved to `Decl`. Storing the
absolute node here instead makes more sense, since it allows for this
source location update logic to be elided entirely in the future by
storing a `TrackedInst.Index` to resolve a source location relative to
rather than a `Decl.Index`.
The surrogate code points U+D800 to U+DFFF are valid code points but are not Unicode scalar values. This commit makes the error message more accurately reflect what is actually allowed in `\u` escape sequences.
From https://www.unicode.org/versions/Unicode15.0.0/ch03.pdf:
> D71 High-surrogate code point: A Unicode code point in the range U+D800 to U+DBFF.
> D73 Low-surrogate code point: A Unicode code point in the range U+DC00 to U+DFFF.
>
> 3.9 Unicode Encoding Forms
> D76 Unicode scalar value: Any Unicode code point except high-surrogate and low-surrogate code points.
Related: #20270
These instructions are not emitted by AstGen. They also would have no
effect even if they did appear in ZIR: the Sema handling for these
instructions creates a Decl which the name strategy is applied to, and
proceeds to never use it. This pointless CPU heater is now gone, saving
2 ZIR tags in the process.
This reverts commit a7de02e05216db9a04e438703ddf1b6b12f3fbef.
This did not implement the accepted proposal, and I did not sign off
on the changes. I would like a chance to review this, please.
this patch renames ComptimeStringMap to StaticStringMap, makes it
accept only a single type parameter, and return a known struct type
instead of an anonymous struct. initial motivation for these changes
was to reduce the 'very long type names' issue described here
https://github.com/ziglang/zig/pull/19682.
this breaks the previous API. users will now need to write:
`const map = std.StaticStringMap(T).initComptime(kvs_list);`
* move `kvs_list` param from type param to an `initComptime()` param
* new public methods
* `keys()`, `values()` helpers
* `init(allocator)`, `deinit(allocator)` for runtime data
* `getLongestPrefix(str)`, `getLongestPrefixIndex(str)` - i'm not sure
these belong but have left in for now incase they are deemed useful
* performance notes:
* i posted some benchmarking results here:
https://github.com/travisstaloch/comptime-string-map-revised/issues/1
* i noticed a speedup reducing the size of the struct from 48 to 32
bytes and thus use u32s instead of usize for all length fields
* i noticed speedup storing KVs as a struct of arrays
* latest benchmark shows these wall_time improvements for
debug/safe/small/fast builds: -6.6% / -10.2% / -19.1% / -8.9%. full
output in link above.
This commit changes how we represent comptime-mutable memory
(`comptime var`) in the compiler in order to implement the intended
behavior that references to such memory can only exist at comptime.
It does *not* clean up the representation of mutable values, improve the
representation of comptime-known pointers, or fix the many bugs in the
comptime pointer access code. These will be future enhancements.
Comptime memory lives for the duration of a single Sema, and is not
permitted to escape that one analysis, either by becoming runtime-known
or by becoming comptime-known to other analyses. These restrictions mean
that we can represent comptime allocations not via Decl, but with state
local to Sema - specifically, the new `Sema.comptime_allocs` field. All
comptime-mutable allocations, as well as any comptime-known const allocs
containing references to such memory, live in here. This allows for
relatively fast checking of whether a value references any
comptime-mtuable memory, since we need only traverse values up to
pointers: pointers to Decls can never reference comptime-mutable memory,
and pointers into `Sema.comptime_allocs` always do.
This change exposed some faulty pointer access logic in `Value.zig`.
I've fixed the important cases, but there are some TODOs I've put in
which are definitely possible to hit with sufficiently esoteric code. I
plan to resolve these by auditing all direct accesses to pointers (most
of them ought to use Sema to perform the pointer access!), but for now
this is sufficient for all realistic code and to get tests passing.
This change eliminates `Zcu.tmp_hack_arena`, instead using the Sema
arena for comptime memory mutations, which is possible since comptime
memory is now local to the current Sema.
This change should allow `Decl` to store only an `InternPool.Index`
rather than a full-blown `ty: Type, val: Value`. This commit does not
perform this refactor.
A pointer type already has an alignment, so this information does not
need to be duplicated on the function type. This already has precedence
with addrspace which is already disallowed on function types for this
reason. Also fixes `@TypeOf(&func)` to have the correct addrspace and
alignment.
There is no reason to perform this detection during semantic analysis.
In fact, doing so is problematic, because we wish to utilize detection
of existing decls in a namespace in incremental compilation.
This fixes an issue with the implementation of #18816. Consider the
following code:
```zig
pub fn Wrap(comptime T: type) type {
return struct {
pub const T1 = T;
inner: struct { x: T1 },
};
}
```
Previously, the type of `inner` was not considered to be "capturing" any
value, as `T1` is a decl. However, since it is declared within a generic
function, this decl reference depends on the context, and thus should be
treated as a capture.
AstGen has been augmented to tunnel references to decls through closure
when the decl was declared in a potentially-generic context (i.e. within
a function).
This changes the representation of closures in Zir and Sema. Rather than
a pair of instructions `closure_capture` and `closure_get`, the system
now works as follows:
* Each ZIR type declaration (`struct_decl` etc) contains a list of
captures in the form of ZIR indices (or, for efficiency, direct
references to parent captures). This is an ordered list; indexes into
it are used to refer to captured values.
* The `extended(closure_get)` ZIR instruction refers to a value in this
list via a 16-bit index (limiting this index to 16 bits allows us to
store this in `extended`).
* `Module.Namespace` has a new field `captures` which contains the list
of values captured in a given namespace. This is initialized based on
the ZIR capture list whenever a type declaration is analyzed.
This change eliminates `CaptureScope` from semantic analysis, which is a
nice simplification; but the main motivation here is that this change is
a prerequisite for #18816.
Thanks to jacobly0 for figuring this out. The chain of events causing
the failure this triggered is as follows.
* As of a recent commit, certain bodies no longer emit a redundant
`block`, meaning there are more likely to be "interesting"
instructions (i.e. not blocks) at the end of parent GenZir scopes.
* When emitting the first `dbg_stmt` in such a body, the elision logic
incorrectly looks at a tag from an instruction in an enclosing scope.
* The tag of this instruction may be `undefined`, meaning that in unsafe
builds it may be incorrectly identified as a `dbg_stmt` instruction.
* This instruction from another body is clobbered rather than emitting
an actual `dbg_stmt` instruction. Note that this does not produce
invalid ZIR, since the creator of the undefined instruction replaces
the previously-undefined payload later.
In the code `if (cond) { ... }`, the "then body" of the `if` is
technically a block. However, we don't need to emit a real ZIR `block`
corresponding to it, because we are already within a condbr body; we
have a separate gz, and appropriate scoping for allocs and debug
variables. In this case, and many like it, we can trivially elide the
block here, instead emitting the block statements directly into the
current `GenZir`. This results in a significant decrease in ZIR bytes
for real code.