34 Commits

Author SHA1 Message Date
Andrew Kelley
25198810c8 add new builtin: @disableInstrumentation
This is needed to ensure that start code does not try to access thread
local storage before it has set up thread local storage.
2024-07-22 13:07:02 -07:00
gooncreeper
c50f300387 Tokenizer bug fixes and improvements
Fixes many error messages corresponding to invalid bytes displaying the
wrong byte. Additionaly improves handling of UTF-8 in some places.
2024-07-15 11:31:19 +03:00
Jora Troosh
13070448f5
std: fix typos (#20560) 2024-07-09 14:25:42 -07:00
wooster0
5e3bad3556 Make 0e.0 and 0xp0 not crash
This fixes those sequences of characters crashing.
2024-07-03 02:53:37 -04:00
mlugg
5b523d0469
Zir: make src_line absolute for declaration instructions
We need special logic for updating line numbers anyway, so it's fine to
just use absolute numbers here. This eliminates a field from `Decl`.
2024-06-26 05:28:03 +01:00
Matthew Lugg
f73be120f4
Merge pull request #20299 from mlugg/the-great-decl-split
The Great Decl Split (preliminary work): refactor source locations and eliminate `Sema.Block.src_decl`.
2024-06-20 11:07:17 +01:00
mlugg
1fdf13a148 AstGen: error for redundant @inComptime() 2024-06-19 03:43:13 +01:00
mlugg
0486aa5081
Zir: provide absolute node for reify
Since we track `reify` instructions across incremental updates, it is
acceptable to treat it as the baseline for a relative source location.
This turns out to be a good idea, since it makes it easy to define the
source location for a reified type.
2024-06-18 04:48:24 +01:00
mlugg
b8d2323b88
Sema: eliminate Block.src_decl
🦀 src_decl is gone 🦀

This commit eliminates the `src_decl` field from `Sema.Block`. This
change goes further to eliminating unnecessary responsibilities of
`Decl` in preparation for its major upcoming refactor.

The two main remaining reponsibilities had to do with namespace types:
`src_decl` was used to determine their line number and their name. The
former use case is solved by storing the line number alongside type
declarations (and reifications) in ZIR; this is actually more correct,
since previously the line number assigned to the type was really the
line number of the source declaration it was syntactically contained
within, which does not necessarily line up. Consequently, this change
makes debug info for namespace types more correct, although I am not
sure how debuggers actually utilize this line number, if at all. Naming
types was solved by a new field on `Block`, called `type_name_ctx`. In a
sense, it represents the "namespace" we are currently within, including
comptime function calls etc. We might want to revisit this in future,
since the type naming rules seem to be a bit hand-wavey right now.

As far as I can tell, there isn't any more preliminary work needed for
me to start work on the behemoth task of splitting `Zcu.Decl` into the
new `Nav` (Named Addressable Value) and `Cau` (Comptime Analysis Unit)
types. This will be a sweeping change, impacting essentially every part
of the pipeline after `AstGen`.
2024-06-15 00:58:35 +01:00
mlugg
e39cc0dff7
Zir: use absolute nodes for declarations and type declarations
The justification for using relative source nodes in ZIR is that it
allows source locations -- which may be serialized across incremental
updates -- to be relative to the source location of their containing
declaration. However, having those "baseline" instructions themselves be
relative to their own parent is counterproductive, since the source
location updating problem is only being moved to `Decl`. Storing the
absolute node here instead makes more sense, since it allows for this
source location update logic to be elided entirely in the future by
storing a `TrackedInst.Index` to resolve a source location relative to
rather than a `Decl.Index`.
2024-06-15 00:57:52 +01:00
Ryan Liptak
0cef727e59 More precise error message for unencodable \u escapes
The surrogate code points U+D800 to U+DFFF are valid code points but are not Unicode scalar values. This commit makes the error message more accurately reflect what is actually allowed in `\u` escape sequences.

From https://www.unicode.org/versions/Unicode15.0.0/ch03.pdf:

> D71 High-surrogate code point: A Unicode code point in the range U+D800 to U+DBFF.
> D73 Low-surrogate code point: A Unicode code point in the range U+DC00 to U+DFFF.
>
> 3.9 Unicode Encoding Forms
> D76 Unicode scalar value: Any Unicode code point except high-surrogate and low-surrogate code points.

Related: #20270
2024-06-12 16:49:00 -04:00
mlugg
d4bc64038c Zir: remove legacy error_set_decl variants
These instructions are not emitted by AstGen. They also would have no
effect even if they did appear in ZIR: the Sema handling for these
instructions creates a Decl which the name strategy is applied to, and
proceeds to never use it. This pointless CPU heater is now gone, saving
2 ZIR tags in the process.
2024-06-10 05:02:34 +01:00
Andrew Kelley
9be8a9000f Revert "implement @expect builtin (#19658)"
This reverts commit a7de02e05216db9a04e438703ddf1b6b12f3fbef.

This did not implement the accepted proposal, and I did not sign off
on the changes. I would like a chance to review this, please.
2024-05-22 09:57:43 -07:00
David Rubin
a7de02e052
implement @expect builtin (#19658)
* implement `@expect`

* add docs

* add a second arg for expected bool

* fix typo

* move `expect` to use BinOp

* update to newer langref format
2024-05-22 10:51:16 -05:00
Veikka Tuominen
0fb2015fd3 llvm: fix @wasmMemory{Size,Grow} for wasm64
Closes #19942
2024-05-22 09:48:52 -04:00
Dominic
1550b5b16d
astgen: fix result info for catch switch_block_err_union 2024-05-11 12:06:13 +03:00
Travis Staloch
8af59d1f98 ComptimeStringMap: return a regular struct and optimize
this patch renames ComptimeStringMap to StaticStringMap, makes it
accept only a single type parameter, and return a known struct type
instead of an anonymous struct.  initial motivation for these changes
was to reduce the 'very long type names' issue described here
https://github.com/ziglang/zig/pull/19682.

this breaks the previous API.  users will now need to write:
`const map = std.StaticStringMap(T).initComptime(kvs_list);`

* move `kvs_list` param from type param to an `initComptime()` param
* new public methods
  * `keys()`, `values()` helpers
  * `init(allocator)`, `deinit(allocator)` for runtime data
  * `getLongestPrefix(str)`, `getLongestPrefixIndex(str)` - i'm not sure
     these belong but have left in for now incase they are deemed useful
* performance notes:
  * i posted some benchmarking results here:
    https://github.com/travisstaloch/comptime-string-map-revised/issues/1
  * i noticed a speedup reducing the size of the struct from 48 to 32
    bytes and thus use u32s instead of usize for all length fields
  * i noticed speedup storing KVs as a struct of arrays
  * latest benchmark shows these wall_time improvements for
    debug/safe/small/fast builds: -6.6% / -10.2% / -19.1% / -8.9%. full
    output in link above.
2024-04-22 15:31:41 -07:00
Jacob Young
eb723a4070 Update uses of @fieldParentPtr to use RLS 2024-03-30 20:50:48 -04:00
Jacob Young
17673dcd6e AstGen: use RLS to infer the first argument of @fieldParentPtr 2024-03-30 20:50:48 -04:00
Jacob Young
e409afb79b Update uses of @fieldParentPtr to pass a pointer type 2024-03-30 20:50:48 -04:00
mlugg
9c3670fc93
compiler: implement analysis-local comptime-mutable memory
This commit changes how we represent comptime-mutable memory
(`comptime var`) in the compiler in order to implement the intended
behavior that references to such memory can only exist at comptime.

It does *not* clean up the representation of mutable values, improve the
representation of comptime-known pointers, or fix the many bugs in the
comptime pointer access code. These will be future enhancements.

Comptime memory lives for the duration of a single Sema, and is not
permitted to escape that one analysis, either by becoming runtime-known
or by becoming comptime-known to other analyses. These restrictions mean
that we can represent comptime allocations not via Decl, but with state
local to Sema - specifically, the new `Sema.comptime_allocs` field. All
comptime-mutable allocations, as well as any comptime-known const allocs
containing references to such memory, live in here. This allows for
relatively fast checking of whether a value references any
comptime-mtuable memory, since we need only traverse values up to
pointers: pointers to Decls can never reference comptime-mutable memory,
and pointers into `Sema.comptime_allocs` always do.

This change exposed some faulty pointer access logic in `Value.zig`.
I've fixed the important cases, but there are some TODOs I've put in
which are definitely possible to hit with sufficiently esoteric code. I
plan to resolve these by auditing all direct accesses to pointers (most
of them ought to use Sema to perform the pointer access!), but for now
this is sufficient for all realistic code and to get tests passing.

This change eliminates `Zcu.tmp_hack_arena`, instead using the Sema
arena for comptime memory mutations, which is possible since comptime
memory is now local to the current Sema.

This change should allow `Decl` to store only an `InternPool.Index`
rather than a full-blown `ty: Type, val: Value`. This commit does not
perform this refactor.
2024-03-25 14:49:41 +00:00
Jacob Young
c11b6adf13 Ast: fix comptime destructure
A preceding `comptime` keyword was being ignored if the first
destructure variable was an expression.
2024-03-17 15:23:16 -07:00
Jacob Young
d10c52c194 AstGen: disallow alignment on function types
A pointer type already has an alignment, so this information does not
need to be duplicated on the function type.  This already has precedence
with addrspace which is already disallowed on function types for this
reason.  Also fixes `@TypeOf(&func)` to have the correct addrspace and
alignment.
2024-03-17 03:06:17 +01:00
mlugg
00969062a9
compiler: detect duplicate test names in AstGen
There is no reason to perform this detection during semantic analysis.
In fact, doing so is problematic, because we wish to utilize detection
of existing decls in a namespace in incremental compilation.
2024-03-14 07:40:05 +00:00
Tristan Ross
aab84a3dec
std.builtin: make float mode fields lowercase 2024-03-11 07:09:10 -07:00
Tristan Ross
099f3c4039
std.builtin: make container layout fields lowercase 2024-03-11 07:09:07 -07:00
mlugg
2c4ac44f25
compiler: treat decl_val/decl_ref of potentially generic decls as captures
This fixes an issue with the implementation of #18816. Consider the
following code:

```zig
pub fn Wrap(comptime T: type) type {
    return struct {
        pub const T1 = T;
        inner: struct { x: T1 },
    };
}
```

Previously, the type of `inner` was not considered to be "capturing" any
value, as `T1` is a decl. However, since it is declared within a generic
function, this decl reference depends on the context, and thus should be
treated as a capture.

AstGen has been augmented to tunnel references to decls through closure
when the decl was declared in a potentially-generic context (i.e. within
a function).
2024-03-06 21:26:38 +00:00
mlugg
a6ca20b9a1
compiler: change representation of closures
This changes the representation of closures in Zir and Sema. Rather than
a pair of instructions `closure_capture` and `closure_get`, the system
now works as follows:

* Each ZIR type declaration (`struct_decl` etc) contains a list of
  captures in the form of ZIR indices (or, for efficiency, direct
  references to parent captures). This is an ordered list; indexes into
  it are used to refer to captured values.
* The `extended(closure_get)` ZIR instruction refers to a value in this
  list via a 16-bit index (limiting this index to 16 bits allows us to
  store this in `extended`).
* `Module.Namespace` has a new field `captures` which contains the list
  of values captured in a given namespace. This is initialized based on
  the ZIR capture list whenever a type declaration is analyzed.

This change eliminates `CaptureScope` from semantic analysis, which is a
nice simplification; but the main motivation here is that this change is
a prerequisite for #18816.
2024-03-06 21:26:37 +00:00
mlugg
6a87e42c2e
AstGen: fix latent bug causing incorrect elision of dbg_stmt instructions
Thanks to jacobly0 for figuring this out. The chain of events causing
the failure this triggered is as follows.

* As of a recent commit, certain bodies no longer emit a redundant
  `block`, meaning there are more likely to be "interesting"
  instructions (i.e. not blocks) at the end of parent GenZir scopes.

* When emitting the first `dbg_stmt` in such a body, the elision logic
  incorrectly looks at a tag from an instruction in an enclosing scope.

* The tag of this instruction may be `undefined`, meaning that in unsafe
  builds it may be incorrectly identified as a `dbg_stmt` instruction.

* This instruction from another body is clobbered rather than emitting
  an actual `dbg_stmt` instruction. Note that this does not produce
  invalid ZIR, since the creator of the undefined instruction replaces
  the previously-undefined payload later.
2024-03-01 23:54:31 +00:00
mlugg
eefa60e376
AstGen: optimize ZIR for -1 literal 2024-03-01 06:01:53 +00:00
mlugg
07d8740882
AstGen: do not generate defers at unreachable end of block
Resolves: #8822
2024-02-29 23:38:17 +00:00
mlugg
f6abf022b7
AstGen: elide block instruction when already in empty body
In the code `if (cond) { ... }`, the "then body" of the `if` is
technically a block. However, we don't need to emit a real ZIR `block`
corresponding to it, because we are already within a condbr body; we
have a separate gz, and appropriate scoping for allocs and debug
variables. In this case, and many like it, we can trivially elide the
block here, instead emitting the block statements directly into the
current `GenZir`. This results in a significant decrease in ZIR bytes
for real code.
2024-02-29 23:38:17 +00:00
mlugg
f0a4bb6bd1
AstGen: avoid unnecessary coercion instructions
Coercions such as `@as(usize, 0)` can be trivially elided by matching
these cases and translating to fixed InternPool indices.
2024-02-29 23:38:17 +00:00
Andrew Kelley
b116063e02 move AstGen to std.zig.AstGen
Part of an effort to ship more of the compiler in source form.
2024-02-26 21:51:19 -07:00