44 Commits

Author SHA1 Message Date
Andrew Kelley
fcafc63f3d inline assembly: use types
until now these were stringly typed.

it's kinda obvious when you think about it.
2025-07-16 10:23:02 -07:00
Andrew Kelley
5378fdb153 std.fmt: fully remove format string from format methods
Introduces `std.fmt.alt` which is a helper for calling alternate format
methods besides one named "format".
2025-07-07 22:43:53 -07:00
Andrew Kelley
30c2921eb8 compiler: update a bunch of format strings 2025-07-07 22:43:52 -07:00
Andrew Kelley
f409457925 compiler: fix a bunch of format strings 2025-07-07 22:43:52 -07:00
Andrew Kelley
3afc6fbac6 std.zig.llvm.Builder: update format API 2025-07-07 22:43:52 -07:00
Andrew Kelley
756a2dbf1a compiler: upgrade various std.io API usage 2025-07-07 22:43:52 -07:00
Andrew Kelley
0e37ff0d59 std.fmt: breaking API changes
added adapter to AnyWriter and GenericWriter to help bridge the gap
between old and new API

make std.testing.expectFmt work at compile-time

std.fmt no longer has a dependency on std.unicode. Formatted printing
was never properly unicode-aware. Now it no longer pretends to be.

Breakage/deprecations:
* std.fs.File.reader -> std.fs.File.deprecatedReader
* std.fs.File.writer -> std.fs.File.deprecatedWriter
* std.io.GenericReader -> std.io.Reader
* std.io.GenericWriter -> std.io.Writer
* std.io.AnyReader -> std.io.Reader
* std.io.AnyWriter -> std.io.Writer
* std.fmt.format -> std.fmt.deprecatedFormat
* std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape
* std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape
* std.fmt.fmtSliceHexLower -> {x}
* std.fmt.fmtSliceHexUpper -> {X}
* std.fmt.fmtIntSizeDec -> {B}
* std.fmt.fmtIntSizeBin -> {Bi}
* std.fmt.fmtDuration -> {D}
* std.fmt.fmtDurationSigned -> {D}
* {} -> {f} when there is a format method
* format method signature
  - anytype -> *std.io.Writer
  - inferred error set -> error{WriteFailed}
  - options -> (deleted)
* std.fmt.Formatted
  - now takes context type explicitly
  - no fmt string
2025-07-07 22:43:51 -07:00
Andrew Kelley
0b3f0124dc std.io: move getStdIn, getStdOut, getStdErr functions to fs.File
preparing to rearrange std.io namespace into an interface

how to upgrade:

std.io.getStdIn() -> std.fs.File.stdin()
std.io.getStdOut() -> std.fs.File.stdout()
std.io.getStdErr() -> std.fs.File.stderr()
2025-07-07 22:43:51 -07:00
Jacob Young
1f98c98fff x86_64: increase passing test coverage on windows
Now that codegen has no references to linker state this is much easier.

Closes #24153
2025-06-19 18:41:12 -04:00
mlugg
6ffa285fc3 compiler: fix @intFromFloat safety check
This safety check was completely broken; it triggered unchecked illegal
behavior *in order to implement the safety check*. You definitely can't
do that! Instead, we must explicitly check the boundaries. This is a
tiny bit fiddly, because we need to make sure we do floating-point
rounding in the correct direction, and also handle the fact that the
operation truncates so the boundary works differently for min vs max.

Instead of implementing this safety check in Sema, there are now
dedicated AIR instructions for safety-checked intfromfloat (two
instructions; which one is used depends on the float mode). Currently,
no backend directly implements them; instead, a `Legalize.Feature` is
added which expands the safety check, and this feature is enabled for
all backends we currently test, including the LLVM backend.

The `u0` case is still handled in Sema, because Sema needs to check for
that anyway due to the comptime-known result. The old safety check here
was also completely broken and has therefore been rewritten. In that
case, we just check for 'abs(input) < 1.0'.

I've added a bunch of test coverage for the boundary cases of
`@intFromFloat`, both for successes (in `test/behavior/cast.zig`) and
failures (in `test/cases/safety/`).

Resolves: #24161
2025-06-15 14:15:18 -04:00
Jacob Young
6e72026e3b Legalize: make the feature set comptime-known in zig1
This allows legalizations to be added that aren't used by zig1 without
affecting the size of zig1.
2025-06-15 11:42:03 -04:00
Jacob Young
c95b1bf2d3
x86_64: remove air references from mir 2025-06-12 13:55:41 +01:00
Jacob Young
0bf8617d96 x86_64: add support for pie executables 2025-06-06 23:42:14 -07:00
Jacob Young
80170d017b Legalize: handle packed semantics
Closes #22915
2025-06-03 15:04:43 -04:00
mlugg
c1a5caa454
compiler: combine @intCast safety checks
`castTruncatedData` was a poorly worded error (all shrinking casts
"truncate bits", it's just that we assume those bits to be zext/sext of
the other bits!), and `negativeToUnsigned` was a pointless distinction
which forced the compiler to emit worse code (since two separate safety
checks were required for casting e.g. 'i32' to 'u16') and wasn't even
implemented correctly. This commit combines those safety panics into one
function, `integerOutOfBounds`. The name maybe isn't perfect, but that's
not hugely important; what matters is the new default message, which is
clearer than the old ones: "integer does not fit in destination type".
2025-06-01 12:10:57 +01:00
Jacob Young
9edfccb9a7
Legalize: implement scalarization of overflow intrinsics 2025-06-01 08:24:01 +01:00
Jacob Young
ec579aa0f3
Legalize: implement scalarization of @shuffle 2025-06-01 08:24:01 +01:00
mlugg
add2976a9b
compiler: implement better shuffle AIR
Runtime `@shuffle` has two cases which backends generally want to handle
differently for efficiency:

* One runtime vector operand; some result elements may be comptime-known
* Two runtime vector operands; some result elements may be undefined

The latter case happens if both vectors given to `@shuffle` are
runtime-known and they are both used (i.e. the mask refers to them).
Otherwise, if the result is not entirely comptime-known, we are in the
former case. `Sema` now diffentiates these two cases in the AIR so that
backends can easily handle them however they want to. Note that this
*doesn't* really involve Sema doing any more work than it would
otherwise need to, so there's not really a negative here!

Most existing backends have their lowerings for `@shuffle` migrated in
this commit. The LLVM backend uses new lowerings suggested by Jacob as
ones which it will handle effectively. The x86_64 backend has not yet
been migrated; for now there's a panic in there. Jacob will implement
that before this is merged anywhere.
2025-06-01 08:24:01 +01:00
Jacob Young
b48d6ff619
Legalize: implement scalarization of @select 2025-06-01 08:24:01 +01:00
Jacob Young
32a57bfeaa
Legalize: update for new Block API 2025-06-01 08:24:01 +01:00
mlugg
4c4dacf81a
Legalize: replace safety_checked_instructions
This adds 4 `Legalize.Feature`s:
* `expand_intcast_safe`
* `expand_add_safe`
* `expand_sub_safe`
* `expand_mul_safe`

These do pretty much what they say on the tin. This logic was previously
in Sema, used when `Zcu.Feature.safety_checked_instructions` was not
supported by the backend. That `Zcu.Feature` has been removed in favour
of this legalization.
2025-06-01 08:24:00 +01:00
Jacob Young
77e6513030 cbe: implement stdbool.h reserved identifiers
Also remove the legalize pass from zig1.
2025-05-31 18:54:28 -04:00
Jacob Young
6198f7afb7 Sema: remove all_vector_instructions logic
Backends can instead ask legalization on a per-instruction basis.
2025-05-31 18:54:28 -04:00
Jacob Young
b483defc5a Legalize: implement scalarization of binary operations 2025-05-31 18:54:28 -04:00
Jacob Young
c1e9ef9eaa Legalize: implement scalarization of unary operations 2025-05-31 18:54:28 -04:00
Jacob Young
c04be630d9 Legalize: introduce a new pass before liveness
Each target can opt into different sets of legalize features.
By performing these transformations before liveness, instructions
that become unreferenced will have up-to-date liveness information.
2025-05-29 03:57:48 -04:00
mlugg
92c63126e8 compiler: tlv pointers are not comptime-known
Pointers to thread-local variables do not have their addresses known
until runtime, so it is nonsensical for them to be comptime-known. There
was logic in the compiler which was essentially attempting to treat them
as not being comptime-known despite the pointer being an interned value.
This was a bit of a mess, the check was frequent enough to actually show
up in compiler profiles, and it was very awkward for backends to deal
with, because they had to grapple with the fact that a "constant" they
were lowering might actually require runtime operations.

So, instead, do not consider these pointers to be comptime-known in
*any* way. Never intern such a pointer; instead, when the address of a
threadlocal is taken, emit an AIR instruction which computes the pointer
at runtime. This avoids lots of special handling for TLVs across
basically all codegen backends; of all somewhat-functional backends, the
only one which wasn't improved by this change was the LLVM backend,
because LLVM pretends this complexity around threadlocals doesn't exist.

This change simplifies Sema and codegen, avoids a potential source of
bugs, and potentially improves Sema performance very slightly by
avoiding a non-trivial check on a hot path.
2025-05-27 19:23:11 +01:00
dweiller
898ca82458 compiler: add @memmove builtin 2025-04-26 13:34:16 +10:00
Jacob Young
470e2b63d9 Dwarf: handle undefined type values
Closes #23461
2025-04-05 21:42:33 -04:00
Jacob Young
afa74c6b21 Sema: introduce all_vector_instructions backend feature
Sema is arbitrarily scalarizing some operations, which means that when I
try to implement vectorized versions of those operations in a backend,
they are impossible to test due to Sema not producing them. Now, I can
implement them and then temporarily enable the new feature for that
backend in order to test them. Once the backend supports all of them,
the feature can be permanently enabled.

This also deletes the Air instructions `int_from_bool` and
`int_from_ptr`, which are just bitcasts with a fixed result type, since
changing `un_op` to `ty_op` takes up the same amount of memory.
2025-01-31 23:00:34 -05:00
mlugg
b01d6b156c compiler: add intcast_safe AIR instruction
This instruction is like `intcast`, but includes two safety checks:

* Checks that the int is in range of the destination type
* If the destination type is an exhaustive enum, checks that the int
  is a named enum value

This instruction is locked behind the `safety_checked_instructions`
backend feature; if unsupported, Sema will emit a fallback, as with
other safety-checked instructions.

This instruction is used to add a missing safety check for `@enumFromInt`
truncating bits. This check also has a fallback for backends which do
not yet support `safety_checked_instructions`.

Resolves: #21946
2025-01-30 14:47:59 +00:00
mlugg
0ec6b2dd88 compiler: simplify generic functions, fix issues with inline calls
The original motivation here was to fix regressions caused by #22414.
However, while working on this, I ended up discussing a language
simplification with Andrew, which changes things a little from how they
worked before #22414.

The main user-facing change here is that any reference to a prior
function parameter, even if potentially comptime-known at the usage
site or even not analyzed, now makes a function generic. This applies
even if the parameter being referenced is not a `comptime` parameter,
since it could still be populated when performing an inline call. This
is a breaking language change.

The detection of this is done in AstGen; when evaluating a parameter
type or return type, we track whether it referenced any prior parameter,
and if so, we mark this type as being "generic" in ZIR. This will cause
Sema to not evaluate it until the time of instantiation or inline call.

A lovely consequence of this from an implementation perspective is that
it eliminates the need for most of the "generic poison" system. In
particular, `error.GenericPoison` is now completely unnecessary, because
we identify generic expressions earlier in the pipeline; this simplifies
the compiler and avoids redundant work. This also entirely eliminates
the concept of the "generic poison value". The only remnant of this
system is the "generic poison type" (`Type.generic_poison` and
`InternPool.Index.generic_poison_type`). This type is used in two
places:

* During semantic analysis, to represent an unknown result type.
* When storing generic function types, to represent a generic parameter/return type.

It's possible that these use cases should instead use `.none`, but I
leave that investigation to a future adventurer.

One last thing. Prior to #22414, inline calls were a little inefficient,
because they re-evaluated even non-generic parameter types whenever they
were called. Changing this behavior is what ultimately led to #22538.
Well, because the new logic will mark a type expression as generic if
there is any change its resolved type could differ in an inline call,
this redundant work is unnecessary! So, this is another way in which the
new design reduces redundant work and complexity.

Resolves: #22494
Resolves: #22532
Resolves: #22538
2025-01-21 02:41:42 +00:00
Jacob Young
c894ac09a3 dwarf: fix stepping through an inline loop containing one statement
Previously, stepping from the single statement within the loop would
always exit the loop because all of the code unrolled from the loop is
associated with the same line and treated by the debugger as one line.
2024-11-24 17:28:12 -05:00
mlugg
d11bbde5f9
compiler: remove anonymous struct types, unify all tuples
This commit reworks how anonymous struct literals and tuples work.

Previously, an untyped anonymous struct literal
(e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type",
which is a special kind of struct which coerces using structural
equivalence. This mechanism was a holdover from before we used
RLS / result types as the primary mechanism of type inference. This
commit changes the language so that the type assigned here is a "normal"
struct type. It uses a form of equivalence based on the AST node and the
type's structure, much like a reified (`@Type`) type.

Additionally, tuples have been simplified. The distinction between
"simple" and "complex" tuple types is eliminated. All tuples, even those
explicitly declared using `struct { ... }` syntax, use structural
equivalence, and do not undergo staged type resolution. Tuples are very
restricted: they cannot have non-`auto` layouts, cannot have aligned
fields, and cannot have default values with the exception of `comptime`
fields. Tuples currently do not have optimized layout, but this can be
changed in the future.

This change simplifies the language, and fixes some problematic
coercions through pointers which led to unintuitive behavior.

Resolves: #16865
2024-10-31 20:42:53 +00:00
David Rubin
043b1adb8d
remove @fence (#21585)
closes #11650
2024-10-04 22:21:27 +00:00
mlugg
5e12ca9fe3
compiler: implement labeled switch/continue 2024-09-01 18:30:31 +01:00
mlugg
5fb4a7df38
Air: add explicit repeat instruction to repeat loops
This commit introduces a new AIR instruction, `repeat`, which causes
control flow to move back to the start of a given AIR loop. `loop`
instructions will no longer automatically perform this operation after
control flow reaches the end of the body.

The motivation for making this change now was really just consistency
with the upcoming implementation of #8220: it wouldn't make sense to
have this feature work significantly differently. However, there were
already some TODOs kicking around which wanted this feature. It's useful
for two key reasons:

* It allows loops over AIR instruction bodies to loop precisely until
  they reach a `noreturn` instruction. This allows for tail calling a
  few things, and avoiding a range check on each iteration of a hot
  path, plus gives a nice assertion that validates AIR structure a
  little. This is a very minor benefit, which this commit does apply to
  the LLVM and C backends.

* It should allow for more compact ZIR and AIR to be emitted by having
  AstGen emit `repeat` instructions more often rather than having
  `continue` statements `break` to a `block` which is *followed* by a
  `repeat`. This is done in status quo because `repeat` instructions
  only ever cause the direct parent block to repeat. Now that AIR is
  more flexible, this flexibility can be pretty trivially extended to
  ZIR, and we can then emit better ZIR. This commit does not implement
  this.

Support for this feature is currently regressed on all self-hosted
native backends, including x86_64. This support will be added where
necessary before this branch is merged.
2024-09-01 18:30:31 +01:00
mlugg
1b000b90c9
Air: direct representation of ranges in switch cases
This commit modifies the representation of the AIR `switch_br`
instruction to represent ranges in cases. Previously, Sema emitted
different AIR in the case of a range, where the `else` branch of the
`switch_br` contained a simple `cond_br` for each such case which did a
simple range check (`x > a and x < b`). Not only does this add
complexity to Sema, which we would like to minimize, but it also gets in
the way of the implementation of #8220. That proposal turns certain
`switch` statements into a looping construct, and for optimization
purposes, we want to lower this to AIR fairly directly (i.e. without
involving a `loop` instruction). That means we would ideally like a
single instruction to represent the entire `switch` statement, so that
we can dispatch back to it with a different operand as in #8220. This is
not really possible to do correctly under the status quo system.

This commit implements lowering of this new `switch_br` usage in the
LLVM and C backends. The C backend just turns any case containing ranges
entirely into conditionals, as before. The LLVM backend is a little
smarter, and puts scalar items into the `switch` instruction, only using
conditionals for the range cases (which direct to the same bb). All
remaining self-hosted backends are temporarily regressed in the presence
of switch range cases. This functionality will be restored for at least
the x86_64 backend before merge.
2024-09-01 18:30:31 +01:00
mlugg
0fe3fd01dd
std: update std.builtin.Type fields to follow naming conventions
The compiler actually doesn't need any functional changes for this: Sema
does reification based on the tag indices of `std.builtin.Type` already!
So, no zig1.wasm update is necessary.

This change is necessary to disallow name clashes between fields and
decls on a type, which is a prerequisite of #9938.
2024-08-28 08:39:59 +01:00
Jacob Young
26d4fd5276 Zcu: avoid trying to link failed container types and contained navs 2024-08-27 02:09:59 -04:00
mlugg
457c94d353
compiler: implement @branchHint, replacing @setCold
Implements the accepted proposal to introduce `@branchHint`. This
builtin is permitted as the first statement of a block if that block is
the direct body of any of the following:

* a function (*not* a `test`)
* either branch of an `if`
* the RHS of a `catch` or `orelse`
* a `switch` prong
* an `or` or `and` expression

It lowers to the ZIR instruction `extended(branch_hint(...))`. When Sema
encounters this instruction, it sets `sema.branch_hint` appropriately,
and `zirCondBr` etc are expected to reset this value as necessary. The
state is on `Sema` rather than `Block` to make it automatically
propagate up non-conditional blocks without special handling. If
`@panic` is reached, the branch hint is set to `.cold` if none was
already set; similarly, error branches get a hint of `.unlikely` if no
hint is explicitly provided. If a condition is comptime-known, `cold`
hints from the taken branch are allowed to propagate up, but other hints
are discarded. This is because a `likely`/`unlikely` hint just indicates
the direction this branch is likely to go, which is redundant
information when the branch is known at comptime; but `cold` hints
indicate that control flow is unlikely to ever reach this branch,
meaning if the branch is always taken from its parent, then the parent
is also unlikely to ever be reached.

This branch information is stored in AIR `cond_br` and `switch_br`. In
addition, `try` and `try_ptr` instructions have variants `try_cold` and
`try_ptr_cold` which indicate that the error case is cold (rather than
just unlikely); this is reachable through e.g. `errdefer unreachable` or
`errdefer @panic("")`.

A new API `unwrapSwitch` is introduced to `Air` to make it more
convenient to access `switch_br` instructions. In time, I plan to update
all AIR instructions to be accessed via an `unwrap` method which returns
a convenient tagged union a la `InternPool.indexToKey`.

The LLVM backend lowers branch hints for conditional branches and
switches as follows:

* If any branch is marked `unpredictable`, the instruction is marked
  `!unpredictable`.
* Any branch which is marked as `cold` gets a
  `llvm.assume(i1 true) [ "cold"() ]` call to mark the code path cold.
* If any branch is marked `likely` or `unlikely`, branch weight metadata
  is attached with `!prof`. Likely branches get a weight of 2000, and
  unlikely branches a weight of 1. In `switch` statements, un-annotated
  branches get a weight of 1000 as a "middle ground" hint, since there
  could be likely *and* unlikely *and* un-annotated branches.

For functions, a `cold` hint corresponds to the `cold` function
attribute, and other hints are currently ignored -- as far as I can tell
LLVM doesn't really have a way to lower them. (Ideally, we would want
the branch hint given in the function to propagate to call sites.)

The compiler and standard library do not yet use this new builtin.

Resolves: #21148
2024-08-27 00:41:49 +01:00
Jacob Young
62f7276501 Dwarf: emit info about inline call sites 2024-08-20 08:09:33 -04:00
Jacob Young
a1053e8e1d InternPool: add and use a mutate mutex for each list
This allows the mutate mutex to only be locked during actual grows,
which are rare. For the lists that didn't previously have a mutex, this
change has little effect since grows are rare and there is zero
contention on a mutex that is only ever locked by one thread.  This
change allows `extra` to be mutated without racing with a grow.
2024-07-13 04:47:38 -04:00
mlugg
0e5335aaf5
compiler: rework type resolution, fully resolve all types
I'm so sorry.

This commit was just meant to be making all types fully resolve by
queueing resolution at the moment of their creation. Unfortunately, a
lot of dominoes ended up falling. Here's what happened:

* I added a work queue job to fully resolve a type.
* I realised that from here we could eliminate `Sema.types_to_resolve`
  if we made function codegen a separate job. This is desirable for
  simplicity of both spec and implementation.
* This led to a new AIR traversal to detect whether any required type is
  unresolved. If a type in the AIR failed to resolve, then we can't run
  codegen.
* Because full type resolution now occurs by the work queue job, a bug
  was exposed whereby error messages for type resolution were associated
  with the wrong `Decl`, resulting in duplicate error messages when the
  type was also resolved "by" its owner `Decl` (which really *all*
  resolution should be done on).
* A correct fix for this requires using a different `Sema` when
  performing type resolution: we need a `Sema` owned by the type. Also
  note that this fix is necessary for incremental compilation.
* This means a whole bunch of functions no longer need to take `Sema`s.
  * First-order effects: `resolveTypeFields`, `resolveTypeLayout`, etc
  * Second-order effects: `Type.abiAlignmentAdvanced`, `Value.orderAgainstZeroAdvanced`, etc

The end result of this is, in short, a more correct compiler and a
simpler language specification. This regressed a few error notes in the
test cases, but nothing that seems worth blocking this change.

Oh, also, I ripped out the old code in `test/src/Cases.zig` which
introduced a dependency on `Compilation`. This dependency was
problematic at best, and this code has been unused for a while. When we
re-enable incremental test cases, we must rewrite their executor to use
the compiler server protocol.
2024-07-04 21:01:42 +01:00