1104 Commits

Author SHA1 Message Date
Alex Rønne Petersen
aa4ac2f85f
std.builtin: Rename CallingConvention.propeller1_sysv to propeller_sysv. 2025-02-17 19:17:56 +01:00
Alex Rønne Petersen
e0f8d4e68e
std.builtin: Rename CallingConvention.wasm_watc to wasm_mvp. 2025-02-17 19:17:56 +01:00
Alex Rønne Petersen
9c015e6c2b
std.builtin: Remove CallingConvention.arm_(apcs,aapcs16_vfp).
* arm_apcs is the long dead "OABI" which we never had working support for.
* arm_aapcs16_vfp is for arm-watchos-none which is a dead target that we've
  dropped support for.
2025-02-17 19:17:56 +01:00
Alex Rønne Petersen
2fe32ef847
std.Target: Remove Cpu.Arch.propeller2 and use a CPU feature instead. 2025-02-17 19:17:55 +01:00
Alex Rønne Petersen
b541a7af11
std.Target: Remove Cpu.Arch.spu_2.
This was for a hobby project that appears to be dormant for now. This can be
added back if the project is resumed in the future.
2025-02-17 19:17:55 +01:00
Jacob Young
afa74c6b21 Sema: introduce all_vector_instructions backend feature
Sema is arbitrarily scalarizing some operations, which means that when I
try to implement vectorized versions of those operations in a backend,
they are impossible to test due to Sema not producing them. Now, I can
implement them and then temporarily enable the new feature for that
backend in order to test them. Once the backend supports all of them,
the feature can be permanently enabled.

This also deletes the Air instructions `int_from_bool` and
`int_from_ptr`, which are just bitcasts with a fixed result type, since
changing `un_op` to `ty_op` takes up the same amount of memory.
2025-01-31 23:00:34 -05:00
mlugg
b01d6b156c compiler: add intcast_safe AIR instruction
This instruction is like `intcast`, but includes two safety checks:

* Checks that the int is in range of the destination type
* If the destination type is an exhaustive enum, checks that the int
  is a named enum value

This instruction is locked behind the `safety_checked_instructions`
backend feature; if unsupported, Sema will emit a fallback, as with
other safety-checked instructions.

This instruction is used to add a missing safety check for `@enumFromInt`
truncating bits. This check also has a fallback for backends which do
not yet support `safety_checked_instructions`.

Resolves: #21946
2025-01-30 14:47:59 +00:00
mlugg
83991efe10
compiler: yet more panic handler changes
* `std.builtin.Panic` -> `std.builtin.panic`, because it is a namespace.
* `root.Panic` -> `root.panic` for the same reason. There are type
  checks so that we still allow the legacy `pub fn panic` strategy in
  the 0.14.0 release.
* `std.debug.SimplePanic` -> `std.debug.simple_panic`, same reason.
* `std.debug.NoPanic` -> `std.debug.no_panic`, same reason.
* `std.debug.FormattedPanic` is now a function `std.debug.FullPanic`
  which takes as input a `panicFn` and returns a namespace with all the
  panic functions. This handles the incredibly common case of just
  wanting to override how the message is printed, whilst keeping nice
  formatted panics.
* Remove `std.builtin.panic.messages`; now, every safety panic has its
  own function. This reduces binary bloat, as calls to these functions
  no longer need to prepare any arguments (aside from the error return
  trace).
* Remove some legacy declarations, since a zig1.wasm update has
  happened. Most of these were related to the panic handler, but a quick
  grep for "zig1" brought up a couple more results too.

Also, add some missing type checks to Sema.

Resolves: #22584

formatted -> full
2025-01-24 19:29:51 +00:00
mlugg
1bce01de97 compiler: pass error return traces everywhere 2025-01-22 02:22:56 -05:00
mlugg
0ec6b2dd88 compiler: simplify generic functions, fix issues with inline calls
The original motivation here was to fix regressions caused by #22414.
However, while working on this, I ended up discussing a language
simplification with Andrew, which changes things a little from how they
worked before #22414.

The main user-facing change here is that any reference to a prior
function parameter, even if potentially comptime-known at the usage
site or even not analyzed, now makes a function generic. This applies
even if the parameter being referenced is not a `comptime` parameter,
since it could still be populated when performing an inline call. This
is a breaking language change.

The detection of this is done in AstGen; when evaluating a parameter
type or return type, we track whether it referenced any prior parameter,
and if so, we mark this type as being "generic" in ZIR. This will cause
Sema to not evaluate it until the time of instantiation or inline call.

A lovely consequence of this from an implementation perspective is that
it eliminates the need for most of the "generic poison" system. In
particular, `error.GenericPoison` is now completely unnecessary, because
we identify generic expressions earlier in the pipeline; this simplifies
the compiler and avoids redundant work. This also entirely eliminates
the concept of the "generic poison value". The only remnant of this
system is the "generic poison type" (`Type.generic_poison` and
`InternPool.Index.generic_poison_type`). This type is used in two
places:

* During semantic analysis, to represent an unknown result type.
* When storing generic function types, to represent a generic parameter/return type.

It's possible that these use cases should instead use `.none`, but I
leave that investigation to a future adventurer.

One last thing. Prior to #22414, inline calls were a little inefficient,
because they re-evaluated even non-generic parameter types whenever they
were called. Changing this behavior is what ultimately led to #22538.
Well, because the new logic will mark a type expression as generic if
there is any change its resolved type could differ in an inline call,
this redundant work is unnecessary! So, this is another way in which the
new design reduces redundant work and complexity.

Resolves: #22494
Resolves: #22532
Resolves: #22538
2025-01-21 02:41:42 +00:00
Jacob Young
5cfcb01503 llvm: convert @divFloor and @mod to forms llvm will recognize
On x86_64, the `@divFloor` change is a strict improvement, and the
`@mod` change adds one zero latency instruction.  In return, once we
upgrade to LLVM 20, when the optimizer discovers one of these operations
has a power-of-two constant rhs, it will be able to optimize the entire
operation into an `ashr` or `and`, respectively.

                      #I   CPL   CPT
    old `@divFloor` |  8 | 15 | .143 |
    new `@divFloor` |  7 | 15 | .148 |
    old `@mod`      |  9 | 17 | .134 | (rip llvm
    new `@mod`      | 10 | 17 | .138 |  scheduler)
2025-01-19 22:10:39 -05:00
Jacob Young
b9c4400776 x86_64: implement fallback for pcmpeqq 2025-01-16 20:42:08 -05:00
mlugg
d00e05f186
all: update to std.builtin.Type.Pointer.Size field renames
This was done by regex substitution with `sed`. I then manually went
over the entire diff and fixed any incorrect changes.

This diff also changes a lot of `callconv(.C)` to `callconv(.c)`, since
my regex happened to also trigger here. I opted to leave these changes
in, since they *are* a correct migration, even if they're not the one I
was trying to do!
2025-01-16 12:46:29 +00:00
Andrew Kelley
e521879e47 rewrite wasm/Emit.zig
mainly, rework how relocations works. This is the point at which symbol
indexes are known - not before. And don't emit unnecessary relocations!
They're only needed when emitting an object file.

Changes wasm linker to keep MIR around long-lived so that fixups can be
reapplied after linker garbage collection.

use labeled switch while we're at it
2025-01-15 15:11:35 -08:00
Andrew Kelley
943dac3e85 compiler: add type safety for export indices 2025-01-15 15:11:35 -08:00
Andrew Kelley
795e7c64d5 wasm linker: aggressive DODification
The goals of this branch are to:
* compile faster when using the wasm linker and backend
* enable saving compiler state by directly copying in-memory linker
  state to disk.
* more efficient compiler memory utilization
* introduce integer type safety to wasm linker code
* generate better WebAssembly code
* fully participate in incremental compilation
* do as much work as possible outside of flush(), while continuing to do
  linker garbage collection.
* avoid unnecessary heap allocations
* avoid unnecessary indirect function calls

In order to accomplish this goals, this removes the ZigObject
abstraction, as well as Symbol and Atom. These abstractions resulted
in overly generic code, doing unnecessary work, and needless
complications that simply go away by creating a better in-memory data
model and emitting more things lazily.

For example, this makes wasm codegen emit MIR which is then lowered to
wasm code during linking, with optimal function indexes etc, or
relocations are emitted if outputting an object. Previously, this would
always emit relocations, which are fully unnecessary when emitting an
executable, and required all function calls to use the maximum size LEB
encoding.

This branch introduces the concept of the "prelink" phase which occurs
after all object files have been parsed, but before any Zcu updates are
sent to the linker. This allows the linker to fully parse all objects
into a compact memory model, which is guaranteed to be complete when Zcu
code is generated.

This commit is not a complete implementation of all these goals; it is
not even passing semantic analysis.
2025-01-15 15:11:35 -08:00
mlugg
4b910e525d Sema: more validation for builtin decl types
Also improve the source locations when this validation fails.

Resolves: #22465
2025-01-14 22:44:18 +00:00
Travis Lange
82e7f23c49 Added support for thin lto 2025-01-05 18:08:11 +01:00
mlugg
b039a8b615 compiler: slightly simplify builtin decl memoization
Rather than `Zcu.BuiltinDecl.Memoized` being a struct with fields, it
can instead just be an array, indexed by the enum. This allows runtime
indexing, avoiding a few now-unnecessary `inline` switch cases.
2025-01-05 05:52:02 +00:00
mlugg
f01029c4af
incremental: new AnalUnit to group dependencies on std.builtin decls
This commit reworks how values like the panic handler function are
memoized during a compiler invocation. Previously, the value was
resolved by whichever analysis requested it first, and cached on `Zcu`.
This is problematic for incremental compilation, as after the initial
resolution, no dependencies are marked by users of this memoized state.
This is arguably acceptable for `std.builtin`, but it's definitely not
acceptable for the panic handler/messages, because those can be set by
the user (`std.builtin.Panic` checks `@import("root").Panic`).

So, here we introduce a new kind of `AnalUnit`, called `memoized_state`.
There are 3 such units:
* `.{ .memoized_state = .va_list }` resolves the type `std.builtin.VaList`
* `.{ .memoized_state = .panic }` resolves `std.Panic`
* `.{ .memoized_state = .main }` resolves everything else we want

These units essentially "bundle" the resolution of their corresponding
declarations, storing the results into fields on `Zcu`. This way, when,
for instance, a function wants to call the panic handler, it simply runs
`ensureMemoizedStateResolved`, registering one dependency, and pulls the
values from the `Zcu`. This "bundling" minimizes dependency edges. The 3
units are separated to allow them to act independently: for instance,
the panic handler can use `std.builtin.Type` without triggering a
dependency loop.
2025-01-04 07:51:19 +00:00
mlugg
3afda4322c
compiler: analyze type and value of global declaration separately
This commit separates semantic analysis of the annotated type vs value
of a global declaration, therefore allowing recursive and mutually
recursive values to be declared.

Every `Nav` which undergoes analysis now has *two* corresponding
`AnalUnit`s: `.{ .nav_val = n }` and `.{ .nav_ty = n }`. The `nav_val`
unit is responsible for *fully resolving* the `Nav`: determining its
value, linksection, addrspace, etc. The `nav_ty` unit, on the other
hand, resolves only the information necessary to construct a *pointer*
to the `Nav`: its type, addrspace, etc. (It does also analyze its
linksection, but that could be moved to `nav_val` I think; it doesn't
make any difference).

Analyzing a `nav_ty` for a declaration with no type annotation will just
mark a dependency on the `nav_val`, analyze it, and finish. Conversely,
analyzing a `nav_val` for a declaration *with* a type annotation will
first mark a dependency on the `nav_ty` and analyze it, using this as
the result type when evaluating the value body.

The `nav_val` and `nav_ty` units always have references to one another:
so, if a `Nav`'s type is referenced, its value implicitly is too, and
vice versa. However, these dependencies are trivial, so, to save memory,
are only known implicitly by logic in `resolveReferences`.

In general, analyzing ZIR `decl_val` will only analyze `nav_ty` of the
corresponding `Nav`. There are two exceptions to this. If the
declaration is an `extern` declaration, then we immediately ensure the
`Nav` value is resolved (which doesn't actually require any more
analysis, since such a declaration has no value body anyway).
Additionally, if the resolved type has type tag `.@"fn"`, we again
immediately resolve the `Nav` value. The latter restriction is in place
for two reasons:

* Functions are special, in that their externs are allowed to trivially
  alias; i.e. with a declaration `extern fn foo(...)`, you can write
  `const bar = foo;`. This is not allowed for non-function externs, and
  it means that function types are the only place where it is possible
  for a declaration `Nav` to have a `.@"extern"` value without actually
  being declared `extern`. We need to identify this situation
  immediately so that the `decl_ref` can create a pointer to the *real*
  extern `Nav`, not this alias.
* In certain situations, such as taking a pointer to a `Nav`, Sema needs
  to queue analysis of a runtime function if the value is a function. To
  do this, the function value needs to be known, so we need to resolve
  the value immediately upon `&foo` where `foo` is a function.

This restriction is simple to codify into the eventual language
specification, and doesn't limit the utility of this feature in
practice.

A consequence of this commit is that codegen and linking logic needs to
be more careful when looking at `Nav`s. In general:

* When `updateNav` or `updateFunc` is called, it is safe to assume that
  the `Nav` being updated (the owner `Nav` for `updateFunc`) is fully
  resolved.
* Any `Nav` whose value is/will be an `@"extern"` or a function is fully
  resolved; see `Nav.getExtern` for a helper for a common case here.
* Any other `Nav` may only have its type resolved.

This didn't seem to be too tricky to satisfy in any of the existing
codegen/linker backends.

Resolves: #131
2024-12-24 02:18:41 +00:00
mlugg
18362ebe13
Zir: refactor declaration instruction representation
The new representation is often more compact. It is also more
straightforward to understand: for instance, `extern` is represented on
the `declaration` instruction itself rather than using a special
instruction. The same applies to `var`, making both of these far more
compact.

This commit also separates the type and value bodies of a `declaration`
instruction. This is a prerequisite for #131.

In general, `declaration` now directly encodes details of the syntax
form used, and the embedded ZIR bodies are for actual expressions. The
only exception to this is functions, where ZIR is effectively designed
as if we had #1717. `extern fn` declarations are modeled as
`extern const` with a function type, and normal `fn` definitions are
modeled as `const` with a `func{,_fancy,_inferred}` instruction. This
may change in the future, but improving on this was out of scope for
this commit.
2024-12-23 21:09:17 +00:00
Alex Rønne Petersen
8af82621d7
compiler: Improve the handling of unwind table levels.
The goal here is to support both levels of unwind tables (sync and async) in
zig cc and zig build. Previously, the LLVM backend always used async tables
while zig cc was partially influenced by whatever was Clang's default.
2024-12-11 00:10:15 +01:00
Alex Rønne Petersen
4e29c67eed
llvm: Remove dead targetArch() and targetOs() functions.
These were leftovers from when we used the LLVM API to create modules.
2024-12-03 20:43:20 +01:00
Alex Rønne Petersen
09b39f77b7
std.Target: Remove Os.Tag.bridgeos.
It doesn't appear that targeting bridgeOS is meaningfully supported by Apple.
Even LLVM/Clang appear to have incomplete support for it, suggesting that Apple
never bothered to upstream that support. So there's really no sense in us
pretending to support this.
2024-12-03 20:43:15 +01:00
Alex Rønne Petersen
f0f2dc52cc
llvm: Lower ohoseabi to ohos instead of verbatim.
LLVM doesn't recognize ohoseabi.
2024-11-28 22:04:00 +01:00
Alex Rønne Petersen
8594f179f9
Merge pull request #22067 from alexrp/pie-tests
Add PIC/PIE tests and fix some bugs + some improvements to the test harness
2024-11-28 14:07:28 +01:00
Jacob Young
c894ac09a3 dwarf: fix stepping through an inline loop containing one statement
Previously, stepping from the single statement within the loop would
always exit the loop because all of the code unrolled from the loop is
associated with the same line and treated by the debugger as one line.
2024-11-24 17:28:12 -05:00
Alex Rønne Petersen
24ecf45569
std.Target: Add Os.HurdVersionRange for Os.Tag.hurd.
This is necessary since isGnuLibC() is true for hurd, so we need to be able to
represent a glibc version for it.

Also add an Os.TaggedVersionRange.gnuLibCVersion() convenience function.
2024-11-24 22:11:16 +01:00
Alex Rønne Petersen
c9052ef931
llvm: Disable lowering to f16 on sparc. 2024-11-08 14:57:16 +01:00
Alex Rønne Petersen
bdca2d0f48
llvm: Also apply the nobuiltin attribute for the no_builtin module option.
From `zig build-exe --help`:

  -fno-builtin              Disable implicit builtin knowledge of functions

It seems entirely reasonable and even expected that this option should imply
both no-builtins on functions (which disables transformation of recognized code
patterns to libcalls) and nobuiltin on call sites (which disables transformation
of libcalls to intrinsics). We now match Clang's behavior for -fno-builtin.

In both cases, we're painting with a fairly broad brush by applying this to an
entire module, but it's better than nothing. #21833 proposes a more fine-grained
way to apply nobuiltin.
2024-11-05 22:41:06 +01:00
Alex Rønne Petersen
b57819118d
Compilation: Move no_builtin to Package.Module.
This option, by its very nature, needs to be attached to a module. If it isn't,
the code in a module could break at random when compiled into an application
that doesn't have this option set.

After this change, skip_linker_dependencies no longer implies no_builtin in the
LLVM backend.
2024-11-05 14:43:02 +01:00
Alex Rønne Petersen
bd7dda0c55
Merge pull request #21907 from alexrp/valgrind-stuff
Add client request support for all architectures supported by Valgrind
2024-11-05 10:03:44 +01:00
Alex Rønne Petersen
bd8ef0036d llvm: Use no-builtins attribute instead of nobuiltin.
The former prevents recognizing code patterns and turning them into libcalls,
which is what we want for compiler-rt. The latter is meant to be used on call
sites to prevent them from being turned into intrinsics.

Context: https://github.com/ziglang/zig/issues/21833
2024-11-04 14:17:57 +01:00
Alex Rønne Petersen
8a73a965d3
llvm: Add client request support for all archs supported by Valgrind. 2024-11-04 13:53:20 +01:00
Alex Rønne Petersen
3054486d1d
Merge pull request #21843 from alexrp/callconv-followup
Some follow-up work for #21697
2024-11-03 14:27:09 +01:00
Alex Rønne Petersen
2f003f39b2
Merge pull request #21599 from alexrp/thumb-porting 2024-11-03 14:25:30 +01:00
Alex Rønne Petersen
947b7195bf llvm: Update the list of address spaces for LLVM 19.
Mainly affects amdgcn.
2024-11-03 13:44:38 +01:00
Alex Rønne Petersen
c9e67e71c1
std.Target: Replace isARM() with isArmOrThumb() and rename it to isArm().
The old isARM() function was a portability trap. With the name it had, it seemed
like the obviously correct function to use, but it didn't include Thumb. In the
vast majority of cases where someone wants to ask "is the target Arm?", Thumb
*should* be included.

There are exactly 3 cases in the codebase where we do actually need to exclude
Thumb, although one of those is in Aro and mirrors a check in Clang that is
itself likely a bug. These rare cases can just add an extra isThumb() check.
2024-11-03 09:29:30 +01:00
Alex Rønne Petersen
3a5142af8d
compiler: Handle arm_aapcs16_vfp alongside arm_aapcs_vfp in some places. 2024-11-02 10:44:18 +01:00
Alex Rønne Petersen
a5bc9cbb15
llvm: Fix lowering of gnuilp32 ABI to be gnu_ilp32.
LLVM doesn't even recognize the gnuilp32 spelling as an alternative.
2024-11-02 10:42:53 +01:00
Alex Rønne Petersen
270fbbcd86
std.Target: Add muslabin32 and muslabi64 tags to Abi.
Once we upgrade to LLVM 20, these should be lowered verbatim rather than to
simply musl. Similarly, the special case in llvmMachineAbi() should go away.
2024-11-02 10:42:53 +01:00
Alex Rønne Petersen
fccf15fc9f std.Target: Remove armv7k/armv7s.
Like d1d95294fd657f771657ea671a6984b860347fb0, this is more Apple nonsense where
they abused the arch component of the triple to encode what's really an ABI.

Handling this correctly in Zig's target triple model would take quite a bit of
work. Fortunately, the last Armv7-based Apple Watch was released in 2017 and
these targets are now considered legacy. By the time Zig hits 1.0, they will be
a distant memory. So just remove them.
2024-11-02 10:25:40 +01:00
Alex Rønne Petersen
3c1ccbdf39
std.Target: Add support for specifying Android API level. 2024-11-01 06:23:59 +01:00
Alex Rønne Petersen
c5395f7cd9
llvm: Set OS min version and libc version in ~all cases.
Except Windows, because that just doesn't really fit into LLVM's target triple
format currently.
2024-11-01 06:23:59 +01:00
Alex Rønne Petersen
60c0f522d9
llvm: Set vendor type in LLVM target triple for more OSs.
Annoyingly, LLVM and Clang have various checks throughout that depend on these
vendor types being set.
2024-11-01 06:23:59 +01:00
mlugg
d11bbde5f9
compiler: remove anonymous struct types, unify all tuples
This commit reworks how anonymous struct literals and tuples work.

Previously, an untyped anonymous struct literal
(e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type",
which is a special kind of struct which coerces using structural
equivalence. This mechanism was a holdover from before we used
RLS / result types as the primary mechanism of type inference. This
commit changes the language so that the type assigned here is a "normal"
struct type. It uses a form of equivalence based on the AST node and the
type's structure, much like a reified (`@Type`) type.

Additionally, tuples have been simplified. The distinction between
"simple" and "complex" tuple types is eliminated. All tuples, even those
explicitly declared using `struct { ... }` syntax, use structural
equivalence, and do not undergo staged type resolution. Tuples are very
restricted: they cannot have non-`auto` layouts, cannot have aligned
fields, and cannot have default values with the exception of `comptime`
fields. Tuples currently do not have optimized layout, but this can be
changed in the future.

This change simplifies the language, and fixes some problematic
coercions through pointers which led to unintuitive behavior.

Resolves: #16865
2024-10-31 20:42:53 +00:00
Alex Rønne Petersen
f1f804e532
zig_llvm: Reduce our exposure to LLVM API breakage.
LLVM recently introduced new Triple::ArchType members in 19.1.3 which broke our
static assertions in zig_llvm.cpp. When implementing a fix for that, I realized
that we don't even need a lot of the stuff we have in zig_llvm.(cpp,h) anymore.
This commit trims the interface down considerably.
2024-10-31 01:27:22 +01:00
David Rubin
93e5ea033c
implement new interrupts in the llvm backend 2024-10-27 13:33:27 -07:00
Alex Rønne Petersen
03d0e296cb
Merge pull request #21710 from alexrp/function-alignment
Some improvements to the compiler's handling of function alignment
2024-10-25 11:10:28 +02:00