1076 Commits

Author SHA1 Message Date
Mason Remaley
13c6eb0d71
compiler,std: implement ZON support
This commit allows using ZON (Zig Object Notation) in a few ways.

* `@import` can be used to load ZON at comptime and convert it to a
  normal Zig value. In this case, `@import` must have a result type.
* `std.zon.parse` can be used to parse ZON at runtime, akin to the
  parsing logic in `std.json`.
* `std.zon.stringify` can be used to convert arbitrary data structures
  to ZON at runtime, again akin to `std.json`.
2025-02-03 09:14:37 +00:00
Andrew Kelley
37037b269e frontend: use main Compilation code_model when building libxx
as well as libtsan, libunwind, and libc files
2025-01-27 17:07:56 -08:00
Matthew Lugg
3767b08039
Merge pull request #22602 from mlugg/incr-embedfile
incremental: handle `@embedFile`
2025-01-26 01:41:56 +00:00
Alex Rønne Petersen
46a1a15861
Compilation: Remove the _tls_index hack for MinGW.
No longer needed since we disable LTO for mingw32.
2025-01-25 14:58:44 +01:00
Alex Rønne Petersen
1e8b82f6b4
compiler: Rework LTO settings for some Zig-provided libraries.
* compiler-rt and mingw32 have both run into LLD bugs, and LLVM disables LTO for
  its compiler-rt, so disable LTO for these.
* While we haven't run into any bugs in it, LLVM disables LTO for its libtsan,
  so follow suit just to be safe.
* Allow LTO for libfuzzer as LLVM does.
2025-01-25 14:57:33 +01:00
mlugg
f47b8de2ad
incremental: handle @embedFile
Uses of `@embedFile` register dependencies on the corresponding
`Zcu.EmbedFile`. At the start of every update, we iterate all embedded
files and update them if necessary, and invalidate the dependencies if
they changed.

In order to properly integrate with the lazy analysis model, failed
embed files are now reported by the `AnalUnit` which actually used
`@embedFile`; the filesystem error is stored in the `Zcu.EmbedFile`.

An incremental test is added covering incremental updates to embedded
files, and I have verified locally that dependency invalidation is
working correctly.
2025-01-25 06:07:08 +00:00
Andrew Kelley
d3f4e9d4ac Compilation pipeline: repeat failed prelink tasks
and remove faulty assertion. When a prelink task fails, the
completed_prelink_tasks counter will not decrement.

A future improvement will be needed to make the pipeline fully robust
and handle failed prelink tasks, followed by updates in which those
tasks succeed, and compilation proceeds like normal.

Currently if a prelink task fails, the Compilation will be left in a
state unrecoverable by an incremental update.
2025-01-24 12:22:00 -08:00
Alex Rønne Petersen
ef4d7f01a5
compiler: Fix computation of Compilation.Config.any_unwind_tables.
This moves the default value logic to Package.Module.create() instead and makes
it so that Compilation.Config.any_unwind_tables is computed similarly to
any_sanitize_thread, any_fuzz, etc. It turns out that for any_unwind_tables, we
only actually care if unwind tables are enabled at all, not at what level.
2025-01-23 23:22:38 +00:00
Alex Rønne Petersen
280ced66eb
std.Target: Define and use lime1 as the baseline CPU model for WebAssembly.
See: https://github.com/WebAssembly/tool-conventions/pull/235

This is not *quite* using the same features as the spec'd lime1 model because
LLVM 19 doesn't have the level of feature granularity that we need for that.
This will be fixed once we upgrade to LLVM 20.

Part of #21818.
2025-01-22 03:01:05 +01:00
Andrew Kelley
874e17fe60 embrace the future slightly less
Turns out that even modern Debian aarch64 glibc libc_nonshared.a has
references to _init, meaning that the previous commit caused a
regression when trying to build any -lc executable on that target.

This commit backs out the changes to LibCInstallation.

There is still a fork in the road coming up when the self-hosted ELF
linker becomes load bearing on that target.
2025-01-20 20:59:52 -08:00
Andrew Kelley
f5485a52bc reject crti.o/crtn.o, embrace the future
crti.o/crtn.o is a legacy strategy for calling constructor functions
upon object loading that has been superseded by the
init_array/fini_array mechanism.

Zig code depends on neither, since the language intentionally has no way
to initialize data at runtime, but alas the Zig linker still must
support this feature since popular languages depend on it.

Anyway, the way it works is that crti.o has the machine code prelude of
two functions called _init and _fini, each in their own section with the
respective name. crtn.o has the machine code instructions comprising the
exitlude for each function. In between, objects use the .init and .fini
link section to populate the function body.

This function is then expected to be called upon object initialization
and deinitialization.

This mechanism is depended on by libc, for example musl and glibc, but
only for older ISAs. By the time the libcs gained support for newer
ISAs, they had moved on to the init_array/fini_array mechanism instead.

For the Zig linker, we are trying to move the linker towards
order-independent objects which is incompatible with the legacy
crti/crtn mechanism.

Therefore, this commit drops support entirely for crti/crtn mechanism,
which is necessary since the other commits in this branch make it
nondeterministic in which order the libc objects and the other link
inputs are sent to the linker.

The linker is still expected to produce a deterministic output, however,
by ignoring object input order for the purposes of symbol resolution.
2025-01-20 20:59:52 -08:00
Andrew Kelley
77af309cb6 Compilation: take advantage of @splat 2025-01-20 20:59:52 -08:00
Andrew Kelley
28da530271 Compilation pipeline: linker input producing Job representation
Move all the remaining Jobs that produce linker inputs to be spawned
earlier in the pipeline and registered with link_task_wait_group.
2025-01-20 20:59:52 -08:00
Andrew Kelley
966169fa68 compilation pipeline: do glibc jobs earlier 2025-01-20 20:59:52 -08:00
Andrew Kelley
ce00e91aa5 Compilation pipeline: do musl jobs earlier
This means doing more work in parallel which is already good, but it's
also a correctnes fix because we need link_task_wait_group.wait() to
ensure that no more linker inputs will be generated.
2025-01-20 20:59:52 -08:00
Jacob Young
af1191ea8b x86_64: rewrite 2025-01-16 20:42:07 -05:00
Andrew Kelley
e4868693a5 Compilation: windows doesn't prelink yet 2025-01-15 19:59:53 -08:00
Andrew Kelley
c535422423 wasm linker: implement hidden visibility 2025-01-15 15:11:36 -08:00
Andrew Kelley
f89ef2f7cd Compilation.saveState implement saving wasm linker state 2025-01-15 15:11:36 -08:00
Andrew Kelley
0dd0ebb6e2 frontend: don't increment remaining_prelink_tasks for windows implibs
yet because they aren't hooked up to the new linker API
2025-01-15 15:11:36 -08:00
Andrew Kelley
a4895f3c42 wasm object parsing: fix handling of weak functions and globals 2025-01-15 15:11:36 -08:00
Andrew Kelley
4fccb5ae7a wasm linker: improve error messages by making source locations more lazy 2025-01-15 15:11:36 -08:00
Andrew Kelley
9cd7cad42e Compilation: account for C objects and resources in prelink 2025-01-15 15:11:36 -08:00
Andrew Kelley
2dbf66dd69 wasm linker: implement stack pointer global 2025-01-15 15:11:36 -08:00
Andrew Kelley
d1cde847a3 implement the prelink phase in the frontend
this strategy uses a "postponed" queue to handle codegen tasks that
spawn too early. there's probably a better way.
2025-01-15 15:11:36 -08:00
Andrew Kelley
070b973c4a wasm linker: allow undefined imports when lib name is provided
and expose object_host_name as an option for setting the lib name for
object files, since the wasm linking standards don't specify a way to do
it.
2025-01-15 15:11:36 -08:00
Andrew Kelley
77accf597d elf linker: conform to explicit error sets 2025-01-15 15:11:35 -08:00
Andrew Kelley
795e7c64d5 wasm linker: aggressive DODification
The goals of this branch are to:
* compile faster when using the wasm linker and backend
* enable saving compiler state by directly copying in-memory linker
  state to disk.
* more efficient compiler memory utilization
* introduce integer type safety to wasm linker code
* generate better WebAssembly code
* fully participate in incremental compilation
* do as much work as possible outside of flush(), while continuing to do
  linker garbage collection.
* avoid unnecessary heap allocations
* avoid unnecessary indirect function calls

In order to accomplish this goals, this removes the ZigObject
abstraction, as well as Symbol and Atom. These abstractions resulted
in overly generic code, doing unnecessary work, and needless
complications that simply go away by creating a better in-memory data
model and emitting more things lazily.

For example, this makes wasm codegen emit MIR which is then lowered to
wasm code during linking, with optimal function indexes etc, or
relocations are emitted if outputting an object. Previously, this would
always emit relocations, which are fully unnecessary when emitting an
executable, and required all function calls to use the maximum size LEB
encoding.

This branch introduces the concept of the "prelink" phase which occurs
after all object files have been parsed, but before any Zcu updates are
sent to the linker. This allows the linker to fully parse all objects
into a compact memory model, which is guaranteed to be complete when Zcu
code is generated.

This commit is not a complete implementation of all these goals; it is
not even passing semantic analysis.
2025-01-15 15:11:35 -08:00
Jacob Young
02692ad78c cbe: fix miscomps of the compiler 2025-01-10 06:10:15 -05:00
Reuben Dunnington
a7a5f3506b fix win32 manifest ID for DLLs
* MSDN documentation page covering what resource IDs manifests should have:
  https://learn.microsoft.com/en-us/windows/win32/sbscs/using-side-by-side-assemblies-as-a-resource
* This change ensures shared libraries that embed win32 manifests use the
  proper ID of 2 instead of 1, which is only allowed for .exes. If the manifest
  uses the wrong ID, it will not be found and is essentially ignored.
2025-01-06 15:56:21 +01:00
Travis Lange
82e7f23c49 Added support for thin lto 2025-01-05 18:08:11 +01:00
mlugg
065e10c95c
link: new incremental line number update API 2025-01-05 02:20:56 +00:00
mlugg
f01029c4af
incremental: new AnalUnit to group dependencies on std.builtin decls
This commit reworks how values like the panic handler function are
memoized during a compiler invocation. Previously, the value was
resolved by whichever analysis requested it first, and cached on `Zcu`.
This is problematic for incremental compilation, as after the initial
resolution, no dependencies are marked by users of this memoized state.
This is arguably acceptable for `std.builtin`, but it's definitely not
acceptable for the panic handler/messages, because those can be set by
the user (`std.builtin.Panic` checks `@import("root").Panic`).

So, here we introduce a new kind of `AnalUnit`, called `memoized_state`.
There are 3 such units:
* `.{ .memoized_state = .va_list }` resolves the type `std.builtin.VaList`
* `.{ .memoized_state = .panic }` resolves `std.Panic`
* `.{ .memoized_state = .main }` resolves everything else we want

These units essentially "bundle" the resolution of their corresponding
declarations, storing the results into fields on `Zcu`. This way, when,
for instance, a function wants to call the panic handler, it simply runs
`ensureMemoizedStateResolved`, registering one dependency, and pulls the
values from the `Zcu`. This "bundling" minimizes dependency edges. The 3
units are separated to allow them to act independently: for instance,
the panic handler can use `std.builtin.Type` without triggering a
dependency loop.
2025-01-04 07:51:19 +00:00
mlugg
f0d5e0df4d
incremental: fix errors not being deleted upon re-analysis
Previously, logic in `Compilation.getAllErrorsAlloc` was corrupting the
`failed_analysis` hashmap. This meant that on updates after the initial
update, attempts to remove entries from this map (because the `AnalUnit`
in question is being re-analyzed) silently failed. This resulted in
compile errors from earlier updates wrongly getting "stuck", i.e. never
being removed.

This commit also adds a few log calls which helped me to find this bug.
2025-01-01 15:49:37 +00:00
mlugg
6026a5f217
compiler: ensure result of block_comptime is comptime-known
To avoid this PR regressing error messages, most of the work here has
gone towards improving error notes for why code was comptime-evaluated.
ZIR `block_comptime` now stores a "comptime reason", the enum for which
is also used by Sema. There are two types in Sema:

* `ComptimeReason` represents the reason we started evaluating something
  at comptime.
* `BlockComptimeReason` represents the reason a given block is evaluated
  at comptime; it's either a `ComptimeReason` with an attached source
  location, or it's because we're in a function which was called at
  comptime (and that function's `Block` should be consulted for the
  "parent" reason).

Every `Block` stores a `?BlockComptimeReason`. The old `is_comptime`
field is replaced with a trivial `isComptime()` method which returns
whether that reason is non-`null`.

Lastly, the handling for `block_comptime` has been simplified. It was
previously going through an unnecessary runtime-handling path; now, it
is a trivial sub block exited through a `break_inline` instruction.

Resolves: #22296
2024-12-31 09:55:03 +00:00
mlugg
3afda4322c
compiler: analyze type and value of global declaration separately
This commit separates semantic analysis of the annotated type vs value
of a global declaration, therefore allowing recursive and mutually
recursive values to be declared.

Every `Nav` which undergoes analysis now has *two* corresponding
`AnalUnit`s: `.{ .nav_val = n }` and `.{ .nav_ty = n }`. The `nav_val`
unit is responsible for *fully resolving* the `Nav`: determining its
value, linksection, addrspace, etc. The `nav_ty` unit, on the other
hand, resolves only the information necessary to construct a *pointer*
to the `Nav`: its type, addrspace, etc. (It does also analyze its
linksection, but that could be moved to `nav_val` I think; it doesn't
make any difference).

Analyzing a `nav_ty` for a declaration with no type annotation will just
mark a dependency on the `nav_val`, analyze it, and finish. Conversely,
analyzing a `nav_val` for a declaration *with* a type annotation will
first mark a dependency on the `nav_ty` and analyze it, using this as
the result type when evaluating the value body.

The `nav_val` and `nav_ty` units always have references to one another:
so, if a `Nav`'s type is referenced, its value implicitly is too, and
vice versa. However, these dependencies are trivial, so, to save memory,
are only known implicitly by logic in `resolveReferences`.

In general, analyzing ZIR `decl_val` will only analyze `nav_ty` of the
corresponding `Nav`. There are two exceptions to this. If the
declaration is an `extern` declaration, then we immediately ensure the
`Nav` value is resolved (which doesn't actually require any more
analysis, since such a declaration has no value body anyway).
Additionally, if the resolved type has type tag `.@"fn"`, we again
immediately resolve the `Nav` value. The latter restriction is in place
for two reasons:

* Functions are special, in that their externs are allowed to trivially
  alias; i.e. with a declaration `extern fn foo(...)`, you can write
  `const bar = foo;`. This is not allowed for non-function externs, and
  it means that function types are the only place where it is possible
  for a declaration `Nav` to have a `.@"extern"` value without actually
  being declared `extern`. We need to identify this situation
  immediately so that the `decl_ref` can create a pointer to the *real*
  extern `Nav`, not this alias.
* In certain situations, such as taking a pointer to a `Nav`, Sema needs
  to queue analysis of a runtime function if the value is a function. To
  do this, the function value needs to be known, so we need to resolve
  the value immediately upon `&foo` where `foo` is a function.

This restriction is simple to codify into the eventual language
specification, and doesn't limit the utility of this feature in
practice.

A consequence of this commit is that codegen and linking logic needs to
be more careful when looking at `Nav`s. In general:

* When `updateNav` or `updateFunc` is called, it is safe to assume that
  the `Nav` being updated (the owner `Nav` for `updateFunc`) is fully
  resolved.
* Any `Nav` whose value is/will be an `@"extern"` or a function is fully
  resolved; see `Nav.getExtern` for a helper for a common case here.
* Any other `Nav` may only have its type resolved.

This didn't seem to be too tricky to satisfy in any of the existing
codegen/linker backends.

Resolves: #131
2024-12-24 02:18:41 +00:00
mlugg
40aafcd6a8
compiler: remove Cau
The `Cau` abstraction originated from noting that one of the two primary
roles of the legacy `Decl` type was to be the subject of comptime
semantic analysis. However, the data stored in `Cau` has always had some
level of redundancy. While preparing for #131, I went to remove that
redundany, and realised that `Cau` now had exactly one field: `owner`.

This led me to conclude that `Cau` is, in fact, an unnecessary level of
abstraction over what are in reality *fundamentally different* kinds of
analysis unit (`AnalUnit`). Types, `Nav` vals, and `comptime`
declarations are all analyzed in different ways, and trying to treat
them as the same thing is counterproductive!

So, these 3 cases are now different alternatives in `AnalUnit`. To avoid
stealing bits from `InternPool`-based IDs, which are already a little
starved for bits due to the sharding datastructures, `AnalUnit` is
expanded to 64 bits (30 of which are currently unused). This doesn't
impact memory usage too much by default, because we don't store
`AnalUnit`s all too often; however, we do store them a lot under
`-fincremental`, so a non-trivial bump to peak RSS can be observed
there. This will be improved in the future when I made
`InternPool.DepEntry` less memory-inefficient.

`Zcu.PerThread.ensureCauAnalyzed` is split into 3 functions, for each of
the 3 new types of `AnalUnit`. The new logic is much easier to
understand, because it avoids conflating the logic of these
fundamentally different cases.
2024-12-24 02:18:41 +00:00
Jacob Young
5c76e08f49 lldb: add pretty printer for intern pool indices 2024-12-20 22:51:20 -05:00
Alex Rønne Petersen
5f34224b2b
zig cc: Remove broken CUDA C/C++ support. 2024-12-15 05:45:53 +01:00
Alex Rønne Petersen
39c4efa2a7
Compilation: Clean up addCCArgs().
The goal of this commit is to get rid of some "unused command line argument"
warnings that Clang would give for various file types previously. This cleanup
also has the side effect of making the order of flags more understandable,
especially as it pertains to include paths.

Since a lot of code was shuffled around in this commit, I recommend reviewing
the old and new versions of the function side-by-side rather than trying to make
sense of the diff.
2024-12-14 06:49:45 +01:00
Alex Rønne Petersen
d74e87aab1
Compilation: Use Clang dependency file for preprocessed assembly files. 2024-12-13 06:48:35 +01:00
Alex Rønne Petersen
12a289c1dd
Compilation: Use a better canonical file extension for header files. 2024-12-13 06:48:35 +01:00
Alex Rønne Petersen
88f324ebe7
Compilation: Override Clang's language type for header files.
Clang seems to treat them as linker input without this.
2024-12-13 06:48:32 +01:00
Alex Rønne Petersen
b6ece854c9
Compilation: Improve classification of various C/C++/Objective-C files. 2024-12-13 05:05:36 +01:00
Alex Rønne Petersen
130f7c2ed8
Merge pull request #22035 from alexrp/unwind-fixes
Better unwind table support + unwind protection in `_start()` and `clone()`
2024-12-13 03:09:24 +01:00
Andrew Kelley
7ff42eff91 std.Build.Cache.hit: work around macOS kernel bug
The previous commit cast doubt upon the initial report about macOS
kernel behavior, identifying another reason that ENOENT could be
returned from file creation.

However, it is demonstrable that ENOENT can be returned for both cases:
1. create file race
2. handle refers to deleted directory

This commit re-introduces the workaround for the file creation race on
macOS however it does not unconditionally retry - it first tries again
with O_EXCL to disambiguate the error condition that has occurred.
2024-12-11 11:56:44 -08:00
Andrew Kelley
d37ee79535 std.Build.Cache.hit: more discipline in error handling
Previous commits

2b0929929d67e222ca6a9523a3a594ed456c4a51
4ea2f441df36cec61e1017f4d795d4037326c98c

had this text:

> There are no dir components, so you would think that this was
> unreachable, however we have observed on macOS two processes racing to
> do openat() with O_CREAT manifest in ENOENT.

This appears to have been a misunderstanding based on the issue
report #12138 and corresponding PR #12139 in which the steps to
reproduce removed the cache directory in a loop which also executed
detached Zig compiler processes.

There is no evidence for the macOS kernel bug however the ENOENT is
easily explained by the removal of the cache directory.

This commit reverts those commits, ultimately reporting the ENOENT as an
error rather than repeating the create file operation. However this
commit also adds an explicit error set to `std.Build.Cache.hit` as well
as changing the `failed_file_index` to a proper diagnostic field that
fully communicates what failed, leading to more informative error
messages on failure to check the cache.

The equivalent failure when occuring for AstGen performs a fatal process
kill, reasoning being that the compiler has an invariant of the cache
directory not being yanked out from underneath it while executing. This
could be made a more granular error in the future but I suspect such
thing is not valuable to pursue.

Related to #18340 but does not solve it.
2024-12-10 18:11:12 -08:00
Alex Rønne Petersen
8af82621d7
compiler: Improve the handling of unwind table levels.
The goal here is to support both levels of unwind tables (sync and async) in
zig cc and zig build. Previously, the LLVM backend always used async tables
while zig cc was partially influenced by whatever was Clang's default.
2024-12-11 00:10:15 +01:00
Travis Lange
414fc3b207 fix unknown file extension with rmeta 2024-12-10 21:12:55 +01:00
Andrew Kelley
7575f21212
Merge pull request #22157 from mlugg/astgen-error-lazy
compiler: allow semantic analysis of files with AstGen errors
2024-12-09 18:32:23 -05:00