mirror/zig - zig - Bouvais Git

mirror/zig

mirror of https://github.com/ziglang/zig.git synced 2025-12-12 17:23:09 +00:00

Author	SHA1	Message	Date
Andrew Kelley	756a2dbf1a	compiler: upgrade various std.io API usage	2025-07-07 22:43:52 -07:00
Andrew Kelley	941bc37193	compiler: update all instances of std.fmt.Formatter	2025-07-07 22:43:52 -07:00
Jacob Young	917640810e	Target: pass and use locals by pointer instead of by value This struct is larger than 256 bytes and code that copies it consistently shows up in profiles of the compiler.	2025-06-19 11:45:06 -04:00
mlugg	6ffa285fc3	compiler: fix `@intFromFloat` safety check This safety check was completely broken; it triggered unchecked illegal behavior in order to implement the safety check. You definitely can't do that! Instead, we must explicitly check the boundaries. This is a tiny bit fiddly, because we need to make sure we do floating-point rounding in the correct direction, and also handle the fact that the operation truncates so the boundary works differently for min vs max. Instead of implementing this safety check in Sema, there are now dedicated AIR instructions for safety-checked intfromfloat (two instructions; which one is used depends on the float mode). Currently, no backend directly implements them; instead, a `Legalize.Feature` is added which expands the safety check, and this feature is enabled for all backends we currently test, including the LLVM backend. The `u0` case is still handled in Sema, because Sema needs to check for that anyway due to the comptime-known result. The old safety check here was also completely broken and has therefore been rewritten. In that case, we just check for 'abs(input) < 1.0'. I've added a bunch of test coverage for the boundary cases of `@intFromFloat`, both for successes (in `test/behavior/cast.zig`) and failures (in `test/cases/safety/`). Resolves: #24161	2025-06-15 14:15:18 -04:00
Jacob Young	6e72026e3b	Legalize: make the feature set comptime-known in zig1 This allows legalizations to be added that aren't used by zig1 without affecting the size of zig1.	2025-06-15 11:42:03 -04:00
mlugg	9eb400ef19	compiler: rework backend pipeline to separate codegen and link The idea here is that instead of the linker calling into codegen, instead codegen should run before we touch the linker, and after MIR is produced, it is sent to the linker. Aside from simplifying the call graph (by preventing N linkers from each calling into M codegen backends!), this has the huge benefit that it is possible to parallellize codegen separately from linking. The threading model can look like this: * 1 semantic analysis thread, which generates AIR * N codegen threads, which process AIR into MIR * 1 linker thread, which emits MIR to the binary The codegen threads are also responsible for `Air.Legalize` and `Air.Liveness`; it's more efficient to do this work here instead of blocking the main thread for this trivially parallel task. I have repurposed the `Zcu.Feature.separate_thread` backend feature to indicate support for this 1:N:1 threading pattern. This commit makes the C backend support this feature, since it was relatively easy to divorce from `link.C`: it just required eliminating some shared buffers. Other backends don't currently support this feature. In fact, they don't even compile -- the next few commits will fix them back up.	2025-06-12 13:55:40 +01:00
Jacob Young	0bf8617d96	x86_64: add support for pie executables	2025-06-06 23:42:14 -07:00
mlugg	add2976a9b	compiler: implement better shuffle AIR Runtime `@shuffle` has two cases which backends generally want to handle differently for efficiency: * One runtime vector operand; some result elements may be comptime-known * Two runtime vector operands; some result elements may be undefined The latter case happens if both vectors given to `@shuffle` are runtime-known and they are both used (i.e. the mask refers to them). Otherwise, if the result is not entirely comptime-known, we are in the former case. `Sema` now diffentiates these two cases in the AIR so that backends can easily handle them however they want to. Note that this doesn't really involve Sema doing any more work than it would otherwise need to, so there's not really a negative here! Most existing backends have their lowerings for `@shuffle` migrated in this commit. The LLVM backend uses new lowerings suggested by Jacob as ones which it will handle effectively. The x86_64 backend has not yet been migrated; for now there's a panic in there. Jacob will implement that before this is merged anywhere.	2025-06-01 08:24:01 +01:00
Jacob Young	d9b6d1ed33	cbe: legalize safety instructions in non-zig1 builds This is valid if the bootstrap dev env doesn't need to support runtime safety. Another solution can always be implemented if needs change.	2025-06-01 08:24:00 +01:00
Jacob Young	77e6513030	cbe: implement `stdbool.h` reserved identifiers Also remove the legalize pass from zig1.	2025-05-31 18:54:28 -04:00
Jacob Young	6198f7afb7	Sema: remove `all_vector_instructions` logic Backends can instead ask legalization on a per-instruction basis.	2025-05-31 18:54:28 -04:00
Jacob Young	b483defc5a	Legalize: implement scalarization of binary operations	2025-05-31 18:54:28 -04:00
Jacob Young	c04be630d9	Legalize: introduce a new pass before liveness Each target can opt into different sets of legalize features. By performing these transformations before liveness, instructions that become unreferenced will have up-to-date liveness information.	2025-05-29 03:57:48 -04:00
mlugg	92c63126e8	compiler: tlv pointers are not comptime-known Pointers to thread-local variables do not have their addresses known until runtime, so it is nonsensical for them to be comptime-known. There was logic in the compiler which was essentially attempting to treat them as not being comptime-known despite the pointer being an interned value. This was a bit of a mess, the check was frequent enough to actually show up in compiler profiles, and it was very awkward for backends to deal with, because they had to grapple with the fact that a "constant" they were lowering might actually require runtime operations. So, instead, do not consider these pointers to be comptime-known in any way. Never intern such a pointer; instead, when the address of a threadlocal is taken, emit an AIR instruction which computes the pointer at runtime. This avoids lots of special handling for TLVs across basically all codegen backends; of all somewhat-functional backends, the only one which wasn't improved by this change was the LLVM backend, because LLVM pretends this complexity around threadlocals doesn't exist. This change simplifies Sema and codegen, avoids a potential source of bugs, and potentially improves Sema performance very slightly by avoiding a non-trivial check on a hot path.	2025-05-27 19:23:11 +01:00
mlugg	37a9a4e0f1	compiler: refactor `Zcu.File` and path representation This commit makes some big changes to how we track state for Zig source files. In particular, it changes: * How `File` tracks its path on-disk * How AstGen discovers files * How file-level errors are tracked * How `builtin.zig` files and modules are created The original motivation here was to address incremental compilation bugs with the handling of files, such as #22696. To fix this, a few changes are necessary. Just like declarations may become unreferenced on an incremental update, meaning we suppress analysis errors associated with them, it is also possible for all imports of a file to be removed on an incremental update, in which case file-level errors for that file should be suppressed. As such, after AstGen, the compiler must traverse files (starting from analysis roots) and discover the set of "live files" for this update. Additionally, the compiler's previous handling of retryable file errors was not very good; the source location the error was reported as was based only on the first discovered import of that file. This source location also disappeared on future incremental updates. So, as a part of the file traversal above, we also need to figure out the source locations of imports which errors should be reported against. Another observation I made is that the "file exists in multiple modules" error was not implemented in a particularly good way (I get to say that because I wrote it!). It was subject to races, where the order in which different imports of a file were discovered affects both how errors are printed, and which module the file is arbitrarily assigned, with the latter in turn affecting which other files are considered for import. The thing I realised here is that while the AstGen worker pool is running, we cannot know for sure which module(s) a file is in; we could always discover an import later which changes the answer. So, here's how the AstGen workers have changed. We initially ensure that `zcu.import_table` contains the root files for all modules in this Zcu, even if we don't know any imports for them yet. Then, the AstGen workers do not need to be aware of modules. Instead, they simply ignore module imports, and only spin off more workers when they see a by-path import. During AstGen, we can't use module-root-relative paths, since we don't know which modules files are in; but we don't want to unnecessarily use absolute files either, because those are non-portable and can make `error.NameTooLong` more likely. As such, I have introduced a new abstraction, `Compilation.Path`. This type is a way of representing a filesystem path which has a canonical form. The path is represented relative to one of a few special directories: the lib directory, the global cache directory, or the local cache directory. As a fallback, we use absolute (or cwd-relative on WASI) paths. This is kind of similar to `std.Build.Cache.Path` with a pre-defined list of possible `std.Build.Cache.Directory`, but has stricter canonicalization rules based on path resolution to make sure deduplicating files works properly. A `Compilation.Path` can be trivially converted to a `std.Build.Cache.Path` from a `Compilation`, but is smaller, has a canonical form, and has a digest which will be consistent across different compiler processes with the same lib and cache directories (important when we serialize incremental compilation state in the future). `Zcu.File` and `Zcu.EmbedFile` both contain a `Compilation.Path`, which is used to access the file on-disk; module-relative sub paths are used quite rarely (`EmbedFile` doesn't even have one now for simplicity). After the AstGen workers all complete, we know that any file which might be imported is definitely in `import_table` and up-to-date. So, we perform a single-threaded graph traversal; similar to what `resolveReferences` plays for `AnalUnit`s, but for files instead. We figure out which files are alive, and which module each file is in. If a file turns out to be in multiple modules, we set a field on `Zcu` to indicate this error. If a file is in a different module to a prior update, we set a flag instructing `updateZirRefs` to invalidate all dependencies on the file. This traversal also discovers "import errors"; these are errors associated with a specific `@import`. With Zig's current design, there is only one possible error here: "import outside of module root". This must be identified during this traversal instead of during AstGen, because it depends on which module the file is in. I tried also representing "module not found" errors in this same way, but it turns out to be much more useful to report those in Sema, because of use cases like optional dependencies where a module import is behind a comptime-known build option. For simplicity, `failed_files` now just maps to `?[]u8`, since the source location is always the whole file. In fact, this allows removing `LazySrcLoc.Offset.entire_file` completely, slightly simplifying some error reporting logic. File-level errors are now directly built in the `std.zig.ErrorBundle.Wip`. If the payload is not `null`, it is the message for a retryable error (i.e. an error loading the source file), and will be reported with a "file imported here" note pointing to the import site discovered during the single-threaded file traversal. The last piece of fallout here is how `Builtin` works. Rather than constructing "builtin" modules when creating `Package.Module`s, they are now constructed on-the-fly by `Zcu`. The map `Zcu.builtin_modules` maps from digests to `Package.Module`s. These digests are abstract hashes of the `Builtin` value; i.e. all of the options which are placed into "builtin.zig". During the file traversal, we populate `builtin_modules` as needed, so that when we see this imports in Sema, we just grab the relevant entry from this map. This eliminates a bunch of awkward state tracking during construction of the module graph. It's also now clearer exactly what options the builtin module has, since previously it inherited some options arbitrarily from the first-created module with that "builtin" module! The user-visible effects of this commit are: retryable file errors are now consistently reported against the whole file, with a note pointing to a live import of that file * some theoretical bugs where imports are wrongly considered distinct (when the import path moves out of the cwd and then back in) are fixed * some consistency issues with how file-level errors are reported are fixed; these errors will now always be printed in the same order regardless of how the AstGen pass assigns file indices * incremental updates do not print retryable file errors differently between updates or depending on file structure/contents * incremental updates support files changing modules * incremental updates support files becoming unreferenced Resolves: #22696	2025-05-18 17:37:02 +01:00
Matthew Lugg	f4e9846bca	Merge pull request #23263 from mlugg/comptime-field-ptr Sema: fix pointers to comptime fields of comptime-known aggregate pointers	2025-05-03 20:10:42 +01:00
Andrew Kelley	8be4511061	C backend: less branching	2025-04-27 23:30:00 -07:00
mlugg	81277b5487	cbe: aggregate assignment does not need a second cast `writeCValue` already emits a cast; including another here is, in fact, invalid, and emits errors under MSVC. Probably this code was originally added to work around the incorrect `.Initializer` location which was fixed in the previous commit.	2025-04-28 02:38:07 +01:00
Jacob Young	029cc0640f	cbe: assignment is not initialization Turns out the backend currently never emits a non-static initializer, but the handling is kept in case it is needed again in the future.	2025-04-28 01:14:24 +01:00
dweiller	898ca82458	compiler: add @memmove builtin	2025-04-26 13:34:16 +10:00
Jacob Young	ed284c1f98	big.int: fix yet another truncate bug Too many bugs have been found with `truncate` at this point, so it was rewritten from scratch. Based on the doc comment, the utility of `convertToTwosComplement` over `r.truncate(a, .unsigned, bit_count)` is unclear and it has a subtle behavior difference that is almost certainly a bug, so it was deleted.	2025-03-21 21:51:08 -04:00
Alex Rønne Petersen	e11ac02662	cbe: Implement support for -fno-builtin and @disableIntrinsics().	2025-02-23 04:08:58 +01:00
Andrew Kelley	eb3c7f5706	zig build fmt	2025-02-22 17:09:20 -08:00
Alex Rønne Petersen	9c015e6c2b	std.builtin: Remove CallingConvention.arm_(apcs,aapcs16_vfp). * arm_apcs is the long dead "OABI" which we never had working support for. * arm_aapcs16_vfp is for arm-watchos-none which is a dead target that we've dropped support for.	2025-02-17 19:17:56 +01:00
Jacob Young	74fbcd22e6	cbe: fix crash rendering argument names in lazy functions Closes #19905	2025-02-10 17:20:52 -08:00
Jacob Young	eb7963e4c7	cbe: emit linksection for `@export` Closes #21490	2025-02-10 17:20:09 -08:00
Meghan Denny	a8af36ab10	std.ArrayHashMap: popOrNul() -> pop()	2025-02-07 17:52:19 -08:00
Jacob Young	afa74c6b21	Sema: introduce all_vector_instructions backend feature Sema is arbitrarily scalarizing some operations, which means that when I try to implement vectorized versions of those operations in a backend, they are impossible to test due to Sema not producing them. Now, I can implement them and then temporarily enable the new feature for that backend in order to test them. Once the backend supports all of them, the feature can be permanently enabled. This also deletes the Air instructions `int_from_bool` and `int_from_ptr`, which are just bitcasts with a fixed result type, since changing `un_op` to `ty_op` takes up the same amount of memory.	2025-01-31 23:00:34 -05:00
mlugg	b01d6b156c	compiler: add `intcast_safe` AIR instruction This instruction is like `intcast`, but includes two safety checks: * Checks that the int is in range of the destination type * If the destination type is an exhaustive enum, checks that the int is a named enum value This instruction is locked behind the `safety_checked_instructions` backend feature; if unsupported, Sema will emit a fallback, as with other safety-checked instructions. This instruction is used to add a missing safety check for `@enumFromInt` truncating bits. This check also has a fallback for backends which do not yet support `safety_checked_instructions`. Resolves: #21946	2025-01-30 14:47:59 +00:00
Jacob Young	b1fa89439a	x86_64: rewrite float vector `@abs` and equality comparisons	2025-01-24 20:56:11 -05:00
mlugg	0ec6b2dd88	compiler: simplify generic functions, fix issues with inline calls The original motivation here was to fix regressions caused by #22414. However, while working on this, I ended up discussing a language simplification with Andrew, which changes things a little from how they worked before #22414. The main user-facing change here is that any reference to a prior function parameter, even if potentially comptime-known at the usage site or even not analyzed, now makes a function generic. This applies even if the parameter being referenced is not a `comptime` parameter, since it could still be populated when performing an inline call. This is a breaking language change. The detection of this is done in AstGen; when evaluating a parameter type or return type, we track whether it referenced any prior parameter, and if so, we mark this type as being "generic" in ZIR. This will cause Sema to not evaluate it until the time of instantiation or inline call. A lovely consequence of this from an implementation perspective is that it eliminates the need for most of the "generic poison" system. In particular, `error.GenericPoison` is now completely unnecessary, because we identify generic expressions earlier in the pipeline; this simplifies the compiler and avoids redundant work. This also entirely eliminates the concept of the "generic poison value". The only remnant of this system is the "generic poison type" (`Type.generic_poison` and `InternPool.Index.generic_poison_type`). This type is used in two places: * During semantic analysis, to represent an unknown result type. * When storing generic function types, to represent a generic parameter/return type. It's possible that these use cases should instead use `.none`, but I leave that investigation to a future adventurer. One last thing. Prior to #22414, inline calls were a little inefficient, because they re-evaluated even non-generic parameter types whenever they were called. Changing this behavior is what ultimately led to #22538. Well, because the new logic will mark a type expression as generic if there is any change its resolved type could differ in an inline call, this redundant work is unnecessary! So, this is another way in which the new design reduces redundant work and complexity. Resolves: #22494 Resolves: #22532 Resolves: #22538	2025-01-21 02:41:42 +00:00
mlugg	d00e05f186	all: update to `std.builtin.Type.Pointer.Size` field renames This was done by regex substitution with `sed`. I then manually went over the entire diff and fixed any incorrect changes. This diff also changes a lot of `callconv(.C)` to `callconv(.c)`, since my regex happened to also trigger here. I opted to leave these changes in, since they are a correct migration, even if they're not the one I was trying to do!	2025-01-16 12:46:29 +00:00
Andrew Kelley	943dac3e85	compiler: add type safety for export indices	2025-01-15 15:11:35 -08:00
Jacob Young	02692ad78c	cbe: fix miscomps of the compiler	2025-01-10 06:10:15 -05:00
Jacob Young	3f95003d4c	cbe: fix miscomps of x86_64 backend	2025-01-08 19:33:45 -05:00
mlugg	3afda4322c	compiler: analyze type and value of global declaration separately This commit separates semantic analysis of the annotated type vs value of a global declaration, therefore allowing recursive and mutually recursive values to be declared. Every `Nav` which undergoes analysis now has two corresponding `AnalUnit`s: `.{ .nav_val = n }` and `.{ .nav_ty = n }`. The `nav_val` unit is responsible for fully resolving the `Nav`: determining its value, linksection, addrspace, etc. The `nav_ty` unit, on the other hand, resolves only the information necessary to construct a pointer to the `Nav`: its type, addrspace, etc. (It does also analyze its linksection, but that could be moved to `nav_val` I think; it doesn't make any difference). Analyzing a `nav_ty` for a declaration with no type annotation will just mark a dependency on the `nav_val`, analyze it, and finish. Conversely, analyzing a `nav_val` for a declaration with a type annotation will first mark a dependency on the `nav_ty` and analyze it, using this as the result type when evaluating the value body. The `nav_val` and `nav_ty` units always have references to one another: so, if a `Nav`'s type is referenced, its value implicitly is too, and vice versa. However, these dependencies are trivial, so, to save memory, are only known implicitly by logic in `resolveReferences`. In general, analyzing ZIR `decl_val` will only analyze `nav_ty` of the corresponding `Nav`. There are two exceptions to this. If the declaration is an `extern` declaration, then we immediately ensure the `Nav` value is resolved (which doesn't actually require any more analysis, since such a declaration has no value body anyway). Additionally, if the resolved type has type tag `.@"fn"`, we again immediately resolve the `Nav` value. The latter restriction is in place for two reasons: * Functions are special, in that their externs are allowed to trivially alias; i.e. with a declaration `extern fn foo(...)`, you can write `const bar = foo;`. This is not allowed for non-function externs, and it means that function types are the only place where it is possible for a declaration `Nav` to have a `.@"extern"` value without actually being declared `extern`. We need to identify this situation immediately so that the `decl_ref` can create a pointer to the real extern `Nav`, not this alias. * In certain situations, such as taking a pointer to a `Nav`, Sema needs to queue analysis of a runtime function if the value is a function. To do this, the function value needs to be known, so we need to resolve the value immediately upon `&foo` where `foo` is a function. This restriction is simple to codify into the eventual language specification, and doesn't limit the utility of this feature in practice. A consequence of this commit is that codegen and linking logic needs to be more careful when looking at `Nav`s. In general: * When `updateNav` or `updateFunc` is called, it is safe to assume that the `Nav` being updated (the owner `Nav` for `updateFunc`) is fully resolved. * Any `Nav` whose value is/will be an `@"extern"` or a function is fully resolved; see `Nav.getExtern` for a helper for a common case here. * Any other `Nav` may only have its type resolved. This didn't seem to be too tricky to satisfy in any of the existing codegen/linker backends. Resolves: #131	2024-12-24 02:18:41 +00:00
Jacob Young	bd0ace5c4e	cbe: prevent tautological-compare warnings in generated code	2024-12-08 10:53:50 +00:00
Jacob Young	c894ac09a3	dwarf: fix stepping through an inline loop containing one statement Previously, stepping from the single statement within the loop would always exit the loop because all of the code unrolled from the loop is associated with the same line and treated by the debugger as one line.	2024-11-24 17:28:12 -05:00
Alex Rønne Petersen	3054486d1d	Merge pull request #21843 from alexrp/callconv-followup Some follow-up work for #21697	2024-11-03 14:27:09 +01:00
Alex Rønne Petersen	c9e67e71c1	std.Target: Replace isARM() with isArmOrThumb() and rename it to isArm(). The old isARM() function was a portability trap. With the name it had, it seemed like the obviously correct function to use, but it didn't include Thumb. In the vast majority of cases where someone wants to ask "is the target Arm?", Thumb should be included. There are exactly 3 cases in the codebase where we do actually need to exclude Thumb, although one of those is in Aro and mirrors a check in Clang that is itself likely a bug. These rare cases can just add an extra isThumb() check.	2024-11-03 09:29:30 +01:00
Alex Rønne Petersen	c217fd2b9c	cbe: Support some more calling conventions.	2024-11-02 10:44:18 +01:00
Alex Rønne Petersen	3a5142af8d	compiler: Handle arm_aapcs16_vfp alongside arm_aapcs_vfp in some places.	2024-11-02 10:44:18 +01:00
mlugg	d11bbde5f9	compiler: remove anonymous struct types, unify all tuples This commit reworks how anonymous struct literals and tuples work. Previously, an untyped anonymous struct literal (e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type", which is a special kind of struct which coerces using structural equivalence. This mechanism was a holdover from before we used RLS / result types as the primary mechanism of type inference. This commit changes the language so that the type assigned here is a "normal" struct type. It uses a form of equivalence based on the AST node and the type's structure, much like a reified (`@Type`) type. Additionally, tuples have been simplified. The distinction between "simple" and "complex" tuple types is eliminated. All tuples, even those explicitly declared using `struct { ... }` syntax, use structural equivalence, and do not undergo staged type resolution. Tuples are very restricted: they cannot have non-`auto` layouts, cannot have aligned fields, and cannot have default values with the exception of `comptime` fields. Tuples currently do not have optimized layout, but this can be changed in the future. This change simplifies the language, and fixes some problematic coercions through pointers which led to unintuitive behavior. Resolves: #16865	2024-10-31 20:42:53 +00:00
mlugg	cb48376bec	cbe,translate-c: support more callconvs There are several more that we could support here, but I didn't feel like going down the rabbit-hole of figuring them out. In particular, some of the Clang enum fields aren't specific enough for us, so we'll have to switch on the target to figure out how to translate-c them. That can be a future enhancement.	2024-10-19 19:15:24 +01:00
mlugg	2d9a167cd2	std.Target: rename `defaultCCallingConvention` and `Cpu.Arch.fromCallconv`	2024-10-19 19:15:23 +01:00
mlugg	bc797a97b1	std: update for new `CallingConvention` The old `CallingConvention` type is replaced with the new `NewCallingConvention`. References to `NewCallingConvention` in the compiler are updated accordingly. In addition, a few parts of the standard library are updated to use the new type correctly.	2024-10-19 19:15:23 +01:00
mlugg	51706af908	compiler: introduce new `CallingConvention` This commit begins implementing accepted proposal #21209 by making `std.builtin.CallingConvention` a tagged union. The stage1 dance here is a little convoluted. This commit introduces the new type as `NewCallingConvention`, keeping the old `CallingConvention` around. The compiler uses `std.builtin.NewCallingConvention` exclusively, but when fetching the type from `std` when running the compiler (e.g. with `getBuiltinType`), the name `CallingConvention` is used. This allows a prior build of Zig to be used to build this commit. The next commit will update `zig1.wasm`, and then the compiler and standard library can be updated to completely replace `CallingConvention` with `NewCallingConvention`. The second half of #21209 is to remove `@setAlignStack`, which will be implemented in another commit after updating `zig1.wasm`.	2024-10-19 19:08:59 +01:00
David Rubin	043b1adb8d	remove `@fence` (#21585 ) closes #11650	2024-10-04 22:21:27 +00:00
Linus Groh	8588964972	Replace deprecated default initializations with decl literals	2024-09-12 16:01:23 +01:00
mlugg	289c704b60	cbe: don't emit 'x = x' in switch dispatch loop	2024-09-01 20:31:01 +01:00

1 2 3 4 5 ...

702 Commits