mirror/zig - zig - Bouvais Git

mirror/zig

mirror of https://github.com/ziglang/zig.git synced 2025-12-13 01:33:09 +00:00

Author	SHA1	Message	Date
Justus Klausecker	277e4a8337	fix: emit vector instead of scalar u1_zero in shl_with_overflow logic	2025-08-12 16:33:58 +02:00
Justus Klausecker	4ec421372f	add remaining undef value tests ; fix `@truncate` undef retval type	2025-08-12 16:33:58 +02:00
Justus Klausecker	79e5c138c6	replace even more aggregate interns	2025-08-12 16:33:57 +02:00
Justus Klausecker	05762ca02f	address most comments	2025-08-12 16:33:57 +02:00
Justus Klausecker	0ef26d113a	make `>>` a compile error with any undef arg ; add a bunch of test cases	2025-08-12 16:33:57 +02:00
Justus Klausecker	d0586da18e	Sema: Improve comptime arithmetic undef handling This commit expands on the foundations laid by https://github.com/ziglang/zig/pull/23177 and moves even more `Sema`-only functionality from `Value` to `Sema.arith`. Specifically all shift and bitwise operations, `@truncate`, `@bitReverse` and `@byteSwap` have been moved and adapted to the new rules around `undefined`. Especially the comptime shift operations have been basically rewritten, fixing many open issues in the process. New rules applied to operators: * `<<`, `@shlExact`, `@shlWithOverflow`, `>>`, `@shrExact`: compile error if any operand is undef * `<<\|`, `~`, `^`, `@truncate`, `@bitReverse`, `@byteSwap`: return undef if any operand is undef * `&`, `\|`: Return undef if both operands are undef, turn undef into actual `0xAA` bytes otherwise Additionally this commit canonicalizes the representation of aggregates with all-undefined members in the `InternPool` by disallowing them and enforcing the usage of a single typed `undef` value instead. This reduces the amount of edge cases and fixes a bunch of bugs related to partially undefined vecs. List of operations directly affected by this patch: * `<<`, `<<\|`, `@shlExact`, `@shlWithOverflow` * `>>`, `@shrExact` * `&`, `\|`, `~`, `^` and their atomic rmw + reduce pendants * `@truncate`, `@bitReverse`, `@byteSwap`	2025-08-12 16:33:57 +02:00
Andrew Kelley	749f10af49	std.ArrayList: make unmanaged the default	2025-08-11 15:52:49 -07:00
Andrew Kelley	7e2a26c0c4	std.io.Writer.printValue: rework logic Alignment and fill options only apply to numbers. Rework the implementation to mainly branch on the format string rather than the type information. This is more straightforward to maintain and more straightforward for comptime evaluation. Enums support being printed as decimal, hexadecimal, octal, and binary. `formatInteger` is another possible format method that is unconditionally called when the value type is struct and one of the integer-printing format specifiers are used.	2025-07-07 22:43:53 -07:00
Andrew Kelley	30c2921eb8	compiler: update a bunch of format strings	2025-07-07 22:43:52 -07:00
Andrew Kelley	f409457925	compiler: fix a bunch of format strings	2025-07-07 22:43:52 -07:00
Andrew Kelley	0e37ff0d59	std.fmt: breaking API changes added adapter to AnyWriter and GenericWriter to help bridge the gap between old and new API make std.testing.expectFmt work at compile-time std.fmt no longer has a dependency on std.unicode. Formatted printing was never properly unicode-aware. Now it no longer pretends to be. Breakage/deprecations: * std.fs.File.reader -> std.fs.File.deprecatedReader * std.fs.File.writer -> std.fs.File.deprecatedWriter * std.io.GenericReader -> std.io.Reader * std.io.GenericWriter -> std.io.Writer * std.io.AnyReader -> std.io.Reader * std.io.AnyWriter -> std.io.Writer * std.fmt.format -> std.fmt.deprecatedFormat * std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape * std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape * std.fmt.fmtSliceHexLower -> {x} * std.fmt.fmtSliceHexUpper -> {X} * std.fmt.fmtIntSizeDec -> {B} * std.fmt.fmtIntSizeBin -> {Bi} * std.fmt.fmtDuration -> {D} * std.fmt.fmtDurationSigned -> {D} * {} -> {f} when there is a format method * format method signature - anytype -> std.io.Writer - inferred error set -> error{WriteFailed} - options -> (deleted) std.fmt.Formatted - now takes context type explicitly - no fmt string	2025-07-07 22:43:51 -07:00
Jacob Young	6b41beb370	big.int: implement float conversions These conversion routines accept a `round` argument to control how the result is rounded and return whether the result is exact. Most callers wanted this functionality and had hacks around it being missing. Also delete `std.math.big.rational` because it was only being used for float conversion, and using rationals for that is a lot more complex than necessary. It also required an allocator, whereas the new integer routines only need to be passed enough memory to store the result.	2025-06-15 14:15:18 -04:00
mlugg	71baa5e769	compiler: improve progress output Update the estimated total items for the codegen and link progress nodes earlier. Rather than waiting for the main thread to dispatch the tasks, we can add the item to the estimated total as soon as we queue the main task. The only difference is we need to complete it even in error cases.	2025-06-12 17:51:31 +01:00
mlugg	c4ec382fc8	InternPool: store the Nav types are named after When the name strategy is `.parent`, the DWARF info really wants to know what `Nav` we were named after to emit a more optimal hierarchy.	2025-06-12 13:55:41 +01:00
mlugg	424e6ac54b	compiler: minor refactors to ZCU linking * The `codegen_nav`, `codegen_func`, `codegen_type` tasks are renamed to `link_nav`, `link_func`, and `link_type`, to more accurately reflect their purpose of sending data to the linker. Currently, `link_func` remains responsible for codegen; this will change in an upcoming commit. * Don't go on a pointless detour through `PerThread` when linking ZCU functions/`Nav`s; so, the `linkerUpdateNav` etc logic now lives in `link.zig`. Currently, `linkerUpdateFunc` is an exception, because it has broader responsibilities including codegen, but this will be solved in an upcoming commit.	2025-06-12 13:55:39 +01:00
Jacob Young	b483defc5a	Legalize: implement scalarization of binary operations	2025-05-31 18:54:28 -04:00
mlugg	d717c96877	compiler: include inline calls in the reference trace Inline calls which happened in the erroring `AnalUnit` still show as error notes, because they tend to make very important context (e.g. to see how comptime values propagate through them). However, "earlier" inline calls are still useful to see to understand how something is being referenced, so we should include them in the reference trace.	2025-05-16 13:28:15 +01:00
mlugg	f83fe2714b	compiler: fix comptime memory store bugs * When storing a zero-bit type, we should short-circuit almost immediately. Zero-bit stores do not need to do any work. * The bit size computation for arrays is incorrect; the `abiSize` will already be appropriately aligned, but the logic to do so here incorrectly assumes that zero-bit types have an alignment of 0. They don't; their alignment is 1. Resolves: #21202 Resolves: #21508 Resolves: #23307	2025-05-03 20:10:26 +01:00
Mun Maks	4fc783670a	Sema/arith.zig: Fixing more typos from #23177 . This is a complementary PR to #23487 (I had only found one typo before). Now I've looked at the whole `arith.zig` file, trying to find other potential problems. Discussion about these changes: https://github.com/ziglang/zig/pull/23177#discussion_r1997957095	2025-04-09 12:53:11 +01:00
Maksat	4995509028	#23177 , maintainter 'mlugg' wanted to fix that typo, 4 weeks without changes, might be forgotten	2025-04-07 16:50:28 +01:00
Mason Remaley	06ee383da9	compiler: allow `@import` of ZON without a result type In particular, this allows importing `build.zig.zon` at comptime.	2025-04-02 05:53:22 +01:00
mlugg	2a4e06bcb3	Sema: rewrite comptime arithmetic This commit reworks how Sema handles arithmetic on comptime-known values, fixing many bugs in the process. The general pattern is that arithmetic on comptime-known values is now handled by the new namespace `Sema.arith`. Functions handling comptime arithmetic no longer live on `Value`; this is because some of them can emit compile errors, so some can't go on `Value`. Only semantic analysis should really be doing arithmetic on `Value`s anyway, so it makes sense for it to integrate more tightly with `Sema`. This commit also implements more coherent rules surrounding how `undefined` interacts with comptime and mixed-comptime-runtime arithmetic. The rules are as follows. * If an operation cannot trigger Illegal Behavior, and any operand is `undefined`, the result is `undefined`. This includes operations like `0 \| undef`, where the LHS logically could* be used to determine a defined result. This is partly to simplify the language, but mostly to permit codegen backends to represent `undefined` values as completely invalid states. * If an operation can trigger Illegal Behvaior, and any operand is `undefined`, then Illegal Behavior results. This occurs even if the operand in question isn't the one that "decides" illegal behavior; for instance, `undef / 1` is undefined. This is for the same reasons as described above. * An operation which would trigger Illegal Behavior, when evaluated at comptime, instead triggers a compile error. Additionally, if one operand is comptime-known undef, such that the other (runtime-known) operand isn't needed to determine that Illegal Behavior would occur, the compile error is triggered. * The only situation in which an operation with one comptime-known operand has a comptime-known result is if that operand is undefined, in which case the result is either undefined or a compile error per the above rules. This could potentially be loosened in future (for instance, `0 * rt` could be comptime-known 0 with a runtime assertion that `rt` is not undefined), but at least for now, defining it more conservatively simplifies the language and allows us to easily change this in future if desired. This commit fixes many bugs regarding the handling of `undefined`, particularly in vectors. Along with a collection of smaller tests, two very large test cases are added to check arithmetic on `undefined`. The operations which have been rewritten in this PR are: * `+`, `+%`, `+\|`, `@addWithOverflow` * `-`, `-%`, `-\|`, `@subWithOverflow` * ``, `%`, `\|`, `@mulWithOverflow` `/`, `@divFloor`, `@divTrunc`, `@divExact` * `%`, `@rem`, `@mod` Other arithmetic operations are currently unchanged. Resolves: #22743 Resolves: #22745 Resolves: #22748 Resolves: #22749 Resolves: #22914	2025-03-16 08:17:50 +00:00
mlugg	55a2e535fd	compiler: integrate ZON with the ZIR caching system This came with a big cleanup to `Zcu.PerThread.updateFile` (formerly `astGenFile`). Also, change how the cache manifest works for files in the import table. Instead of being added to the manifest when we call `semaFile` on them, we iterate the import table after running the AstGen workers and add all the files to the cache manifest then. The downside is that this is a bit more eager to include files in the manifest; in particular, files which are imported but not actually referenced are now included in analysis. So, for instance, modifying any standard library file will invalidate all Zig compilations using that standard library, even if they don't use that file. The original motivation here was simply that the old logic in `semaFile` didn't translate nicely to ZON. However, it turns out to actually be necessary for correctness. Because `@import("foo.zig")` is an AstGen-level error if `foo.zig` does not exist, we need to invalidate the cache when an imported but unreferenced file is removed to make sure this error is triggered when it needs to be. Resolves: #22746	2025-02-04 16:20:29 +00:00
Mason Remaley	13c6eb0d71	compiler,std: implement ZON support This commit allows using ZON (Zig Object Notation) in a few ways. * `@import` can be used to load ZON at comptime and convert it to a normal Zig value. In this case, `@import` must have a result type. * `std.zon.parse` can be used to parse ZON at runtime, akin to the parsing logic in `std.json`. * `std.zon.stringify` can be used to convert arbitrary data structures to ZON at runtime, again akin to `std.json`.	2025-02-03 09:14:37 +00:00
mlugg	3afda4322c	compiler: analyze type and value of global declaration separately This commit separates semantic analysis of the annotated type vs value of a global declaration, therefore allowing recursive and mutually recursive values to be declared. Every `Nav` which undergoes analysis now has two corresponding `AnalUnit`s: `.{ .nav_val = n }` and `.{ .nav_ty = n }`. The `nav_val` unit is responsible for fully resolving the `Nav`: determining its value, linksection, addrspace, etc. The `nav_ty` unit, on the other hand, resolves only the information necessary to construct a pointer to the `Nav`: its type, addrspace, etc. (It does also analyze its linksection, but that could be moved to `nav_val` I think; it doesn't make any difference). Analyzing a `nav_ty` for a declaration with no type annotation will just mark a dependency on the `nav_val`, analyze it, and finish. Conversely, analyzing a `nav_val` for a declaration with a type annotation will first mark a dependency on the `nav_ty` and analyze it, using this as the result type when evaluating the value body. The `nav_val` and `nav_ty` units always have references to one another: so, if a `Nav`'s type is referenced, its value implicitly is too, and vice versa. However, these dependencies are trivial, so, to save memory, are only known implicitly by logic in `resolveReferences`. In general, analyzing ZIR `decl_val` will only analyze `nav_ty` of the corresponding `Nav`. There are two exceptions to this. If the declaration is an `extern` declaration, then we immediately ensure the `Nav` value is resolved (which doesn't actually require any more analysis, since such a declaration has no value body anyway). Additionally, if the resolved type has type tag `.@"fn"`, we again immediately resolve the `Nav` value. The latter restriction is in place for two reasons: * Functions are special, in that their externs are allowed to trivially alias; i.e. with a declaration `extern fn foo(...)`, you can write `const bar = foo;`. This is not allowed for non-function externs, and it means that function types are the only place where it is possible for a declaration `Nav` to have a `.@"extern"` value without actually being declared `extern`. We need to identify this situation immediately so that the `decl_ref` can create a pointer to the real extern `Nav`, not this alias. * In certain situations, such as taking a pointer to a `Nav`, Sema needs to queue analysis of a runtime function if the value is a function. To do this, the function value needs to be known, so we need to resolve the value immediately upon `&foo` where `foo` is a function. This restriction is simple to codify into the eventual language specification, and doesn't limit the utility of this feature in practice. A consequence of this commit is that codegen and linking logic needs to be more careful when looking at `Nav`s. In general: * When `updateNav` or `updateFunc` is called, it is safe to assume that the `Nav` being updated (the owner `Nav` for `updateFunc`) is fully resolved. * Any `Nav` whose value is/will be an `@"extern"` or a function is fully resolved; see `Nav.getExtern` for a helper for a common case here. * Any other `Nav` may only have its type resolved. This didn't seem to be too tricky to satisfy in any of the existing codegen/linker backends. Resolves: #131	2024-12-24 02:18:41 +00:00
mlugg	d11bbde5f9	compiler: remove anonymous struct types, unify all tuples This commit reworks how anonymous struct literals and tuples work. Previously, an untyped anonymous struct literal (e.g. `const x = .{ .a = 123 }`) was given an "anonymous struct type", which is a special kind of struct which coerces using structural equivalence. This mechanism was a holdover from before we used RLS / result types as the primary mechanism of type inference. This commit changes the language so that the type assigned here is a "normal" struct type. It uses a form of equivalence based on the AST node and the type's structure, much like a reified (`@Type`) type. Additionally, tuples have been simplified. The distinction between "simple" and "complex" tuple types is eliminated. All tuples, even those explicitly declared using `struct { ... }` syntax, use structural equivalence, and do not undergo staged type resolution. Tuples are very restricted: they cannot have non-`auto` layouts, cannot have aligned fields, and cannot have default values with the exception of `comptime` fields. Tuples currently do not have optimized layout, but this can be changed in the future. This change simplifies the language, and fixes some problematic coercions through pointers which led to unintuitive behavior. Resolves: #16865	2024-10-31 20:42:53 +00:00
Andrew Kelley	4f8d244e7e	remove formatted panics implements #17969	2024-09-26 12:35:14 -07:00
mlugg	0fe3fd01dd	std: update `std.builtin.Type` fields to follow naming conventions The compiler actually doesn't need any functional changes for this: Sema does reification based on the tag indices of `std.builtin.Type` already! So, no zig1.wasm update is necessary. This change is necessary to disallow name clashes between fields and decls on a type, which is a prerequisite of #9938.	2024-08-28 08:39:59 +01:00
David Rubin	80cd53d3bb	sema: clean-up `{union,struct}FieldAlignment` and friends My main gripes with this design were that it was incorrectly namespaced, the naming was inconsistent and a bit wrong (`fooAlign` vs `fooAlignment`). This commit moves all the logic from `PerThread.zig` to use the zcu + tid system that the previous couple commits introduce. I've organized and merged the functions to be a bit more specific to their own purpose. - `fieldAlignment` takes a struct or union type, an index, and a Zcu (or the Sema version which takes a Pt), and gives you the alignment of the field at the index. - `structFieldAlignment` takes the field type itself, and provides the logic to handle special cases, such as externs. A design goal I had in mind was to avoid using the word 'struct' in the function name, when it worked for things that aren't structs, such as unions.	2024-08-25 15:16:46 -07:00
David Rubin	b4bb64ce78	sema: rework type resolution to use Zcu when possible	2024-08-25 15:16:42 -07:00
mlugg	548a087faf	compiler: split Decl into Nav and Cau The type `Zcu.Decl` in the compiler is problematic: over time it has gained many responsibilities. Every source declaration, container type, generic instantiation, and `@extern` has a `Decl`. The functions of these `Decl`s are in some cases entirely disjoint. After careful analysis, I determined that the two main responsibilities of `Decl` are as follows: * A `Decl` acts as the "subject" of semantic analysis at comptime. A single unit of analysis is either a runtime function body, or a `Decl`. It registers incremental dependencies, tracks analysis errors, etc. * A `Decl` acts as a "global variable": a pointer to it is consistent, and it may be lowered to a specific symbol by the codegen backend. This commit eliminates `Decl` and introduces new types to model these responsibilities: `Cau` (Comptime Analysis Unit) and `Nav` (Named Addressable Value). Every source declaration, and every container type requiring resolution (so not including `opaque`), has a `Cau`. For a source declaration, this `Cau` performs the resolution of its value. (When #131 is implemented, it is unsolved whether type and value resolution will share a `Cau` or have two distinct `Cau`s.) For a type, this `Cau` is the context in which type resolution occurs. Every non-`comptime` source declaration, every generic instantiation, and every distinct `extern` has a `Nav`. These are sent to codegen/link: the backends by definition do not care about `Cau`s. This commit has some minor technically-breaking changes surrounding `usingnamespace`. I don't think they'll impact anyone, since the changes are fixes around semantics which were previously inconsistent (the behavior changed depending on hashmap iteration order!). Aside from that, this changeset has no significant user-facing changes. Instead, it is an internal refactor which makes it easier to correctly model the responsibilities of different objects, particularly regarding incremental compilation. The performance impact should be negligible, but I will take measurements before merging this work into `master`. Co-authored-by: Jacob Young <jacobly0@users.noreply.github.com> Co-authored-by: Jakub Konka <kubkon@jakubkonka.com>	2024-08-11 07:29:41 +01:00
Will Lillis	a9d544575d	Sema: add error note for failed coercions to optional types and error unions	2024-07-16 16:42:13 +00:00
Jacob Young	525f341f33	Zcu: introduce `PerThread` and pass to all the functions	2024-07-07 22:59:52 -04:00
mlugg	0e5335aaf5	compiler: rework type resolution, fully resolve all types I'm so sorry. This commit was just meant to be making all types fully resolve by queueing resolution at the moment of their creation. Unfortunately, a lot of dominoes ended up falling. Here's what happened: * I added a work queue job to fully resolve a type. * I realised that from here we could eliminate `Sema.types_to_resolve` if we made function codegen a separate job. This is desirable for simplicity of both spec and implementation. * This led to a new AIR traversal to detect whether any required type is unresolved. If a type in the AIR failed to resolve, then we can't run codegen. * Because full type resolution now occurs by the work queue job, a bug was exposed whereby error messages for type resolution were associated with the wrong `Decl`, resulting in duplicate error messages when the type was also resolved "by" its owner `Decl` (which really all resolution should be done on). * A correct fix for this requires using a different `Sema` when performing type resolution: we need a `Sema` owned by the type. Also note that this fix is necessary for incremental compilation. * This means a whole bunch of functions no longer need to take `Sema`s. * First-order effects: `resolveTypeFields`, `resolveTypeLayout`, etc * Second-order effects: `Type.abiAlignmentAdvanced`, `Value.orderAgainstZeroAdvanced`, etc The end result of this is, in short, a more correct compiler and a simpler language specification. This regressed a few error notes in the test cases, but nothing that seems worth blocking this change. Oh, also, I ripped out the old code in `test/src/Cases.zig` which introduced a dependency on `Compilation`. This dependency was problematic at best, and this code has been unused for a while. When we re-enable incremental test cases, we must rewrite their executor to use the compiler server protocol.	2024-07-04 21:01:42 +01:00
mlugg	2f0f1efa6f	compiler: type.zig -> Type.zig	2024-07-04 21:01:42 +01:00
Andrew Kelley	0fcd59eada	rename src/Module.zig to src/Zcu.zig This patch is a pure rename plus only changing the file path in `@import` sites, so it is expected to not create version control conflicts, even when rebasing.	2024-06-22 22:59:56 -04:00
mlugg	1eaeb4a0a8	Zcu: rework source locations `LazySrcLoc` now stores a reference to the "base AST node" to which it is relative. The previous tagged union is `LazySrcLoc.Offset`. To make working with this structure convenient, `Sema.Block` contains a convenience `src` method which takes an `Offset` and returns a `LazySrcLoc`. The "base node" of a source location is no longer given by a `Decl`, but rather a `TrackedInst` representing either a `declaration`, `struct_decl`, `union_decl`, `enum_decl`, or `opaque_decl`. This is a more appropriate model, and removes an unnecessary responsibility from `Decl` in preparation for the upcoming refactor which will split it into `Nav` and `Cau`. As a part of these `Decl` reworks, the `src_node` field is eliminated. This change aids incremental compilation, and simplifies `Decl`. In some cases -- particularly in backends -- the source location of a declaration is desired. This was previously `Decl.srcLoc` and worked for any `Decl`. Now, it is `Decl.navSrcLoc` in reference to the upcoming refactor, since the set of `Decl`s this works for precisely corresponds to what will in future become a `Nav` -- that is, source-level declarations and generic function instantiations, but not type owner Decls. This commit introduces more tags to `LazySrcLoc.Offset` so as to eliminate the concept of `error.NeededSourceLocation`. Now, `.unneeded` should only be used to assert that an error path is unreachable. In the future, uses of `.unneeded` can probably be replaced with `undefined`. The `src_decl` field of `Sema.Block` no longer has a role in type resolution. Its main remaining purpose is to handle namespacing of type names. It will be eliminated entirely in a future commit to remove another undue responsibility from `Decl`. It is worth noting that in future, the `Zcu.SrcLoc` type should probably be eliminated entirely in favour of storing `Zcu.LazySrcLoc` values. This is because `Zcu.SrcLoc` is not valid across incremental updates, and we want to be able to reuse error messages from previous updates even if the source file in question changed. The error reporting logic should instead simply resolve the location from the `LazySrcLoc` on the fly.	2024-06-15 00:57:52 +01:00
mlugg	07a24bec9a	compiler: move LazySrcLoc out of std This is in preparation for some upcoming changes to how we represent source locations in the compiler. The bulk of the change here is dealing with the removal of `src()` methods from `Zir` types.	2024-06-15 00:57:52 +01:00
mlugg	03ad862197	compiler: un-implement #19634 This commit reverts the handling of partially-undefined values in bitcasting to transform these bits into an arbitrary numeric value, like happens on `master` today. As @andrewrk rightly points out, #19634 has unfortunate consequences for the standard library, and likely requires more thought. To avoid a major breaking change, it has been decided to revert this design decision for now, and make a more informed decision further down the line.	2024-04-17 13:41:25 +01:00
mlugg	d0e74ffe52	compiler: rework comptime pointer representation and access We've got a big one here! This commit reworks how we represent pointers in the InternPool, and rewrites the logic for loading and storing from them at comptime. Firstly, the pointer representation. Previously, pointers were represented in a highly structured manner: pointers to fields, array elements, etc, were explicitly represented. This works well for simple cases, but is quite difficult to handle in the cases of unusual reinterpretations, pointer casts, offsets, etc. Therefore, pointers are now represented in a more "flat" manner. For types without well-defined layouts -- such as comptime-only types, automatic-layout aggregates, and so on -- we still use this "hierarchical" structure. However, for types with well-defined layouts, we use a byte offset associated with the pointer. This allows the comptime pointer access logic to deal with reinterpreted pointers far more gracefully, because the "base address" of a pointer -- for instance a `field` -- is a single value which pointer accesses cannot exceed since the parent has undefined layout. This strategy is also more useful to most backends -- see the updated logic in `codegen.zig` and `codegen/llvm.zig`. For backends which do prefer a chain of field and elements accesses for lowering pointer values, such as SPIR-V, there is a helpful function in `Value` which creates a strategy to derive a pointer value using ideally only field and element accesses. This is actually more correct than the previous logic, since it correctly handles pointer casts which, after the dust has settled, end up referring exactly to an aggregate field or array element. In terms of the pointer access code, it has been rewritten from the ground up. The old logic had become rather a mess of special cases being added whenever bugs were hit, and was still riddled with bugs. The new logic was written to handle the "difficult" cases correctly, the most notable of which is restructuring of a comptime-only array (for instance, converting a `[3][2]comptime_int` to a `[2][3]comptime_int`. Currently, the logic for loading and storing work somewhat differently, but a future change will likely improve the loading logic to bring it more in line with the store strategy. As far as I can tell, the rewrite has fixed all bugs exposed by #19414. As a part of this, the comptime bitcast logic has also been rewritten. Previously, bitcasts simply worked by serializing the entire value into an in-memory buffer, then deserializing it. This strategy has two key weaknesses: pointers, and undefined values. Representations of these values at comptime cannot be easily serialized/deserialized whilst preserving data, which means many bitcasts would become runtime-known if pointers were involved, or would turn `undefined` values into `0xAA`. The new logic works by "flattening" the datastructure to be cast into a sequence of bit-packed atomic values, and then "unflattening" it; using serialization when necessary, but with special handling for `undefined` values and for pointers which align in virtual memory. The resulting code is definitely slower -- more on this later -- but it is correct. The pointer access and bitcast logic required some helper functions and types which are not generally useful elsewhere, so I opted to split them into separate files `Sema/comptime_ptr_access.zig` and `Sema/bitcast.zig`, with simple re-exports in `Sema.zig` for their small public APIs. Whilst working on this branch, I caught various unrelated bugs with transitive Sema errors, and with the handling of `undefined` values. These bugs have been fixed, and corresponding behavior test added. In terms of performance, I do anticipate that this commit will regress performance somewhat, because the new pointer access and bitcast logic is necessarily more complex. I have not yet taken performance measurements, but will do shortly, and post the results in this PR. If the performance regression is severe, I will do work to to optimize the new logic before merge. Resolves: #19452 Resolves: #19460	2024-04-17 13:41:25 +01:00

40 Commits