mirror/zig - zig - Bouvais Git

mirror/zig

mirror of https://github.com/ziglang/zig.git synced 2025-12-09 15:53:08 +00:00

Author	SHA1	Message	Date
Andrew Kelley	377e8579f9	std.zig.tokenizer: simplify I pointed a fuzzer at the tokenizer and it crashed immediately. Upon inspection, I was dissatisfied with the implementation. This commit removes several mechanisms: * Removes the "invalid byte" compile error note. * Dramatically simplifies tokenizer recovery by making recovery always occur at newlines, and never otherwise. * Removes UTF-8 validation. * Moves some character validation logic to `std.zig.parseCharLiteral`. Removing UTF-8 validation is a regression of #663, however, the existing implementation was already buggy. When adding this functionality back, it must be fuzz-tested while checking the property that it matches an independent Unicode validation implementation on the same file. While we're at it, fuzzing should check the other properties of that proposal, such as no ASCII control characters existing inside the source code. Other changes included in this commit: * Deprecate `std.unicode.utf8Decode` and its WTF-8 counterpart. This function has an awkward API that is too easy to misuse. * Make `utf8Decode2` and friends use arrays as parameters, eliminating a runtime assertion in favor of using the type system. After this commit, the crash found by fuzzing, which was "\x07\xd5\x80\xc3=o\xda\|a\xfc{\x9a\xec\x91\xdf\x0f\\\x1a^\xbe;\x8c\xbf\xee\xea" no longer causes a crash. However, I did not feel the need to add this test case because the simplified logic eradicates most crashes of this nature.	2024-07-31 16:57:42 -07:00
Bogdan Romanyuk	42d9017feb	Sema: fix OOB access in coerceTupleToStruct (#19620 ) Co-authored-by: Veikka Tuominen <git@vexu.eu>	2024-07-21 23:56:04 +00:00
Will Lillis	7e76818132	fix(fmt): pointer type syntax to index (take 2) (#20336 ) * Change main token for many-item and c style pointers from asterisk to l brace, update main token in c translation	2024-07-21 01:55:52 -07:00
WillLillis	18d412ab2f	fix: remove misleading error note for failed array coercions	2024-07-21 00:10:36 -07:00
Will Lillis	9b292c0949	fix: Add error notes for method calls on double pointers (#20686 )	2024-07-21 00:03:23 -07:00
Will Lillis	a9d544575d	Sema: add error note for failed coercions to optional types and error unions	2024-07-16 16:42:13 +00:00
Wooster	888708ec8a	Sema: support pointer subtraction	2024-07-15 18:18:38 +00:00
gooncreeper	c50f300387	Tokenizer bug fixes and improvements Fixes many error messages corresponding to invalid bytes displaying the wrong byte. Additionaly improves handling of UTF-8 in some places.	2024-07-15 11:31:19 +03:00
fmaggi	583e698256	Sema: disallow casting to opaque	2024-07-15 10:58:33 +03:00
Jacob Young	2ff49751aa	Compilation: introduce work stages for better work distribution	2024-07-13 04:47:38 -04:00
Jacob Young	3ad81c40c0	Zcu: allow atomic operations on packed structs Same validation rules as the backing integer would have.	2024-07-12 00:43:38 -07:00
mlugg	0e5335aaf5	compiler: rework type resolution, fully resolve all types I'm so sorry. This commit was just meant to be making all types fully resolve by queueing resolution at the moment of their creation. Unfortunately, a lot of dominoes ended up falling. Here's what happened: * I added a work queue job to fully resolve a type. * I realised that from here we could eliminate `Sema.types_to_resolve` if we made function codegen a separate job. This is desirable for simplicity of both spec and implementation. * This led to a new AIR traversal to detect whether any required type is unresolved. If a type in the AIR failed to resolve, then we can't run codegen. * Because full type resolution now occurs by the work queue job, a bug was exposed whereby error messages for type resolution were associated with the wrong `Decl`, resulting in duplicate error messages when the type was also resolved "by" its owner `Decl` (which really all resolution should be done on). * A correct fix for this requires using a different `Sema` when performing type resolution: we need a `Sema` owned by the type. Also note that this fix is necessary for incremental compilation. * This means a whole bunch of functions no longer need to take `Sema`s. * First-order effects: `resolveTypeFields`, `resolveTypeLayout`, etc * Second-order effects: `Type.abiAlignmentAdvanced`, `Value.orderAgainstZeroAdvanced`, etc The end result of this is, in short, a more correct compiler and a simpler language specification. This regressed a few error notes in the test cases, but nothing that seems worth blocking this change. Oh, also, I ripped out the old code in `test/src/Cases.zig` which introduced a dependency on `Compilation`. This dependency was problematic at best, and this code has been unused for a while. When we re-enable incremental test cases, we must rewrite their executor to use the compiler server protocol.	2024-07-04 21:01:42 +01:00
mlugg	5f03c02505	Zcu: key compile errors on `AnalUnit` where appropriate This change seeks to more appropriately model the way semantic analysis works by drawing a more clear line between errors emitted by analyzing a `Decl` (in future a `Cau`) and errors emitted by analyzing a runtime function. This does change a few compile errors surrounding compile logs by adding more "also here" notes. The new notes are more technically correct, but perhaps not so helpful. They're not doing enough harm for me to put extensive thought into this for now.	2024-07-04 21:01:41 +01:00
wooster0	5e3bad3556	Make 0e.0 and 0xp0 not crash This fixes those sequences of characters crashing.	2024-07-03 02:53:37 -04:00
Matthew Lugg	f73be120f4	Merge pull request #20299 from mlugg/the-great-decl-split The Great Decl Split (preliminary work): refactor source locations and eliminate `Sema.Block.src_decl`.	2024-06-20 11:07:17 +01:00
mlugg	1fdf13a148	AstGen: error for redundant `@inComptime()`	2024-06-19 03:43:13 +01:00
mlugg	6a8cf25a8a	cases: un-regress some notes Since we now have source locations for reified types again, some error notes have returned which were previously regressed by this branch.	2024-06-18 04:55:39 +01:00
mlugg	1eaeb4a0a8	Zcu: rework source locations `LazySrcLoc` now stores a reference to the "base AST node" to which it is relative. The previous tagged union is `LazySrcLoc.Offset`. To make working with this structure convenient, `Sema.Block` contains a convenience `src` method which takes an `Offset` and returns a `LazySrcLoc`. The "base node" of a source location is no longer given by a `Decl`, but rather a `TrackedInst` representing either a `declaration`, `struct_decl`, `union_decl`, `enum_decl`, or `opaque_decl`. This is a more appropriate model, and removes an unnecessary responsibility from `Decl` in preparation for the upcoming refactor which will split it into `Nav` and `Cau`. As a part of these `Decl` reworks, the `src_node` field is eliminated. This change aids incremental compilation, and simplifies `Decl`. In some cases -- particularly in backends -- the source location of a declaration is desired. This was previously `Decl.srcLoc` and worked for any `Decl`. Now, it is `Decl.navSrcLoc` in reference to the upcoming refactor, since the set of `Decl`s this works for precisely corresponds to what will in future become a `Nav` -- that is, source-level declarations and generic function instantiations, but not type owner Decls. This commit introduces more tags to `LazySrcLoc.Offset` so as to eliminate the concept of `error.NeededSourceLocation`. Now, `.unneeded` should only be used to assert that an error path is unreachable. In the future, uses of `.unneeded` can probably be replaced with `undefined`. The `src_decl` field of `Sema.Block` no longer has a role in type resolution. Its main remaining purpose is to handle namespacing of type names. It will be eliminated entirely in a future commit to remove another undue responsibility from `Decl`. It is worth noting that in future, the `Zcu.SrcLoc` type should probably be eliminated entirely in favour of storing `Zcu.LazySrcLoc` values. This is because `Zcu.SrcLoc` is not valid across incremental updates, and we want to be able to reuse error messages from previous updates even if the source file in question changed. The error reporting logic should instead simply resolve the location from the `LazySrcLoc` on the fly.	2024-06-15 00:57:52 +01:00
Veikka Tuominen	15791b8b1a	Sema: validate function signature for Signal calling convention	2024-06-02 21:42:13 +03:00
Veikka Tuominen	17a0458e53	Sema: add missing error for runtime `@ptrFromInt` to comptime-only type Closes #20083	2024-06-02 21:42:13 +03:00
Andrew Kelley	9be8a9000f	Revert "implement `@expect` builtin (#19658 )" This reverts commit a7de02e05216db9a04e438703ddf1b6b12f3fbef. This did not implement the accepted proposal, and I did not sign off on the changes. I would like a chance to review this, please.	2024-05-22 09:57:43 -07:00
David Rubin	a7de02e052	implement `@expect` builtin (#19658 ) * implement `@expect` * add docs * add a second arg for expected bool * fix typo * move `expect` to use BinOp * update to newer langref format	2024-05-22 10:51:16 -05:00
wooster0	ac55685a94	Sema: add missing declared here note	2024-05-22 02:16:56 +09:00
Wooster	f14cf13ff8	Sema: suggest using try/catch/if on method call on error union	2024-05-14 01:13:49 +09:00
r00ster91	9ae43567a3	Sema: improve error set/union discard/ignore errors Previously the error had a note suggesting to use `try`, `catch`, or `if`, even for error sets where none of those work. Instead, in case of an error set the way you can handle the error depends very much on the specific case. For example you might be in a `catch` where you are discarding or ignoring the error set capture value, in which case one way to handle the error might be to `return` the error. So, in that case, we do not attach that error note. Additionally, this makes the error tell you what kind of an error it is: is it an error set or an error union? This distinction is very relevant in how to handle the error.	2024-05-14 01:13:49 +09:00
r00ster91	8579904ddd	Sema: add error note for !?Type types when optional type is expected	2024-05-14 01:13:49 +09:00
r00ster91	60830e36e3	Sema error: talk about discarding instead of suppressing Maybe I'm just being pedantic here (most likely) but I don't like how we're just telling the user here how to "suppress this error" by "assigning the value to '_'". I think it's better if we use the word "discard" here which I think is the official terminology and also tells the user what it actually means to "assign the value to '_'". Also, using the value would also be a way to "suppress the error". It is just one of the two options: discard or use.	2024-05-14 01:13:48 +09:00
David Rubin	d9e0cafe64	riscv: add stage2_riscv to test matrix and bypass failing tests	2024-05-11 02:17:24 -07:00
David Rubin	b87baad0ff	error on `undefined` end index	2024-04-23 19:25:49 +03:00
David Rubin	187f0c1e26	Sema: correctly make inferred allocs constant Resolves: #19677	2024-04-18 04:45:14 +00:00
mlugg	03ad862197	compiler: un-implement #19634 This commit reverts the handling of partially-undefined values in bitcasting to transform these bits into an arbitrary numeric value, like happens on `master` today. As @andrewrk rightly points out, #19634 has unfortunate consequences for the standard library, and likely requires more thought. To avoid a major breaking change, it has been decided to revert this design decision for now, and make a more informed decision further down the line.	2024-04-17 13:41:25 +01:00
mlugg	d0e74ffe52	compiler: rework comptime pointer representation and access We've got a big one here! This commit reworks how we represent pointers in the InternPool, and rewrites the logic for loading and storing from them at comptime. Firstly, the pointer representation. Previously, pointers were represented in a highly structured manner: pointers to fields, array elements, etc, were explicitly represented. This works well for simple cases, but is quite difficult to handle in the cases of unusual reinterpretations, pointer casts, offsets, etc. Therefore, pointers are now represented in a more "flat" manner. For types without well-defined layouts -- such as comptime-only types, automatic-layout aggregates, and so on -- we still use this "hierarchical" structure. However, for types with well-defined layouts, we use a byte offset associated with the pointer. This allows the comptime pointer access logic to deal with reinterpreted pointers far more gracefully, because the "base address" of a pointer -- for instance a `field` -- is a single value which pointer accesses cannot exceed since the parent has undefined layout. This strategy is also more useful to most backends -- see the updated logic in `codegen.zig` and `codegen/llvm.zig`. For backends which do prefer a chain of field and elements accesses for lowering pointer values, such as SPIR-V, there is a helpful function in `Value` which creates a strategy to derive a pointer value using ideally only field and element accesses. This is actually more correct than the previous logic, since it correctly handles pointer casts which, after the dust has settled, end up referring exactly to an aggregate field or array element. In terms of the pointer access code, it has been rewritten from the ground up. The old logic had become rather a mess of special cases being added whenever bugs were hit, and was still riddled with bugs. The new logic was written to handle the "difficult" cases correctly, the most notable of which is restructuring of a comptime-only array (for instance, converting a `[3][2]comptime_int` to a `[2][3]comptime_int`. Currently, the logic for loading and storing work somewhat differently, but a future change will likely improve the loading logic to bring it more in line with the store strategy. As far as I can tell, the rewrite has fixed all bugs exposed by #19414. As a part of this, the comptime bitcast logic has also been rewritten. Previously, bitcasts simply worked by serializing the entire value into an in-memory buffer, then deserializing it. This strategy has two key weaknesses: pointers, and undefined values. Representations of these values at comptime cannot be easily serialized/deserialized whilst preserving data, which means many bitcasts would become runtime-known if pointers were involved, or would turn `undefined` values into `0xAA`. The new logic works by "flattening" the datastructure to be cast into a sequence of bit-packed atomic values, and then "unflattening" it; using serialization when necessary, but with special handling for `undefined` values and for pointers which align in virtual memory. The resulting code is definitely slower -- more on this later -- but it is correct. The pointer access and bitcast logic required some helper functions and types which are not generally useful elsewhere, so I opted to split them into separate files `Sema/comptime_ptr_access.zig` and `Sema/bitcast.zig`, with simple re-exports in `Sema.zig` for their small public APIs. Whilst working on this branch, I caught various unrelated bugs with transitive Sema errors, and with the handling of `undefined` values. These bugs have been fixed, and corresponding behavior test added. In terms of performance, I do anticipate that this commit will regress performance somewhat, because the new pointer access and bitcast logic is necessarily more complex. I have not yet taken performance measurements, but will do shortly, and post the results in this PR. If the performance regression is severe, I will do work to to optimize the new logic before merge. Resolves: #19452 Resolves: #19460	2024-04-17 13:41:25 +01:00
Jacob Young	eb723a4070	Update uses of `@fieldParentPtr` to use RLS	2024-03-30 20:50:48 -04:00
Jacob Young	17673dcd6e	AstGen: use RLS to infer the first argument of `@fieldParentPtr`	2024-03-30 20:50:48 -04:00
Jacob Young	9b2345e182	Sema: rework `@fieldParentPtr` to accept a pointer type There is no way to know the expected parent pointer attributes (most notably alignment) from the type of the field pointer, so provide them in the first argument.	2024-03-30 20:50:48 -04:00
Veikka Tuominen	9106fdffaf	Sema: check error union payload types in `@errorCast`	2024-03-28 15:39:47 +02:00
Veikka Tuominen	60614b2a85	add tests for fixed stage1 bugs Closes #10357 Closes #11236 Closes #11615 Closes #12055	2024-03-28 15:24:01 +02:00
HydroH	7aa42f47b7	allow `@errorcast` to cast error sets to error unions	2024-03-28 10:23:32 +00:00
mlugg	845226a7c9	cases: necessary changes from branch	2024-03-26 17:06:14 +00:00
mlugg	9c3670fc93	compiler: implement analysis-local comptime-mutable memory This commit changes how we represent comptime-mutable memory (`comptime var`) in the compiler in order to implement the intended behavior that references to such memory can only exist at comptime. It does not clean up the representation of mutable values, improve the representation of comptime-known pointers, or fix the many bugs in the comptime pointer access code. These will be future enhancements. Comptime memory lives for the duration of a single Sema, and is not permitted to escape that one analysis, either by becoming runtime-known or by becoming comptime-known to other analyses. These restrictions mean that we can represent comptime allocations not via Decl, but with state local to Sema - specifically, the new `Sema.comptime_allocs` field. All comptime-mutable allocations, as well as any comptime-known const allocs containing references to such memory, live in here. This allows for relatively fast checking of whether a value references any comptime-mtuable memory, since we need only traverse values up to pointers: pointers to Decls can never reference comptime-mutable memory, and pointers into `Sema.comptime_allocs` always do. This change exposed some faulty pointer access logic in `Value.zig`. I've fixed the important cases, but there are some TODOs I've put in which are definitely possible to hit with sufficiently esoteric code. I plan to resolve these by auditing all direct accesses to pointers (most of them ought to use Sema to perform the pointer access!), but for now this is sufficient for all realistic code and to get tests passing. This change eliminates `Zcu.tmp_hack_arena`, instead using the Sema arena for comptime memory mutations, which is possible since comptime memory is now local to the current Sema. This change should allow `Decl` to store only an `InternPool.Index` rather than a full-blown `ty: Type, val: Value`. This commit does not perform this refactor.	2024-03-25 14:49:41 +00:00
Andrew Kelley	95cb939440	Merge pull request #19333 from Vexu/fixes Miscellaneous error fixes	2024-03-17 15:26:55 -07:00
Veikka Tuominen	f983adfc10	Sema: fix printing of inferred error set of generic fn Closes #19332	2024-03-17 13:33:05 +02:00
Jacob Young	d10c52c194	AstGen: disallow alignment on function types A pointer type already has an alignment, so this information does not need to be duplicated on the function type. This already has precedence with addrspace which is already disallowed on function types for this reason. Also fixes `@TypeOf(&func)` to have the correct addrspace and alignment.	2024-03-17 03:06:17 +01:00
mlugg	48af67c152	Zcu: rename implicitly-named decls to avoid overriding by explicit decls	2024-03-14 07:40:05 +00:00
mlugg	00969062a9	compiler: detect duplicate test names in AstGen There is no reason to perform this detection during semantic analysis. In fact, doing so is problematic, because we wish to utilize detection of existing decls in a namespace in incremental compilation.	2024-03-14 07:40:05 +00:00
Tristan Ross	6067d39522	std.builtin: make atomic order fields lowercase	2024-03-11 07:09:10 -07:00
Tristan Ross	099f3c4039	std.builtin: make container layout fields lowercase	2024-03-11 07:09:07 -07:00
mlugg	e1d8187028	cases: correct after #18816 I changed an error messages and fixed a minor bug while implementing this proposal, which led to a few compile error cases failing.	2024-03-06 21:26:38 +00:00
John Schmidt	7a045ede7c	Check for inactive union field when calling fn at comptime Reuse `unionFieldPtr` here to ensure that all the safety checks are included. Closes https://github.com/ziglang/zig/issues/18546.	2024-02-26 16:55:17 -08:00
Andrew Kelley	3e79c0f18c	Merge pull request #18859 from schmee/switch-union-capture-align Sema: preserve field alignment in union pointer captures	2024-02-26 16:52:39 -08:00

1 2 3 4 5 ...

518 Commits