838 Commits

Author SHA1 Message Date
Andrew Kelley
d34fae26d5 LLVM 18 std lib updates and fixes
* some manual fixes to generated CPU features code. In the future it
  would be nice to make the script do those automatically.

* add to various target OS switches. Some of the values I was unsure of
  and added TODO panics, for example in the case of spirv CPU arch.
2024-05-08 19:37:28 -07:00
Nameless
aecd9cc6d1 std.posix.iovec: use .base and .len instead of .iov_base and .iov_len 2024-04-28 00:20:30 -07:00
Jacob Young
5d745d94fb x86_64: fix C abi for unions
Closes #19721
2024-04-22 15:24:29 -07:00
mlugg
d0e74ffe52
compiler: rework comptime pointer representation and access
We've got a big one here! This commit reworks how we represent pointers
in the InternPool, and rewrites the logic for loading and storing from
them at comptime.

Firstly, the pointer representation. Previously, pointers were
represented in a highly structured manner: pointers to fields, array
elements, etc, were explicitly represented. This works well for simple
cases, but is quite difficult to handle in the cases of unusual
reinterpretations, pointer casts, offsets, etc. Therefore, pointers are
now represented in a more "flat" manner. For types without well-defined
layouts -- such as comptime-only types, automatic-layout aggregates, and
so on -- we still use this "hierarchical" structure. However, for types
with well-defined layouts, we use a byte offset associated with the
pointer. This allows the comptime pointer access logic to deal with
reinterpreted pointers far more gracefully, because the "base address"
of a pointer -- for instance a `field` -- is a single value which
pointer accesses cannot exceed since the parent has undefined layout.
This strategy is also more useful to most backends -- see the updated
logic in `codegen.zig` and `codegen/llvm.zig`. For backends which do
prefer a chain of field and elements accesses for lowering pointer
values, such as SPIR-V, there is a helpful function in `Value` which
creates a strategy to derive a pointer value using ideally only field
and element accesses. This is actually more correct than the previous
logic, since it correctly handles pointer casts which, after the dust
has settled, end up referring exactly to an aggregate field or array
element.

In terms of the pointer access code, it has been rewritten from the
ground up. The old logic had become rather a mess of special cases being
added whenever bugs were hit, and was still riddled with bugs. The new
logic was written to handle the "difficult" cases correctly, the most
notable of which is restructuring of a comptime-only array (for
instance, converting a `[3][2]comptime_int` to a `[2][3]comptime_int`.
Currently, the logic for loading and storing work somewhat differently,
but a future change will likely improve the loading logic to bring it
more in line with the store strategy. As far as I can tell, the rewrite
has fixed all bugs exposed by #19414.

As a part of this, the comptime bitcast logic has also been rewritten.
Previously, bitcasts simply worked by serializing the entire value into
an in-memory buffer, then deserializing it. This strategy has two key
weaknesses: pointers, and undefined values. Representations of these
values at comptime cannot be easily serialized/deserialized whilst
preserving data, which means many bitcasts would become runtime-known if
pointers were involved, or would turn `undefined` values into `0xAA`.
The new logic works by "flattening" the datastructure to be cast into a
sequence of bit-packed atomic values, and then "unflattening" it; using
serialization when necessary, but with special handling for `undefined`
values and for pointers which align in virtual memory. The resulting
code is definitely slower -- more on this later -- but it is correct.

The pointer access and bitcast logic required some helper functions and
types which are not generally useful elsewhere, so I opted to split them
into separate files `Sema/comptime_ptr_access.zig` and
`Sema/bitcast.zig`, with simple re-exports in `Sema.zig` for their small
public APIs.

Whilst working on this branch, I caught various unrelated bugs with
transitive Sema errors, and with the handling of `undefined` values.
These bugs have been fixed, and corresponding behavior test added.

In terms of performance, I do anticipate that this commit will regress
performance somewhat, because the new pointer access and bitcast logic
is necessarily more complex. I have not yet taken performance
measurements, but will do shortly, and post the results in this PR. If
the performance regression is severe, I will do work to to optimize the
new logic before merge.

Resolves: #19452
Resolves: #19460
2024-04-17 13:41:25 +01:00
Jacob Young
7611d90ba0 InternPool: remove slice from byte aggregate keys
This deletes a ton of lookups and avoids many UAF bugs.

Closes #19485
2024-04-08 13:24:08 -04:00
Jacob Young
eb723a4070 Update uses of @fieldParentPtr to use RLS 2024-03-30 20:50:48 -04:00
Jacob Young
5a41704f7e cbe: rewrite CType
Closes #14904
2024-03-30 20:50:48 -04:00
mlugg
5132549565
Zcu: remove some unused functions 2024-03-26 21:35:21 +00:00
mlugg
a61def10c6
compiler: eliminate most usages of TypedValue 2024-03-26 13:48:07 +00:00
mlugg
b8d114a29e
Zcu: use Value instead of TypedValue when initializing legacy anon decls
Also removes some unnecessary uses of legacy anon decls for constructing
the array of test functions for the test runner.
2024-03-26 13:48:07 +00:00
mlugg
0d8c7ae007
Zcu.Decl: replace typedValue with valueOrFail
Now that the legacy `Value` representation is eliminated, we can begin
to phase out the redundant `TypedValue` type.
2024-03-26 13:48:07 +00:00
mlugg
920f2c7794
compiler: minor cleanups 2024-03-26 13:48:07 +00:00
mlugg
26a94e8481
Zcu: eliminate Decl.alive field
Legacy anon decls now have three uses:
* Type owner decls
* Function owner decls
* `@export` and `@extern`

Therefore, there are no longer any cases where we wish to explicitly
omit legacy anon decls from the binary. This means we can remove the
concept of an "alive" vs "dead" `Decl`, which also allows us to remove
the separate `anon_work_queue` in `Compilation`.
2024-03-26 13:48:06 +00:00
mlugg
884d957b6c
compiler: eliminate legacy Value representation
Good riddance!

Most of these changes are trivial. There's a fix for a minor bug this
exposed in `Value.readFromPackedMemory`, but aside from that, it's all
just things like changing `intern` calls to `toIntern`.
2024-03-26 13:48:06 +00:00
mlugg
5ec6e3036b
Sema: introduce separate MutableValue representation for comptime-mutable memory
Perhaps someday, we will make Sema operate on mutable values more
generally. For now, it makes sense to split out this representation,
since it is only used in comptime pointer accesses.

There are some currently unused methods on `MutableValue` which will
be used once I rewrite the comptime pointer access logic to be less
terrible.

The commit following this one will - at long last - delete the legacy
Value representation
2024-03-26 13:48:06 +00:00
mlugg
c6f3e9d79c
Zcu.Decl: remove ty field
`Decl` can no longer store un-interned values, so this field is now
unnecessary. The type can instead be fetched with the new `typeOf`
helper method, which just gets the type of the Decl's `Value`.
2024-03-26 13:48:06 +00:00
mlugg
9c3670fc93
compiler: implement analysis-local comptime-mutable memory
This commit changes how we represent comptime-mutable memory
(`comptime var`) in the compiler in order to implement the intended
behavior that references to such memory can only exist at comptime.

It does *not* clean up the representation of mutable values, improve the
representation of comptime-known pointers, or fix the many bugs in the
comptime pointer access code. These will be future enhancements.

Comptime memory lives for the duration of a single Sema, and is not
permitted to escape that one analysis, either by becoming runtime-known
or by becoming comptime-known to other analyses. These restrictions mean
that we can represent comptime allocations not via Decl, but with state
local to Sema - specifically, the new `Sema.comptime_allocs` field. All
comptime-mutable allocations, as well as any comptime-known const allocs
containing references to such memory, live in here. This allows for
relatively fast checking of whether a value references any
comptime-mtuable memory, since we need only traverse values up to
pointers: pointers to Decls can never reference comptime-mutable memory,
and pointers into `Sema.comptime_allocs` always do.

This change exposed some faulty pointer access logic in `Value.zig`.
I've fixed the important cases, but there are some TODOs I've put in
which are definitely possible to hit with sufficiently esoteric code. I
plan to resolve these by auditing all direct accesses to pointers (most
of them ought to use Sema to perform the pointer access!), but for now
this is sufficient for all realistic code and to get tests passing.

This change eliminates `Zcu.tmp_hack_arena`, instead using the Sema
arena for comptime memory mutations, which is possible since comptime
memory is now local to the current Sema.

This change should allow `Decl` to store only an `InternPool.Index`
rather than a full-blown `ty: Type, val: Value`. This commit does not
perform this refactor.
2024-03-25 14:49:41 +00:00
Andrew Kelley
cd62005f19 extract std.posix from std.os
closes #5019
2024-03-19 11:45:09 -07:00
Andrew Kelley
95cb939440
Merge pull request #19333 from Vexu/fixes
Miscellaneous error fixes
2024-03-17 15:26:55 -07:00
Veikka Tuominen
f983adfc10 Sema: fix printing of inferred error set of generic fn
Closes #19332
2024-03-17 13:33:05 +02:00
Jacob Young
d10c52c194 AstGen: disallow alignment on function types
A pointer type already has an alignment, so this information does not
need to be duplicated on the function type.  This already has precedence
with addrspace which is already disallowed on function types for this
reason.  Also fixes `@TypeOf(&func)` to have the correct addrspace and
alignment.
2024-03-17 03:06:17 +01:00
mlugg
7c32370194
Zcu: preserve ordering of usingnamespace decls
See comment. This slightly regresses a previous fix from this branch. A
proper solution will come soon, with the splitting up of `Decl` into
`Cau` and `Nav`.
2024-03-14 07:40:08 +00:00
mlugg
347196f905
Zcu: perform orphan checks against uncoerced function 2024-03-14 07:40:08 +00:00
mlugg
075c103332
compiler: add func_ies incremental dependencies
This was an oversight in my original design. This new form of dependency
is invalidated when the resolved IES for a runtime function changes.
2024-03-14 07:40:08 +00:00
mlugg
1421d329a3
compiler: progress towards incremental
Most basic re-analysis logic is now in place. Trivial updates are
hitting linker assertions.
2024-03-14 07:40:07 +00:00
mlugg
7ba8641d19
Zcu: handle updates to file root struct 2024-03-14 07:40:07 +00:00
mlugg
42fedb3f6c
Zcu: handle incremental updates (more) correctly when scanning namespaces 2024-03-14 07:40:06 +00:00
mlugg
d4ce1ec8dd
Zcu: convert Decl.zir_inst_index to a TrackedInst.Index.Optional
This makes tracking easier across incremental updates: `scanDecl` can
now tell whether an existing decl in a namespace was mapped to the one
it's analyzing in the new ZIR.
2024-03-14 07:40:06 +00:00
mlugg
7c4793f23c
Zcu: remove old decls after scanning namespace 2024-03-14 07:40:06 +00:00
mlugg
48af67c152
Zcu: rename implicitly-named decls to avoid overriding by explicit decls 2024-03-14 07:40:05 +00:00
mlugg
00969062a9
compiler: detect duplicate test names in AstGen
There is no reason to perform this detection during semantic analysis.
In fact, doing so is problematic, because we wish to utilize detection
of existing decls in a namespace in incremental compilation.
2024-03-14 07:40:05 +00:00
Tristan Ross
c260b4c753
std.builtin: make global linkage fields lowercase 2024-03-11 07:09:10 -07:00
Tristan Ross
099f3c4039
std.builtin: make container layout fields lowercase 2024-03-11 07:09:07 -07:00
mlugg
d0c022f734
compiler: namespace type equivalence based on AST node + captures
This implements the accepted proposal #18816. Namespace-owning types
(struct, enum, union, opaque) are no longer unique whenever analysed;
instead, their identity is determined based on their AST node and the
set of values they capture.

Reified types (`@Type`) are deduplicated based on the structure of the
type created. For instance, if two structs are created by the same
reification with identical fields, layout, etc, they will be the same
type.

This commit does not produce a working compiler; the next commit, adding
captures for decl references, is necessary. It felt appropriate to split
this up.

Resolves: #18816
2024-03-06 21:26:37 +00:00
mlugg
8ec6f730ef
compiler: represent captures directly in InternPool
These were previously associated with the type's namespace, but we need
to store them directly in the InternPool for #18816.
2024-03-06 21:26:37 +00:00
mlugg
975b859377
InternPool: create specialized functions for loading namespace types
Namespace types (`struct`, `enum`, `union`, `opaque`) do not use
structural equality - equivalence is based on their Decl index (and soon
will change to AST node + captures). However, we previously stored all
other information in the corresponding `InternPool.Key` anyway. For
logical consistency, it makes sense to have the key only be the true key
(that is, the Decl index) and to load all other data through another
function. This introduces those functions, by the name of
`loadStructType` etc. It's a big diff, but most of it is no-brainer
changes.

In future, it might be nice to eliminate a bunch of the loaded state in
favour of accessor functions on the `LoadedXyzType` types (like how we
have `LoadedUnionType.size()`), but that can be explored at a later
date.
2024-03-06 21:26:37 +00:00
mlugg
a6ca20b9a1
compiler: change representation of closures
This changes the representation of closures in Zir and Sema. Rather than
a pair of instructions `closure_capture` and `closure_get`, the system
now works as follows:

* Each ZIR type declaration (`struct_decl` etc) contains a list of
  captures in the form of ZIR indices (or, for efficiency, direct
  references to parent captures). This is an ordered list; indexes into
  it are used to refer to captured values.
* The `extended(closure_get)` ZIR instruction refers to a value in this
  list via a 16-bit index (limiting this index to 16 bits allows us to
  store this in `extended`).
* `Module.Namespace` has a new field `captures` which contains the list
  of values captured in a given namespace. This is initialized based on
  the ZIR capture list whenever a type declaration is analyzed.

This change eliminates `CaptureScope` from semantic analysis, which is a
nice simplification; but the main motivation here is that this change is
a prerequisite for #18816.
2024-03-06 21:26:37 +00:00
Matthew Lugg
9d500bda2d
Merge pull request #19117 from mlugg/dbg-var-blocks
Major ZIR size optimizations & small cleanups in Sema
2024-03-02 04:52:19 +00:00
Jacob Young
b60fc16b4f compiler: audit debug mode checks
* Introduce `-Ddebug-extensions` for enabling compiler debug helpers
 * Replace safety mode checks with `std.debug.runtime_safety`
 * Replace debugger helper checks with `!builtin.strip_debug_info`

Sometimes, you just have to debug optimized compilers...
2024-03-01 17:42:54 -08:00
mlugg
f51d9ab892
Sema: simplify and clarify analyzeBodyInner and wrapper functions
The signature and variants of Sema's main loop have evolved over time to
what was a quite confusing state of affairs. This commit makes minor
changes to how `analyzeBodyInner` works, and restructures/renames the
wrapper functions, adding doc comments to clarify their purposes. The
most notable change is that `analyzeBodyInner` now returns
`CompileError!void`; inline breaks are now all communicated via
`error.ComptimeBreak`.
2024-02-29 23:38:18 +00:00
Andrew Kelley
d661f0f35b compiler: JIT zig fmt
See #19063
2024-02-26 22:26:19 -07:00
Andrew Kelley
b116063e02 move AstGen to std.zig.AstGen
Part of an effort to ship more of the compiler in source form.
2024-02-26 21:51:19 -07:00
Andrew Kelley
7b37bc771b move Zir to std.zig.Zir
Part of an effort to ship more of the compiler in source form.
2024-02-26 21:35:30 -07:00
Andrew Kelley
f7143e18e3 move Zcu.LazySrcLoc to std.zig.LazySrcLoc
Part of an effort to ship more of the compiler in source form.
2024-02-26 21:35:30 -07:00
Jacob Young
d656c2a7ab test: rework how filtering works
* make test names contain the fully qualified name
 * make test filters match the fully qualified name
 * allow multiple test filters, where a test is skipped if it does not
   match any of the specified filters
2024-02-25 19:12:08 -08:00
Ryan Liptak
68b87918df Fix handling of Windows (WTF-16) and WASI (UTF-8) paths
Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior.

WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8.

Closes #18694
Closes #1774
Closes #2565
2024-02-24 14:05:24 -08:00
Jacob Young
e60d667111 Module: fix @embedFile of files containing zero bytes
If an adapted string key with embedded nulls was put in a hash map with
`std.hash_map.StringIndexAdapter`, then an incorrect hash would be
entered for that entry such that it is possible that when looking for
the exact key that matches the prefix of the original key up to the
first null would sometimes match this entry due to hash collisions and
sometimes not if performed later after a grow + rehash, causing the same
key to exist with two different indices breaking every string equality
comparison ever, for example claiming that a container type doesn't
contain a field because the field name string in the struct and the
string representing the identifier to lookup might be equal strings but
have different string indices.  This could maybe be fixed by changing
`std.hash_map.StringIndexAdapter.hash` to only hash up to the first
null, therefore ensuring that the entry's hash is correct and that all
future lookups will be consistent, but I don't trust anything so instead
I assert that there are no embedded nulls.
2024-02-22 12:33:53 -08:00
mlugg
e6cf3ce24c
Sema: correct source location for return value coercion errors
When coercing the operand of a `ret_node` etc instruction, the source
location for errors used to point to the entire `return` statement.
Instead, we now point to the operand, as would be expected if there was
an explicit `as_node` instruction (like there used to be).
2024-02-16 11:26:35 +00:00
Andrew Kelley
54bbc73f85
Merge pull request #18712 from Vexu/std.options
std: make options a struct instance instead of a namespace
2024-02-09 13:38:42 -08:00
Matthew Lugg
0c80725068
Merge pull request #18814 from mlugg/incremental-dependencies
Begin re-implementing incremental compilation
2024-02-06 11:33:07 +00:00