16273 Commits

Author SHA1 Message Date
Andrew Kelley
3e79315d19 x86 backend: don't read bogus safety flag
Safety is not a global flag that should be enabled or disabled for all
stores - it's lowered by the frontend directly into AIR instruction
semantics. The flag for this is communicated via the `store` vs
`store_safe` AIR instructions, and whether to write 0xaa bytes or not
should be decided in `airStore` and passed down via function parameters.

This commit is a step backwards since it removes functionality but it
aims our feet towards a better mountain to climb.
2023-09-19 00:43:21 -07:00
Jay Petacat
f91ff9a746 translate-c: Struct fields default to zero value
C99 introduced designated initializers for structs. Omitted fields are
implicitly initialized to zero. Some C APIs are designed with this in
mind. Defaulting to zero values for translated struct fields permits Zig
code to comfortably use such an API.

Closes #8165
2023-09-18 23:09:47 -07:00
mlugg
9ea2076663 translate-c: prevent variable names conflicting with type names
This introduces the concept of a "weak global name" into translate-c.

translate-c consists of two passes. The first is important, because it
discovers all global names, which are used to prevent naming conflicts:
whenever we see an identifier in the second pass, we can mangle it if it
conflicts with any global or any other in-scope identifier.

Unfortunately, this is a bit tricky for structs, unions, and enums. In
C, these types are not represented by normal identifers, but by separate
tags - `struct foo` does not prevent an unrelated identifier `foo`
existing. In general, we want to translate type names to user-friendly
ones such as `struct_foo` and `foo` where possible, but we can't
guarantee such names will not conflict with real variable names.

This is where weak global names come in. In the initial pass, when a
global type declaration is seen, `struct_foo` and `foo` are both added
as weak global names. This essentially means that we will use these
names for the type *if possible*, but if there is another global with
the same name, we will mangle the type name instead. Then, when actually
translating the declaration, we check whether there's a "true" global
with a conflicting name, in which case we mangle our name. If the
user-friendly alias `foo` conflicts, we do not attempt to mangle it: we
just don't emit it, because a mangled alias isn't particularly helpful.
2023-09-18 14:12:33 +03:00
Andrew Kelley
c970dbdfec LLVM: cache LLVM struct field indexes
This is an optimization to avoid O(N) field index lookups. It's also
nicer in terms of DRY; the only tradeoff is memory usage.
2023-09-18 00:09:30 -07:00
Andrew Kelley
48e2ba3b3c
Merge pull request #17179 from mlugg/fix/translate-c
translate-c fixes
2023-09-17 13:54:35 -07:00
Ryan Liptak
471f279cd6 Fix rc preprocessing when using the MinGW includes and targeting the GNU abi
Also update the standalone test so that this failure would have been detected on any host system.
2023-09-17 13:09:16 -07:00
Loris Cro
c1e94b28a3 autodoc: split json payload per field
this will make s3 re-enable compression for the stdlib's autodoc
and improve loading times (and data usage) for users

alongside this commit the deploy script for the official website is also
being updated
2023-09-17 18:23:21 +02:00
mlugg
0fa8cf44f6
translate-c: do not translate macros which use arguments as struct/union/enum names
Consider this C macro:
```c
#define FOO(x) struct x
```

Previously, translate-c did not detect that the `x` in the body referred
to the argument, so wrongly translated this code as using the
nonexistent `struct_x`. Since undefined identifiers are noticed in
AstGen, this prevents the translated file from being usable at all.

translate-c now instead detects this case and emits an appropriate
compile error in the macro's place.
2023-09-17 12:41:11 +01:00
mlugg
28caaea093
AstGen: allow closure over known-runtime values within @TypeOf
AstGen emits an error when a closure over a known-runtime value crosses
a namespace boundary. This usually makes sense: however, this usage is
actually valid if the capture is within a `@TypeOf` operand. Sema
already has a special case to allow such closure within `@TypeOf` when
AstGen could not determine a value to be runtime-known. This commit
simply introduces analagous logic to AstGen to allow `var`s to cross
namespace boundaries within `@TypeOf`.
2023-09-17 12:41:11 +01:00
Ryan Liptak
0168ed7bf1 rc compilation: Use MSVC includes if present, fallback to mingw
The include directories used when preprocessing .rc files are now separate from the target, and by default will use the system MSVC include paths if the MSVC + Windows SDK are present, otherwise it will fall back to the MinGW includes distributed with Zig. This default behavior can be overridden by the `-rcincludes` option (possible values: any (the default), msvc, gnu, or none).

This behavior is useful because Windows resource files may `#include` files that only exist with in the MSVC include dirs (e.g. in `<MSVC install directory>/atlmfc/include` which can contain other .rc files, images, icons, cursors, etc). So, by defaulting to the `any` behavior (MSVC if present, MinGW fallback), users will by default get behavior that is most-likely-to-work.

It also should be okay that the include directories used when compiling .rc files differ from the include directories used when compiling the main binary, since the .res format is not dependent on anything ABI-related. The only relevant differences would be things like `#define` constants being different values in the MinGW headers vs the MSVC headers, but any such differences would likely be a MinGW bug.
2023-09-17 03:09:58 -07:00
Ryan Liptak
4fac7a5263 Only populate rc_include_dirs if there are .rc files in the compilation 2023-09-17 03:09:45 -07:00
Ryan Liptak
73f581b7bc Disallow .rc/.res files unless the object format is coff 2023-09-17 03:09:45 -07:00
Ryan Liptak
28f6559947 Add the ATLMFC include directory to the libc include dir list
https://learn.microsoft.com/en-us/cpp/mfc/mfc-and-atl

Note that this include directory gets added to %INCLUDE% by vcvarsall.bat, and is especially crucial when working with resource files (many .rc files within the https://github.com/microsoft/Windows-classic-samples/ set reference files from the ATLMFC include directory).
2023-09-17 03:09:45 -07:00
Ryan Liptak
2a56fe1175 Add a .rc -> .res compiler to the Zig compiler 2023-09-17 03:09:45 -07:00
Krzysztof Wolicki
f2026e7dd6 autodoc: Implement builtin function rendering.
Implement unary ops handling.
Fix getType in main.js
Minor cleanup of builtin function handling.
2023-09-16 17:35:11 +02:00
Krzysztof Wolicki
da28379d6c autodoc: Remove unnecessary Expr tag 2023-09-16 17:30:49 +02:00
Krzysztof Wolicki
8a9aa9e112 autodoc: Handle ref ZIR instruction 2023-09-16 17:30:49 +02:00
Ryan Liptak
8e35be0640 ErrorBundle: rename addBundle to addBundleAsNotes, add addBundleAsRoots 2023-09-15 23:36:44 -07:00
mlugg
6df78c3bc1 Sema: mark pointers to inline functions as comptime-only
This is supposed to be the case, similar to how pointers to generic
functions are comptime-only (several pieces of logic already assumed
this). These types being considered runtime was causing `dbg_var_val`
AIR instructions to be wrongly emitted for such values, causing codegen
backends to create a runtime reference to the inline function, which (at
least on the LLVM backend) triggers an error.

Resolves: #38
2023-09-15 21:46:38 -07:00
Andrew Kelley
61b70778bd
Merge pull request #17156 from mlugg/destructure
compiler: implement destructuring syntax
2023-09-15 14:51:52 -07:00
mlugg
94529ffb62 package manager: write deps in a flat format, eliminating the FQN concept
The new `@depedencies` module contains generated code like the
following (where strings like "abc123" represent hashes):

```zig
pub const root_deps = [_]struct { []const u8, []const u8 }{
    .{ "foo", "abc123" },
};

pub const packages = struct {
    pub const abc123 = struct {
        pub const build_root = "/home/mlugg/.cache/zig/blah/abc123";
        pub const build_zig = @import("abc123");
        pub const deps = [_]struct { []const u8, []const u8 }{
            .{ "bar", "abc123" },
            .{ "name", "ghi789" },
        };
    };
};
```

Each package contains a build root string, the build.zig import, and a
mapping from dependency names to package hashes. There is also such a
mapping for the root package dependencies.

In theory, we could now remove the `dep_prefix` field from `std.Build`,
since its main purpose is now handled differently. I believe this is a
desirable goal, as it doesn't really make sense to assign a single FQN
to any package (because it may appear in many different places in the
package hierarchy). This commit does not remove that field, as it's used
non-trivially in a few places in the build runner and compiler tests:
this will be a future enhancement.

Resolves: #16354
Resolves: #17135
2023-09-15 14:04:23 -07:00
mlugg
f366d9f879 compiler: start using destructure syntax 2023-09-15 11:42:08 -07:00
mlugg
88f5315ddf compiler: implement destructuring syntax
This change implements the following syntax into the compiler:

```zig
const x: u32, var y, foo.bar = .{ 1, 2, 3 };
```

A destructure expression may only appear within a block (i.e. not at
comtainer scope). The LHS consists of a sequence of comma-separated var
decls and/or lvalue expressions. The RHS is a normal expression.

A new result location type, `destructure`, is used, which contains
result pointers for each component of the destructure. This means that
when the RHS is a more complicated expression, peer type resolution is
not used: each result value is individually destructured and written to
the result pointers. RLS is always used for destructure expressions,
meaning every `const` on the LHS of such an expression creates a true
stack allocation.

Aside from anonymous array literals, Sema is capable of destructuring
the following types:
* Tuples
* Arrays
* Vectors

A destructure may be prefixed with the `comptime` keyword, in which case
the entire destructure is evaluated at comptime: this means all `var`s
in the LHS are `comptime var`s, every lvalue expression is evaluated at
comptime, and the RHS is evaluated at comptime. If every LHS is a
`const`, this is not allowed: as with single declarations, the user
should instead mark the RHS as `comptime`.

There are a few subtleties in the grammar changes here. For one thing,
if every LHS is an lvalue expression (rather than a var decl), a
destructure is considered an expression. This makes, for instance,
`if (cond) x, y = .{ 1, 2 };` valid Zig code. A destructure is allowed
in almost every context where a standard assignment expression is
permitted. The exception is `switch` prongs, which cannot be
destructures as the comma is ambiguous with the end of the prong.

A follow-up commit will begin utilizing this syntax in the Zig compiler.

Resolves: #498
2023-09-15 11:33:53 -07:00
mlugg
50ef10eb49
Sema: add missing compile error for runtime-known const with comptime-only type
When RLS is used to initialize a value with a comptime-only type, the
usual "value with comptime-only type depends on runtime control flow"
error message isn't hit, because we don't use results from a block. When
we reach `make_ptr_const`, we must validate that the value is
comptime-known.
2023-09-15 14:29:57 +01:00
mlugg
cba7e8a4e9 AstGen: do not forward result pointers through @as
The `coerce_result_ptr` instruction is highly problematic and leads to
unintentional memory reinterpretation in some cases. It is more correct
to simply not forward result pointers through this builtin.

`coerce_result_ptr` is still used for struct and array initializations,
where it can still cause issues. Eliminating this usage will be a future
change.

Resolves: #16991
2023-09-15 01:05:02 -07:00
Andrew Kelley
8592c5cdac compiler: rework capture scopes in-memory layout
* Use 32-bit integers instead of pointers for compactness and
  serialization friendliness.
* Use a separate hash map for runtime and comptime capture scopes,
  avoiding the 1-bit union tag.
* Use a compact array representation instead of a tree of hash maps.
* Eliminate the only instance of ref-counting in the compiler, instead
  relying on garbage collection (not implemented yet but is the plan for
  almost all long-lived objects related to incremental compilation).

Because a code modification may need to access capture scope data, this
makes capture scope data long-lived state. My goal is to get incremental
compilation state serialization down to a single pwritev syscall, by
unifying the on-disk representation with the in-memory representation.
This commit eliminates the last remaining pointer field of
`Module.Decl`.
2023-09-15 00:55:07 -07:00
Ryan Liptak
30e1883834 Add -includes option to zig libc
Prints the detected libc include paths for the target and exits
2023-09-14 11:04:34 -07:00
Jakub Konka
fc86b80b3b elf: correctly handle overflows on non-64bit hosts 2023-09-13 22:38:44 +02:00
Jakub Konka
d4c1e85a13 elf: skip writing non-alloc and zerofill atoms 2023-09-13 21:51:43 +02:00
Jakub Konka
a9f1b994bd elf: allocate locals and globals in objects 2023-09-13 21:51:43 +02:00
Jakub Konka
d37cb60621 elf: re-enable linking compiler_rt 2023-09-13 21:51:43 +02:00
Jakub Konka
ce88df497c elf: do not store Symbol's index in Symbol 2023-09-13 21:51:43 +02:00
Jakub Konka
dbde746f9d elf: parse archives 2023-09-13 21:51:43 +02:00
Jakub Konka
5eef7577d1 elf: handle more relocs with GOT relaxation 2023-09-13 21:51:43 +02:00
Jakub Konka
7a16a97671 x86_64: add simple disassembler interface to the encoder 2023-09-13 21:51:43 +02:00
Jakub Konka
9de0df76a8 elf: allocate .bss section and matching PHDR 2023-09-13 21:51:43 +02:00
Jakub Konka
0d924d2da6 elf: look for entry point globally if not set by incremental compiler 2023-09-13 12:33:51 +02:00
Jakub Konka
31f363d51f elf: enable linker for non-incremental code paths 2023-09-13 12:17:02 +02:00
Jakub Konka
4d29b39678
Merge pull request #17113 from ziglang/elf-linker
elf: upstream zld/ELF functionality, part 1
2023-09-13 10:07:07 +02:00
Andrew Kelley
cb6201715a InternPool: prevent anon struct UAF bugs with type safety
Instead of using actual slices for InternPool.Key.AnonStructType, this
commit changes to use Slice types instead, which store a
long-lived index rather than a pointer.

This is a follow-up to 7ef1eb1c27754cb0349fdc10db1f02ff2dddd99b.
2023-09-12 20:08:56 -04:00
Jakub Konka
6910a50ae5 elf: add u64 to usize casts where required 2023-09-13 00:31:41 +02:00
Jakub Konka
1a6d12ea92 elf: clean up and unify symbol ref handling in relocs
Also, this lets us re-enable proper undefined symbols tracking.
2023-09-12 23:27:14 +02:00
Jakub Konka
9719fa7412 elf: include C compilation artifacts on the linker line 2023-09-12 19:26:51 +02:00
Jakub Konka
ae74a36af0 elf: resolve and write objects to file 2023-09-12 19:17:57 +02:00
Jakub Konka
652ebf3b6a elf: allocate objects, currently atom-by-atom 2023-09-12 18:07:10 +02:00
Jakub Konka
9db472cff6 elf: set output section index of a global when resolving 2023-09-12 17:52:55 +02:00
Jakub Konka
472d326a8c elf: set output section index when parsing objects 2023-09-12 17:35:56 +02:00
Jakub Konka
44e84af874 elf: add simplistic reloc scanning mechanism 2023-09-12 16:32:55 +02:00
Jakub Konka
c654f3b0ee elf: claim unresolved dangling symbols that can be claimed 2023-09-12 15:44:16 +02:00
Jakub Konka
b478a0dd1a elf: mark imports-exports; populate symtab with objects 2023-09-12 15:14:38 +02:00