414 Commits

Author SHA1 Message Date
Andrew Kelley
e501cf51a0 link.File.Wasm: unify the string tables
Before, the wasm struct had a string table, the ZigObject had a string
table, and each Object had a string table.

Now there is just the one. This makes for more efficient use of memory
and simplifies logic, particularly with regards to linker state
serialization.

This commit additionally adds significantly more integer type safety.
2024-10-31 22:01:10 -07:00
Andrew Kelley
f2dcfe0e40 link.File.Wasm: parse inputs in compilation pipeline
Primarily, this moves linker input parsing from flush() into the linker
task queue, which is executed simultaneously with the frontend.

I also made it avoid redundantly opening the same archive file N times
for each object file inside. Furthermore, hard code fixed buffer stream
rather than using a generic stream type.

Finally, I fixed the error handling of the Wasm.Archive.parse function.
Please pay attention to this pattern of returning a struct rather than
accepting a mutable struct as an argument. This ensures function-level
atomicity and makes resource management straightforward.

Deletes the file and path fields from Archive and Object.

Removed a well-meaning but ultimately misguided suggestion about how to
think about ZigObject since thinking about it that way has led to
problematic anti-DOD patterns.
2024-10-30 23:43:53 -07:00
Andrew Kelley
ba2d006634 link.File.Wasm: remove the "files" abstraction
Removes the `files` field from the Wasm linker, storing the ZigObject
as its own field instead using a tagged union.

This removes a layer of indirection when accessing the ZigObject, and
untangles logic so that we can introduce a "pre-link" phase that
prepares the linker state to handle only incremental updates to the
ZigObject and then minimize logic inside flush().

Furthermore, don't make array elements store their own indexes, that's
always a waste.

Flattens some of the file system hierarchy and unifies variable names
for easier refactoring.

Introduces type safety for optional object indexes.
2024-10-30 19:34:58 -07:00
Andrew Kelley
12f3a7c3c2 fix wasm crt logic 2024-10-23 16:27:38 -07:00
Andrew Kelley
e567abb339 rework linker inputs
* Compilation.objects changes to Compilation.link_inputs which stores
  objects, archives, windows resources, shared objects, and strings
  intended to be put directly into the dynamic section. Order is now
  preserved between all of these kinds of linker inputs. If it is
  determined the order does not matter for a particular kind of linker
  input, that item should be moved to a different array.
* rename system_libs to windows_libs
* untangle library lookup from CLI types
* when doing library lookup, instead of using access syscalls, go ahead
  and open the files and keep the handles around for passing to the
  cache system and the linker.
* during library lookup and cache file hashing, use positioned reads to
  avoid affecting the file seek position.
* library directories are opened in the CLI and converted to Directory
  objects, warnings emitted for those that cannot be opened.
2024-10-23 16:27:38 -07:00
Andrew Kelley
13fb68c064 link: consolidate diagnostics
By organizing linker diagnostics into this struct, it becomes possible
to share more code between linker backends, and more importantly it
becomes possible to pass only the Diag struct to some functions, rather
than passing the entire linker state object in. This makes data
dependencies more obvious, making it easier to rearrange code and to
multithread.

Also fix MachO code abusing an atomic variable. Not only was it using
the wrong atomic operation, it is unnecessary additional state since
the state is already being protected by a mutex.
2024-10-11 10:36:19 -07:00
Andrew Kelley
14c8e270bb link: fix false positive crtbegin/crtend detection
Embrace the Path abstraction, doing more operations based on directory
handles rather than absolute file paths. Most of the diff noise here
comes from this one.

Fix sorting of crtbegin/crtend atoms. Previously it would look at all
path components for those strings.

Make the C runtime path detection partially a pure function, and move
some logic to glibc.zig where it belongs.
2024-10-10 14:21:52 -07:00
Andrew Kelley
2c41c453b6 link.Elf: avoid converting rpath data in flush()
The goal is to minimize as much as possible how much logic is inside
flush(). So let's start by moving out obvious stuff. This data can be
preformatted before flush().
2024-10-08 18:02:59 -07:00
Linus Groh
8588964972 Replace deprecated default initializations with decl literals 2024-09-12 16:01:23 +01:00
mlugg
0fe3fd01dd
std: update std.builtin.Type fields to follow naming conventions
The compiler actually doesn't need any functional changes for this: Sema
does reification based on the tag indices of `std.builtin.Type` already!
So, no zig1.wasm update is necessary.

This change is necessary to disallow name clashes between fields and
decls on a type, which is a prerequisite of #9938.
2024-08-28 08:39:59 +01:00
David Rubin
863f74dcd2
comp: rename module to zcu 2024-08-25 15:17:21 -07:00
Robin Voetter
b4343074d2
replace Compilation.Emit with std.Build.Cache.Path
This type is exactly the same as std.Build.Cache.Path, except for
one function which is not used anymore. Therefore we can replace
it without consequences.
2024-08-19 19:09:12 +02:00
mlugg
548a087faf
compiler: split Decl into Nav and Cau
The type `Zcu.Decl` in the compiler is problematic: over time it has
gained many responsibilities. Every source declaration, container type,
generic instantiation, and `@extern` has a `Decl`. The functions of
these `Decl`s are in some cases entirely disjoint.

After careful analysis, I determined that the two main responsibilities
of `Decl` are as follows:
* A `Decl` acts as the "subject" of semantic analysis at comptime. A
  single unit of analysis is either a runtime function body, or a
  `Decl`. It registers incremental dependencies, tracks analysis errors,
  etc.
* A `Decl` acts as a "global variable": a pointer to it is consistent,
  and it may be lowered to a specific symbol by the codegen backend.

This commit eliminates `Decl` and introduces new types to model these
responsibilities: `Cau` (Comptime Analysis Unit) and `Nav` (Named
Addressable Value).

Every source declaration, and every container type requiring resolution
(so *not* including `opaque`), has a `Cau`. For a source declaration,
this `Cau` performs the resolution of its value. (When #131 is
implemented, it is unsolved whether type and value resolution will share
a `Cau` or have two distinct `Cau`s.) For a type, this `Cau` is the
context in which type resolution occurs.

Every non-`comptime` source declaration, every generic instantiation,
and every distinct `extern` has a `Nav`. These are sent to codegen/link:
the backends by definition do not care about `Cau`s.

This commit has some minor technically-breaking changes surrounding
`usingnamespace`. I don't think they'll impact anyone, since the changes
are fixes around semantics which were previously inconsistent (the
behavior changed depending on hashmap iteration order!).

Aside from that, this changeset has no significant user-facing changes.
Instead, it is an internal refactor which makes it easier to correctly
model the responsibilities of different objects, particularly regarding
incremental compilation. The performance impact should be negligible,
but I will take measurements before merging this work into `master`.

Co-authored-by: Jacob Young <jacobly0@users.noreply.github.com>
Co-authored-by: Jakub Konka <kubkon@jakubkonka.com>
2024-08-11 07:29:41 +01:00
Alex Rønne Petersen
642cd730c8 link: Accept -Brepro linker option and pass it to LLD.
Enable it by default when building Zig code in release modes.

Contributes to #9432.
2024-07-28 14:37:03 +02:00
Jakub Konka
cba3389d90 macho: redo input file parsing in prep for multithreading 2024-07-22 12:05:56 +02:00
sobolevn
4c71d3f29e
Fix typos in code comments in src/ 2024-07-20 20:23:18 +03:00
Jacob Young
2e65244cae dev: fix llvm backend checks 2024-07-20 07:43:40 -04:00
Jacob Young
4f742c4cfc dev: introduce dev environments that enable compiler feature sets 2024-07-19 22:35:33 -07:00
Jacob Young
ca02266157 Zcu: pass PerThread to intern pool string functions 2024-07-07 22:59:52 -04:00
Jacob Young
525f341f33 Zcu: introduce PerThread and pass to all the functions 2024-07-07 22:59:52 -04:00
mlugg
2f0f1efa6f
compiler: type.zig -> Type.zig 2024-07-04 21:01:42 +01:00
mlugg
ded5c759f8
Zcu: store LazySrcLoc in error messages
This change modifies `Zcu.ErrorMsg` to store a `Zcu.LazySrcLoc` rather
than a `Zcu.SrcLoc`. Everything else is dominoes.

The reason for this change is incremental compilation. If a failed
`AnalUnit` is up-to-date on an update, we want to re-use the old error
messages. However, the file containing the error location may have been
modified, and `SrcLoc` cannot survive such a modification. `LazySrcLoc`
is designed to be correct across incremental updates. Therefore, we
defer source location resolution until `Compilation` gathers the compile
errors into the `ErrorBundle`.
2024-07-04 21:01:41 +01:00
mlugg
7e552dc1e9
Zcu: rework exports
This commit reworks our representation of exported Decls and values in
Zcu to be memory-optimized and trivially serialized.

All exports are now stored in the `all_exports` array on `Zcu`. An
`AnalUnit` which performs an export (either through an `export`
annotation or by containing an analyzed `@export`) gains an entry into
`single_exports` if it performs only one export, or `multi_exports` if
it performs multiple.

We no longer store a persistent mapping from a `Decl`/value to all
exports of that entity; this state is not necessary for the majority of
the pipeline. Instead, we construct it in `Zcu.processExports`, just
before flush. This does not affect the algorithmic complexity of
`processExports`, since this function already iterates all exports in
the `Zcu`.

The elimination of `decl_exports` and `value_exports` led to a few
non-trivial backend changes. The LLVM backend has been wrangled into a
more reasonable state in general regarding exports and externs. The C
backend is currently disabled in this commit, because its support for
`export` was quite broken, and that was exposed by this work -- I'm
hoping @jacobly0 will be able to pick this up!
2024-07-04 21:01:41 +01:00
Michael Bradshaw
642093e04b Rename *[UI]LEB128 functions to *[UI]leb128 2024-06-23 04:30:12 +01:00
Andrew Kelley
0fcd59eada rename src/Module.zig to src/Zcu.zig
This patch is a pure rename plus only changing the file path in
`@import` sites, so it is expected to not create version control
conflicts, even when rebasing.
2024-06-22 22:59:56 -04:00
Andrew Kelley
f97c2f28fd update the codebase for the new std.Progress API 2024-05-27 20:56:48 -07:00
Andrew Kelley
f47824f24d std: restructure child process namespace 2024-05-26 09:31:55 -07:00
Nameless
aecd9cc6d1 std.posix.iovec: use .base and .len instead of .iov_base and .iov_len 2024-04-28 00:20:30 -07:00
Jacob Young
5a41704f7e cbe: rewrite CType
Closes #14904
2024-03-30 20:50:48 -04:00
mlugg
a61def10c6
compiler: eliminate most usages of TypedValue 2024-03-26 13:48:07 +00:00
Andrew Kelley
cd62005f19 extract std.posix from std.os
closes #5019
2024-03-19 11:45:09 -07:00
Tristan Ross
9d70d614ae
std.builtin: make link mode fields lowercase 2024-03-11 07:09:10 -07:00
Dillen Meijboom
377ecc6afb feat: add support for --enable-new-dtags and --disable-new-dtags 2024-03-06 17:52:05 -08:00
Luuk de Gram
202ed7330f
fix memory leaks 2024-02-29 15:52:43 +01:00
Luuk de Gram
196ba706a0
wasm: gc fixes and re-enable linker tests
Certain symbols were left unmarked, meaning they would not be emit into
the final binary incorrectly. We now mark the synthetic symbols to ensure
they are emit as they are already created under the circumstance they're
needed for. This also re-enables disabled tests that were left disabled
in a previous merge conflict.
Lastly, this adds the shared-memory test to the test harnass as it was
previously forgotten and therefore regressed.
2024-02-29 15:24:08 +01:00
Luuk de Gram
5ba5a2c133
wasm: integrate linker errors with Compilation
Rather than using the logger, we now emit proper 'compiler'-errors just
like the ELF and MachO linkers with notes. We now also support emitting
multiple errors before quiting the linking process in certain phases,
such as symbol resolution. This means we will print all symbols which
were resolved incorrectly, rather than the first one we encounter.
2024-02-29 15:24:07 +01:00
Luuk de Gram
5ef8321338
wasm: make symbol indexes a non-exhaustive enum
This introduces some type safety so we cannot accidently give an atom
index as a symbol index. This also means we do not have to store any
optionals and therefore allow for memory optimizations. Lastly, we can
now always simply access the symbol index of an atom, rather than having
to call `getSymbolIndex` as it is easy to forget.
2024-02-29 15:24:07 +01:00
Luuk de Gram
c99ef23862
wasm: consolidate flushModule and linkWithZld
We now use a single function to use the in-house WebAssembly linker
rather than wasm-ld. For both incremental compilation and traditional
linking we use the same codepath.
2024-02-29 15:24:04 +01:00
Luuk de Gram
5aec88fa41
wasm: correctly generate relocations for type index
Previously we could directly write the type index because we used the
index that was known in the final binary. However, as we now process
the Zig module as its own relocatable object file, we must ensure to
generate a relocation for type indexes. This also ensures that we can
later link the relocatable object file as a standalone also.

This also fixes generating indirect function table entries for ZigObject
as it now correctly points to the relocation symbol index rather than
the symbol index that owns the relocation.
2024-02-29 15:23:05 +01:00
Luuk de Gram
5a0f2af7e4
wasm: reimplement Zig errors in linker 2024-02-29 15:23:04 +01:00
Luuk de Gram
c153f94c89
wasm: ensure unique function indexes
We cannot keep function indexes as maxInt(u32) due to functions being
dedupliated when they point to the same function. For this reason we now
use a regular arraylist which will have new functions appended to, and
when deleted, its index is appended to the free list, allowing us to
re-use slots in the function list.
2024-02-29 15:23:04 +01:00
Luuk de Gram
fde8c2f41a
wasm: reimplement deleteDeclExport
Removes the symbol from the decl's list of exports, marks it as dead,
as well as appends it to the symbol free list. Also removes it from
the list of global symbols as all exports are global.

In the future we should perhaps use a map for the export list to prevent
linear lookups. But this requires a benchmark as having more than 1
export for the same decl is very rare.
2024-02-29 15:23:04 +01:00
Luuk de Gram
8f96e7eec1
wasm: re-implement updateExports
We now correctly create a symbol for each exported decl with its export-
name. The symbol points to the same linker-object. We store a map from
decl to all of its exports so we can update exports if it already exists
rather than infinitely create new exports.
2024-02-29 15:23:04 +01:00
Luuk de Gram
a028b10b9f
wasm: update freeDecl and finishDecl
We now parse the decls right away into atoms and allocate the
corresponding linker-object, such as segment and function, rather than
waiting until `flushModule`.
2024-02-29 15:23:04 +01:00
Luuk de Gram
4d14374a66
wasm: Move createFunction to ZigObject
This function was previously only called by the backend which generates
a synthetical function that is not represented by any AIR or Zig code.
For this reason, the ownership is moved to the zig-object and stored
there so it can be linked with the other object files without the driver
having to specialize it.
2024-02-29 15:23:04 +01:00
Luuk de Gram
0a030d6598
wasm: Use File.Index for symbol locations
Rather than using the optional, we now directly use `File.Index` which
can already represent an unknown file due to its `.null` value. This
means we do not pay for the memory cost.

This type of index is now used for:
- SymbolLoc
- Key of the functions map
- InitFunc

Now we can simply pass things like atom.file, object.file, loc.file etc
whenever we need to access its representing object file which makes it
a lot easier.
2024-02-29 15:23:03 +01:00
Luuk de Gram
cbc8d33062
wasm: fix symbol resolution and atom processing 2024-02-29 15:23:03 +01:00
Luuk de Gram
143e9599d6
wasm: use File abstraction instead of object
When merging sections we now make use of the `File` abstraction so all
objects such as globals, functions, imports, etc are also merged from
the `ZigObject` module. This allows us to use a singular way to perform
each link action without having to check the kind of the file.
The logic is mostly handled in the abstract file module, unless its
complexity warrants the handling within the corresponding module itself.
2024-02-29 15:23:03 +01:00
Luuk de Gram
12505c6d3d
wasm: store File.Index on the Atom
Also, consolidate the creation of Atoms so they all use `createAtom`.
2024-02-29 15:23:03 +01:00
Luuk de Gram
f6896ef218
wasm: create linking objects in correct module
CodeGen will create linking objects such as symbols, function types, etc
in ZigObject, rather than in the linker driver where the final result
will be stored. They will end up in the linker driver module during
the `flush` phase instead.

This must mean we must call functions such as `addOrGetFuncType` in the
correct namespace or else it will be created in the incorrect list and
therefore return incorrect indexes.
2024-02-29 15:23:03 +01:00