This change modifies `Zcu.ErrorMsg` to store a `Zcu.LazySrcLoc` rather
than a `Zcu.SrcLoc`. Everything else is dominoes.
The reason for this change is incremental compilation. If a failed
`AnalUnit` is up-to-date on an update, we want to re-use the old error
messages. However, the file containing the error location may have been
modified, and `SrcLoc` cannot survive such a modification. `LazySrcLoc`
is designed to be correct across incremental updates. Therefore, we
defer source location resolution until `Compilation` gathers the compile
errors into the `ErrorBundle`.
This commit reworks our representation of exported Decls and values in
Zcu to be memory-optimized and trivially serialized.
All exports are now stored in the `all_exports` array on `Zcu`. An
`AnalUnit` which performs an export (either through an `export`
annotation or by containing an analyzed `@export`) gains an entry into
`single_exports` if it performs only one export, or `multi_exports` if
it performs multiple.
We no longer store a persistent mapping from a `Decl`/value to all
exports of that entity; this state is not necessary for the majority of
the pipeline. Instead, we construct it in `Zcu.processExports`, just
before flush. This does not affect the algorithmic complexity of
`processExports`, since this function already iterates all exports in
the `Zcu`.
The elimination of `decl_exports` and `value_exports` led to a few
non-trivial backend changes. The LLVM backend has been wrangled into a
more reasonable state in general regarding exports and externs. The C
backend is currently disabled in this commit, because its support for
`export` was quite broken, and that was exposed by this work -- I'm
hoping @jacobly0 will be able to pick this up!
This patch is a pure rename plus only changing the file path in
`@import` sites, so it is expected to not create version control
conflicts, even when rebasing.
Certain symbols were left unmarked, meaning they would not be emit into
the final binary incorrectly. We now mark the synthetic symbols to ensure
they are emit as they are already created under the circumstance they're
needed for. This also re-enables disabled tests that were left disabled
in a previous merge conflict.
Lastly, this adds the shared-memory test to the test harnass as it was
previously forgotten and therefore regressed.
Rather than using the logger, we now emit proper 'compiler'-errors just
like the ELF and MachO linkers with notes. We now also support emitting
multiple errors before quiting the linking process in certain phases,
such as symbol resolution. This means we will print all symbols which
were resolved incorrectly, rather than the first one we encounter.
This introduces some type safety so we cannot accidently give an atom
index as a symbol index. This also means we do not have to store any
optionals and therefore allow for memory optimizations. Lastly, we can
now always simply access the symbol index of an atom, rather than having
to call `getSymbolIndex` as it is easy to forget.
We now use a single function to use the in-house WebAssembly linker
rather than wasm-ld. For both incremental compilation and traditional
linking we use the same codepath.
Previously we could directly write the type index because we used the
index that was known in the final binary. However, as we now process
the Zig module as its own relocatable object file, we must ensure to
generate a relocation for type indexes. This also ensures that we can
later link the relocatable object file as a standalone also.
This also fixes generating indirect function table entries for ZigObject
as it now correctly points to the relocation symbol index rather than
the symbol index that owns the relocation.
We cannot keep function indexes as maxInt(u32) due to functions being
dedupliated when they point to the same function. For this reason we now
use a regular arraylist which will have new functions appended to, and
when deleted, its index is appended to the free list, allowing us to
re-use slots in the function list.
Removes the symbol from the decl's list of exports, marks it as dead,
as well as appends it to the symbol free list. Also removes it from
the list of global symbols as all exports are global.
In the future we should perhaps use a map for the export list to prevent
linear lookups. But this requires a benchmark as having more than 1
export for the same decl is very rare.
We now correctly create a symbol for each exported decl with its export-
name. The symbol points to the same linker-object. We store a map from
decl to all of its exports so we can update exports if it already exists
rather than infinitely create new exports.
We now parse the decls right away into atoms and allocate the
corresponding linker-object, such as segment and function, rather than
waiting until `flushModule`.
This function was previously only called by the backend which generates
a synthetical function that is not represented by any AIR or Zig code.
For this reason, the ownership is moved to the zig-object and stored
there so it can be linked with the other object files without the driver
having to specialize it.
Rather than using the optional, we now directly use `File.Index` which
can already represent an unknown file due to its `.null` value. This
means we do not pay for the memory cost.
This type of index is now used for:
- SymbolLoc
- Key of the functions map
- InitFunc
Now we can simply pass things like atom.file, object.file, loc.file etc
whenever we need to access its representing object file which makes it
a lot easier.
When merging sections we now make use of the `File` abstraction so all
objects such as globals, functions, imports, etc are also merged from
the `ZigObject` module. This allows us to use a singular way to perform
each link action without having to check the kind of the file.
The logic is mostly handled in the abstract file module, unless its
complexity warrants the handling within the corresponding module itself.
CodeGen will create linking objects such as symbols, function types, etc
in ZigObject, rather than in the linker driver where the final result
will be stored. They will end up in the linker driver module during
the `flush` phase instead.
This must mean we must call functions such as `addOrGetFuncType` in the
correct namespace or else it will be created in the incorrect list and
therefore return incorrect indexes.
When we have a ZigCompileUnit and don't use LLVM, we initialize the
ZigObject which will encapsulate the Zig Module as an object file in-
memory. During initialization we also create symbols which the object
will need such as the stack pointer.
* make test names contain the fully qualified name
* make test filters match the fully qualified name
* allow multiple test filters, where a test is skipped if it does not
match any of the specified filters
We delay atom allocation for the code section until we write the actual
atoms. We do this to ensure the offset of the atom also includes the
'size' field which is leb128-encoded and therefore variable. We need this
correct offset to ensure debug info works correctly.
The ordering of the code section is now automatic due to iterating the
function section and then finding the corresponding atom to each
function. This also ensures each function corresponds to the right atom,
and they do not go out-of-sync.
Lastly, we removed the `next` field as it is no longer required and also
removed manually setting the offset in synthetic functions. This means
atoms use less memory and synthetic functions are less prone. They will
also be placed in order of function order correctly.
Not all custom sections are represented by a symbol, which means the
section will not be parsed by the lazy parsing and therefore get garbage-
collected. This is problematic as it may contain debug information that
should not be garbage-collected. To resolve this, we manually create
local symbols for those sections and also ensure they do not get garbage-
collected.
Co-authored-by: Motiejus Jakštys <motiejus@jakstys.lt>
Co-authored-by: Jakub Konka <kubkon@jakubkonka.com>
Co-authored-by: Samuel Cantero <scanterog@gmail.com>
Co-authored-by: Giorgos Georgiou <giorgos.georgiou@datadoghq.com>
Co-authored-by: Carl Åstholm <carl@astholm.se>
This branch introduced an arena allocator for temporary allocations in
Compilation.update. Almost every implementation of flush() inside the
linker code was already creating a local arena that had the lifetime of
the function call. This commit passes the update arena so that all those
local ones can be deleted, resulting in slightly more efficient memory
usage with every compilation update.
While at it, this commit also removes the Compilation parameter from the
linker flush function API since a reference to the Compilation is now
already stored in `link.File`.