Previously we used single arraylists for each debug section for debug
information that was generated from Zig code. (e.i. `Module` is available).
This information is now stored in Atoms, similarly to debug information
from object files. This will allow us to link them together and resolve
debug relocations.
This correctly performs a relocation for debug sections.
The result is that the wasm-linker can now correctly create
a binary from object files while preserving all debug information.
We now link relocatable debug sections with the correct
section symbol and then allocate and resolve the debug atoms
before writing them into the final binary.
Although this does perform the relocation, the actual relocations
are not done correctly yet.
Rather than storing the name of a debug section into the structure
`RelocatableData`, we use the `index` field as an offset into the
debug names table. This means we do not have to store an extra 16 bytes
for non-debug sections which can be massive for object files where each
data symbol has its own data section. The name of a debug section
can then be retrieved again when needed by using the offset and
then reading until the 0-delimiter.
This means we can request ASLR on by default as other COFF linkers
do. Currently, we write the base relocations in bulk, however,
given that there is a mechanism for padding in place in PE/COFF
I believe there might be room for making it an incremental operation
(write base relocation whenever we add/update a pointer that would
require it).
This also fixes performing relocations for data symbols
of which the target symbol exists in an external object file.
We do this by checking if the target symbol was discarded,
and if so: get the new location so that we can find the
corresponding atom that belongs to said new location. Previously
it would always assume the symbol would live in the same file
as the atom/symbol that is doing the relocation.
Generate symbols for extern variables and try to resolve them.
Unresolved 'data' symbols generate an error as they cannot be
exported from the Wasm runtime into a Wasm module. This means,
they can only be resolved by other object files such as from other
Zig or C code compiled to Wasm.
Given that COFF will want to support PIC from ground-up, there is no
point in leaving outdated code for COFF in other backends such as
arm or aarch64. Instead, when we are ready to look into those, we
can start figuring out what to add and where.
This is not technically correct, but given that we are not yet able
to link against the CRT, it's a good default until then.
Add basic logging of generated symbol table in the linker.
Regardless of the build mode (build-exe, build-lib), always
set the default stack size to 1MB. Previously, this was only
done when using build-exe, making the inconsistancy confusing.
The user can still override this behavior by providing the
`--stack <size>` flag.
This commit enables `-u <symbol>` for ELF and `-include:<symbol>` for
COFF linkers for use internally. This means we do not expose these
flags to the users just yet, however, we make use of them internally
whenever required. One such use case is forcing inclusion of
`_tls_index` when linking for Windows with mingw and LTO and dead
code stripping enabled. This ensures we add `_tls_index` to the symbol
resolver as an undefined symbol and force the linker to include an atom
that provides it marking it a dead-code-stripping root - meaning it will
not be garbage collected by the linker no matter what.
We now do not allocate memory for headers and other metadata unless
requested by the caller. Instead, we read-in the entire contents
of the image into memory and operate on pointers and casts wherever
possible. I have a left a TODO to hook up Windows' memory-mapped API
here in-place of standard `readToEndAlloc` which should be more memory
proof on memory constrained hosts.
This commit also supplements our `std.coff` with a lot missing basic
extern structs required to make our COFF linker.
* Added support for stroffsetsptr class in Dwarf stdlib
* Proper initializion of debug_str_offsets in DwarfInfo
* Added missing null initializer to DwarfInfo in Macho
* Added missing is_64 field to getAttrString in DwarfInfo
* Fixed formatting
* Added missing is_64 param to getAttrString
* Added required cast to usize
* Adding missing .debug_str_offsets initialization
* getAttrString now uses the str_offsets_base attr
When an object file is being parsed from within an archive
file, we provide the object file size to ensure we do not
read past the object file. This is because follow up object
files can exist there, as well as an LF character to notate
the end of the file was reached. Such a character is invalid
within the object file.
This also fixes a bug in getting the function/global type
for defined globals/functions from object files as it was missing
the substraction with the import count of the respective type.
Wasm archive files are encoded the same way as GNU.
This means that the header notates the character index within
the long file name list rather than the length of the name.
The entire name is then delimited by an LF character (0x0a).
This also makes a cosmetic update to remove the `self` name,
and rather label it as `archive` instead.
Firstly, opening a file handle is not really needed since we won't even
use it, and secondly, this can cause AccessDenied errors on Windows
when trying to move a directory from zig-cache/tmp/ to zig-cache/o/
since, without POSIX semantics, it is illegal to move directories
with open handles to any of its resources.
This adds additional checks during symbol resolution:
- Ensures function signatures match when a symbol will be replaced.
- Ensures global types match when the symbol is being replaced.
- When both symbols are undefined, ensures they have a matching module name.
Those changes ensure the result will pass the validator when
the runtime compiles the Wasm module.
Additionally, this also slightly changes the behavior when both
the existing symbol and new symbol are both defined. Rather than
always resulting in a collision, it only results in a collision
when both are also weak. Else, the non-weak symbol will be picked.
Add handling for these additional `MCValue`s:
* `.immediate` - lower to `DW.OP.consts` or `DW.OP.constu` depending
on signedness followed by popping off the DWARF stack with
`DW.OP.stack_value`
* `.undef` - lower to `DW.OP.implicit_value`
* `.none` - lower to `DW.OP.lit0` followed by popping off the DWARF
stack with `DW.OP.stack_value`
For any remaining unhandled case, we generate `DW.OP.nop` in order
not to mess up remaining DWARF info.