50 Commits

Author SHA1 Message Date
Luuk de Gram
2b3e6f680c
wasm-linker: ensure custom sections are parsed
Not all custom sections are represented by a symbol, which means the
section will not be parsed by the lazy parsing and therefore get garbage-
collected. This is problematic as it may contain debug information that
should not be garbage-collected. To resolve this, we manually create
local symbols for those sections and also ensure they do not get garbage-
collected.
2024-01-12 14:57:32 +01:00
Andrew Kelley
2047a6b82d fix remaining compile errors except one 2024-01-01 17:51:20 -07:00
Luuk de Gram
596d1cd5a8
wasm-linker: support --no-gc-sections
By default we garbage-collect sections for Wasm to reduce size, as well
as finish linking quicker (as we have fewer things to do). However,
when the user specifies `--no-gc-sections` we ensure all resolved symbols
get marked and therefore do not get garbage collected.
This is supported in both incremental-mode and traditional linking.
2023-11-28 16:40:26 +01:00
Luuk de Gram
8856ba7505
wasm-linker: parse symbols into atoms lazily
Rather than parsing every symbol into an atom, we now only parse them
into an atom when such atom is marked. This means garbage-collected
symbols will also not be parsed into atoms, and neither are discarded
symbols which have been resolved by other symbols. (Such as multiple
weak symbols).

This also introduces a binary search for finding the start index into
the list of relocations. This speeds up finding the corresponding
relocations tremendously as they're ordered ascended by address.

Lastly, we re-use the memory of atom's data as well as relocations
instead of duplicating it. This means we half the memory usage of
atom's data and relocations for linked object files. As we are
aware of decls and synthetic atoms, we free the memory of those
atoms indepedently of the atoms of object files to prevent double-frees.
2023-11-28 15:47:07 +01:00
Andrew Kelley
d5e21a4f1a std: remove meta.trait
In general, I don't like the idea of std.meta.trait, and so I am
providing some guidance by deleting the entire namespace from the
standard library and compiler codebase.

My main criticism is that it's overcomplicated machinery that bloats
compile times and is ultimately unnecessary given the existence of Zig's
strong type system and reference traces.

Users who want this can create a third party package that provides this
functionality.

closes #18051
2023-11-22 13:24:27 -05:00
mlugg
b355893438
compiler: correct unnecessary uses of 'var' 2023-11-19 11:11:49 +00:00
Andrew Kelley
3fc6fc6812 std.builtin.Endian: make the tags lower case
Let's take this breaking change opportunity to fix the style of this
enum.
2023-10-31 21:37:35 -04:00
Jacob Young
d890e81761 mem: fix ub in writeInt
Use inline to vastly simplify the exposed API.  This allows a
comptime-known endian parameter to be propogated, making extra functions
for a specific endianness completely unnecessary.
2023-10-31 21:37:35 -04:00
Andrew Kelley
accd5701c2 compiler: move struct types into InternPool proper
Structs were previously using `SegmentedList` to be given indexes, but
were not actually backed by the InternPool arrays.

After this, the only remaining uses of `SegmentedList` in the compiler
are `Module.Decl` and `Module.Namespace`. Once those last two are
migrated to become backed by InternPool arrays as well, we can introduce
state serialization via writing these arrays to disk all at once.

Unfortunately there are a lot of source code locations that touch the
struct type API, so this commit is still work-in-progress. Once I get it
compiling and passing the test suite, I can provide some interesting
data points such as how it affected the InternPool memory size and
performance comparison against master branch.

I also couldn't resist migrating over a bunch of alignment API over to
use the log2 Alignment type rather than a mismash of u32 and u64 byte
units with 0 meaning something implicitly different and special at every
location. Turns out you can do all the math you need directly on the
log2 representation of alignments.
2023-09-21 14:48:40 -07:00
Luuk de Gram
3fd6e93f4f
wasm-linker: prevent double-free on parse failure 2023-07-19 17:22:46 +02:00
mlugg
f26dda2117 all: migrate code to new cast builtin syntax
Most of this migration was performed automatically with `zig fmt`. There
were a few exceptions which I had to manually fix:

* `@alignCast` and `@addrSpaceCast` cannot be automatically rewritten
* `@truncate`'s fixup is incorrect for vectors
* Test cases are not formatted, and their error locations change
2023-06-24 16:56:39 -07:00
Eric Joldasov
50339f595a all: zig fmt and rename "@XToY" to "@YFromX"
Signed-off-by: Eric Joldasov <bratishkaerik@getgoogleoff.me>
2023-06-19 12:34:42 -07:00
r00ster91
2593156068 migration: std.math.{min, min3, max, max3} -> @min & @max 2023-06-16 13:44:09 -07:00
Luuk de Gram
9fce1df4cd
wasm-linker: implement runtime TLS relocations 2023-03-18 20:13:30 +01:00
Luuk de Gram
fb9d3cd50e
wasm-linker: feature verifiction for shared-mem
When the user enables shared-memory, we must ensure the linked objects
have the 'atomics' and 'bulk-memory' features allowed.
2023-03-18 20:13:29 +01:00
Luuk de Gram
09abd53da7
wasm-linker: refactor Limits and add flags
Rather than adding the flags "on-demand" during limits writing,
we now properly parse them and store the flags within the limits
itself. This also allows us to store whether we're using shared-
memory or not. Only when the correct flag is set will we set the
max within `Limits` or else we will leave it `undefined`.
2023-03-18 20:13:29 +01:00
Luuk de Gram
b0024c4884
wasm-linker: basic TLS support
Linker now parses segments with regards to TLS segments. If the name
represents a TLS segment but does not contain the TLS flag, we set it
manually as the object file is created using an older compiler (LLVM).

For now we panic when we find a TLS relocation and implement those
later.
2023-03-18 20:13:25 +01:00
Luuk de Gram
87738cad86
wasm-linker: store symbol's virtual address
For data symbols we will now store its virtual address. This means
we do no longer have to calculate it each time a relocation asks
for the address. This is now done for all data symbols only once
rather than every single relocation for that symbol.

This now also allows us directly store the virtual address of synthetic
symbols without having to create an atom for them. This means we also
don't need to have a "synthetic" segment any longer and do not emit
the synthetic symbols such as __heap_end and __heap_base into the final
binary.
2023-03-09 19:14:17 +01:00
Andrew Kelley
aeaef8c0ff update std lib and compiler sources to new for loop syntax 2023-02-18 19:17:21 -07:00
Luuk de Gram
46f54b23ae
link: make Wasm atoms fully owned by the linker 2023-02-01 19:10:56 +01:00
Luuk de Gram
dd85092982
wasm-linker: Fix relocations for alias'd atoms
When an atom has one or multiple aliasses, we we could not find the
target atom from the alias'd symbol. This is solved by ensuring that
we also insert each alias symbol in the symbol-atom map.
2022-12-18 16:37:00 +01:00
Andrew Kelley
ceb0a632cf std.mem.Allocator: allow shrink to fail
closes #13535
2022-11-29 23:30:38 -07:00
Luuk de Gram
7f508480f4 wasm-linker: convert relocation addend to i32
Addends in relocations are signed integers as theoretically it could
be a negative number. As Atom's offsets are relative to their parent
section, the relocation value should still result in a postive number.
For this reason, the final result is stored as an unsigned integer.

Also, rather than using `null` for relocations that do not support
addends. We set the value to 0 for those that do not support addends,
and have to call `addendIsPresent` to determine if an addend exists
or not. This means each Relocation costs 4 bytes less than before,
saving memory while linking.
2022-10-08 17:23:13 +02:00
Luuk de Gram
61f317e386
wasm-linker: rename self to descriptive name 2022-09-12 21:19:16 +02:00
Jakub Konka
56b96cd61b
Merge pull request #12772 from ziglang/coff-basic-imports
coff: implement enough of the incremental linker to pass behavior and incremental tests on Windows
2022-09-09 13:08:58 +02:00
Jakub Konka
678e07b924 macho+wasm: unify and clean up closing file handles 2022-09-07 22:42:59 +02:00
Luuk de Gram
a8d137d05a
wasm-linker: support incremental debug info
Although the wasm-linker previously already supported
debug information in incremental-mode, this was no longer
working as-is with the addition of supporting object-file-parsed
debug information. This commit implements the Zig-created debug information
structure from scratch which is a lot more robust and also allows
being linked with debug information from other object files.
2022-09-07 18:59:36 +02:00
Luuk de Gram
46c932a2c9
wasm-linker: perform debug relocations
This correctly performs a relocation for debug sections.
The result is that the wasm-linker can now correctly create
a binary from object files while preserving all debug information.
2022-09-07 18:53:16 +02:00
Luuk de Gram
c347751338
wasm-linker: write debug sections from objects
We now link relocatable debug sections with the correct
section symbol and then allocate and resolve the debug atoms
before writing them into the final binary.

Although this does perform the relocation, the actual relocations
are not done correctly yet.
2022-09-07 18:53:16 +02:00
Luuk de Gram
f060edb0f3
wasm-linker: create atoms from debug sections 2022-09-07 18:53:16 +02:00
Luuk de Gram
9a92f3d290
wasm/Object: parse debug sections into reloc data
Rather than storing the name of a debug section into the structure
`RelocatableData`, we use the `index` field as an offset into the
debug names table. This means we do not have to store an extra 16 bytes
for non-debug sections which can be massive for object files where each
data symbol has its own data section. The name of a debug section
can then be retrieved again when needed by using the offset and
then reading until the 0-delimiter.
2022-09-07 18:53:12 +02:00
Luuk de Gram
1544625df3
wasm/Object: parse using the correct file size
When an object file is being parsed from within an archive
file, we provide the object file size to ensure we do not
read past the object file. This is because follow up object
files can exist there, as well as an LF character to notate
the end of the file was reached. Such a character is invalid
within the object file.

This also fixes a bug in getting the function/global type
for defined globals/functions from object files as it was missing
the substraction with the import count of the respective type.
2022-08-20 14:50:11 +02:00
InKryption
a0d3a87ce1 std.fmt: require specifier for unwrapping ?T and E!T 2022-07-26 11:25:49 -07:00
Andrew Kelley
934573fc5d Revert "std.fmt: require specifier for unwrapping ?T and E!T."
This reverts commit 7cbd586ace46a8e8cebab660ebca3cfc049305d9.

This is causing a fail to build from source:

```
./lib/std/fmt.zig:492:17: error: cannot format optional without a specifier (i.e. {?} or {any})
                @compileError("cannot format optional without a specifier (i.e. {?} or {any})");
                ^
./src/link/MachO/Atom.zig:544:26: note: called from here
                log.debug("  RELA({s}) @ {x} => %{d} in object({d})", .{
                         ^
```

I looked at the code to fix it but none of those args are optionals.
2022-07-24 11:50:10 -07:00
InKryption
7cbd586ace
std.fmt: require specifier for unwrapping ?T and E!T.
Co-authored-by: Veikka Tuominen <git@vexu.eu>
2022-07-24 12:01:56 +03:00
Luuk de Gram
241180216f wasm-linker: Parse object file from the archive
Rather than finding the original object file, we seekTo to the
object file's position within the archive file, and from there open
a new file handle. This file handle is passed to the `Object` parser
which will create the object.
2022-06-24 08:12:17 +02:00
Luuk de Gram
c9f929a18b fix memory leaks 2022-06-24 08:12:17 +02:00
Luuk de Gram
8d03e4fc6b link: Implement API to get global symbol index 2022-06-24 08:12:17 +02:00
Veikka Tuominen
6ad510d832 update self hosted sources to language changes 2022-04-15 11:17:19 +03:00
Luuk de Gram
d66c61a2cf wasm-linker: Prevent overalignment for segments
Previously, the data segments were being aligned twice.
This caused us to overalign the segment and therefore allocate a much larger
size for each segment than was required. This fix ensures we align and set the size
just once, ensuring semantically correct binaries as well as smaller binaries.
2022-04-14 22:53:13 +02:00
Luuk de Gram
cf37101108 wasm-linker: Add function table indexes
When linking with an object file, verify if a relocation is a table index relocation.
If that's the case, add the relocation target to the function table.
2022-04-14 22:53:13 +02:00
Luuk de Gram
321a164269 wasm-linker: Fix memory leak
This fixes a memory leak when an object file contains one or more element sections which
then contains one or more function indexes. This commit ensures the slice of index functions
for each element section will be freed upon resource deallocation also.
2022-04-14 22:53:13 +02:00
Luuk de Gram
5a45fe2dba
wasm: Call generateSymbol for updateDecl
To unify the wasm backend with the other backends, we will now call `generateSymbol` to
lower a Decl into bytes. This means we also have to change some function signatures
to comply with the linker interface.

Since the general purpose generateSymbol is less featureful than wasm's, some tests are
temporarily disabled.
2022-03-06 19:38:50 +01:00
Luuk de Gram
f5a31cb0d6 wasm-linker: Intern globals, exports & imports
Symbols that have globals used to have their lookup key be the symbol name.
This key is now the offset into the string table.

Imports have both the module name (library name) and name (of the symbol), those strings are now
also being interned. This can save us up to 24bytes per import which have both their module name and name de-duplicated.
Module names are almost entirely the same for all imports, providing us with a big chance of saving us 12 bytes at least.

Just like imports, exports can also have a seperate name than the internal symbol name. Rather than storing the slice,
we now store the offset of this string instead.
2022-03-01 08:35:20 +01:00
Luuk de Gram
b1159ab7ae wasm-linker: Intern all symbol names
For all symbols read from object files as well as generated from Zig code
will now be interned and have their offset into the string table saved on the `Symbol` instead.

Besides interning, local symbols now also use a decl's fully qualified name.
When a decl/symbol is extern/to-be-imported, the name of the decl itself will be used for symbol resolving.
Similarly for symbols that will be exported, will have their 'export name' set.
2022-03-01 08:35:20 +01:00
Luuk de Gram
49f01c0a0c wasm-object: Use given allocator rather than arena
This is preliminary work for string interning in the wasm linker.
Using an arena would defeat the purpose of de-duplicating strings as we wouldn't be able to free memory
of duplicated strings.
This change also means we can simplify wasm binary parsing, by creating a general purpose parser that
parses the binary into its sections, but untyped. Doing this, allows us to re-use the base of that, for
object file, but also debug info parsing.
2022-03-01 08:35:20 +01:00
Luuk de Gram
ced958e8a8 wasm-linker: Simplify symbol names
No longer duplicate the symbol name and instead take the pointer from the decl itself.
Also fix 32bit build
2022-02-17 18:11:48 +01:00
Luuk de Gram
a4622501bd wasm-linker: Allocate atoms and handle imports
We now correctly allocate and create atoms for symbols from other object files.
Imports are now also resolved and appended when required.
Besides those changes, we now duplicate all symbol names, so we can correctly
generate unique names for unnamed constants.
TODO: String interning
2022-02-17 18:11:48 +01:00
Luuk de Gram
f1cc5f33e8 wasm-linker: Implement section merging
This implements the merging of all sections, to generate a valid wasm binary where all symbols
have been resolved and their respective sections have been merged into the final binary.
2022-02-17 18:11:48 +01:00
Luuk de Gram
e7be0bef43 wasm-linker: Add Object file parsing
This upstreams the object file parsing from zwld, bringing us closer to being able
to link stage2 code with object files/C-code as well as replacing lld with the self-hosted
linker once feature complete.
2022-02-17 18:11:48 +01:00