* AIR: use pl_op instead of ty_pl for wasm_memory_size. No need to
store the type because the type is always `u32`.
* AstGen: use coerced_ty for `@wasmMemorySize` and `@wasmMemoryGrow`
and do the coercions in Sema.
* Sema: use more accurate source locations for errors.
* Provide more information in the compiler error message.
* Codegen: use liveness data to avoid lowering unused
`@wasmMemorySize`.
* LLVM backend: add implementation
- I wasn't able to test it because we are hitting a linker error for
`-target wasm32-wasi -fLLVM`.
* C backend: use `zig_unimplemented()` instead of silently doing wrong
behavior for these builtins.
* behavior tests: branch only on stage2_arch for inclusion of the
wasm.zig file. We would change it to `builtin.cpu.arch` but that is
causing a compiler crash on some backends.
This implements the wasm builtins by lowering to builtins that are supported by c-compilers.
In this case: Clang.
This also simplifies the `AIR` instruction as it now uses the payload field of `ty_pl` and `pl_op`
directly to store the index argument rather than storing it inside Extra. This saves us 4 bytes
per builtin call.
Symbols that have globals used to have their lookup key be the symbol name.
This key is now the offset into the string table.
Imports have both the module name (library name) and name (of the symbol), those strings are now
also being interned. This can save us up to 24bytes per import which have both their module name and name de-duplicated.
Module names are almost entirely the same for all imports, providing us with a big chance of saving us 12 bytes at least.
Just like imports, exports can also have a seperate name than the internal symbol name. Rather than storing the slice,
we now store the offset of this string instead.
For all symbols read from object files as well as generated from Zig code
will now be interned and have their offset into the string table saved on the `Symbol` instead.
Besides interning, local symbols now also use a decl's fully qualified name.
When a decl/symbol is extern/to-be-imported, the name of the decl itself will be used for symbol resolving.
Similarly for symbols that will be exported, will have their 'export name' set.
This is preliminary work for string interning in the wasm linker.
Using an arena would defeat the purpose of de-duplicating strings as we wouldn't be able to free memory
of duplicated strings.
This change also means we can simplify wasm binary parsing, by creating a general purpose parser that
parses the binary into its sections, but untyped. Doing this, allows us to re-use the base of that, for
object file, but also debug info parsing.
Unless the pointer is a pointer to a function, if the pointee type
has zero-bits, we need to return `MCValue.none` as the `Decl` has
not been lowered to memory, and therefore, any GOT reference will be
wrong.
* fix alignment issues for consts with natural ABI alignment not
matching that of the `ldr` instruction in `aarch64` - solved by
preceeding the `ldr` with an additional `add` instruction to form
the full address before dereferencing the pointer.
* redo selection of segment/section for decls and consts based on
combined type and value
Rather than ping ponging between codegen and the linker to generate the symbols/atoms
for a local constant and its relocations. We now create all neccesary objects within the linker.
This simplifies the code as we can now simply call `lowerUnnamedConst` from anywhere in codegen,
allowing us to further improve lowering constants into .rodata so we do not have to sacrifice
lowering certain types such as decl_ref's where its type is a slice.
Otherwise, we risk collisions in the global symbol table. This is
also an opportunity to generalise and rewrite the symbol table
abstraction.
Also, improve the logs for the symbol table.
Prior to this change, the routine would assume it is called first,
before any symbol was created, thus precluding an option that in the
incremental setting, we might have already pulled a suitably defined
and exported symbol that could collide and/or be replaced by the
symbol synthesised by the linker.
We now correctly implement exporting decls. This means it is possible to export
a decl with a different name than the decl that is doing the export.
This also sets the symbols with the correct flags, so when we emit a relocatable
object file, a linker can correctly resolve symbols and/or export the symbol to the host environment.
This commit also includes fixes to ensure relocations have the correct offset to how other
linkers will expect the offset, rather than what we use internally.
Other linkers accept the offset, relative to the section.
Internally we use an offset relative to the atom.
When generating a relocatable object file, we now emit a custom "reloc.CODE" and "reloc.DATA" section
which will contain the relocations for each section.
Using a new symbol location -> Atom mapping, we can now easily find the corresponding `Atom` from a symbol.
This can be used to construct the symbol table, as well as easier access to a target atom when performing
a relocation for a data symbol.
When creating a relocatable object file, we do no longer perform the following actions:
- Merge data segments
- Calculate stack size
- Relocations
We now also make the stack pointer symbol `undefined` for this use case as well as add the symbol
as an import.
When creating a relocatable object file, emit the symbol table.
We do this by iterating over all atoms, and finding the corresponding
symbols of those. This provides us all the meta information such as size, and offset as well.
This data is required for defined data symbols.
When we emit an object file, the "Names" section does not have to be emitted, as all symbol names
are already in the symbol table, so the names section is redundant.
- Correctly get discard symbol by first checking if it was discarded or not.
- Remove imports if extern symbols were resolved by an object file.
- Correctly relocate data symbols by ensuring the atom is from the correct file.
- Fix the `Names` section by using the resolved symbols, rather than the ones defined in Zig code.
We now correctly allocate and create atoms for symbols from other object files.
Imports are now also resolved and appended when required.
Besides those changes, we now duplicate all symbol names, so we can correctly
generate unique names for unnamed constants.
TODO: String interning
This implements the merging of all sections, to generate a valid wasm binary where all symbols
have been resolved and their respective sections have been merged into the final binary.
This upstreams the object file parsing from zwld, bringing us closer to being able
to link stage2 code with object files/C-code as well as replacing lld with the self-hosted
linker once feature complete.
* move DWARF in file if LINKEDIT spilled in dSYM
* update VM addresses for all segments
* introduce copyFileRangeOverlappedAll instead of File.copyRangeAll
since we make lots of overlapping writes in MachO linker
* update number of type abbrevs to match Elf linker
* update `DebugSymbols` to write symbol and string tables
at the end to match the `MachO` linker
* TODO: update segment vm addresses when growing segments in
the binary
* TODO: store DWARF relocations in linker's interned arena
Due to differences in where the output gets emitted in stage1 and stage2,
we were putting the symlink next to the binary rather than in `zig-cache`
directory when building with stage2.
Match changes required to `Elf` linker, which enable lowering
of const slices on `MachO` targets.
Expand `Mir` instructions requiring the knowledge of the containing
atom - pass the symbol index into the linker's table from codegen
via mir to emitter, to then utilise it in the linker.
In `getDeclVAddr`, it may happen that the target `Decl` has not
been allocated space in virtual memory. In this case, we store a
relocation in the linker-global table which we will iterate over
when flushing the module, and fill in any missing address in the
final binary. Note that for optimisation, if the address was resolved
at the time of a call to `getDeclVAddr`, we skip relocating this
atom.
This commit also adds the glue code for lowering const slices in
the ARM backend.
This implements the `field_ptr` value for pointers. As the value only provides us with the index,
we must calculate the offset from the container type using said index. (i.e. the offset from a struct field at index 2).
Besides this, small miscellaneous fixes/updates were done to get remaining behavior tests passing:
- We start the function table index at 1, so unresolved function pointers don't can be null-checked properly.
- Implement genTypedValue for floats up to f64.
- Fix zero-sized arguments by only creating `args` for non-zero-sized types.
- lowerConstant now works for all decl_ref's.
- lowerConstant properly lowers optional pointers, so `null` pointers are lowered to `0`.
We need to pad out the file to the required maximum size equal the
final section's offset plus the section's size. We only need to
this when populating initial metadata and only when section header
was updated.
This updates the test runner for stage2 to emit to stdout with the passed, skipped and failed tests
similar to the LLVM backend.
Another change to this is the start function, as it's now more in line with stage1's.
The stage2 test infrastructure for wasm/wasi has been updated to reflect this as well.
* link: add a virtual function `lowerUnnamedConsts`, similar to
`updateFunc` or `updateDecl` which needs to be implemented by the
linker backend in order to be used with the `CodeGen` code
* elf: implement `lowerUnnamedConsts` specialization where we
lower unnamed constants to `.rodata` section. We keep track of the
atoms encompassing the lowered unnamed consts in a global table
indexed by parent `Decl`. When the `Decl` is updated or destroyed,
we clear the unnamed consts referenced within the `Decl`.
* macho: implement `lowerUnnamedConsts` specialization where we
lower unnamed constants to `__TEXT,__const` section. We keep track of the
atoms encompassing the lowered unnamed consts in a global table
indexed by parent `Decl`. When the `Decl` is updated or destroyed,
we clear the unnamed consts referenced within the `Decl`.
* x64: change `MCValue.linker_sym_index` into two `MCValue`s: `.got_load` and
`.direct_load`. The former signifies to the emitter that it should
emit a GOT load relocation, while the latter that it should emit
a direct load (`SIGNED`) relocation.
* x64: lower `struct` instantiations
In accordance with the requesting issue (#10750):
- `zig test` skips any tests that it cannot spawn, returning success
- `zig run` and `zig build` exit with failure, reporting the command the cannot be run
- `zig clang`, `zig ar`, etc. already punt directly to the appropriate clang/lld main(), even before this change
- Native `libc` Detection is not supported
Additionally, `exec()` and related Builder functions error at run-time, reporting the command that cannot be run
This implements lowering elem_ptr for decl's and constants.
To generate the correct pointer, we perform a relocation by using the addend
that represents the offset. The offset is calculated by taking the element's size
and multiplying that by the index.
For constants this generates a single immediate instruction, and for decl's
this generates a single pointer address.
Clarify that `astgen.advanceSourceCursor` already increments absolute
values of the line and columns numbers; i.e., `GenZir.calcLine` is thus
not only obsolete but wrong by design.
Incidentally, this clean up allows for specifying the `FnDecl` line
numbers for DWARF use correctly as relative values with respect to
the start of the parent `Decl`. This `Decl` in turn has its line number
information specified relatively to its parent `Decl`, and so on, until
we reach the global scope.