693 Commits

Author SHA1 Message Date
Jakub Konka
c53423f8aa macho: don't cache unused link options
This will change incrementally once we support more linker options.
2021-08-13 10:46:53 +02:00
Jakub Konka
f49aa960a9 macho: use ArrayHashMap.popOrNull where applicable 2021-08-12 10:20:57 +02:00
Jakub Konka
f01366e8b3 macho: exclude tentative def before checking for collision
when resolving global symbols.
2021-08-12 10:20:57 +02:00
Jakub Konka
35403d41ce macho: use array hashmaps for quick lookups
as containers for unresolved and tentative definitions when resolving
symbols.
2021-08-12 10:20:57 +02:00
Jakub Konka
da57d6df32 macho: simplify symbol management and resolution
instead of globally storing unresolved and tentative defs,
store indices to actual symbols in the functions that are responsible
for symbol resolution.
2021-08-12 10:20:57 +02:00
Jakub Konka
5d548cc651 macho: move parsing logic for Object, Archive and Dylib into MachO
This way, the functionality is better segregated, and we finally do
not unnecessarily reparse dynamic libraries that were already visited
and parsed.
2021-08-11 19:38:00 +02:00
Jakub Konka
16bb5c05f1 macho: refactor stub parsing in Dylib 2021-08-11 19:38:00 +02:00
Jakub Konka
d95e8bc5f8 macho: simplify versioning logic for TAPI 2021-08-11 19:38:00 +02:00
Jakub Konka
8afe6210e9 macho: add TAPI v3 parser
This turns out needed to correctly support version back to macOS
10.14 (Mojave)
2021-08-11 19:38:00 +02:00
Jakub Konka
509fe33d10 macho: when targeting simulator, match host dylibs too
otherwise, linking may fail as some libc functions are provided by
the host when simulating a different OS such iPhoneOS.
2021-08-10 13:41:10 +02:00
Jakub Konka
9ab8d065b6 macho: deinit BuildVersion load command 2021-08-10 13:41:10 +02:00
Jakub Konka
2371a63bd4 macho: allow .simulator ABI when targeting Apple simulator env
For example, in order to run a binary on an iPhone Simulator,
you need to specify that explicitly as part of the target as
`aarch64-ios-simulator` rather than `aarch64-ios-gnu` or
`aarch64-ios` for short.
2021-08-10 13:41:10 +02:00
Jakub Konka
7007684984 macho: swap out VERSION_MIN for BUILD_VERSION
this makes the app successfully run on the iOS simluator!
2021-08-10 13:41:10 +02:00
Jakub Konka
ace9b3de64 macho: fix parsing target string when linking against tbds 2021-08-10 13:41:07 +02:00
Jakub Konka
049ff45430 macho: add basic support for building iOS binaries
* ensure we correctly transfer `-iwithsysroot` and
  `-iframeworkwithsysroot` flags with values from `build.zig` and that
  they are correctly transferred forward to `zig cc`
* try to look for `libSystem.tbd` in the provided syslibroot - one
  caveat that the user will have to specify library search paths too
2021-08-10 13:40:39 +02:00
Jakub Konka
f2bf1390a2 macho: fix linking of dylibs and frameworks
Previously, I have incorrectly assumed that with two-level namespace
we only need to link in dylibs/frameworks that actually export symbols
which are undefined in the linked image. Turns out, regardless of
whether we link with two-level namespace (default on macOS) or a
flat namespace (more common on other platforms), we always need to
put the dylibs/frameworks as specified by the user from the linker
line into the final linked image.
2021-08-10 08:13:07 +02:00
Ryan Liptak
d31352ee85 Update all usages of mem.split/mem.tokenize for generic version 2021-08-06 02:01:47 -07:00
Jakub Konka
41d7787b69 macho: remove obsolete pack/unpack dylib ordinal fns
Remove some unused debugging machinery such as full printing of the
symtab after symbol resolution. It was there only for the time of
rewriting the linker.
2021-08-02 19:49:32 +02:00
Jakub Konka
bf25650974 macho: refactor management of section ordinals
Instead of storing a two-way relation (seg,sect) <=> ordinal
we get the latter with `getIndex((seg, sect))`.
2021-08-02 19:49:32 +02:00
Jakub Konka
f3b328ee8c macho: refactor tracking of referenced dylibs
Now, index in the global referenced array hashmap is equivalent to
the dylib's ordinal in the final linked image.
2021-08-02 19:49:32 +02:00
Jakub Konka
d794f4cd2a macho: add runaway section id when sorting sections 2021-08-01 18:05:07 +02:00
Jakub Konka
0ce56f9305 macho: fix Trie and CodeSignature unit tests
after the cleanup.
2021-08-01 09:06:56 +02:00
Jakub Konka
58bc713c17 macho: make Trie accept allocator as a param
instead of storing it as a member of Trie struct.
2021-08-01 09:06:56 +02:00
Jakub Konka
d19fdf09ae macho: make CodeSignature accept allocator as param
instead storing it within the struct.
2021-08-01 09:06:56 +02:00
Jakub Konka
2e30bf23aa macho: cleanup extracting objects from archives 2021-08-01 09:06:56 +02:00
Jakub Konka
0b15ba8334 macho: don't allocate Dylib on the heap
instead, immediately transfer ownership to MachO struct. Also, revert
back to try-ok-fail parsing approach of objects, archives, and dylibs.
It seems easier to try and fail than check if the file *is* of a
certain type given that a dylib may be a stub and parsing yaml
twice in a row seems very wasteful.

Hint for the future: if we optimise yaml/TAPI parsing, this approach
may be rethought!
2021-08-01 09:06:56 +02:00
Jakub Konka
f023cdad7c macho: don't allocate Archives on the heap
instead, transfer ownership directly to MachO struct.
2021-08-01 09:06:56 +02:00
Jakub Konka
06396ddd7d macho: don't allocate Objects on the heap
instead, ownership is transferred to MachO. This makes Object
management align closer with data-oriented design.
2021-08-01 09:06:56 +02:00
Jakub Konka
e73777333d macho: don't store allocator in Dylib instance
instead pass it in as an arg to a function that requires it.
2021-08-01 09:06:56 +02:00
Jakub Konka
586a19e3db macho: don't store allocator in Archive
instead pass it in functions as an arg when required.
2021-08-01 09:06:56 +02:00
Jakub Konka
c30cc4dbbf macho: don't store allocator in Object
instead, pass it in functions that require it. Also, when parsing
relocs, make Object part of the context struct where we pass in
additional goodies such as `*MachO` or `*Allocator`.
2021-08-01 09:06:56 +02:00
Andrew Kelley
040c6eaaa0 stage2: garbage collect unused anon decls
After this change, the frontend and backend cooperate to keep track of
which Decls are actually emitted into the machine code. When any backend
sees a `decl_ref` Value, it must mark the corresponding Decl `alive`
field to true.

This prevents unused comptime data from spilling into the output object
files. For example, if you do an `inline for` loop, previously, any
intermediate value calculations would have gone into the object file.
Now they are garbage collected immediately after the owner Decl has its
machine code generated.

In the frontend, when it is time to send a Decl to the linker, if it has
not been marked "alive" then it is deleted instead.

Additional improvements:
 * Resolve type ABI layouts after successful semantic analysis of a
   Decl. This is needed so that the backend has access to struct fields.
 * Sema: fix incorrect logic in resolveMaybeUndefVal. It should return
   "not comptime known" instead of a compile error for global variables.
 * `Value.pointerDeref` now returns `null` in the case that the pointer
   deref cannot happen at compile-time. This is true for global
   variables, for example. Another example is if a comptime known
   pointer has a hard coded address value.
 * Binary arithmetic sets the requireRuntimeBlock source location to the
   lhs_src or rhs_src as appropriate instead of on the operator node.
 * Fix LLVM codegen for slice_elem_val which had the wrong logic for
   when the operand was not a pointer.

As noted in the comment in the implementation of deleteUnusedDecl, a
future improvement will be to rework the frontend/linker interface to
remove the frontend's responsibility of calling allocateDeclIndexes.

I discovered some issues with the plan9 linker backend that are related
to this, and worked around them for now.
2021-07-29 19:30:37 -07:00
Andrew Kelley
a5c6e51f03 stage2: more principled approach to comptime references
* AIR no longer has a `variables` array. Instead of the `varptr`
   instruction, Sema emits a constant with a `decl_ref`.
 * AIR no longer has a `ref` instruction. There is no longer any
   instruction that takes a value and returns a pointer to it. If this
   is desired, Sema must either create an anynomous Decl and return a
   constant `decl_ref`, or in the case of a runtime value, emit an
   `alloc` instruction, `store` the value to it, and then return the
   `alloc`.
 * The `ref_val` Value Tag is eliminated. `decl_ref` should be used
   instead. Also added is `eu_payload_ptr` which points to the payload
   of an error union, given an error union pointer.

In general, Sema should avoid calling `analyzeRef` if it can be helped.
For example in the case of field_val and elem_val, there should never be
a reason to create a temporary (alloc or decl). Recent previous commits
made progress along that front.

There is a new abstraction in Sema, which looks like this:

    var anon_decl = try block.startAnonDecl();
    defer anon_decl.deinit();
    // here 'anon_decl.arena()` may be used
    const decl = try anon_decl.finish(ty, val);
    // decl is typically now used with `decl_ref`.

This pattern is used to upgrade `ref_val` usages to `decl_ref` usages.

Additional improvements:

 * Sema: fix source location resolution for calling convention
   expression.
 * Sema: properly report "unable to resolve comptime value" for loads of
   global variables. There is now a set of functions which can be
   called if the callee wants to obtain the Value even if the tag is
   `variable` (indicating comptime-known address but runtime-known value).
 * Sema: `coerce` resolves builtin types before checking equality.
 * Sema: fix `u1_type` missing from `addType`, making this type have a
   slightly more efficient representation in AIR.
 * LLVM backend: fix `genTypedValue` for tags `decl_ref` and `variable`
   to properly do an LLVMConstBitCast.
 * Remove unused parameter from `Value.toEnum`.

After this commit, some test cases are no longer passing. This is due to
the more principled approach to comptime references causing more
anonymous decls to get sent to the linker for codegen. However, in all
these cases the decls are not actually referenced by the runtime machine
code. A future commit in this branch will implement garbage collection
of decls so that unused decls do not get sent to the linker for codegen.
This will make the tests go back to passing.
2021-07-29 15:59:51 -07:00
Andrew Kelley
dc88864c97 stage2: implement @boolToInt
This is the first commit in which some behavior tests are passing for
both stage1 and stage2.
2021-07-27 17:08:37 -07:00
Andrew Kelley
31a59c229c stage2: improvements towards zig test
* Add AIR instruction: struct_field_val
   - This is part of an effort to eliminate the AIR instruction `ref`.
   - It's implemented for C backend and LLVM backend so far.
 * Rename `resolvePossiblyUndefinedValue` to `resolveMaybeUndefVal` just
   to save some columns on long lines.
 * Sema: add `fieldVal` alongside `fieldPtr` (renamed from
   `namedFieldPtr`). This is part of an effort to eliminate the AIR
   instruction `ref`. The idea is to avoid unnecessary loads, stores,
   stack usage, and IR instructions, by paying a DRY cost.

LLVM backend improvements:

 * internal linkage vs exported linkage is implemented, along with
   aliases. There is an issue with incremental updates due to missing
   LLVM API for deleting aliases; see the relevant comment in this commit.
   - `updateDeclExports` is hooked up to the LLVM backend now.
 * Fix usage of `Type.tag() == .noreturn` rather than calling `isNoReturn()`.
 * Properly mark global variables as mutable/constant.
 * Fix llvm type generation of function pointers
 * Fix codegen for calls of function pointers
 * Implement llvm type generation of error unions and error sets.
 * Implement AIR instructions: addwrap, subwrap, mul, mulwrap, div,
   bit_and, bool_and, bit_or, bool_or, xor, struct_field_ptr,
   struct_field_val, unwrap_errunion_err, add for floats, sub for
   floats.

After this commit, `zig test` on a file with `test "example" {}`
correctly generates and executes a test binary. However the
`test_functions` slice is undefined and just happens to be going into
the .bss section, causing the length to be 0. The next step towards
`zig test` will be replacing the `test_functions` Decl Value with the
set of test function pointers, before it is sent to linker/codegen.
2021-07-26 19:27:49 -07:00
Jakub Konka
5533f77054 Merge remote-tracking branch 'origin/master' into zld-incremental-2 2021-07-23 17:06:19 +02:00
Jakub Konka
1beda818e1 macho: re-enable parsing sections into atoms
However, make it default only when building in release modes since
it's a prelude to advanced dead code stripping not very useful in
debug.
2021-07-23 16:55:19 +02:00
Andrew Kelley
7c25390c95 support -fcompiler-rt in conjunction with build-obj
When using `build-exe` or `build-lib -dynamic`, `-fcompiler-rt` means building
compiler-rt into a static library and then linking it into the executable.

When using `build-lib`, `-fcompiler-rt` means building compiler-rt into an
object file and then adding it into the static archive.

Before this commit, when using `build-obj`, zig would build compiler-rt
into an object file, and then on ELF, use `lld -r` to merge it into the
main object file. Other linker backends of LLD do not support `-r` to
merge objects, so this failed with error messages for those targets.

Now, `-fcompiler-rt` when used with `build-obj` acts as if the user puts
`_ = @import("compiler_rt");` inside their root source file. The symbols
of compiler-rt go into the same compilation unit as the root source file.

This is hooked up for stage1 only for now. Once stage2 is capable of
building compiler-rt, it should be hooked up there as well.
2021-07-22 19:51:32 -07:00
Andrew Kelley
a5fb28070f add -femit-llvm-bc CLI option and implement it
* Added doc comments for `std.Target.ObjectFormat` enum
 * `std.Target.oFileExt` is removed because it is incorrect for Plan-9
   targets. Instead, use `std.Target.ObjectFormat.fileExt` and pass a
   CPU architecture.
 * Added `Compilation.Directory.joinZ` for when a null byte is desired.
 * Improvements to `Compilation.create` logic for computing `use_llvm`
   and reporting errors in contradictory flags. `-femit-llvm-ir` and
   `-femit-llvm-bc` will now imply `-fLLVM`.
 * Fix compilation when passing `.bc` files on the command line.
 * Improvements to the stage2 LLVM backend:
   - cleaned up error messages and error reporting. Properly bubble up
     some errors rather than dumping to stderr; others turn into panics.
   - properly call ZigLLVMCreateTargetMachine and
     ZigLLVMTargetMachineEmitToFile and implement calculation of the
     respective parameters (cpu features, code model, abi name, lto,
     tsan, etc).
   - LLVM module verification only runs in debug builds of the compiler
   - use LLVMDumpModule rather than printToString because in the case
     that we incorrectly pass a null pointer to LLVM it may crash during
     dumping the module and having it partially printed is helpful in
     this case.
   - support -femit-asm, -fno-emit-bin, -femit-llvm-ir, -femit-llvm-bc
   - Support LLVM backend when used with Mach-O and WASM linkers.
2021-07-22 19:51:32 -07:00
Jakub Konka
a4feb97cdf macho: assign and cache section ordinals upon creation
then, when sorting sections within segments, clear and redo the
ordinals since we re-apply them to symbols anyway. It is vital
to have the ordinals consistent with parsing and resolving relocs
however.
2021-07-22 23:13:13 +02:00
Jakub Konka
4fd0cb7618 macho: sort nlists within object before filtering by type
Previously, we'd filter the nlists assuming they were correctly
ordered by type: local < extern defined < undefined within the
object's symbol table but this doesn't seem to be guaranteed,
therefore, we sort by type and address in one go, and filter
defined from undefined afterwards.
2021-07-22 16:02:31 +02:00
Jakub Konka
773863150a macho: fix incorrect prealloc in traditional path 2021-07-22 14:50:06 +02:00
Jakub Konka
ca90efe88e macho: fix memory leaks when emptying TextBlocks
This happens on every call to `TextBlock.empty` by the `Module`.
2021-07-22 14:05:12 +02:00
Jakub Konka
def1359187 Merge remote-tracking branch 'origin/master' into zld-incremental-2 2021-07-22 09:34:44 +02:00
Jakub Konka
d0edd37f69 macho: fix bug when freeing Decl
Take into account that an already freed Decl will no longer be
available as `decl.link.macho` causing a potential "inactive union field"
panic.
2021-07-21 23:38:20 +02:00
Andrew Kelley
36295d712f remove 'pe' object format
Portable Executable is an executable format, not an object format.
Everywhere in the entire zig codebase, we treated coff and pe as if they
were the same. Remove confusion by not including pe in the
std.Target.ObjectFormat enum.
2021-07-21 12:45:32 -07:00
Jakub Konka
845c906e6a macho: add relocations for GOT cells
in self-hosted compiler.
2021-07-21 17:58:05 +02:00
Jakub Konka
3bfde76cff macho: fix text block management
For the time being, until we rewrite how atoms are handled across
linkers, store two tables in the MachO linker: one for TextBlocks
directly created and managed by the linker, and one for TextBlocks
that were spawned by Module.Decl. This allows for correct memory
clean up after linking is done.
2021-07-21 15:46:57 +02:00
Jakub Konka
5276ce8e63 macho: use adapters to directly reference strtab
Thanks to this, we no longer need to do allocs per symbol name
landing in the symbol resolver, plus we do not need to actively
track if the string was already inserted into the string table.
2021-07-20 23:37:22 +02:00
Andrew Kelley
95756299af stage2: fix compile errors in LLVM backend 2021-07-20 12:19:16 -07:00