122 Commits

Author SHA1 Message Date
Veikka Tuominen
ef7282bab4 stage2 macho: workaround stage2 bugs 2022-04-15 16:05:27 +03:00
Andrew Kelley
2587474717 stage2: progress towards stage3
* The `@bitCast` workaround is removed in favor of `@ptrCast` properly
   doing element casting for slice element types. This required an
   enhancement both to stage1 and stage2.
 * stage1 incorrectly accepts `.{}` instead of `{}`. stage2 code that
   abused this is fixed.
 * Make some parameters comptime to support functions in switch
   expressions (as opposed to making them function pointers).
 * Avoid relying on local temporaries being mutable.
 * Workarounds for when stage1 and stage2 disagree on function pointer
   types.
 * Workaround recursive formatting bug with a `@panic("TODO")`.
 * Remove unreachable `else` prongs for some inferred error sets.

All in effort towards #89.
2022-04-14 10:12:45 -07:00
William Sengir
6de8b4bc3d std.dwarf: implement basic DWARF 5 parsing
DWARF 5 moves around some fields and adds a few new ones that can't be
parsed or ignored by our current DWARF 4 parser. This isn't a complete
implementation of DWARF 5, but this is enough to make stack traces
mostly work. Line numbers from C++ don't show up, but I know the info
is there. I think the answer is to iterate through .debug_line_str in
getLineNumberInfo, but I didn't want to fall into an even deeper rabbit
hole tonight.
2022-03-15 16:53:45 -04:00
Jakub Konka
f9f792ab70 zld: fix num nlist calc when there's no dynsymtab
Handle `__DATA,.rustc` section containing `rustc` metadata - this
is required to get crates like `serde_derive` link properly.
Note to self: this special section has to be copied __verbatim__
from the relocatable object file - this includes preserving its size
even though unpadded according the section's required alignment.
2022-01-13 20:02:11 +01:00
Jakub Konka
ab328aca33 macho: put LC_* consts in a typed enum(u32) LC
repeat for `PLATFORM_*` and `TOOL_*` sets
2021-12-15 08:59:20 +01:00
Jakub Konka
828f61e8df macho: move all helpers from commands.zig into std.macho
This way we will finally be able to share common parsing logic
between different Zig components and 3rd party packages.
2021-12-10 18:18:28 +01:00
Jakub Konka
81e7d8505c macho: move helper functions to libstd
Helper functions such as `commands.sectionName`, etc. should really
belong in `std.macho.section_64` extern struct.
2021-12-10 11:56:51 +01:00
Jakub Konka
c86f2402d0 macho: don't prealloc sections when stage1 2021-12-05 22:46:46 +01:00
Lee Cannon
85de022c56
allocgate: std Allocator interface refactor 2021-11-30 23:32:47 +00:00
Jakub Konka
86fe47235e macho: move nlist_64 type/flags helpers to std.macho 2021-11-30 13:59:33 +01:00
Andrew Kelley
902df103c6 std lib API deprecations for the upcoming 0.9.0 release
See #3811
2021-11-30 00:13:07 -07:00
Ryan Liptak
e97feb96e4 Replace ArrayList.init/ensureTotalCapacity pairs with initCapacity
Because ArrayList.initCapacity uses 'precise' capacity allocation, this should save memory on average, and definitely will save memory in cases where ArrayList is used where a regular allocated slice could have also be used.
2021-11-04 14:54:25 -04:00
Ryan Liptak
70ef9bc75c Fix ensureTotalCapacity calls that should be ensureUnusedCapacity calls
If these functions are called more than once, then the array list would no longer be guaranteed to have enough capacity during the appendAssumeCapacity calls. With ensureUnusedCapacity, they will always be guaranteed to have enough capacity regardless of how many times the function is called.
2021-11-01 15:08:41 -04:00
Jakub Konka
d0dceae736 macho: dump linker's state as JSON
Each element of the output JSON has the VM address of the generated
binary nondecreasing (some elements might occupy the same VM address
for example the atom and the relocation might coincide in the address
space).

The generated JSON can be inspected manually or via a preview tool
`zig-snapshots` that I am currently working on and will allow the user
to inspect interactively the state of the linker together with the
positioning of sections, symbols, atoms and relocations within each
snapshot state, and in the future, between snapshots too. This should
allow for quicker debugging of the linker which is nontrivial when
run in the incremental mode.

Note that the state will only be dumped if the compiler is built with
`-Dlink-snapshot` flag on, and then the compiler is passed `--debug-link-snapshot`
flag upon compiling a source/project.
2021-10-22 12:50:25 +02:00
Jakub Konka
fc302f00a9 macho: redo relocation handling and lazy bind globals
* apply late symbol resolution for globals - instead of resolving
  the exact location of a symbol in locals, globals or undefs,
  we postpone the exact resolution until we have a full picture
  for relocation resolution.
* fixup stubs to defined symbols - this is currently a hack rather
  than a final solution. I'll need to work out the details to make
  it more approachable. Currently, we preemptively create a stub
  for a lazy bound global and fix up stub offsets in stub helper
  routine if the global turns out to be undefined only. This is quite
  wasteful in terms of space as we create stub, stub helper and lazy ptr
  atoms but don't use them for defined globals.
* change log scope to .link for macho.
* remove redundant code paths from Object and Atom.
* drastically simplify the contents of Relocation struct (i.e., it is
  now a simple superset of macho.relocation_info), clean up relocation
  parsing and resolution logic.
2021-10-13 16:17:10 +02:00
Jakub Konka
d722f0cc62 macho: do not write temp and noname symbols to symtab
Remove currently obsolete AtomParser from Object.
2021-09-21 11:05:22 +02:00
Ryan Liptak
59f5053bed Update all ensureCapacity calls to the relevant non-deprecated version 2021-09-19 13:52:56 +02:00
Jakub Konka
983d6dcd9e macho: implement object relinking in stage2
* In watch mode, when changing the C source, we will trigger complete
  relinking of objects, dylibs and archives (atoms coming from the
  incremental updates stay put however). This means, we need to undo
  metadata populated when linking in objects, archives and dylibs.
* Remove unused splitting section into atoms bit. This optimisation
  will probably be best rewritten from scratch once self-hosted
  matures so parking the idea for now. Also, for easier management
  of atoms spawned from the Object file, keep the atoms subgraph as
  part of the Object file struct.
* Remove obsolete ref to static initializers in object struct.
* Implement handling of global symbol collision in updateDeclExports.
2021-09-16 12:38:47 +02:00
Jakub Konka
05763f43b3 macho: disable splitting sections into atoms in release
since we don't actually benefit from it just yet, and getting
it right for release and dead code stripping will require some more
thought put into it.
2021-09-14 10:28:58 +02:00
Jakub Konka
a38b636045 Merge remote-tracking branch 'origin/master' into zld-incr 2021-09-13 23:40:38 +02:00
Jakub Konka
4c36da1047 macho: fix incremental compilation 2021-09-13 17:00:36 +02:00
Jakub Konka
054fe96bcd macho: enable tracy in more places within the linker 2021-09-11 12:25:00 +02:00
Jakub Konka
6e0c3950b8 macho: rename blocks to atoms in Object.zig 2021-09-10 22:42:39 +02:00
Jakub Konka
aaacfc0d0a macho: init process of renaming TextBlock to Atom
Initially, internally within the linker.
2021-09-09 18:32:03 +02:00
Jakub Konka
1efdb137d1 macho: don't allocate atoms when parsing objects 2021-09-09 14:18:28 +02:00
Jakub Konka
e229202cb8 macho: store source section address of relocs in context
This is particularly relevant for x86_64 and C++ when relocating
StaticInit sections containing static initializers machine code.
Then, in case of SIGNED_X relocations, it is necessary to have the
full image of the VM address layout of the sections in the object
file as this is how the addend needs to be adjusted for non-extern
relocations.
2021-09-07 23:21:08 +02:00
Jakub Konka
6836cc473c macho: make sure that parsed bss atoms are zero-filled 2021-09-06 18:30:40 +02:00
Jakub Konka
5e64d9745b macho: fix noninclusion of data-in-code
Also, calculate non-extern, section offset based addends for SIGNED
and UNSIGNED relocations on x86_64 upfront as an offset wrt to the
target symbol representing position of the section/atom within the
final artifact.
2021-09-06 10:38:51 +02:00
Andrew Kelley
332eafeb7f stage2: first pass at implementing usingnamespace
Ran into a design flaw here which will need to get solved by having
AstGen annotate ZIR with which instructions are closed over.
2021-09-01 17:54:06 -07:00
Jakub Konka
7a99cd069a macho: clean up allocating atom logic
Instead of checking for stage1 at every callsite, move the logic
inside `allocateAtom`. This is fine since this logic will disappear
anyhow once I add expanding and shifting segments and sections.
2021-09-01 12:14:29 +02:00
Jakub Konka
2831d6e9b8 macho: add first pass at allocating parsed atoms in objects
This commit makes it possible to combine self-hosted with a pre-compiled
C object file, e.g.:

```
zig-out/bin/zig build-exe hello.zig add.o
```

where `add.o` is a pre-compiled C object file.
2021-08-30 15:43:20 +02:00
Jakub Konka
a14e98fcac macho: remove sorting sections and refactor atom parsing in objects 2021-08-27 20:32:11 +02:00
Jakub Konka
5d548cc651 macho: move parsing logic for Object, Archive and Dylib into MachO
This way, the functionality is better segregated, and we finally do
not unnecessarily reparse dynamic libraries that were already visited
and parsed.
2021-08-11 19:38:00 +02:00
Jakub Konka
ace9b3de64 macho: fix parsing target string when linking against tbds 2021-08-10 13:41:07 +02:00
Jakub Konka
bf25650974 macho: refactor management of section ordinals
Instead of storing a two-way relation (seg,sect) <=> ordinal
we get the latter with `getIndex((seg, sect))`.
2021-08-02 19:49:32 +02:00
Jakub Konka
0b15ba8334 macho: don't allocate Dylib on the heap
instead, immediately transfer ownership to MachO struct. Also, revert
back to try-ok-fail parsing approach of objects, archives, and dylibs.
It seems easier to try and fail than check if the file *is* of a
certain type given that a dylib may be a stub and parsing yaml
twice in a row seems very wasteful.

Hint for the future: if we optimise yaml/TAPI parsing, this approach
may be rethought!
2021-08-01 09:06:56 +02:00
Jakub Konka
f023cdad7c macho: don't allocate Archives on the heap
instead, transfer ownership directly to MachO struct.
2021-08-01 09:06:56 +02:00
Jakub Konka
06396ddd7d macho: don't allocate Objects on the heap
instead, ownership is transferred to MachO. This makes Object
management align closer with data-oriented design.
2021-08-01 09:06:56 +02:00
Jakub Konka
c30cc4dbbf macho: don't store allocator in Object
instead, pass it in functions that require it. Also, when parsing
relocs, make Object part of the context struct where we pass in
additional goodies such as `*MachO` or `*Allocator`.
2021-08-01 09:06:56 +02:00
Jakub Konka
1beda818e1 macho: re-enable parsing sections into atoms
However, make it default only when building in release modes since
it's a prelude to advanced dead code stripping not very useful in
debug.
2021-07-23 16:55:19 +02:00
Jakub Konka
a4feb97cdf macho: assign and cache section ordinals upon creation
then, when sorting sections within segments, clear and redo the
ordinals since we re-apply them to symbols anyway. It is vital
to have the ordinals consistent with parsing and resolving relocs
however.
2021-07-22 23:13:13 +02:00
Jakub Konka
4fd0cb7618 macho: sort nlists within object before filtering by type
Previously, we'd filter the nlists assuming they were correctly
ordered by type: local < extern defined < undefined within the
object's symbol table but this doesn't seem to be guaranteed,
therefore, we sort by type and address in one go, and filter
defined from undefined afterwards.
2021-07-22 16:02:31 +02:00
Jakub Konka
ca90efe88e macho: fix memory leaks when emptying TextBlocks
This happens on every call to `TextBlock.empty` by the `Module`.
2021-07-22 14:05:12 +02:00
Jakub Konka
3bfde76cff macho: fix text block management
For the time being, until we rewrite how atoms are handled across
linkers, store two tables in the MachO linker: one for TextBlocks
directly created and managed by the linker, and one for TextBlocks
that were spawned by Module.Decl. This allows for correct memory
clean up after linking is done.
2021-07-21 15:46:57 +02:00
Jakub Konka
a442b165f1 macho: add stub relocs when adding extern fn
in self-hosted.
2021-07-20 20:33:07 +02:00
Jakub Konka
f6d13e9d6f zld: move contents of Zld into MachO module 2021-07-18 17:48:00 +02:00
Jakub Konka
9f20a51555 zld: demote logging back to debug from warn 2021-07-17 18:33:47 +02:00
Jakub Konka
71384a383e zld: correctly set n_sect for sections as symbols 2021-07-17 11:29:40 +02:00
Jakub Konka
407745a5e9 zld: simplify and move Relocations into TextBlock
It makes sense to have them as a dependent type since they only ever
deal with TextBlocks. Simplify Relocations to rely on symbol indices
and symbol resolver rather than pointers.
2021-07-17 01:03:40 +02:00
Jakub Konka
54a403d4ff zld: replace parsed reloc with a simple wrapper around macho.relocation_info 2021-07-16 17:18:53 +02:00