No longer automatically append the `--export` flag for each exported
function unconditionally. This was essentially a hack to prevent
binary bloat caused by compiler-rt symbols being always included in
the final binary as they were exported and therefore not garbage-
collected. This is no longer needed as we now support the ability to
set the visibility of exports.
This essentially reverts 6d951aff7e32b1b0252d341e66517a9a9ee98a2d
In #1622, when targeting WebAsembly, the --allow-undefined flag
became unconditionally added to the linker.
This is not always desirable.
First, this is error prone. Code with references to unkown symbols
will link just fine, but then fail at run-time.
This behavior is inconsistent with all other targets.
For freestanding wasm applications, and applications that only use
WASI, undefined references are better reported at compile-time.
This behavior is also inconsistent with clang itself. Autoconf and
cmake scripts checking for function presence think that all tested
functions exist, but then resulting application cannot run.
For example, this is one of the reasons compilation of Ruby 3.2.0
to WASI fails with zig cc, while it works out of the box with clang.
But all applications checking for symbol existence before compilation
are affected.
This reverts the behavior to the one Zig had before #1622, and
introduces an `import_symbols` flag to ignore undefined symbols,
assuming that the webassembly runtime will define them.
Otherwise, we were prematurely committing `__LINKEDIT` segment LC
with outdated size (i.e., without code signature being taken into account).
This would scaffold into strict validation failures by Apple tooling.
This merges the paths from flushModule and linkWithZld to a single
function that will write the entire WebAssembly module to the file.
This reduces the chance of mistakes as we do not have to duplicate
the logic. A similar action may be needed later for linkWithLLD.
When an atom has one or multiple aliasses, we we could not find the
target atom from the alias'd symbol. This is solved by ensuring that
we also insert each alias symbol in the symbol-atom map.
Previously we used the relocation index to find the corresponding
symbol that represents the type. However, the index actually
represents the index into the list of types. We solved this by
first retrieving the original type, and then finding its location
in the new list of types. When the atom file is 'null', it means
the type originates from a Zig function pointer or a synthetic
function. In both cases, the final type index was already resolved
and therefore equals to relocation's index value.
When parsing the table of contents containing the symbols and their
positions we initially used the index within the map to retrieve
the offset. However, during resizing of the underlaying array this
would invalidate those indexes which meant incorrect offsets were
being stored for symbols. We now use the current symbol index
to also get the index into the symbol position instead.
Given `main.go`:
package main
import _ "os/user"
func main() {}
Compiling it to linux/arm64:
$ CGO_CFLAGS='-O0' GOOS=linux GOARCH=arm64 CGO_ENABLED=1 CC="zig cc -target aarch64-linux-gnu.2.28" go build main.go
Results in this error:
runtime/cgo(.text): unknown symbol memset in callarm64
runtime/cgo(.text): unknown symbol memset in callarm64
runtime/cgo(.text): relocation target memset not defined
In the midst of intermediate compilations files we can see this commmand:
ld.lld -o _cgo_.o <...> /tmp/go-build206961058/b043/_x009.o <...> ~/.cache/zig/.../libcompiler_rt.a <...> ~/.cache/.../libc.so.6
`_x009.o` needs memset:
$ readelf -Ws ./b043/_x009.o | grep memset
22: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND memset
Both `libcompiler_rt.a` and `libc.so.6` provide it:
$ readelf -Ws ~/.cache/zig/.../libcompiler_rt.a | grep memset
870: 0000000000000000 318 FUNC WEAK DEFAULT 519 memset
$ readelf -Ws ~/.cache/zig/.../libc.so.6 | grep -w memset
476: 000000000001d34c 0 FUNC GLOBAL DEFAULT 7 memset@@GLIBC_2.2.5
Since `libcompiler_rt.a` comes before libc in the linker line, the
resulting `_cgo_.o` still links to a weak, unversioned memset:
$ readelf -Ws ./b043/_cgo_.o | grep -w memset
40: 000000000022c07c 160 FUNC WEAK DEFAULT 14 memset
719: 000000000022c07c 160 FUNC WEAK DEFAULT 14 memset
Since the final linking step is done by Golang's linker, it does not
know of `libcompiler_rt.a`, and fails to link with the error message
above. However, Go linker does recognize memset from glibc. If we
specify an `-lc` equivalent before the `libcompiler_rt.a`, it will link
to memset from libc:
$ readelf -Wa ./b043/_x009.o |grep memset
14: 0000000000000000 0 FUNC GLOBAL DEFAULT UND memset@GLIBC_2.17 (2)
157: 0000000000000000 0 FUNC GLOBAL DEFAULT UND memset@GLIBC_2.17
... and then `main.go` will compile+link successfully.
Why doesn't Go linker take memset from glibc? An educated guess: Go
determines whether to link with glibc from what the program asks (I
presume `.dynsym`). Since `memset` is no longer attributed to glibc, Go
skips linking to glibc altogether.
Bonus question: curious why `-O0` is necessary? Because when
optimizations are enabled (the default), the C compiler replaces
`memset` function call with plain `stp` instructions (on aarch64).
By pulling out the parallel hashing setup from `CodeSignature.zig`,
we can now reuse it different places across MachO linker (for now;
I can totally see its usefulness beyond MachO, eg. in COFF or ELF too).
The parallel hasher is generic over actual hasher such as Sha256 or MD5.
The implementation is kept as it was.
For UUID calculation, depending on the linking mode:
* incremental - since it only supports debug mode, we don't bother with MD5
hashing of the contents, and populate it with random data but only once
per a sequence of in-place binary patches
* traditional - in debug, we use random string (for speed); in release,
we calculate the hash, however we use LLVM/LLD's trick in that we
calculate a series of MD5 hashes in parallel and then one an MD5 of MD5
final hash to generate digest.
When a data symbol is required to be exported, we instead generate
a global that will be exported. This global is immutable and contains
the address of the data symbol.
When invoking the self-hosted linker using `-fno-LLD` while using the
LLVM backend or invoking it as a linker, we create a seperate path.
This path will link the object file generated by LLVM and the
supplied object files just once, allowing to simplify the
implementation between incremental and regular linking.