Aside from adding comments to document the logic in `Cache.Manifest.hit`
better, this commit fixes two serious bugs.
The first, spotted by Andrew, is that when upgrading from a shared to an
exclusive lock on the manifest file, we do not seek it back to the
start. This is a simple fix.
The second is more subtle, and has to do with the computation of file
digests. Broadly speaking, the goal of the main loop in `hit` is to
iterate the files listed in the manifest file, and check if they've
changed, based on stat and a file hash. While doing this, the
`bin_digest` field of `std.Build.Cache.File`, which is initially
`undefined`, is populated for all files, either straight from the
manifest (if the stat matches) or recomputed from the file on-disk. This
file digest is then used to update `man.hash.hasher`, which is building
the final hash used as, for instance, the output directory name when the
compiler emits into the cache directory. When `hit` returns a cache
miss, it is expected that `man.hash.hasher` includes the digests of all
"initial files"; that is, those which have been already added with e.g.
`addFilePath`, but not those which will later be added with
`addFilePost` (even though the manifest file has told us about some such
files). Previously, `hit` was using the `unhit` function to do this in a
few cases. However, this is incorrect, because `hit` assumes that all
files already have their `bin_digest` field populated; this function is
only valid to call *after* `hit` returns. Instead, we need to actually
compute the hashes which haven't yet been populated. Even if this logic
has been working, there was still a bug here, because we called `unhit`
when upgrading from a shared to an exclusive lock, writing the
(potentially `undefined`) file digests, but the loop itself writes the
file digests *again*! All in all, the hashing logic here was actually
incredibly broken.
I've taken the opportunity to restructure this section of the code into
what I think is a more readable format. A new function,
`hitWithCurrentLock`, uses the open manifest file to try and find a
cache hit. It returns a tagged union which, in the miss case, tells the
caller (`hit`) how many files already have their hash populated. This
avoids redundant work recomputing the same hash multiple times in
situations where the lock needs upgrading. This also eliminates the
outer loop from `hit`, which was a little confusing because it iterated
no more than twice!
The bugs fixed here could manifest in several different ways depending
on how contended file locks were satisfied. Most notably, on a cache
miss, the Zig compiler might have written the compilation output to the
incorrect directory (because it incorrectly constructed a hash using
`undefined` or repeated file digests), resulting in all future hits on
this manifest causing `error.FileNotFound`. This is #23110. I have been
able to reproduce #23110 on `master`, and have not been able to after
this commit, so I am relatively sure this commit resolves that issue.
Resolves: #23110
* Accept -fsanitize-c=trap|full in addition to the existing form.
* Accept -f(no-)sanitize-trap=undefined in zig cc.
* Change type of std.Build.Module.sanitize_c to std.zig.SanitizeC.
* Add some missing Compilation.Config fields to the cache.
Closes#23216.
Compile log output is now separated based on the `AnalUnit` which
perfomred the `@compileLog` call, so that we can omit the output for
unreferenced ("dead") units. The units are also sorted when collecting
the `ErrorBundle`, so that compile logs are always printed in a
consistent order, like compile errors are. This is important not only
for incremental compilation, but also for parallel analysis.
Resolves: #23609
This lays the groundwork for #2879. This library will be built and linked when a
static libc is going to be linked into the compilation. Currently, that means
musl, wasi-libc, and MinGW-w64. As a demonstration, this commit removes the musl
C code for a few string functions and implements them in libzigc. This means
that those libzigc functions are now load-bearing for musl and wasi-libc.
Note that if a function has an implementation in compiler-rt already, libzigc
should not implement it. Instead, as we recently did for memcpy/memmove, we
should delete the libc copy and rely on the compiler-rt implementation.
I repurposed the existing "universal libc" code to do this. That code hadn't
seen development beyond basic string functions in years, and was only usable-ish
on freestanding. I think that if we want to seriously pursue the idea of Zig
providing a freestanding libc, we should do so only after defining clear goals
(and non-goals) for it. See also #22240 for a similar case.
LLD expects the library file name (minus extension) to be exactly libmingw32. By
calling it mingw32 previously, we prevented it from being detected as being in
LLD's list of libraries that are excluded from the MinGW-specific auto-export
mechanism.
b9d27ac252/lld/COFF/MinGW.cpp (L30-L56)
As a result, a DLL built for *-windows-gnu with Zig would export a bunch of
internal MinGW symbols. This sometimes worked out fine, but it could break at
link or run time when linking an EXE with a DLL, where both are targeting
*-windows-gnu and thus linking separate copies of mingw32.lib. In #23204, this
manifested as the linker getting confused about _gnu_exception_handler() because
it was incorrectly exported by the DLL while also being defined in the
mingw32.lib that was being linked into the EXE.
Closes#23204.
On updates with failed files, we should refrain from doing any semantic
analysis, or even touching codegen/link. That way, incremental
compilation state is untouched for when the user fixes the AstGen
errors.
Resolves: #23205
Clang's integrated Arm assembler doesn't understand -mabi yet, so this results
in "unused command line argument" warnings when building musl code and glibc
stubs, for example.
Windows is a ridiculous operating system designed by toddlers, and so
requires us to close all file handles in the `tmp/xxxxxxx` cache dir
before renaming it into `o/xxxxxxx`. We have a hack in place to handle
this for the main output file, but the MachO linker also outputs a file
with debug symbols, and we weren't closing it! This led to a fuckton of
CI failures when we enabled `.whole` cache mode by default for
self-hosted backends.
thanks jacob for figuring this out while i sat there
This reverts commit dea72d15da4fba909dc3ccb2e9dc5286372ac023, reversing
changes made to ab381933c87bcc744058d25a876cfdc0d23fc674.
The changeset does not work as advertised and does not have sufficient
test coverage.
Reopens#22822
Problem here is if zig is asked to create multiple static libraries, it
will build the runtime multiple times and then they will conflict.
Instead we want to build the runtime exactly once.
Unlike `compiler-rt`, `ubsan` uses the standard library quite a lot.
Using a similar approach to how `compiler-rt` is handled today, where it's
compiled into its own object and then linked would be sub-optimal as we'd
be introducing a lot of code bloat.
This approach always "imports" `ubsan` if the ZCU, if it exists. If it doesn't
such as the case where we're compiling only C code, then we have no choice other
than to compile it down to an object and link. There's still a tiny optimization
we can do in that case, which is when compiling to a static library, there's no
need to construct an archive with a single object. We'd only go back and parse out
ubsan from the archive later in the pipeline. So we compile it to an object instead
and link that to the static library.
TLDR;
- `zig build-exe foo.c` -> build `libubsan.a` and links
- `zig build-obj foo.c` -> doesn't build anything, just emits references to ubsan runtime
- `zig build-lib foo.c -static` -> build `ubsan.o` and link it
- `zig build-exe foo.zig bar.c` -> import `ubsan-rt` into the ZCU
- `zig build-obj foo.zig bar.c` -> import `ubsan-rt` into the ZCU
- `zig build-lib foo.zig bar.c` -> import `ubsan-rt` into the ZCU
This can also be extended to ELF later as it means roughly the same thing there.
This addresses the main issue in #21721 but as I don't have a macOS machine to
do further testing on, I can't confirm whether zig cc is able to pass the entire
cgo test suite after this commit. It can, however, cross-compile a basic program
that uses cgo to x86_64-macos-none which previously failed due to lack of -x
support. Unlike previously, the resulting symbol table does not contain local
symbols (such as C static functions).
I believe this satisfies the related donor bounty: https://ziglang.org/news/second-donor-bounty
- allow `-fsanitize-coverage-trace-pc-guard` to be used on its own without enabling the fuzzer.
(note that previouly, while the flag was only active when fuzzing, the fuzzer itself doesn't use it, and the code will not link as is.)
- add stub functions in the fuzzer to link with instrumented C code (previously fuzzed tests failed to link if they were calling into C):
while the zig compile unit uses a custom `EmitOptions.Coverage` with features disabled,
the C code is built calling into the clang driver with "-fsanitize=fuzzer-no-link" that automatically enables the default features.
(see de06978ebc/clang/lib/Driver/SanitizerArgs.cpp (L587))
- emit `-fsanitize-coverage=trace-pc-guard` instead of `-Xclang -fsanitize-coverage-trace-pc-guard` so that edge coverrage is enabled by clang driver. (previously, it was enabled only because the fuzzer was)
Functions like isMinGW() and isGnuLibC() have a good reason to exist: They look
at multiple components of the target. But functions like isWasm(), isDarwin(),
isGnu(), etc only exist to save 4-8 characters. I don't think this is a good
enough reason to keep them, especially given that:
* It's not immediately obvious to a reader whether target.isDarwin() means the
same thing as target.os.tag.isDarwin() precisely because isMinGW() and similar
functions *do* look at multiple components.
* It's not clear where we would draw the line. The logical conclusion before
this commit would be to also wrap Arch.isX86(), Os.Tag.isSolarish(),
Abi.isOpenHarmony(), etc... this obviously quickly gets out of hand.
* It's nice to just have a single correct way of doing something.
Clearing the analysis roots was very clever and all, but not actually
valid. We need to avoid *any* reference to the analysis errors if there
were any fatal files, and that includes sorting the errors!
Resolves: #22774
The changes from a few commits earlier, where semantic analysis no
longer occurs if any Zig files failed to lower to ZIR, mean `file`
dependencies are no longer necessary! However, we now need them for ZON
files, to be invalidated whenever a ZON file changes.
This came with a big cleanup to `Zcu.PerThread.updateFile` (formerly
`astGenFile`).
Also, change how the cache manifest works for files in the import table.
Instead of being added to the manifest when we call `semaFile` on them,
we iterate the import table after running the AstGen workers and add all
the files to the cache manifest then.
The downside is that this is a bit more eager to include files in the
manifest; in particular, files which are imported but not actually
referenced are now included in analysis. So, for instance, modifying any
standard library file will invalidate all Zig compilations using that
standard library, even if they don't use that file.
The original motivation here was simply that the old logic in `semaFile`
didn't translate nicely to ZON. However, it turns out to actually be
necessary for correctness. Because `@import("foo.zig")` is an
AstGen-level error if `foo.zig` does not exist, we need to invalidate
the cache when an imported but unreferenced file is removed to make sure
this error is triggered when it needs to be.
Resolves: #22746
This is mainly in preparation for integrating ZonGen into the pipeline
properly, although these names are better because `astGenFile` isn't
*necessarily* running AstGen; it may determine that the current ZIR is
up-to-date, or load cached ZIR.
Instead, `source`, `tree`, and `zir` should all be optional. This is
precisely what we're actually trying to model here; and `File` isn't
optimized for memory consumption or serializability anyway, so it's fine
to use a couple of extra bytes on actual optionals here.
This commit allows using ZON (Zig Object Notation) in a few ways.
* `@import` can be used to load ZON at comptime and convert it to a
normal Zig value. In this case, `@import` must have a result type.
* `std.zon.parse` can be used to parse ZON at runtime, akin to the
parsing logic in `std.json`.
* `std.zon.stringify` can be used to convert arbitrary data structures
to ZON at runtime, again akin to `std.json`.