Linking it by default means that we produce binaries that, effectively, only run
on systems which have the Windows SDK installed because ucrtbased.dll is not
redistributable, and the Windows SDK is what actually installs ucrtbased.dll
into %SYSTEM32%. The resulting binaries also can't run under Wine because Wine
does not provide ucrtbased.dll.
It is also inconsistent with our behavior for *-windows-gnu where we always link
ucrtbase.dll. See #23983, #24019, and #24053 for more details.
So just use ucrtbase.dll regardless of mode. With this change, we can also drop
the implicit definition of the _DEBUG macro in zig cc, which has in some cases
been problematic for users.
Users who want to opt into the old behavior can do so, both for *-windows-msvc
and *-windows-gnu, by explicitly passing -lucrtbased and -D_DEBUG. We might
consider adding a more ergonomic flag like -fdebug-crt to the zig build-* family
of commands in the future.
Closes#24052.
In a compiler built with debug extensions, pass `--debug-incremental` to
spawn the "incremental debug server". This is a TCP server exposing a
REPL which allows querying a bunch of compiler state, some of which is
stored only when that flag is passed. Eventually, this will probably
move into `std.zig.Server`/`std.zig.Client`, but this is easier to work
with right now. The easiest way to interact with the server is `telnet`.
This was an unintentional regression in 23c8175 which meant that
backwards-incompatible ZIR changes would have caused compiler crashes if
old caches were present.
This commit makes some big changes to how we track state for Zig source
files. In particular, it changes:
* How `File` tracks its path on-disk
* How AstGen discovers files
* How file-level errors are tracked
* How `builtin.zig` files and modules are created
The original motivation here was to address incremental compilation bugs
with the handling of files, such as #22696. To fix this, a few changes
are necessary.
Just like declarations may become unreferenced on an incremental update,
meaning we suppress analysis errors associated with them, it is also
possible for all imports of a file to be removed on an incremental
update, in which case file-level errors for that file should be
suppressed. As such, after AstGen, the compiler must traverse files
(starting from analysis roots) and discover the set of "live files" for
this update.
Additionally, the compiler's previous handling of retryable file errors
was not very good; the source location the error was reported as was
based only on the first discovered import of that file. This source
location also disappeared on future incremental updates. So, as a part
of the file traversal above, we also need to figure out the source
locations of imports which errors should be reported against.
Another observation I made is that the "file exists in multiple modules"
error was not implemented in a particularly good way (I get to say that
because I wrote it!). It was subject to races, where the order in which
different imports of a file were discovered affects both how errors are
printed, and which module the file is arbitrarily assigned, with the
latter in turn affecting which other files are considered for import.
The thing I realised here is that while the AstGen worker pool is
running, we cannot know for sure which module(s) a file is in; we could
always discover an import later which changes the answer.
So, here's how the AstGen workers have changed. We initially ensure that
`zcu.import_table` contains the root files for all modules in this Zcu,
even if we don't know any imports for them yet. Then, the AstGen
workers do not need to be aware of modules. Instead, they simply ignore
module imports, and only spin off more workers when they see a by-path
import.
During AstGen, we can't use module-root-relative paths, since we don't
know which modules files are in; but we don't want to unnecessarily use
absolute files either, because those are non-portable and can make
`error.NameTooLong` more likely. As such, I have introduced a new
abstraction, `Compilation.Path`. This type is a way of representing a
filesystem path which has a *canonical form*. The path is represented
relative to one of a few special directories: the lib directory, the
global cache directory, or the local cache directory. As a fallback, we
use absolute (or cwd-relative on WASI) paths. This is kind of similar to
`std.Build.Cache.Path` with a pre-defined list of possible
`std.Build.Cache.Directory`, but has stricter canonicalization rules
based on path resolution to make sure deduplicating files works
properly. A `Compilation.Path` can be trivially converted to a
`std.Build.Cache.Path` from a `Compilation`, but is smaller, has a
canonical form, and has a digest which will be consistent across
different compiler processes with the same lib and cache directories
(important when we serialize incremental compilation state in the
future). `Zcu.File` and `Zcu.EmbedFile` both contain a
`Compilation.Path`, which is used to access the file on-disk;
module-relative sub paths are used quite rarely (`EmbedFile` doesn't
even have one now for simplicity).
After the AstGen workers all complete, we know that any file which might
be imported is definitely in `import_table` and up-to-date. So, we
perform a single-threaded graph traversal; similar to what
`resolveReferences` plays for `AnalUnit`s, but for files instead. We
figure out which files are alive, and which module each file is in. If a
file turns out to be in multiple modules, we set a field on `Zcu` to
indicate this error. If a file is in a different module to a prior
update, we set a flag instructing `updateZirRefs` to invalidate all
dependencies on the file. This traversal also discovers "import errors";
these are errors associated with a specific `@import`. With Zig's
current design, there is only one possible error here: "import outside
of module root". This must be identified during this traversal instead
of during AstGen, because it depends on which module the file is in. I
tried also representing "module not found" errors in this same way, but
it turns out to be much more useful to report those in Sema, because of
use cases like optional dependencies where a module import is behind a
comptime-known build option.
For simplicity, `failed_files` now just maps to `?[]u8`, since the
source location is always the whole file. In fact, this allows removing
`LazySrcLoc.Offset.entire_file` completely, slightly simplifying some
error reporting logic. File-level errors are now directly built in the
`std.zig.ErrorBundle.Wip`. If the payload is not `null`, it is the
message for a retryable error (i.e. an error loading the source file),
and will be reported with a "file imported here" note pointing to the
import site discovered during the single-threaded file traversal.
The last piece of fallout here is how `Builtin` works. Rather than
constructing "builtin" modules when creating `Package.Module`s, they are
now constructed on-the-fly by `Zcu`. The map `Zcu.builtin_modules` maps
from digests to `*Package.Module`s. These digests are abstract hashes of
the `Builtin` value; i.e. all of the options which are placed into
"builtin.zig". During the file traversal, we populate `builtin_modules`
as needed, so that when we see this imports in Sema, we just grab the
relevant entry from this map. This eliminates a bunch of awkward state
tracking during construction of the module graph. It's also now clearer
exactly what options the builtin module has, since previously it
inherited some options arbitrarily from the first-created module with
that "builtin" module!
The user-visible effects of this commit are:
* retryable file errors are now consistently reported against the whole
file, with a note pointing to a live import of that file
* some theoretical bugs where imports are wrongly considered distinct
(when the import path moves out of the cwd and then back in) are fixed
* some consistency issues with how file-level errors are reported are
fixed; these errors will now always be printed in the same order
regardless of how the AstGen pass assigns file indices
* incremental updates do not print retryable file errors differently
between updates or depending on file structure/contents
* incremental updates support files changing modules
* incremental updates support files becoming unreferenced
Resolves: #22696
Inline calls which happened in the erroring `AnalUnit` still show as
error notes, because they tend to make very important context (e.g. to
see how comptime values propagate through them). However, "earlier"
inline calls are still useful to see to understand how something is
being referenced, so we should include them in the reference trace.
When `-freference-trace` is not passed, we want to show exactly one
reference trace. Previously, we set the reference trace root in `Sema`
iff there were no other failed analyses. However, this results in an
arbitrary error being the one with the reference trace after error
sorting. It is also incompatible with incremental compilation, where
some errors might be unreferenced. Instead, set the field on all
analysis errors, and decide in `Compilation.getAllErrorsAlloc` which
reference trace[s] to actually show.
If clang encountered bad imports, the depfile will not be generated. It
doesn't make sense to warn the user in this case. In fact,
`FileNotFound` is never worth warning about here; it just means that
the file we were deleting to save space isn't there in the first place!
If the missing file actually affected the compilation (e.g. another
process raced to delete it for some reason) we would already error in
the normal code path which reads these files, so we can safely omit the
warning in the `FileNotFound` case always, only warning when the file
might still exist.
To see what this fixes, create the following file...
```c
#include <nonexist>
```
...and run `zig build-obj` on it. Before this commit, you will get a
redundant warning; after this commit, that warning is gone.
It remains 1 everywhere else.
Also remove some code that allowed setting the libc++ ABI version on the
Compilation since there are no current plans to actually expose this in the CLI.
Aside from adding comments to document the logic in `Cache.Manifest.hit`
better, this commit fixes two serious bugs.
The first, spotted by Andrew, is that when upgrading from a shared to an
exclusive lock on the manifest file, we do not seek it back to the
start. This is a simple fix.
The second is more subtle, and has to do with the computation of file
digests. Broadly speaking, the goal of the main loop in `hit` is to
iterate the files listed in the manifest file, and check if they've
changed, based on stat and a file hash. While doing this, the
`bin_digest` field of `std.Build.Cache.File`, which is initially
`undefined`, is populated for all files, either straight from the
manifest (if the stat matches) or recomputed from the file on-disk. This
file digest is then used to update `man.hash.hasher`, which is building
the final hash used as, for instance, the output directory name when the
compiler emits into the cache directory. When `hit` returns a cache
miss, it is expected that `man.hash.hasher` includes the digests of all
"initial files"; that is, those which have been already added with e.g.
`addFilePath`, but not those which will later be added with
`addFilePost` (even though the manifest file has told us about some such
files). Previously, `hit` was using the `unhit` function to do this in a
few cases. However, this is incorrect, because `hit` assumes that all
files already have their `bin_digest` field populated; this function is
only valid to call *after* `hit` returns. Instead, we need to actually
compute the hashes which haven't yet been populated. Even if this logic
has been working, there was still a bug here, because we called `unhit`
when upgrading from a shared to an exclusive lock, writing the
(potentially `undefined`) file digests, but the loop itself writes the
file digests *again*! All in all, the hashing logic here was actually
incredibly broken.
I've taken the opportunity to restructure this section of the code into
what I think is a more readable format. A new function,
`hitWithCurrentLock`, uses the open manifest file to try and find a
cache hit. It returns a tagged union which, in the miss case, tells the
caller (`hit`) how many files already have their hash populated. This
avoids redundant work recomputing the same hash multiple times in
situations where the lock needs upgrading. This also eliminates the
outer loop from `hit`, which was a little confusing because it iterated
no more than twice!
The bugs fixed here could manifest in several different ways depending
on how contended file locks were satisfied. Most notably, on a cache
miss, the Zig compiler might have written the compilation output to the
incorrect directory (because it incorrectly constructed a hash using
`undefined` or repeated file digests), resulting in all future hits on
this manifest causing `error.FileNotFound`. This is #23110. I have been
able to reproduce #23110 on `master`, and have not been able to after
this commit, so I am relatively sure this commit resolves that issue.
Resolves: #23110
* Accept -fsanitize-c=trap|full in addition to the existing form.
* Accept -f(no-)sanitize-trap=undefined in zig cc.
* Change type of std.Build.Module.sanitize_c to std.zig.SanitizeC.
* Add some missing Compilation.Config fields to the cache.
Closes#23216.
Compile log output is now separated based on the `AnalUnit` which
perfomred the `@compileLog` call, so that we can omit the output for
unreferenced ("dead") units. The units are also sorted when collecting
the `ErrorBundle`, so that compile logs are always printed in a
consistent order, like compile errors are. This is important not only
for incremental compilation, but also for parallel analysis.
Resolves: #23609
This lays the groundwork for #2879. This library will be built and linked when a
static libc is going to be linked into the compilation. Currently, that means
musl, wasi-libc, and MinGW-w64. As a demonstration, this commit removes the musl
C code for a few string functions and implements them in libzigc. This means
that those libzigc functions are now load-bearing for musl and wasi-libc.
Note that if a function has an implementation in compiler-rt already, libzigc
should not implement it. Instead, as we recently did for memcpy/memmove, we
should delete the libc copy and rely on the compiler-rt implementation.
I repurposed the existing "universal libc" code to do this. That code hadn't
seen development beyond basic string functions in years, and was only usable-ish
on freestanding. I think that if we want to seriously pursue the idea of Zig
providing a freestanding libc, we should do so only after defining clear goals
(and non-goals) for it. See also #22240 for a similar case.
LLD expects the library file name (minus extension) to be exactly libmingw32. By
calling it mingw32 previously, we prevented it from being detected as being in
LLD's list of libraries that are excluded from the MinGW-specific auto-export
mechanism.
b9d27ac252/lld/COFF/MinGW.cpp (L30-L56)
As a result, a DLL built for *-windows-gnu with Zig would export a bunch of
internal MinGW symbols. This sometimes worked out fine, but it could break at
link or run time when linking an EXE with a DLL, where both are targeting
*-windows-gnu and thus linking separate copies of mingw32.lib. In #23204, this
manifested as the linker getting confused about _gnu_exception_handler() because
it was incorrectly exported by the DLL while also being defined in the
mingw32.lib that was being linked into the EXE.
Closes#23204.
On updates with failed files, we should refrain from doing any semantic
analysis, or even touching codegen/link. That way, incremental
compilation state is untouched for when the user fixes the AstGen
errors.
Resolves: #23205
Clang's integrated Arm assembler doesn't understand -mabi yet, so this results
in "unused command line argument" warnings when building musl code and glibc
stubs, for example.
Windows is a ridiculous operating system designed by toddlers, and so
requires us to close all file handles in the `tmp/xxxxxxx` cache dir
before renaming it into `o/xxxxxxx`. We have a hack in place to handle
this for the main output file, but the MachO linker also outputs a file
with debug symbols, and we weren't closing it! This led to a fuckton of
CI failures when we enabled `.whole` cache mode by default for
self-hosted backends.
thanks jacob for figuring this out while i sat there
This reverts commit dea72d15da4fba909dc3ccb2e9dc5286372ac023, reversing
changes made to ab381933c87bcc744058d25a876cfdc0d23fc674.
The changeset does not work as advertised and does not have sufficient
test coverage.
Reopens#22822
Problem here is if zig is asked to create multiple static libraries, it
will build the runtime multiple times and then they will conflict.
Instead we want to build the runtime exactly once.
Unlike `compiler-rt`, `ubsan` uses the standard library quite a lot.
Using a similar approach to how `compiler-rt` is handled today, where it's
compiled into its own object and then linked would be sub-optimal as we'd
be introducing a lot of code bloat.
This approach always "imports" `ubsan` if the ZCU, if it exists. If it doesn't
such as the case where we're compiling only C code, then we have no choice other
than to compile it down to an object and link. There's still a tiny optimization
we can do in that case, which is when compiling to a static library, there's no
need to construct an archive with a single object. We'd only go back and parse out
ubsan from the archive later in the pipeline. So we compile it to an object instead
and link that to the static library.
TLDR;
- `zig build-exe foo.c` -> build `libubsan.a` and links
- `zig build-obj foo.c` -> doesn't build anything, just emits references to ubsan runtime
- `zig build-lib foo.c -static` -> build `ubsan.o` and link it
- `zig build-exe foo.zig bar.c` -> import `ubsan-rt` into the ZCU
- `zig build-obj foo.zig bar.c` -> import `ubsan-rt` into the ZCU
- `zig build-lib foo.zig bar.c` -> import `ubsan-rt` into the ZCU