Aside from adding comments to document the logic in `Cache.Manifest.hit`
better, this commit fixes two serious bugs.
The first, spotted by Andrew, is that when upgrading from a shared to an
exclusive lock on the manifest file, we do not seek it back to the
start. This is a simple fix.
The second is more subtle, and has to do with the computation of file
digests. Broadly speaking, the goal of the main loop in `hit` is to
iterate the files listed in the manifest file, and check if they've
changed, based on stat and a file hash. While doing this, the
`bin_digest` field of `std.Build.Cache.File`, which is initially
`undefined`, is populated for all files, either straight from the
manifest (if the stat matches) or recomputed from the file on-disk. This
file digest is then used to update `man.hash.hasher`, which is building
the final hash used as, for instance, the output directory name when the
compiler emits into the cache directory. When `hit` returns a cache
miss, it is expected that `man.hash.hasher` includes the digests of all
"initial files"; that is, those which have been already added with e.g.
`addFilePath`, but not those which will later be added with
`addFilePost` (even though the manifest file has told us about some such
files). Previously, `hit` was using the `unhit` function to do this in a
few cases. However, this is incorrect, because `hit` assumes that all
files already have their `bin_digest` field populated; this function is
only valid to call *after* `hit` returns. Instead, we need to actually
compute the hashes which haven't yet been populated. Even if this logic
has been working, there was still a bug here, because we called `unhit`
when upgrading from a shared to an exclusive lock, writing the
(potentially `undefined`) file digests, but the loop itself writes the
file digests *again*! All in all, the hashing logic here was actually
incredibly broken.
I've taken the opportunity to restructure this section of the code into
what I think is a more readable format. A new function,
`hitWithCurrentLock`, uses the open manifest file to try and find a
cache hit. It returns a tagged union which, in the miss case, tells the
caller (`hit`) how many files already have their hash populated. This
avoids redundant work recomputing the same hash multiple times in
situations where the lock needs upgrading. This also eliminates the
outer loop from `hit`, which was a little confusing because it iterated
no more than twice!
The bugs fixed here could manifest in several different ways depending
on how contended file locks were satisfied. Most notably, on a cache
miss, the Zig compiler might have written the compilation output to the
incorrect directory (because it incorrectly constructed a hash using
`undefined` or repeated file digests), resulting in all future hits on
this manifest causing `error.FileNotFound`. This is #23110. I have been
able to reproduce #23110 on `master`, and have not been able to after
this commit, so I am relatively sure this commit resolves that issue.
Resolves: #23110
* Accept -fsanitize-c=trap|full in addition to the existing form.
* Accept -f(no-)sanitize-trap=undefined in zig cc.
* Change type of std.Build.Module.sanitize_c to std.zig.SanitizeC.
* Add some missing Compilation.Config fields to the cache.
Closes#23216.
This is fairly straightforward; the actual compiler changes are limited
to the CLI, since `Compilation` already supports this combination.
A new `std.Build` API is introduced to allow representing this. By
passing the `emit_object` option to `std.Build.addTest`, you get a
`Step.Compile` which emits an object file; you can then use that as you
would any other object, such as either installing it for external use,
or linking it into another step.
A standalone test is added to cover the build system API. It builds a
test into an object, and links it into a final executable, which it then
runs.
Using this build system mechanism prevents the build system from
noticing that you're running a `zig test`, so the build runner and test
runner do not communicate over stdio. However, that's okay, because the
real-world use cases for this feature don't want to do that anyway!
Resolves: #23374
This change fixes false-positive cache hits for run steps that get run
with different sets of environment variables due the the environment map
being excluded from the cache hash.
The code did one useless thing and two wrong things:
- ref counting was basically a noop
- last_dir_fd was chosen from the wrong index and also under the wrong
condition
This caused regular crashes on macOS which are now gone.
This reverts commit dea72d15da4fba909dc3ccb2e9dc5286372ac023, reversing
changes made to ab381933c87bcc744058d25a876cfdc0d23fc674.
The changeset does not work as advertised and does not have sufficient
test coverage.
Reopens#22822
Inheriting allow-deprecation from parent modules doesn't make too much
sense, so instead make them default to disallow unless otherwise
specified. This allows build system to avoid redundant
`-fno-allow-deprecated` args.
This makes the generated CLIs smaller, and makes zig1.wasm update not
needed.
Also represented `is_root` differently (moved to field of graph).
This can also be extended to ELF later as it means roughly the same thing there.
This addresses the main issue in #21721 but as I don't have a macOS machine to
do further testing on, I can't confirm whether zig cc is able to pass the entire
cgo test suite after this commit. It can, however, cross-compile a basic program
that uses cgo to x86_64-macos-none which previously failed due to lack of -x
support. Unlike previously, the resulting symbol table does not contain local
symbols (such as C static functions).
I believe this satisfies the related donor bounty: https://ziglang.org/news/second-donor-bounty
Functions like isMinGW() and isGnuLibC() have a good reason to exist: They look
at multiple components of the target. But functions like isWasm(), isDarwin(),
isGnu(), etc only exist to save 4-8 characters. I don't think this is a good
enough reason to keep them, especially given that:
* It's not immediately obvious to a reader whether target.isDarwin() means the
same thing as target.os.tag.isDarwin() precisely because isMinGW() and similar
functions *do* look at multiple components.
* It's not clear where we would draw the line. The logical conclusion before
this commit would be to also wrap Arch.isX86(), Os.Tag.isSolarish(),
Abi.isOpenHarmony(), etc... this obviously quickly gets out of hand.
* It's nice to just have a single correct way of doing something.
* fix merge conflicts
* rename the declarations
* reword documentation
* extract FixedBufferAllocator to separate file
* take advantage of locals
* remove the assertion about max alignment in Allocator API, leaving it
Allocator implementation defined
* fix non-inline function call in start logic
The GeneralPurposeAllocator implementation is totally broken because it
uses global state but I didn't address that in this commit.
heap.zig: define new default page sizes
heap.zig: add min/max_page_size and their options
lib/std/c: add miscellaneous declarations
heap.zig: add pageSize() and its options
switch to new page sizes, especially in GPA/stdlib
mem.zig: remove page_size
Currently -freference-trace only works when running from a terminal.
This is annoying if you're running in another environment or if you redirect the output.
But -freference-trace also works fine without the color, so change how the build runner is interpreting this option.
`std.Build.Step.Run` makes the very reasonable assumption that
`error.InvalidExe` will be reported on `spawn` if it will happen.
However, this property does not currently hold on POSIX targets. This
is, through a slightly convoluted series of events, partially
responsible for the sporadic `BrokenPipe` errors we've been seeing more
and more in CI runs.
Making `spawn` wait for the child to exec in the POSIX path introduces
a block of up to 400us. So, instead of doing that, we add a new API for
this particular case: `waitForSpawn`. This function is a nop on Windows,
but on POSIX it blocks until the child successfully (or otherwise) calls
`execvpe`, and reports the error if necessary. `std.Build.Step.Run`
calls this function, so that it can get `error.InvalidExe` when it wants
it.
I'm not convinced that this API is optimal. However, I think this entire
API needs to be either heavily refactored or straight-up redesigned
(related: #22504), so I'm not too worried about hitting the perfect API:
I'd rather just fix this bug for now, and figure out the long-term goal
a bit later.
The previous logic here was trying to assume that custom test runners
never used `std.zig.Server` to communicate with the build runner;
however, it was flawed, because modifying the `test_runner` field on
`Step.Compile` would not update this flag. That might have been
intentional (allowing a way for the user to specify a custom test runner
which *does* use the compiler server protocol), but if so, it was a
flawed API, since it was too easy to update one field without updating
the other.
Instead, bundle these two pieces of state into a new type
`std.Build.Step.Compile.TestRunner`. When passing a custom test runner,
you are now *provided* to specify whether it is a "simple" runner, or
whether it uses the compiler server protocol.
This is a breaking change, but is unlikely to affect many people, since
custom test runners are seldom used in the wild.
This was done by regex substitution with `sed`. I then manually went
over the entire diff and fixed any incorrect changes.
This diff also changes a lot of `callconv(.C)` to `callconv(.c)`, since
my regex happened to also trigger here. I opted to leave these changes
in, since they *are* a correct migration, even if they're not the one I
was trying to do!
The goals of this branch are to:
* compile faster when using the wasm linker and backend
* enable saving compiler state by directly copying in-memory linker
state to disk.
* more efficient compiler memory utilization
* introduce integer type safety to wasm linker code
* generate better WebAssembly code
* fully participate in incremental compilation
* do as much work as possible outside of flush(), while continuing to do
linker garbage collection.
* avoid unnecessary heap allocations
* avoid unnecessary indirect function calls
In order to accomplish this goals, this removes the ZigObject
abstraction, as well as Symbol and Atom. These abstractions resulted
in overly generic code, doing unnecessary work, and needless
complications that simply go away by creating a better in-memory data
model and emitting more things lazily.
For example, this makes wasm codegen emit MIR which is then lowered to
wasm code during linking, with optimal function indexes etc, or
relocations are emitted if outputting an object. Previously, this would
always emit relocations, which are fully unnecessary when emitting an
executable, and required all function calls to use the maximum size LEB
encoding.
This branch introduces the concept of the "prelink" phase which occurs
after all object files have been parsed, but before any Zcu updates are
sent to the linker. This allows the linker to fully parse all objects
into a compact memory model, which is guaranteed to be complete when Zcu
code is generated.
This commit is not a complete implementation of all these goals; it is
not even passing semantic analysis.
This commit amends `std.Build.ExecutableOptions` etc to have a new
field, `root_module`, which allows artifacts to be created whose root
module is an existing `*Module` rather than a freshly constructed one.
This API can be far more versatile, allowing construction of complex
module graphs before creating any compile steps, and therefore also
allowing easy reuse of modules.
The fields which correspond to module options, such as
`root_source_file`, are all considered deprecated. They may not be
populated at the same time as the `root_module` field. In the next
release cycle, these deprecated fields will be removed, and the
`root_module` field made non-optional.
At the expense of a slight special case in the build runner, we can make
the handling of dependencies between modules a little shorter and much
easier to follow.
When module and step graphs are being constructed during the "configure"
phase, we do not set up step dependencies triggered by modules. Instead,
after the configure phase, the build runner traverses the whole
step/module graph, starting from the root top-level steps, and
configures all step dependencies implied by modules. The "make" phase
then proceeds as normal. Also, the old `Module.dependencyIterator` logic
is replaced by two separate iterables. `Module.getGraph` takes the root
module of a compilation, and returns all modules in its graph; while
`Step.Compile.getCompileDependencies` takes a `*Step.Compile` and
returns all `*Step.Compile` it depends on, recursively, possibly
excluding dynamic libraries. The old `Module.dependencyIterator`
combined these two functions into one unintuitive iterator; they are now
separated, which in particular helps readability at the usage sites
which only need one or the other.