The last commit passed CI, so this final bump is just to allow for
deviation caused by different loads on the runner machines. With this
change, I don't expect any current unit test to ever time out, even when
CI is under extreme load.
Unfortunately, Windows' scheduler means that test timeouts get hit very
easily, because it seems the system can refuse to schedule a waiting
process for *upwards of 10 minutes*. We should look for a better
solution for this problem going forwards, but for now, just give Windows
a very high test timeout.
The 30 minute timeout set here is around the duration of a *full CI run*
on Windows, so it should be impossible to hit normally, but it means
that if a test gets stuck we'll at least get told (eventually).
The Wycheproof test suite is extensive, but takes a long time to
complete on CI.
Keep only the most relevant ones and take it as an opportunity to describe
what they are.
The remaining ones are still available for manual testing when required.
This test called `yield` 80,000 times, which is nothing on a system with
little load, but murder on a CI system. macOS' scheduler in particular
doesn't seem to deal with this very well. The `yield` calls also weren't
even necessarily doing what they were meant to: if the optimizer could
figure out that it doesn't clobber some memory, then it could happily
reorder around the `yield`s anyway!
The test has been simplified and made to work better, and the number of
yields have been reduced. The number of overall iterations has also been
reduced, because with the `yield` calls making races very likely, we
don't really need to run too many iterations to be confident that the
implementation is race-free.
For instance, when running a Zig test using the self-hosted aarch64
backend, this logic was previously expecting `std.zig.Server` to be
used, but the default test runner intentionally does not do this because
the backend is too immature to handle it. On 'master', this is causing
sporadic failures; on this branch, they became consistent failures.
The new `--error-style` option decides how build failures are printed.
The default mode "verbose" prints all context including the step graph
fragment and the failed command (if any). The alternative mode "minimal"
prints only the failed step itself, and does not print the failed
command. There are also "verbose_clear" and "minimal_clear" modes, which
have the distinction that the output is cleared (through ANSI escape
codes) between updates, preventing different updates from being confused
in the output. If `--error-style` is not specified, the environment
variable `ZIG_BUILD_ERROR_STYLE` is checked before falling back to the
default of "verbose"; this means the value can effectively be chosen
system-wide since it is generally a personal preference.
Also introduced is a `--multiline-errors` option which decides how to
print errors which span multiple lines. By default, non-initial lines
are indented to align with the first. Alternatively, a leading newline
can be printed to align everyting on the first column, or no special
treatment can be applied, resulting in misaligned output. Again, there
is an environment variable (`ZIG_BUILD_MULTILINE_ERRORS`) to specify a
preferred default if the option is not explicitly provided.
Resolves: #23472
This is a major refactor to `Step.Run` which adds new functionality,
primarily to the execution of Zig tests.
* All tests are run, even if a test crashes. This happens through the
same mechanism as timeouts where the test processes is repeatedly
respawned as needed.
* The build status output is more precise. For each unit test, it
differentiates pass, skip, fail, crash, and timeout. Memory leaks are
reported separately, as they do not indicate a test's "status", but
are rather an additional property (a test with leaks may still pass!).
* The number of memory leaks is tracked and reported, both per-test and
for a whole `Run` step.
* Reporting is made clearer when a step is failed solely due to error
logs (`std.log.err`) where every unit test passed.
For now, there is a flag to `zig build` called `--test-timeout-ms` which
accepts a value in milliseconds. If the execution time of any individual
unit test exceeds that number of milliseconds, the test is terminated
and marked as timed out.
In the future, we may want to increase the granularity of this feature
by allowing timeouts to be specified per-step or even per-test. However,
a global option is actually very useful. In particular, it can be used
in CI scripts to ensure that no individual unit test exceeds some
reasonable limit (e.g. 60 seconds) without having to assign limits to
every individual test step in the build script.
Also, individual unit test durations are now shown in the time report
web interface -- this was fairly trivial to add since we're timing tests
(to check for timeouts) anyway.
This commit makes progress on #19821, but does not close it, because
that proposal includes a more sophisticated mechanism for setting
timeouts.
Co-Authored-By: David Rubin <david@vortan.dev>
This is very likely full of wrong stuff. It's effectively just a copy of the
x86_64 file - needed because the former stopped using usize/isize. To be clear,
this is no more broken than the old situation was; this just makes the
brokenness explicit.
This is very likely full of wrong stuff. It's effectively just a copy of the
mips64 file - needed because the former stopped using usize/isize. To be clear,
this is no more broken than the old situation was; this just makes the
brokenness explicit.
This reverts commit fcfdf99122b17a928c144c3d70418b35e6b1620e.
This added too much load to the x86_64-linux machines, resulting in timeouts
pretty much across the board.
This is a rewrite of the BLAKE3 implementation, with vectorization.
On Apple Silicon, the new implementation is about twice as fast as the previous one.
With AVX2, it is more than 4 times faster.
With AVX512, it is more than 7.5x faster than the previous implementation (from 678 MB/s to 5086 MB/s).
The return address points to the call instruction on SPARC, so the actual return
address is 8 bytes after. This means that we shouldn't do the return address
adjustment that we normally do.
The FP would point to the register save area for the previous frame, while the
SP points to the register save area for the current frame. So use the latter.