while still preserving the guarantee about async() being assigned a unit
of concurrency (or immediately running the task), this change:
* retains the error from calling getCpuCount()
* spawns all threads in detached mode, using WaitGroup to join them
* treats all workers the same regardless of whether they are processing
concurrent or async tasks. one thread pool does all the work, while
respecting async and concurrent limits.
This is a reimplementation of Io.Threaded that fixes the issues
highlighted in the recent Zulip discussion. It's poorly tested but it
does successfully run to completion the litmust test example that I
offered in the discussion.
This implementation has the following key design decisions:
- `t.cpu_count` is used as the threadpool size.
- `t.concurrency_limit` is used as the maximum number of
"burst, one-shot" threads that can be spawned by `io.concurrent` past
`t.cpu_count`.
- `t.available_thread_count` is the number of threads in the pool that
is not currently busy with work (the bookkeeping happens in the worker
function).
- `t.one_shot_thread_count` is the number of active threads that were
spawned by `io.concurrent` past `t.cpu_count`.
In this implementation:
- `io.async` first tries to decrement `t.available_thread_count`. If
there are no threads available, it tries to spawn a new one if possible,
otherwise it runs the task immediately.
- `io.concurrent` first tries to use a thread in the pool same as
`io.async`, but on failure (no available threads and pool size limit
reached) it tries to spawn a new one-shot thread. One shot threads
run a different main function that just executes one task, decrements
the number of active one shot threads, and then exits.
A relevant future improvement is to have one-shot threads stay on for a
few seconds (and potentially pick up a new task) to amortize spawning
costs.
I would like a chance to review this before it lands, please. Feel free
to submit the work again without changes and I will make review
comments.
In the meantime, these reverts avoid intermittent CI failures, and
remove bad patterns from occurring in the standard library that other
users might copy.
Revert "std.crypto: improve KT documentation, use key_length for B3 key length (#25807)"
This reverts commit 4b593a6c24797484e68a668818736b0f6a8d81a2.
Revert "crypto - threaded K12: separate context computation from thread spawning (#25793)"
This reverts commit ee4df4ad3edad160fb737a1935cd86bc2f9cfbbe.
Revert "crypto.kt128: when using incremental hashing, use SIMD when possible (#25783)"
This reverts commit bf9082518c32ce7d53d011777bf8d8056472cbf9.
Revert "Add std.crypto.hash.sha3.{KT128,KT256} - RFC 9861. (#25593)"
This reverts commit 95c76b1b4aa7302966281c6b9b7f6cadea3cf7a6.
This is relevant to PIEs, which are notably enabled by default on macOS.
The build system needs to only see virtual addresses, that is, those
which do not have the slide applied; but the fuzzer itself naturally
sees relocated addresses (i.e. with the slide applied). We just need to
subtract the slide when we communicate addresses to the build system.
Like ELF, we now have `std.debug.MachOFile` for the host-independent
parts, and `std.debug.SelfInfo.MachO` for logic requiring the file to
correspond to the running program.
Little-endian is what `std.zig.Server` expects, but the old logic just
send the raw bytes of the struct, so sent in native endian (causing a
crash on big-endian targets).
- Affects the following functions:
+ `std.fs.Dir.readLinkW`
+ `std.os.windows.ReadLink`
+ `std.os.windows.ntToWin32Namespace`
+ `std.posix.readlinkW`
+ `std.posix.readlinkatW`
Each of these functions (except `ntToWin32Namespace`) took WTF-16 as input and would output WTF-8, which makes optimal buffer re-use difficult at callsites and could force unnecessary WTF-16 <-> WTF-8 conversion during an intermediate step.
The functions have been updated to output WTF-16, and also allow for the path and the output to re-use the same buffer (i.e. in-place modification), which can reduce the stack usage at callsites. For example, all of `std.fs.Dir.readLink`/`readLinkZ`/`std.posix.readlink`/`readlinkZ`/`readlinkat`/`readlinkatZ` have had their stack usage reduced by one PathSpace struct (64 KiB) when targeting Windows.
The new `ntToWin32Namespace` takes an output buffer and returns a slice from that instead of returning a PathSpace, which is necessary to make the above possible.
The reasoning in the comment deleted by this commit no longer applies, since that same benefit can be obtained by using OpenFile with `.filter = .any`.
Also removes a stray debug.print
The build runner was previously forcing child processes to have their
stderr colorization match the build runner by setting `CLICOLOR_FORCE`
or `NO_COLOR`. This is a nice idea in some cases---for instance a simple
`Run` step which we just expect to exit with code 0 and whose stderr is
not being programmatically inspected---but is a bad idea in others, for
instance if there is a check on stderr or if stderr is captured, in
which case forcing color on the child could cause checks to fail.
Instead, this commit adds a field to `std.Build.Step.Run` which
specifies a behavior for the build runner to employ in terms of
assigning the `CLICOLOR_FORCE` and `NO_COLOR` environment variables. The
default behavior is to set `CLICOLOR_FORCE` if the build runner's output
is colorized and the step's stderr is not captured, and to set
`NO_COLOR` otherwise. Alternatively, colors can be always enabled,
always disabled, always match the build runner, or the environment
variables can be left untouched so they can be manually controlled
through `env_map`.
Notably, this fixes a failure when running `zig build test-cli` in a
TTY (or with colors explicitly enabled). GitHub CI hadn't caught this
because it does not request color, but Codeberg CI now does, and we were
seeing a failure in the `zig init` test because the actual output had
color escape codes in it due to 6d280dc.