mirror of
https://github.com/ziglang/zig.git
synced 2025-12-06 06:13:07 +00:00
The problem is that one may execute too many subprocesses concurrently that, together, exceed an RSS value that causes the OOM killer to kill something problematic such as the window manager. Or worse, nothing, and the system freezes. This is a real world problem. For example when building LLVM a simple `ninja install` will bring your system to its knees if you don't know that you should add `-DLLVM_PARALLEL_LINK_JOBS=1`. In particular: compiling the zig std lib tests takes about 2G each, which at 16x at once (8 cores + hyperthreading) is using all 32GB of my RAM, causing the OOM killer to kill my window manager The idea here is that you can annotate steps that might use a high amount of system resources with an upper bound. So for example I could mark the std lib tests as having an upper bound peak RSS of 3 GiB. Then the build system will do 2 things: 1. ulimit the child process, so that it will fail if it would exceed that memory limit. 2. Notice how much system RAM is available and avoid running too many concurrent jobs at once that would total more than that. This implements (1) not with an operating system enforced limit, but by checking the maxrss after a child process exits. However it does implement (2) correctly. The available memory used by the build system defaults to the total system memory, regardless of whether it is used by other processes at the time of spawning the build runner. This value can be overridden with the new --maxrss flag to `zig build`. This mechanism will ensure that the sum total of upper bound RSS memory of concurrent tasks will not exceed this value. This system makes it so that project maintainers can annotate problematic subprocesses, avoiding bug reports from users, who can blissfully execute `zig build` without worrying about the project's internals. Nobody's computer crashes, and the build system uses as much parallelism as possible without risking OOM. Users do not need to unnecessarily resort to -j1 when the build system can figure this out for them.