mirror of
https://github.com/ziglang/zig.git
synced 2025-12-06 14:23:09 +00:00
Aside from adding comments to document the logic in `Cache.Manifest.hit` better, this commit fixes two serious bugs. The first, spotted by Andrew, is that when upgrading from a shared to an exclusive lock on the manifest file, we do not seek it back to the start. This is a simple fix. The second is more subtle, and has to do with the computation of file digests. Broadly speaking, the goal of the main loop in `hit` is to iterate the files listed in the manifest file, and check if they've changed, based on stat and a file hash. While doing this, the `bin_digest` field of `std.Build.Cache.File`, which is initially `undefined`, is populated for all files, either straight from the manifest (if the stat matches) or recomputed from the file on-disk. This file digest is then used to update `man.hash.hasher`, which is building the final hash used as, for instance, the output directory name when the compiler emits into the cache directory. When `hit` returns a cache miss, it is expected that `man.hash.hasher` includes the digests of all "initial files"; that is, those which have been already added with e.g. `addFilePath`, but not those which will later be added with `addFilePost` (even though the manifest file has told us about some such files). Previously, `hit` was using the `unhit` function to do this in a few cases. However, this is incorrect, because `hit` assumes that all files already have their `bin_digest` field populated; this function is only valid to call *after* `hit` returns. Instead, we need to actually compute the hashes which haven't yet been populated. Even if this logic has been working, there was still a bug here, because we called `unhit` when upgrading from a shared to an exclusive lock, writing the (potentially `undefined`) file digests, but the loop itself writes the file digests *again*! All in all, the hashing logic here was actually incredibly broken. I've taken the opportunity to restructure this section of the code into what I think is a more readable format. A new function, `hitWithCurrentLock`, uses the open manifest file to try and find a cache hit. It returns a tagged union which, in the miss case, tells the caller (`hit`) how many files already have their hash populated. This avoids redundant work recomputing the same hash multiple times in situations where the lock needs upgrading. This also eliminates the outer loop from `hit`, which was a little confusing because it iterated no more than twice! The bugs fixed here could manifest in several different ways depending on how contended file locks were satisfied. Most notably, on a cache miss, the Zig compiler might have written the compilation output to the incorrect directory (because it incorrectly constructed a hash using `undefined` or repeated file digests), resulting in all future hits on this manifest causing `error.FileNotFound`. This is #23110. I have been able to reproduce #23110 on `master`, and have not been able to after this commit, so I am relatively sure this commit resolves that issue. Resolves: #23110