Closes#14298
This commit adds support for fetching dependencies over git+http(s)
using a minimal implementation of the Git protocols and formats relevant
to fetching repository data.
Git URLs can be specified in `build.zig.zon` as follows:
```zig
.xml = .{
.url = "git+https://github.com/ianprime0509/zig-xml#7380d59d50f1cd8460fd748b5f6f179306679e2f",
.hash = "122085c1e4045fa9cb69632ff771c56acdb6760f34ca5177e80f70b0b92cd80da3e9",
},
```
The fragment part of the URL may specify a commit ID (SHA1 hash), branch
name, or tag. It is an error to omit the fragment: if this happens, the
compiler will prompt the user to add it, using the commit ID of the HEAD
commit of the repository (that is, the latest commit of the default
branch):
```
Fetch Packages... xml... /var/home/ian/src/zig-gobject/build.zig.zon:6:20: error: url field is missing an explicit ref
.url = "git+https://github.com/ianprime0509/zig-xml",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: try .url = "git+https://github.com/ianprime0509/zig-xml#dfdc044f3271641c7d428dc8ec8cd46423d8b8b6",
```
This implementation currently supports only version 2 of Git's wire
protocol (documented in
[protocol-v2](https://git-scm.com/docs/protocol-v2)), which was first
introduced in Git 2.19 (2018) and made the default in 2.26 (2020).
The wire protocol behaves similarly when used over other transports,
such as SSH and the "Git protocol" (git:// URLs), so it should be
reasonably straightforward to support fetching dependencies from such
URLs if the necessary transports are implemented (e.g. #14295).
(Unmanaged)ArrayList.insert has the same inefficiency as the old insertSlice. With the new addManyAt, the solution is trivial.
Also improves the test "growing memory preserves contents". In the previous implementation, if any changes were made to the ArrayList memory growth policy (function growMemory), the list could end up with enough capacity to not trigger a memory growth, defeating the purpose of the test. The new implementation more robustly triggers a memory growth.
These are an order of magnitude quicker than the previous
implementations:
A relative comparison of each, measuring scanning a 1G file.
Reading 1G (1.0000000009313226GiB)
std.mem.sliceTo: 281.232ms
vectorized.sliceTo: 24.769ms
strlen: 24.291ms
std.indexOfScalar: 229.016ms
vectorized.indexOfScalar: 24.685ms
memchr: 24.958ms
* Move `computeBetterCapacity` to the bottom so that `pub` stuff shows
up first.
* Rename `computeBetterCapacity` to `growCapacity`. Every function
implicitly computes something; that word is always redundant in a
function name. "better" is vague. Better in what way? Instead we
describe what is actually happening. "grow".
* Improve doc comments to be very explicit about when element pointers
are invalidated or not.
* Rename `addManyAtIndex` to `addManyAt`. The parameter is named
`index`; that is enough.
* Extract some duplicated code into `addManyAtAssumeCapacity` and make
it `pub`.
* Since I audited every line of code for correctness, I changed the
style to my personal preference.
* Avoid a redundant `@memset` to `undefined` - memory allocation does
that already.
* Fixed comment giving the wrong reason for not calling
`ensureTotalCapacity`.
Includes a more robust implementation of replaceRange, which updates the
ArrayListUnmanaged if state changes in the managed part of the code
before returning an error.
Co-authored-by: Andrew Kelley <andrew@ziglang.org>
This commit makes the following changes:
* Disallow file:/// URIs
* Allow only relative paths in the .path field of build.zig.zon
* Remote now-unneeded shlwapi dependency
This commit introduces `--debug-incremental` so that we can start
playing around with incremental compilation while it is still being
developed, and before it is enabled by default.
Currently it saves InternPool data, and has TODO comments for the
remaining things. Deserialization is not implemented yet, which will
require some post-processing such as to build a string map out of
null-terminated string table bytes.
The saved compiler state is stored in a file called <root-name>.zcs
alongside <root-name>.o, <root-name>.pdb, <root-name>.exe, etc. In case
of using the zig build system, these files are all in a zig-cache
directory.
For the self-hosted compiler, here is one data point on the performance
penalty of saving this data:
```
Benchmark 1 (3 runs): zig build-exe ...
measurement mean ± σ min … max outliers delta
wall_time 51.1s ± 354ms 50.7s … 51.4s 0 ( 0%) 0%
peak_rss 3.91GB ± 354KB 3.91GB … 3.91GB 0 ( 0%) 0%
cpu_cycles 212G ± 3.17G 210G … 216G 0 ( 0%) 0%
instructions 274G ± 57.5M 274G … 275G 0 ( 0%) 0%
cache_references 13.1G ± 97.6M 13.0G … 13.2G 0 ( 0%) 0%
cache_misses 1.12G ± 24.6M 1.10G … 1.15G 0 ( 0%) 0%
branch_misses 1.53G ± 1.46M 1.53G … 1.53G 0 ( 0%) 0%
Benchmark 2 (3 runs): zig build-exe ... --debug-incremental
measurement mean ± σ min … max outliers delta
wall_time 51.8s ± 271ms 51.5s … 52.1s 0 ( 0%) + 1.3% ± 1.4%
peak_rss 3.91GB ± 317KB 3.91GB … 3.91GB 0 ( 0%) - 0.0% ± 0.0%
cpu_cycles 213G ± 398M 212G … 213G 0 ( 0%) + 0.3% ± 2.4%
instructions 275G ± 79.1M 275G … 275G 0 ( 0%) + 0.1% ± 0.1%
cache_references 13.1G ± 26.9M 13.0G … 13.1G 0 ( 0%) - 0.1% ± 1.2%
cache_misses 1.12G ± 5.66M 1.11G … 1.12G 0 ( 0%) - 0.6% ± 3.6%
branch_misses 1.53G ± 1.75M 1.53G … 1.54G 0 ( 0%) + 0.2% ± 0.2%
```
At the end of each compilation with `--debug-incremental`, we end up
with a 43 MiB `zig.zcs` file that contains all of the InternPool data
serialized.
Of course, it will necessarily be more expensive to save the state than
to not save the state. However, this data point shows just how cheap the
save state operation is, with all of the groundwork laid for using a
serialization-friendly in-memory data layout.