7983 Commits

Author SHA1 Message Date
Ryan Liptak
95f4c1532a Treap: do not set key to undefined in remove to allow re-use of removed nodes 2023-10-03 01:21:51 -07:00
Ryan Liptak
cf3572a66b GeneralPurposeAllocator: Considerably improve worst case performance
Before this commit, GeneralPurposeAllocator could run into incredibly degraded performance in scenarios where the bucket count for a particular size class grew to be large. For example, if exactly `slot_count` allocations of a single size class were performed and then all of them were freed except one, then the bucket for those allocations would have to be kept around indefinitely. If that pattern of allocation were done over and over, then the bucket list for that size class could grow incredibly large.

This allocation pattern has been seen in the wild: https://github.com/Vexu/arocc/issues/508#issuecomment-1738275688

In that case, the length of the bucket list for the `128` size class would grow to tens of thousands of buckets and cause Debug runtime to balloon to ~8 minutes whereas with the c_allocator the Debug runtime would be ~3 seconds.

To address this, there are three different changes happening here:

1. std.Treap is used instead of a doubly linked list for the lists of buckets. This takes the time complexity of searchBucket [used in resize and free] from O(n) to O(log n), but increases the time complexity of insert from O(1) to O(log n) [before, all new buckets would get added to the head of the list]. Note: Any data structure with O(log n) or better search/insert/delete would also work for this use-case.
2. If the 'current' bucket for a size class is full, the list of buckets is never traversed and instead a new bucket is allocated. Previously, traversing the bucket list could only find a non-full bucket in specific circumstances, and only because of a separate optimization that is no longer needed (before, after any resize/free, the affected bucket would be moved to the head of the bucket list to allow searchBucket to perform better on average). Now, the current_bucket for each size class only changes when either (1) the current bucket is emptied/freed, or (2) a new bucket is allocated (due to the current bucket being full or null). Because each bucket's alloc_cursor only moves forward (i.e. slots within a bucket are never re-used), we can therefore always know that any bucket besides the current_bucket will be full, so traversing the list in the hopes of finding an existing non-full bucket is entirely pointless.
3. Size + alignment information for small allocations has been moved into the Bucket data instead of keeping it in a separate HashMap. This offers an improvement over the HashMap since whenever we need to get/modify the length/alignment of an allocation it's extremely likely we will already have calculated any bucket-related information necessary to get the data.

The first change is the most relevant and accounts for most of the benefit here. Also note that the overall functionality of GeneralPurposeAllocator is unchanged.

In the degraded `arocc` case, these changes bring Debug performance from ~8 minutes to ~20 seconds.

Benchmark 1: test-master.bat
  Time (mean ± σ):     481.263 s ±  5.440 s    [User: 479.159 s, System: 1.937 s]
  Range (min … max):   477.416 s … 485.109 s    2 runs

Benchmark 2: test-optim-treap.bat
  Time (mean ± σ):     19.639 s ±  0.037 s    [User: 18.183 s, System: 1.452 s]
  Range (min … max):   19.613 s … 19.665 s    2 runs

Summary
  'test-optim-treap.bat' ran
   24.51 ± 0.28 times faster than 'test-master.bat'

Note: Much of the time taken on Windows in this particular case is related to gathering stack traces. With `.stack_trace_frames = 0` the runtime goes down to 6.7 seconds, which is a little more than 2.5x slower compared to when the c_allocator is used.

These changes may or mat not introduce a slight performance regression in the average case:

Here's the standard library tests on Windows in Debug mode:

Benchmark 1 (10 runs): std-tests-master.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          16.0s  ± 30.8ms    15.9s  … 16.1s           1 (10%)        0%
  peak_rss           42.8MB ± 8.24KB    42.8MB … 42.8MB          0 ( 0%)        0%
Benchmark 2 (10 runs): std-tests-optim-treap.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          16.2s  ± 37.6ms    16.1s  … 16.3s           0 ( 0%)        💩+  1.3% ±  0.2%
  peak_rss           42.8MB ± 5.18KB    42.8MB … 42.8MB          0 ( 0%)          +  0.1% ±  0.0%

And on Linux:

Benchmark 1: ./test-master
  Time (mean ± σ):     16.091 s ±  0.088 s    [User: 15.856 s, System: 0.453 s]
  Range (min … max):   15.870 s … 16.166 s    10 runs
 
Benchmark 2: ./test-optim-treap
  Time (mean ± σ):     16.028 s ±  0.325 s    [User: 15.755 s, System: 0.492 s]
  Range (min … max):   15.735 s … 16.709 s    10 runs
 
Summary
  './test-optim-treap' ran
    1.00 ± 0.02 times faster than './test-master'
2023-10-03 01:21:51 -07:00
xdBronch
c9c3ee704c correctly detect apple a15 and a16 chips 2023-10-03 00:36:59 -07:00
Ryan Liptak
da7ecfb2de Treap: Add InorderIterator 2023-10-02 21:11:14 -07:00
Andrew Kelley
21181181bf zig fetch: enhanced error reporting
* Package: use std.tar diagnostics to give detailed error messages
* std.tar: add diagnostic for unsupported file type
2023-10-02 17:02:25 -07:00
Andrew Kelley
ef9966c985 introduce the 'zig fetch' command + symlink support
zig fetch [options] <url>
zig fetch [options] <path>

Fetches a package which is found at <url> or <path> into the global
cache directory, printing the package hash to stdout.

Closes #16972
Related to #14280

Additionally, this commit:

* Adds uncompressed .tar support to package fetching
* Introduces symlink support to package fetching
2023-10-02 17:02:25 -07:00
Andrew Kelley
309c53295f std.fs: give readLink an explicit error set 2023-10-02 17:02:24 -07:00
Andrew Kelley
a5144d19b7 std.tar: support symlinks
closes #16678
2023-10-02 17:02:24 -07:00
Carl Åstholm
412d863ba5 std.Build: expose -idirafter to the build system 2023-10-02 16:22:07 -07:00
Ryan Zezeski
dd026588d0 illumos: fix dynamic linker path 2023-10-02 16:37:37 -06:00
Ryan Zezeski
42ad3e265c illumos does not have versions
The 5.11 in uname is not something that is ever updated. There is no
versioning of the illumos system in general. Illumos prefers to rely
on feature detection.

I can't say what Solaris does these days as I do not work at Oracle;
so I left it alone.
2023-10-02 16:23:17 -06:00
Stephen Gregoratto
285970982a Add illumos OS tag
- Adds `illumos` to the `Target.Os.Tag` enum. A new function,
  `isSolarish` has been added that returns true if the tag is either
  Solaris or Illumos. This matches the naming convention found in Rust's
  `libc` crate[1].
- Add the tag wherever `.solaris` is being checked against.
- Check for the C pre-processor macro `__illumos__` in CMake to set the
  proper target tuple. Illumos distros patch their compilers to have
  this in the "built-in" set (verified with `echo | cc -dM -E -`).

  Alternatively you could check the output of `uname -o`.

Right now, both Solaris and Illumos import from `c/solaris.zig`. In the
future it may be worth putting the shared ABI bits in a base file, and
mixing that in with specific `c/solaris.zig`/`c/illumos.zig` files.

[1]: 6e02a329a2/src/unix/solarish
2023-10-02 15:31:49 -06:00
Veikka Tuominen
63bd2bff12 Sema: add @errorCast which works for both error sets and error unions
Closes #17343
2023-10-01 17:00:01 +03:00
Jay Petacat
d8bfbbbf25 std.mem.zeroes: Zero out entire extern union, including padding
Fixes #17258
2023-10-01 02:39:05 -07:00
Andrew Kelley
376242e586
Merge pull request #17161 from tiehuis/vectorize-index-of-scalar
std.mem: add vectorized indexOfScalarPos and indexOfSentinel
2023-10-01 00:07:57 -07:00
Lucas Santos
303181901b Improve (Unmanaged)ArrayList.insert
(Unmanaged)ArrayList.insert has the same inefficiency as the old insertSlice. With the new addManyAt, the solution is trivial.
Also improves the test "growing memory preserves contents". In the previous implementation, if any changes were made to the ArrayList memory growth policy (function growMemory), the list could end up with enough capacity to not trigger a memory growth, defeating the purpose of the test. The new implementation more robustly triggers a memory growth.
2023-09-30 16:17:22 -07:00
Ryan Zezeski
54ad5f31c6 solaris: hard-code ABI and dynamic linker
Solaris/illumos is multi-lib, so you can't rely on an arbitrary
executable to give you the correct dynamic linker. Besides, it's
always the same path.
2023-09-30 11:38:56 -06:00
Ryan Zezeski
68bcd7ddd4 solaris: load CA certs file 2023-09-30 11:38:56 -06:00
Ryan Zezeski
c17ebdca6a solaris: fix path component max 2023-09-30 11:38:56 -06:00
Ryan Zezeski
f0724229d6 solaris: add missing registers 2023-09-30 11:38:56 -06:00
Marc Tiehuis
08635f08a9 fix indexOfSentinel alignment for types larger than 1 byte 2023-09-30 22:15:47 +13:00
Marc Tiehuis
5b5da0ef8c std.mem: check backend vector support for indexOfSentinel/indexOfScalarPos 2023-09-30 21:22:12 +13:00
Marc Tiehuis
cd766513fe std.mem: add vectorized indexOfScalarPos and indexOfSentinel
These are an order of magnitude quicker than the previous
implementations:

A relative comparison of each, measuring scanning a 1G file.

    Reading 1G (1.0000000009313226GiB)

             std.mem.sliceTo: 281.232ms
          vectorized.sliceTo: 24.769ms
                      strlen: 24.291ms

           std.indexOfScalar: 229.016ms
    vectorized.indexOfScalar: 24.685ms
                      memchr: 24.958ms
2023-09-30 21:19:43 +13:00
Andrew Kelley
101df768a0
Merge pull request #17312 from LucasSantos91/master
Fix inefficiency with ArrayList.insertSlice
2023-09-29 18:15:24 -07:00
Krzysztof Wolicki
19a82ffdba
Add include_extensions to InstallDir Options (#17300)
closes #16687
2023-09-29 18:50:37 -04:00
Andrew Kelley
9013970861 std.ArrayList: fixups for the previous commit
* Move `computeBetterCapacity` to the bottom so that `pub` stuff shows
   up first.
 * Rename `computeBetterCapacity` to `growCapacity`. Every function
   implicitly computes something; that word is always redundant in a
   function name. "better" is vague. Better in what way? Instead we
   describe what is actually happening. "grow".
 * Improve doc comments to be very explicit about when element pointers
   are invalidated or not.
 * Rename `addManyAtIndex` to `addManyAt`. The parameter is named
   `index`; that is enough.
 * Extract some duplicated code into `addManyAtAssumeCapacity` and make
   it `pub`.
 * Since I audited every line of code for correctness, I changed the
   style to my personal preference.
 * Avoid a redundant `@memset` to `undefined` - memory allocation does
   that already.
 * Fixed comment giving the wrong reason for not calling
   `ensureTotalCapacity`.
2023-09-29 13:42:38 -07:00
Lucas Santos
9d765b5ab5 std.ArrayList: insertSlice avoids extra memcpy
Includes a more robust implementation of replaceRange, which updates the
ArrayListUnmanaged if state changes in the managed part of the code
before returning an error.

Co-authored-by: Andrew Kelley <andrew@ziglang.org>
2023-09-29 12:52:40 -07:00
Krzysztof Wolicki
e919fbea9f
Step.Run: fix assert of the wrong value (#17303)
closes #16866
2023-09-29 14:14:42 -04:00
Adam Goertz
2f0e5b00b0 Allow only relative paths.
This commit makes the following changes:
* Disallow file:/// URIs
* Allow only relative paths in the .path field of build.zig.zon
* Remote now-unneeded shlwapi dependency
2023-09-29 00:32:43 -07:00
Adam Goertz
b3cad98534 Support file:/// URIs and relative paths 2023-09-29 00:32:43 -07:00
Philipp Lühmann
ed19ebc360
std.math.big.int.Const.order 0 == -0 (#17299) 2023-09-29 18:09:47 +13:00
Luis Cáceres
acac685621
std.Build.ConfigHeader: override include guard option for blank style (#17310)
This commit adds an option to allow for overriding the default header guard that is generated from the output
file path.
2023-09-28 19:30:42 -04:00
Andrew Kelley
077994abb6
Merge pull request #17318 from gh-fork-dump/linux-5.6-cachestat
Update Linux syscalls for 5.5 and add a wrapper for `cachestat`
2023-09-28 16:27:28 -07:00
Christofer Nolander
c4ad6be002
Allow empty enum to be used in EnumSet/EnumMap
Moves the check for empty fields before any access to those fields.
2023-09-28 21:48:39 +00:00
Karl Seguin
599641357c
std.mem: use for loop instead of while in indexOf* to reduce bound checking 2023-09-28 15:40:08 +00:00
Jonathan Marler
e0ef61d46d simplify ContainerDeclarations grammar rule
Noticed this grammar rule could be simplified using a repeating sequence
rather than recursion.
2023-09-28 14:18:54 +03:00
Emil Lerch
fcca3cd1a3
std.http: introduce options to http client to allow for raw uris
Addresses #17015 by introducing a new startWithOptions. The only option is currently is a flag
to use the provided URI as is, without modification when passed to the server. Normally, this
is not needed nor desired. However, some REST APIs may have requirements that cannot be satisfied
with the default handling.
2023-09-28 14:16:39 +03:00
Jonathan Marler
18f1db134c docs: remove unnecessary nesting in grammar
noticed this extra level of nesting in the Decl grammar that looks
unnecssary.
2023-09-28 13:58:14 +03:00
Stephen Gregoratto
bc0bf4e97a Linux: Add cachestat wrapper.
Can be tested using this program I whipped up:
https://gist.github.com/The-King-of-Toasters/aee448f5975c50f735fd1946794574f7
2023-09-28 18:58:05 +10:00
Stephen Gregoratto
5f456b2b97 Update Linux syscalls for kernel 5.5
The latest addition is `cachestat`, which provides more detailed
information for paged files.
2023-09-28 18:58:05 +10:00
antlilja
8191199951 fmt: add rewrite from @fabs to @abs 2023-09-27 11:24:45 -07:00
antlilja
bcf4a13913 Remove @fabs, fabs and absCast/Int from std lib
Replaces occurences of @fabs absCast and absInt with new @abs builtin.
Also removes the std.math.fabs alias from math.zig.
2023-09-27 11:24:28 -07:00
Kai Jellinghaus
d1e39b6914 Add new fields to io_sqring_offsets & io_cqring_offsets
`user_addr`s were introduced in `03d89a2` ([github link](03d89a2de2) which was shipped in v6.5
`flags` was introduced even earlier
2023-09-26 18:16:36 -07:00
Jay Petacat
37398ed2a5 std: Reactivate skipped tests w.r.t. llvm/llvm-project#55522 2023-09-27 01:37:25 +03:00
LinuxUserGD
ceaae42e90 Add '--compress-debug-sections=zstd' 2023-09-26 14:18:01 -07:00
Chris Burgess
1c726bcb32
std.http: add identity to content encodings (#16493)
Some servers will respond with the identity encoding, meaning no
encoding, especially when responding to range-get requests. Adding the
identity encoding stops the header parser from failing when it
encounters this.
2023-09-26 17:16:40 -04:00
Andrew Kelley
28ac9f8b70
Merge pull request #17253 from ziglang/MultiArrayList-0bit-struct
std.MultiArrayList: add test coverage for 0-bit structs
2023-09-25 02:33:32 -07:00
Jan Weidner
c8ba5839f7
std.http.Client: add note about resource management 2023-09-25 12:17:11 +03:00
Andrew Kelley
cc69315f03 std.MultiArrayList: add test coverage for 0-bit structs
closes #10618
solved by #17172
2023-09-24 15:49:56 -07:00
Andrew Kelley
df5f0517b3
Merge pull request #17205 from mlugg/rls-ref
compiler: preserve result type information through address-of operator
2023-09-24 15:19:48 -07:00