90 Commits

Author SHA1 Message Date
Hampus Fröjdholm
d526a2cf95 gpa: Add never_unmap and retain_metadata test 2024-05-21 19:09:52 +02:00
Hampus Fröjdholm
8a57e09b15 gpa: Fix GeneralPurposeAllocator crash when deallocating metadata 2024-05-21 19:09:52 +02:00
Hampus Fröjdholm
762e2a4b52 gpa: Fix GeneralPurposeAllocator double free stack traces
The wrong `size_class` was used when fetching stack traces from empty
buckets. The `size_class` would always be the maximum value after
exhausting the search of active buckets rather than the actual
`size_class` of the allocation.
2024-05-18 11:46:37 +02:00
Hampus Fröjdholm
61f1b2db70 gpa: Add helper to calculate size class of empty buckets
Empty buckets have their `alloc_cursor` set to `slot_count` to allow the
size class to be calculated later. This happens deep within the free
function.

This adds a helper and a test to verify that the size class of empty
buckets is indeed recoverable.
2024-05-18 11:43:42 +02:00
mlugg
8944935499 std: eliminate some uses of usingnamespace
This eliminates some simple usages of `usingnamespace` in the standard
library. This construct may in future be removed from the language, and
is generally an inappropriate way to formulate code. It is also
problematic for incremental compilation, which may not initially support
projects using it.

I wasn't entirely sure what the appropriate namespacing for the types in
`std.os.uefi.tables` would be, so I ofted to preserve the current
namespacing, meaning this is not a breaking change. It's possible some
of the moved types should instead be namespaced under `BootServices`
etc, but this can be a future enhancement.
2024-02-01 20:30:42 +00:00
mlugg
51595d6b75
lib: correct unnecessary uses of 'var' 2023-11-19 09:55:07 +00:00
Jacob Young
8f69e977f1 x86_64: implement 128-bit builtins
* `@clz`
 * `@ctz`
 * `@popCount`
 * `@byteSwap`
 * `@bitReverse`
 * various encodings used by std
2023-10-23 22:42:18 -04:00
Jacob Young
27fe945a00 Revert "Revert "Merge pull request #17637 from jacobly0/x86_64-test-std""
This reverts commit 6f0198cadbe29294f2bf3153a27beebd64377566.
2023-10-22 15:46:43 -04:00
Andrew Kelley
6f0198cadb Revert "Merge pull request #17637 from jacobly0/x86_64-test-std"
This reverts commit 0c99ba1eab63865592bb084feb271cd4e4b0357e, reversing
changes made to 5f92b070bf284f1493b1b5d433dd3adde2f46727.

This caused a CI failure when it landed in master branch due to a
128-bit `@byteSwap` in std.mem.
2023-10-22 12:16:35 -07:00
Jacob Young
32e85d44eb x86_64: disable failing tests, enable test-std testing 2023-10-21 10:55:41 -04:00
Ryan Liptak
ec0f76c599 GeneralPurposeAllocator.searchBucket: check current bucket before searching the list
Follow up to #17383. This is a minor optimization that only matters when a small allocation is resized/free'd soon after it is allocated.

The only real difference I was able to observe with this was via a synthetic benchmark that allocates a full bucket and then frees all but one of the slots, over and over in a loop:

Debug build:

Benchmark 1 (9 runs): gpa-degen-master.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           575ms ± 5.19ms     569ms …  583ms          0 ( 0%)        0%
  peak_rss           43.8MB ± 1.37KB    43.8MB … 43.8MB          1 (11%)        0%
Benchmark 2 (10 runs): gpa-degen-search-cur.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           532ms ± 5.55ms     520ms …  539ms          0 ( 0%)        -  7.5% ±  0.9%
  peak_rss           43.8MB ± 65.2KB    43.8MB … 44.0MB          1 (10%)          +  0.0% ±  0.1%

ReleaseFast build:

Benchmark 1 (129 runs): gpa-degen-master-release.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          38.9ms ± 1.12ms    36.7ms … 42.4ms          8 ( 6%)        0%
  peak_rss           23.2MB ± 2.39KB    23.2MB … 23.2MB          0 ( 0%)        0%
Benchmark 2 (151 runs): gpa-degen-search-cur-release.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          33.2ms ±  999us    31.9ms … 36.3ms         20 (13%)        - 14.7% ±  0.6%
  peak_rss           23.2MB ± 2.26KB    23.2MB … 23.2MB          0 ( 0%)          +  0.0% ±  0.0%
2023-10-04 02:55:54 -07:00
Ryan Liptak
95f4c1532a Treap: do not set key to undefined in remove to allow re-use of removed nodes 2023-10-03 01:21:51 -07:00
Ryan Liptak
cf3572a66b GeneralPurposeAllocator: Considerably improve worst case performance
Before this commit, GeneralPurposeAllocator could run into incredibly degraded performance in scenarios where the bucket count for a particular size class grew to be large. For example, if exactly `slot_count` allocations of a single size class were performed and then all of them were freed except one, then the bucket for those allocations would have to be kept around indefinitely. If that pattern of allocation were done over and over, then the bucket list for that size class could grow incredibly large.

This allocation pattern has been seen in the wild: https://github.com/Vexu/arocc/issues/508#issuecomment-1738275688

In that case, the length of the bucket list for the `128` size class would grow to tens of thousands of buckets and cause Debug runtime to balloon to ~8 minutes whereas with the c_allocator the Debug runtime would be ~3 seconds.

To address this, there are three different changes happening here:

1. std.Treap is used instead of a doubly linked list for the lists of buckets. This takes the time complexity of searchBucket [used in resize and free] from O(n) to O(log n), but increases the time complexity of insert from O(1) to O(log n) [before, all new buckets would get added to the head of the list]. Note: Any data structure with O(log n) or better search/insert/delete would also work for this use-case.
2. If the 'current' bucket for a size class is full, the list of buckets is never traversed and instead a new bucket is allocated. Previously, traversing the bucket list could only find a non-full bucket in specific circumstances, and only because of a separate optimization that is no longer needed (before, after any resize/free, the affected bucket would be moved to the head of the bucket list to allow searchBucket to perform better on average). Now, the current_bucket for each size class only changes when either (1) the current bucket is emptied/freed, or (2) a new bucket is allocated (due to the current bucket being full or null). Because each bucket's alloc_cursor only moves forward (i.e. slots within a bucket are never re-used), we can therefore always know that any bucket besides the current_bucket will be full, so traversing the list in the hopes of finding an existing non-full bucket is entirely pointless.
3. Size + alignment information for small allocations has been moved into the Bucket data instead of keeping it in a separate HashMap. This offers an improvement over the HashMap since whenever we need to get/modify the length/alignment of an allocation it's extremely likely we will already have calculated any bucket-related information necessary to get the data.

The first change is the most relevant and accounts for most of the benefit here. Also note that the overall functionality of GeneralPurposeAllocator is unchanged.

In the degraded `arocc` case, these changes bring Debug performance from ~8 minutes to ~20 seconds.

Benchmark 1: test-master.bat
  Time (mean ± σ):     481.263 s ±  5.440 s    [User: 479.159 s, System: 1.937 s]
  Range (min … max):   477.416 s … 485.109 s    2 runs

Benchmark 2: test-optim-treap.bat
  Time (mean ± σ):     19.639 s ±  0.037 s    [User: 18.183 s, System: 1.452 s]
  Range (min … max):   19.613 s … 19.665 s    2 runs

Summary
  'test-optim-treap.bat' ran
   24.51 ± 0.28 times faster than 'test-master.bat'

Note: Much of the time taken on Windows in this particular case is related to gathering stack traces. With `.stack_trace_frames = 0` the runtime goes down to 6.7 seconds, which is a little more than 2.5x slower compared to when the c_allocator is used.

These changes may or mat not introduce a slight performance regression in the average case:

Here's the standard library tests on Windows in Debug mode:

Benchmark 1 (10 runs): std-tests-master.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          16.0s  ± 30.8ms    15.9s  … 16.1s           1 (10%)        0%
  peak_rss           42.8MB ± 8.24KB    42.8MB … 42.8MB          0 ( 0%)        0%
Benchmark 2 (10 runs): std-tests-optim-treap.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          16.2s  ± 37.6ms    16.1s  … 16.3s           0 ( 0%)        💩+  1.3% ±  0.2%
  peak_rss           42.8MB ± 5.18KB    42.8MB … 42.8MB          0 ( 0%)          +  0.1% ±  0.0%

And on Linux:

Benchmark 1: ./test-master
  Time (mean ± σ):     16.091 s ±  0.088 s    [User: 15.856 s, System: 0.453 s]
  Range (min … max):   15.870 s … 16.166 s    10 runs
 
Benchmark 2: ./test-optim-treap
  Time (mean ± σ):     16.028 s ±  0.325 s    [User: 15.755 s, System: 0.492 s]
  Range (min … max):   15.735 s … 16.709 s    10 runs
 
Summary
  './test-optim-treap' ran
    1.00 ± 0.02 times faster than './test-master'
2023-10-03 01:21:51 -07:00
Gregory Anders
cab9da35bd std: enable FailingAllocator to fail on resize
Now that allocator.resize() is allowed to fail, programs may wish to
test code paths that handle resize() failure. The simplest way to do
this now is to replace the vtable of the testing allocator with one
that uses Allocator.noResize for the 'resize' function pointer.

An alternative way to support this testing capability is to augment the
FailingAllocator (which is already useful for testing allocation failure
scenarios) to intentionally fail on calls to resize(). To do this, add a
'resize_fail_index' parameter to the FailingAllocator that causes
resize() to fail after the given number of calls.
2023-09-06 19:06:32 +03:00
Gregory Mullen
f74e10cd47 Update default stack frames for general_purpose_allocator.zig
Created from a conversation with  @andrewrk on irc: Memory leaks when using ArrayList can be inconvenient to debug when the stack frame size is 4 because the entirety of the printed frame is within zig stdlib, and not in the users calling stack. Increasing this to 6 for Debug builds, gives 2 frames of user code. I increased the frame size for tests as well by the equivalent factor, but I'm unconvinced that's actually desirable.
2023-08-21 11:22:22 -07:00
Philipp Lühmann
de227ace14 std: fix doc comment of GPA deinit
This was missed in #15269
2023-07-03 01:14:20 -07:00
mlugg
f26dda2117 all: migrate code to new cast builtin syntax
Most of this migration was performed automatically with `zig fmt`. There
were a few exceptions which I had to manually fix:

* `@alignCast` and `@addrSpaceCast` cannot be automatically rewritten
* `@truncate`'s fixup is incorrect for vectors
* Test cases are not formatted, and their error locations change
2023-06-24 16:56:39 -07:00
Eric Joldasov
50339f595a all: zig fmt and rename "@XToY" to "@YFromX"
Signed-off-by: Eric Joldasov <bratishkaerik@getgoogleoff.me>
2023-06-19 12:34:42 -07:00
Motiejus Jakštys
d41111d7ef mem: rename align*Generic to mem.align*
Anecdote 1: The generic version is way more popular than the non-generic
one in Zig codebase:

     git grep -w alignForward | wc -l
    56
     git grep -w alignForwardGeneric | wc -l
    149

     git grep -w alignBackward | wc -l
    6
     git grep -w alignBackwardGeneric | wc -l
    15

Anecdote 2: In my project (turbonss) that does much arithmetic and
alignment I exclusively use the Generic functions.

Anecdote 3: we used only the Generic versions in the Macho Man's linker
workshop.
2023-06-17 12:49:13 -07:00
Linus Groh
94e30a756e std: fix a bunch of typos
The majority of these are in comments, some in doc comments which might
affect the generated documentation, and a few in parameter names -
nothing that should be breaking, however.
2023-04-30 18:16:04 -07:00
Andrew Kelley
6261c13731 update codebase to use @memset and @memcpy 2023-04-28 13:24:43 -07:00
Andrew Kelley
401b7f6f53 zig fmt 2023-04-25 11:23:41 -07:00
Andrew Kelley
a5c910adb6 change semantics of @memcpy and @memset
Now they use slices or array pointers with any element type instead of
requiring byte pointers.

This is a breaking enhancement to the language.

The safety check for overlapping pointers will be implemented in a
future commit.

closes #14040
2023-04-25 11:23:40 -07:00
Borja Clemente
bd801dc489
std: GPA deinit return an enum instead of a bool 2023-04-22 14:09:44 +03:00
Ganesan Rajagopal
49b56f88b9
GPA: Catch invalid frees
* GPA: Catch invalid frees

Fix #14791: Catch cases where an invalid slice is passed to free().
This was silently ignored before but now logs an error. This change
uses a AutoHashMap to keep track of the sizes which seems to be an
overkill but seems like the easiest way to catch these errors.

* GPA: Add wrong alignment checks to free/resize

Implement @Inkryption's suggestion to catch free/resize with the wrong
alignment. I also changed the naming to match large allocations.
2023-04-04 13:11:25 +03:00
Andrew Kelley
5236842a9d std.heap.GeneralPurposeAllocator: add doc comment for deinit 2023-02-27 22:04:29 -07:00
Andrew Kelley
aeaef8c0ff update std lib and compiler sources to new for loop syntax 2023-02-18 19:17:21 -07:00
Andrew Kelley
3c2a43fdcc Revert "std: check types of pointers passed to allocator functions"
This reverts commit abc9530a88d24350481d9264edcde300f293929a.

This patch implies that the idiomatic Zig way of handling anytype
parameter is to write a bunch of boilerplate instead of directly
accessing type information and relying on the compiler to be useful.

I don't want it to be this way.

It is the compiler's job to make useful error messages when the wrong
field of a type info result is accessed, and it is the zig programmer's
job to understand what it means when a compile error points at the field
access of `@typeInfo` (along with the relevant callsites).

One thing that might be useful would be having the compiler be aware of
module boundaries and highlighting the boundaries of them. The first
reference note after crossing a module boundary is likely the most
interesting one.
2023-02-12 05:59:28 -07:00
Leo Constantinides
abc9530a88
std: check types of pointers passed to allocator functions 2023-02-12 00:04:27 +00:00
Andrew Kelley
ceb0a632cf std.mem.Allocator: allow shrink to fail
closes #13535
2022-11-29 23:30:38 -07:00
Nick Cernis
8a5818535b
Make invalidFmtError public and use in place of compileErrors for bad format strings (#13526)
* Export invalidFmtErr

To allow consistent use of "invalid format string" compile error
response for badly formatted format strings.

See https://github.com/ziglang/zig/pull/13489#issuecomment-1311759340.

* Replace format compile errors with invalidFmtErr

- Provides more consistent compile errors.
- Gives user info about the type of the badly formated value.

* Rename invalidFmtErr as invalidFmtError

For consistency. Zig seems to use “Error” more often than “Err”.

* std: add invalid format string checks to remaining custom formatters

* pass reference-trace to comp when building build file; fix checkobjectstep
2022-11-12 21:03:24 +02:00
Andrew Kelley
3f3003097c std.heap.PageAllocator: add check for large allocation
Instead of making the memory alignment functions more complicated, I
added more API documentation for their existing semantics.

closes #12118
closes #12135
2022-10-30 16:10:20 -07:00
Andrew Kelley
f16855b9d7 remove pointless discards 2022-09-12 18:13:24 -07:00
Ryan Liptak
22720981ea Move sys_can_stack_trace from GPA to std.debug so that it can be re-used as needed 2022-06-25 21:27:56 -07:00
protty
963ac60918
std.Thread: Mutex and Condition improvements (#11497)
* Thread: minor cleanups

* Thread: rewrite Mutex

* Thread: introduce Futex.Deadline

* Thread: Condition rewrite + cleanup

* Mutex: optimize lock fast path

* Condition: more docs

* Thread: more mutex + condition docs

* Thread: remove broken Condition test

* Thread: zig fmt

* address review comments + fix Thread.DummyMutex in GPA

* Atomic: disable bitRmw x86 inline asm for stage2

* GPA: typo mutex_init

* Thread: remove noalias on stuff

* Thread: comment typos + clarifications
2022-04-23 19:35:56 -05:00
Veikka Tuominen
12f3c461a4 Sema: implement zirSwitchCaptureElse for error sets 2022-03-19 15:49:27 +02:00
Veikka Tuominen
c9b6f1bf90 std: enable default panic handler for stage2 LLVM on Linux 2022-03-19 14:05:57 +02:00
Andrew Kelley
92a09eb1e4 std.heap.GeneralPurposeAllocator: use var for mutable locals
Required to be compatible with new language semantics.
2022-03-16 13:31:16 -07:00
Veikka Tuominen
2682b41da5 make gpa.deinit work with stage2 2022-02-28 13:09:14 -07:00
Kenta Iwasaki
5c7f2ab011 stage1: deal with BPF not supporting @returnAddress()
Make `@returnAddress()` return for the BPF target, as the BPF target for
the time being does not support probing for the return address. Stack
traces for the general purpose allocator for the BPF target is also set
to not be captured.
2021-12-19 23:22:05 -08:00
Andrew Kelley
70dcdcb73d std: remove double free in GPA
Merge conflict between
02a1f838e67edc4107df19ba194a925958f1c669
and
885c73f3438d108c3cbb1afd75e3fee2f4bc88c0
2021-12-01 15:19:29 -07:00
Matthew Borkowski
02a1f838e6 gpa: fix leak in freeLarge and memory limit accounting in resize and resizeLarge 2021-12-01 13:34:53 -08:00
Lee Cannon
885c73f343
allocgate: actually free memory in gpa 2021-12-01 09:44:19 +00:00
Lee Cannon
066eaa5e9c
allocgate: change resize to return optional instead of error 2021-11-30 23:45:01 +00:00
Lee Cannon
f68cda738a
allocgate: split free out from resize 2021-11-30 23:32:48 +00:00
Lee Cannon
23866b1f81
allocgate: update code to use new interface 2021-11-30 23:32:48 +00:00
Lee Cannon
02e5e0ba1f
allocgate: apply missed changes 2021-11-30 23:32:48 +00:00
Lee Cannon
9377f32c08
allocgate: utilize a *const vtable field 2021-11-30 23:32:48 +00:00
Lee Cannon
1093b09a98
allocgate: renamed getAllocator function to allocator 2021-11-30 23:32:47 +00:00
Lee Cannon
85de022c56
allocgate: std Allocator interface refactor 2021-11-30 23:32:47 +00:00