167 Commits

Author SHA1 Message Date
Andrew Kelley
1a6d87d699 std.heap.ThreadSafeAllocator: update to new Allocator API 2025-02-06 14:23:23 -08:00
Andrew Kelley
36e9b0f026 std.mem.Allocator: keep the undefined memset
Reversal on the decision: the Allocator interface is the correct place
for the memset to undefined because it allows Allocator implementations
to bypass the interface and use a backing allocator directly, skipping
the performance penalty of memsetting the entire allocation, which may
be very large, as well as having valuable zeroes on them.

closes #4298
2025-02-06 14:23:23 -08:00
Andrew Kelley
601f632c27 std.heap.GeneralPurposeAllocator: fix large alloc accounting
when mremap relocates an allocation
2025-02-06 14:23:23 -08:00
Andrew Kelley
f1717777a2 std.heap: delete LoggingAllocator and friends
I don't think these belong in std, at least not in their current form.

If someone wants to add these back I'd like to review the patch before
it lands.

Reverts 629e2e784495dd8ac91493fa7bb11e1772698e42
2025-02-06 14:23:23 -08:00
Andrew Kelley
0d8166be3f std: update to new Allocator API 2025-02-06 14:23:23 -08:00
Andrew Kelley
a4d4e086c5 introduce std.posix.mremap and use it
in std.heap.page_allocator
2025-02-06 14:23:23 -08:00
Andrew Kelley
7eeef5fb2b std.mem.Allocator: introduce remap function to the interface
This one changes the size of an allocation, allowing it to be relocated.
However, the implementation will still return `null` if it would be
equivalent to

new = alloc
memcpy(new, old)
free(old)

Mainly this prepares for taking advantage of `mremap` which I thought
would be a bigger deal but apparently is only available on Linux. Still,
we should use it on Linux.
2025-02-06 14:23:23 -08:00
Andrew Kelley
dd2fa4f75d std.heap.GeneralPurposeAllocator: runtime-known page size
no longer causes compilation failure.

This also addresses the problem of high map count causing OOM by
choosing a page size of 2MiB for most targets when the page_size_max is
smaller than this number.
2025-02-06 14:23:23 -08:00
Andrew Kelley
b23662feeb std.heap.WasmAllocator: use @splat syntax
preferred over array multiplication where possible.
2025-02-06 14:23:23 -08:00
Andrew Kelley
91f41bdc70 std.heap.PageAllocator: restore high alignment functionality
This allocator now supports alignments greater than page size, with the
same implementation as it used before.

This is a partial revert of ceb0a632cfd6a4eada6bd27bf6a3754e95dcac86.

It looks like VirtualAlloc2 has better solutions to this problem,
including features such as MEM_RESERVE_PLACEHOLDER and MEM_LARGE_PAGES.
This possibility can be investigated as a follow-up task.
2025-02-06 14:23:23 -08:00
Andrew Kelley
4913de3c88 GeneralPurposeAllocator: minimal fix
This keeps the implementation matching master branch, however,
introduces a compile error that applications can work around by
explicitly setting page_size_max and page_size_min to match their
computer's settings, in the case that those values are not already
equal.

I plan to rework this allocator in a follow-up enhancement with the goal
of reducing total active memory mappings.
2025-02-06 14:23:23 -08:00
Andrew Kelley
95a0474dc6 revert GPA to before this branch 2025-02-06 14:23:23 -08:00
Andrew Kelley
284de7d957 adjust runtime page size APIs
* fix merge conflicts
* rename the declarations
* reword documentation
* extract FixedBufferAllocator to separate file
* take advantage of locals
* remove the assertion about max alignment in Allocator API, leaving it
  Allocator implementation defined
* fix non-inline function call in start logic

The GeneralPurposeAllocator implementation is totally broken because it
uses global state but I didn't address that in this commit.
2025-02-06 14:23:23 -08:00
Archbirdplus
439667be04 runtime page size detection
heap.zig: define new default page sizes
heap.zig: add min/max_page_size and their options
lib/std/c: add miscellaneous declarations
heap.zig: add pageSize() and its options
switch to new page sizes, especially in GPA/stdlib
mem.zig: remove page_size
2025-02-06 14:23:23 -08:00
mlugg
dc5c827847
std.heap.GeneralPurposeAllocator: disable some tests on wasm32-wasi
The ZON PR (#20271) is causing these tests to inexplicably fail. It
doesn't seem like that PR is what's breaking GPA, so these tests are now
disabled. This is tracked by #22731.
2025-02-03 09:17:52 +00:00
Andrew Kelley
fecdc53a48 delete std.heap.WasmPageAllocator
This allocator has no purpose since it cannot truly fulfill the role of
page allocation, and std.heap.wasm_allocator is better both in terms of
performance and code size.

This commit redefines `std.heap.page_allocator` to be less strict:

"On operating systems that support memory mapping, this allocator makes
a syscall directly for every allocation and free. Otherwise, it falls
back to the preferred singleton for the target. Thread-safe."

This now matches how it was actually being implemented, and matches its
use sites - which are mainly as the backing allocator for
`std.heap.ArenaAllocator`.
2025-01-29 21:10:20 -08:00
John Benediktsson
884b1423a4 std.heap.memory_pool: make preheat() usable after init() 2025-01-28 00:06:54 +01:00
Krzysztof Wolicki
04c182274c Fix index calculation in WasmPageAllocator 2024-10-12 22:53:02 +02:00
Krzysztof Wolicki
b1eaed6c8d Remove packed_int_array usage from WasmPageAllocator and BigInt 2024-10-12 12:55:35 +02:00
mlugg
0b9fccf508
std: deprecate some incorrect default initializations
In favour of newly-added decls, which can be used via decl literals.
2024-09-01 17:34:07 +01:00
mlugg
4330c40596
std: avoid field/decl name conflicts
Most of these changes seem like improvements. The PDB thing had a TODO
saying it used to crash; I anticipate it works now, we'll see what CI
does.

The `std.os.uefi` field renames are a notable breaking change.
2024-08-29 20:39:11 +01:00
mlugg
0fe3fd01dd
std: update std.builtin.Type fields to follow naming conventions
The compiler actually doesn't need any functional changes for this: Sema
does reification based on the tag indices of `std.builtin.Type` already!
So, no zig1.wasm update is necessary.

This change is necessary to disallow name clashes between fields and
decls on a type, which is a prerequisite of #9938.
2024-08-28 08:39:59 +01:00
mlugg
6808ce27bd
compiler,lib,test,langref: migrate @setCold to @branchHint 2024-08-27 00:44:35 +01:00
Andrew Kelley
33c7984183 add std.testing.random_seed
closes #17609
2024-07-23 11:43:12 -07:00
Hampus Fröjdholm
d526a2cf95 gpa: Add never_unmap and retain_metadata test 2024-05-21 19:09:52 +02:00
Hampus Fröjdholm
8a57e09b15 gpa: Fix GeneralPurposeAllocator crash when deallocating metadata 2024-05-21 19:09:52 +02:00
Hampus Fröjdholm
762e2a4b52 gpa: Fix GeneralPurposeAllocator double free stack traces
The wrong `size_class` was used when fetching stack traces from empty
buckets. The `size_class` would always be the maximum value after
exhausting the search of active buckets rather than the actual
`size_class` of the allocation.
2024-05-18 11:46:37 +02:00
Hampus Fröjdholm
61f1b2db70 gpa: Add helper to calculate size class of empty buckets
Empty buckets have their `alloc_cursor` set to `slot_count` to allow the
size class to be calculated later. This happens deep within the free
function.

This adds a helper and a test to verify that the size class of empty
buckets is indeed recoverable.
2024-05-18 11:43:42 +02:00
Lucas Santos
f71f27bcb0 Avoid unnecessary operation in PageAllocator.
There's no need to call `alignForward` before `VirtualAlloc`.
From [MSDN](https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-virtualalloc):
```
If the lpAddress parameter is NULL, this value is rounded up to the next page boundary
```
2024-05-10 22:51:52 +03:00
Jacob Young
a7282d0910 WasmAllocator: fix safety panic during OOM 2024-03-23 11:32:37 +01:00
Andrew Kelley
cd62005f19 extract std.posix from std.os
closes #5019
2024-03-19 11:45:09 -07:00
Tristan Ross
6067d39522
std.builtin: make atomic order fields lowercase 2024-03-11 07:09:10 -07:00
Ryan Liptak
16b3d1004e Remove redundant test name prefixes now that test names are fully qualified
Follow up to #19079, which made test names fully qualified.

This fixes tests that now-redundant information in their test names. For example here's a fully qualified test name before the changes in this commit:

"priority_queue.test.std.PriorityQueue: shrinkAndFree"

and the same test's name after the changes in this commit:

"priority_queue.test.shrinkAndFree"
2024-02-26 15:18:31 -08:00
e4m2
8d56e472c9 Replace std.rand references with std.Random 2024-02-08 15:21:35 +01:00
Andrew Kelley
9f3165540e std.os.linux.MAP: use a packed struct
Introduces type safety to this constant. Eliminates one use of
`usingnamespace`.
2024-02-06 21:12:11 -07:00
mlugg
8944935499 std: eliminate some uses of usingnamespace
This eliminates some simple usages of `usingnamespace` in the standard
library. This construct may in future be removed from the language, and
is generally an inappropriate way to formulate code. It is also
problematic for incremental compilation, which may not initially support
projects using it.

I wasn't entirely sure what the appropriate namespacing for the types in
`std.os.uefi.tables` would be, so I ofted to preserve the current
namespacing, meaning this is not a breaking change. It's possible some
of the moved types should instead be namespaced under `BootServices`
etc, but this can be a future enhancement.
2024-02-01 20:30:42 +00:00
Michael Pfaff
478c89b46f
std.heap: Use @alignOf(T) rather than 0 if not manually overridden for alignment of MemoryPool items 2023-11-21 13:23:53 +00:00
mlugg
51595d6b75
lib: correct unnecessary uses of 'var' 2023-11-19 09:55:07 +00:00
Kai Jellinghaus
084a7cf028
Use ArenaAllocator.reset in MemoryPool 2023-11-01 14:45:35 +02:00
Andrew Kelley
3fc6fc6812 std.builtin.Endian: make the tags lower case
Let's take this breaking change opportunity to fix the style of this
enum.
2023-10-31 21:37:35 -04:00
Jacob Young
8f69e977f1 x86_64: implement 128-bit builtins
* `@clz`
 * `@ctz`
 * `@popCount`
 * `@byteSwap`
 * `@bitReverse`
 * various encodings used by std
2023-10-23 22:42:18 -04:00
Jacob Young
27fe945a00 Revert "Revert "Merge pull request #17637 from jacobly0/x86_64-test-std""
This reverts commit 6f0198cadbe29294f2bf3153a27beebd64377566.
2023-10-22 15:46:43 -04:00
Andrew Kelley
6f0198cadb Revert "Merge pull request #17637 from jacobly0/x86_64-test-std"
This reverts commit 0c99ba1eab63865592bb084feb271cd4e4b0357e, reversing
changes made to 5f92b070bf284f1493b1b5d433dd3adde2f46727.

This caused a CI failure when it landed in master branch due to a
128-bit `@byteSwap` in std.mem.
2023-10-22 12:16:35 -07:00
Jacob Young
32e85d44eb x86_64: disable failing tests, enable test-std testing 2023-10-21 10:55:41 -04:00
Johan Jansson
a1e0b9979a std.heap.ArenaAllocator: fix doc comment typo
Fixes #17537
2023-10-15 21:20:48 +03:00
Ryan Liptak
ec0f76c599 GeneralPurposeAllocator.searchBucket: check current bucket before searching the list
Follow up to #17383. This is a minor optimization that only matters when a small allocation is resized/free'd soon after it is allocated.

The only real difference I was able to observe with this was via a synthetic benchmark that allocates a full bucket and then frees all but one of the slots, over and over in a loop:

Debug build:

Benchmark 1 (9 runs): gpa-degen-master.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           575ms ± 5.19ms     569ms …  583ms          0 ( 0%)        0%
  peak_rss           43.8MB ± 1.37KB    43.8MB … 43.8MB          1 (11%)        0%
Benchmark 2 (10 runs): gpa-degen-search-cur.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           532ms ± 5.55ms     520ms …  539ms          0 ( 0%)        -  7.5% ±  0.9%
  peak_rss           43.8MB ± 65.2KB    43.8MB … 44.0MB          1 (10%)          +  0.0% ±  0.1%

ReleaseFast build:

Benchmark 1 (129 runs): gpa-degen-master-release.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          38.9ms ± 1.12ms    36.7ms … 42.4ms          8 ( 6%)        0%
  peak_rss           23.2MB ± 2.39KB    23.2MB … 23.2MB          0 ( 0%)        0%
Benchmark 2 (151 runs): gpa-degen-search-cur-release.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          33.2ms ±  999us    31.9ms … 36.3ms         20 (13%)        - 14.7% ±  0.6%
  peak_rss           23.2MB ± 2.26KB    23.2MB … 23.2MB          0 ( 0%)          +  0.0% ±  0.0%
2023-10-04 02:55:54 -07:00
Ryan Liptak
95f4c1532a Treap: do not set key to undefined in remove to allow re-use of removed nodes 2023-10-03 01:21:51 -07:00
Ryan Liptak
cf3572a66b GeneralPurposeAllocator: Considerably improve worst case performance
Before this commit, GeneralPurposeAllocator could run into incredibly degraded performance in scenarios where the bucket count for a particular size class grew to be large. For example, if exactly `slot_count` allocations of a single size class were performed and then all of them were freed except one, then the bucket for those allocations would have to be kept around indefinitely. If that pattern of allocation were done over and over, then the bucket list for that size class could grow incredibly large.

This allocation pattern has been seen in the wild: https://github.com/Vexu/arocc/issues/508#issuecomment-1738275688

In that case, the length of the bucket list for the `128` size class would grow to tens of thousands of buckets and cause Debug runtime to balloon to ~8 minutes whereas with the c_allocator the Debug runtime would be ~3 seconds.

To address this, there are three different changes happening here:

1. std.Treap is used instead of a doubly linked list for the lists of buckets. This takes the time complexity of searchBucket [used in resize and free] from O(n) to O(log n), but increases the time complexity of insert from O(1) to O(log n) [before, all new buckets would get added to the head of the list]. Note: Any data structure with O(log n) or better search/insert/delete would also work for this use-case.
2. If the 'current' bucket for a size class is full, the list of buckets is never traversed and instead a new bucket is allocated. Previously, traversing the bucket list could only find a non-full bucket in specific circumstances, and only because of a separate optimization that is no longer needed (before, after any resize/free, the affected bucket would be moved to the head of the bucket list to allow searchBucket to perform better on average). Now, the current_bucket for each size class only changes when either (1) the current bucket is emptied/freed, or (2) a new bucket is allocated (due to the current bucket being full or null). Because each bucket's alloc_cursor only moves forward (i.e. slots within a bucket are never re-used), we can therefore always know that any bucket besides the current_bucket will be full, so traversing the list in the hopes of finding an existing non-full bucket is entirely pointless.
3. Size + alignment information for small allocations has been moved into the Bucket data instead of keeping it in a separate HashMap. This offers an improvement over the HashMap since whenever we need to get/modify the length/alignment of an allocation it's extremely likely we will already have calculated any bucket-related information necessary to get the data.

The first change is the most relevant and accounts for most of the benefit here. Also note that the overall functionality of GeneralPurposeAllocator is unchanged.

In the degraded `arocc` case, these changes bring Debug performance from ~8 minutes to ~20 seconds.

Benchmark 1: test-master.bat
  Time (mean ± σ):     481.263 s ±  5.440 s    [User: 479.159 s, System: 1.937 s]
  Range (min … max):   477.416 s … 485.109 s    2 runs

Benchmark 2: test-optim-treap.bat
  Time (mean ± σ):     19.639 s ±  0.037 s    [User: 18.183 s, System: 1.452 s]
  Range (min … max):   19.613 s … 19.665 s    2 runs

Summary
  'test-optim-treap.bat' ran
   24.51 ± 0.28 times faster than 'test-master.bat'

Note: Much of the time taken on Windows in this particular case is related to gathering stack traces. With `.stack_trace_frames = 0` the runtime goes down to 6.7 seconds, which is a little more than 2.5x slower compared to when the c_allocator is used.

These changes may or mat not introduce a slight performance regression in the average case:

Here's the standard library tests on Windows in Debug mode:

Benchmark 1 (10 runs): std-tests-master.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          16.0s  ± 30.8ms    15.9s  … 16.1s           1 (10%)        0%
  peak_rss           42.8MB ± 8.24KB    42.8MB … 42.8MB          0 ( 0%)        0%
Benchmark 2 (10 runs): std-tests-optim-treap.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          16.2s  ± 37.6ms    16.1s  … 16.3s           0 ( 0%)        💩+  1.3% ±  0.2%
  peak_rss           42.8MB ± 5.18KB    42.8MB … 42.8MB          0 ( 0%)          +  0.1% ±  0.0%

And on Linux:

Benchmark 1: ./test-master
  Time (mean ± σ):     16.091 s ±  0.088 s    [User: 15.856 s, System: 0.453 s]
  Range (min … max):   15.870 s … 16.166 s    10 runs
 
Benchmark 2: ./test-optim-treap
  Time (mean ± σ):     16.028 s ±  0.325 s    [User: 15.755 s, System: 0.492 s]
  Range (min … max):   15.735 s … 16.709 s    10 runs
 
Summary
  './test-optim-treap' ran
    1.00 ± 0.02 times faster than './test-master'
2023-10-03 01:21:51 -07:00
Gregory Anders
cab9da35bd std: enable FailingAllocator to fail on resize
Now that allocator.resize() is allowed to fail, programs may wish to
test code paths that handle resize() failure. The simplest way to do
this now is to replace the vtable of the testing allocator with one
that uses Allocator.noResize for the 'resize' function pointer.

An alternative way to support this testing capability is to augment the
FailingAllocator (which is already useful for testing allocation failure
scenarios) to intentionally fail on calls to resize(). To do this, add a
'resize_fail_index' parameter to the FailingAllocator that causes
resize() to fail after the given number of calls.
2023-09-06 19:06:32 +03:00
Gregory Mullen
f74e10cd47 Update default stack frames for general_purpose_allocator.zig
Created from a conversation with  @andrewrk on irc: Memory leaks when using ArrayList can be inconvenient to debug when the stack frame size is 4 because the entirety of the printed frame is within zig stdlib, and not in the users calling stack. Increasing this to 6 for Debug builds, gives 2 frames of user code. I increased the frame size for tests as well by the equivalent factor, but I'm unconvinced that's actually desirable.
2023-08-21 11:22:22 -07:00