62 Commits

Author SHA1 Message Date
riverbl
a0732117d0 HashMap: add removeByPtr 2022-01-24 20:29:05 +02:00
djg
4731a6e5d5
std: hash_map: optimize isFree/isTombstone (#10562)
- Add an `Metadata.isFree` helper method.
- Implement `Metadata.isTombstone` and `Metadata.isFree` with `@bitCast` then comparing to a constant. I assume `@bitCast`-then-compare is faster than the old method because it only involves one comparison, and doesn't require bitmasking.
- Summary of benchmarked changes (`gotta-go-fast`, run locally, compared to master):
  - 3/4 of the hash map benchmarks used ~10% fewer cycles
  - The last one (project Euler) shows 4% fewer cycles.
2022-01-10 23:54:45 -05:00
Andrew Kelley
6d04de706a Revert "std: optimize hash_map probe loop condition"
This reverts commit 11803a3a569205d640c7ec0b0aedba83f47a6e64.

Observations from the performance dashboard:
 * strictly worse in terms of CPU instructions
 * slightly worse wall time (but this can be noisy)
 * sometimes better, sometimes worse for branch predictions

Given that the commit was introducing complexity for optimization's
sake, these performance changes do not seem worth it.
2021-12-17 16:56:43 -07:00
sentientwaffle
11803a3a56 std: optimize hash_map probe loop condition
See https://github.com/ziglang/zig/pull/10337 for context.

In #10337 the `available` tracking fix necessitated an additional condition on the probe loop in both `getOrPut` and `getIndex` to prevent an infinite loop. Previously, this condition was implicit thanks to the guaranteed presence of a free slot.

The new condition hurts the `HashMap` benchmarks (https://github.com/ziglang/zig/pull/10337#issuecomment-996432758).

This commit removes that extra condition on the loop. Instead, when probing, first check whether the "home" slot is the target key — if so, return it. Otherwise, save the home slot's metadata to the stack and temporarily "free" the slot (but don't touch its value). Then continue with the original loop. Once again, the loop will be implicitly broken by the new "free" slot. The original metadata is restored before the function returns.

`getOrPut` has one additional gotcha — if the home slot is a tombstone and `getOrPut` misses, then the home slot is is written with the new key; that is, its original metadata (the tombstone) is not restored.

Other changes:

- Test hash map misses.
- Test using `getOrPutAssumeCapacity` to get keys at the end (along with `get`).
2021-12-17 15:21:41 -08:00
sentientwaffle
ef0566df78 std: count hash_map tombstones as available
When entries are inserted and removed into a hash map at an equivalent rate (maintaining a mostly-consistent total count of entries), it should never need to be resized. But `HashMapUnmanaged.available` does not presently count tombstoned slots as "available", so this put/remove pattern eventually panics (assertion failure) when `available` reaches `0`.

The solution implemented here is to count tombstoned slots as "available". Another approach (which hashbrown (b3eaf32e60/src/raw/mod.rs (L1455-L1542)) takes) would be to rehash all entries in place when there are too many tombstones. This is more complex but avoids an `O(n)` bad case when the hash map is full of many tombstones.
2021-12-16 19:11:53 -08:00
Lee Cannon
85de022c56
allocgate: std Allocator interface refactor 2021-11-30 23:32:47 +00:00
Andrew Kelley
902df103c6 std lib API deprecations for the upcoming 0.9.0 release
See #3811
2021-11-30 00:13:07 -07:00
Ominitay
c1a5ff34f3 std.rand: Refactor Random interface
These changes have been made to resolve issue #10037. The `Random`
interface was implemented in such a way that causes significant slowdown
when calling the `fill` function of the rng used.

The `Random` interface is no longer stored in a field of the rng, and is
instead returned by the child function `random()` of the rng. This
avoids the performance issues caused by the interface.
2021-10-27 16:07:48 -04:00
Ryan Liptak
59f5053bed Update all ensureCapacity calls to the relevant non-deprecated version 2021-09-19 13:52:56 +02:00
FnControlOption
fdf1918b39 std.hash_map: add StringIndexAdapter and StringIndexContext 2021-09-03 06:50:27 -07:00
Andrew Kelley
c05a20fc8c std: reorganization that allows new usingnamespace semantics
The proposal #9629 is now accepted, usingnamespace stays but no longer
puts identifiers in scope.
2021-09-01 17:54:06 -07:00
fn ⌃ ⌥
b25e58b0ac
std.hash_map: add getKey methods (#9607) 2021-08-31 00:32:34 -04:00
Andrew Kelley
d29871977f remove redundant license headers from zig standard library
We already have a LICENSE file that covers the Zig Standard Library. We
no longer need to remind everyone that the license is MIT in every single
file.

Previously this was introduced to clarify the situation for a fork of
Zig that made Zig's LICENSE file harder to find, and replaced it with
their own license that required annual payments to their company.
However that fork now appears to be dead. So there is no need to
reinforce the copyright notice in every single file.
2021-08-24 12:25:09 -07:00
Andrew Kelley
47f2463b5c std.HashMap: fix getPtrAdapted. AstGen: fix fn param iteration
There was a bug in stage2 regarding iteration of function parameter AST.
This resulted in a false negative "unused parameter" compile error,
which, when fixed, revealed a bug in the std lib HashMap implementation.
2021-08-05 23:17:29 -07:00
Isaac Freund
03156e5899 std/hash_map: fix ensureUnusedCapacity() over-allocating
Currently this function adds the desired unused capactiy to the current
total capacity instead of the current used capactiy.
2021-07-12 11:31:36 -07:00
Andrew Kelley
298a65ff4b std.HashMap: add ensureUnusedCapacity and ensureTotalCapacity
and deprecated ensureCapacity. This matches the pattern set by ArrayList
and ArrayHashMap already.
2021-07-07 00:38:10 -07:00
Jacob G-W
9fffffb07b fix code broken from previous commit 2021-06-21 17:03:03 -07:00
Jacob G-W
641ecc260f std, src, doc, test: remove unused variables 2021-06-21 17:03:03 -07:00
hadroncfy
1f29b75f08
HashMap.getOrPutAssumeCapacityAdapted should set key to undefined (#9138)
* std.hash_map.HashMap: getOrPutAssumeCapacityAdapted should set key to undefined

* add test for std.hash_map.HashMap.getOrPutAdapted
2021-06-18 08:52:30 +03:00
Andrew Kelley
138afd5cbf zig fmt 2021-06-10 20:13:43 -07:00
Martin Wickham
6953c8544b Fix return type of HashMap.getAdapted 2021-06-03 17:58:06 -05:00
Martin Wickham
fc9430f567 Breaking hash map changes for 0.8.0
- hash/eql functions moved into a Context object
- *Context functions pass an explicit context
- *Adapted functions pass specialized keys and contexts
- new getPtr() function returns a pointer to value
- remove functions renamed to fetchRemove
- new remove functions return bool
- removeAssertDiscard deleted, use assert(remove(...)) instead
- Keys and values are stored in separate arrays
- Entry is now {*K, *V}, the new KV is {K, V}
- BufSet/BufMap functions renamed to match other set/map types
- fixed iterating-while-modifying bug in src/link/C.zig
2021-06-03 17:02:16 -05:00
Andrew Kelley
597082adf4 Merge remote-tracking branch 'origin/master' into stage2-whole-file-astgen
Conflicts:
 * build.zig
 * src/Compilation.zig
 * src/codegen/spirv/spec.zig
 * src/link/SpirV.zig
 * test/stage2/darwin.zig
   - this one might be problematic; start.zig looks for `main` in the
     root source file, not `_main`. Not sure why there is an underscore
     there in master branch.
2021-05-15 21:44:38 -07:00
Sahnvour
2d4d4baa42 std.hash_map: use 7 bits of metadata instead of 6
we only effectively need 1 control bit to represent 2 special states for
the metadata (free and tombstone)
this should reduce the number of actual element equality tests, but since
it's very low already, the impact is negligible
2021-05-14 15:15:55 -04:00
Andrew Kelley
5619ce2406 Merge remote-tracking branch 'origin/master' into stage2-whole-file-astgen
Conflicts:
 * doc/langref.html.in
 * lib/std/enums.zig
 * lib/std/fmt.zig
 * lib/std/hash/auto_hash.zig
 * lib/std/math.zig
 * lib/std/mem.zig
 * lib/std/meta.zig
 * test/behavior/alignof.zig
 * test/behavior/bitcast.zig
 * test/behavior/bugs/1421.zig
 * test/behavior/cast.zig
 * test/behavior/ptrcast.zig
 * test/behavior/type_info.zig
 * test/behavior/vector.zig

Master branch added `try` to a bunch of testing function calls, and some
lines also had changed how to refer to the native architecture and other
`@import("builtin")` stuff.
2021-05-08 14:45:21 -07:00
Veikka Tuominen
fd77f2cfed std: update usage of std.testing 2021-05-08 15:15:30 +03:00
Andrew Kelley
429cd2b5dd std: change @import("builtin") to std.builtin 2021-04-15 19:06:39 -07:00
Meghan Denny
ab43f2376e lib/std: remove empty init from HashMapUnmanaged 2021-04-10 12:49:02 -07:00
Veikka Tuominen
0a7be71bc2
stage2 cbe: non pointer optionals 2021-03-08 00:33:56 +02:00
daurnimator
d4af35b3fe HashMap.put returns !void, not a !bool 2021-02-27 13:11:47 +02:00
Julius Putra Tanu Setiaji
2b3b355a23 Add compileError message for StringHashMap in AutoHashMap 2021-01-07 23:51:53 -08:00
Frank Denis
6c2e0c2046 Year++ 2020-12-31 15:45:24 -08:00
Vexu
7e30e83900
small fixes and zig fmt 2020-12-09 13:54:26 +02:00
Vexu
ae6f3291c0
std: fix HashMap.clearRetainingCapacity 2020-11-11 14:05:43 +02:00
Vexu
f70160f89c
std: fix HashMap.putAssumeCapacity 2020-11-11 13:57:08 +02:00
Zachary Meadows
edc40157eb Switch type of HashMap's count from usize to u32 (#6262) 2020-09-09 00:33:14 -04:00
Sahnvour
575fbd5e35 hash_map: rename to ArrayHashMap and add new HashMap implementation 2020-09-02 00:17:50 +02:00
Andrew Kelley
4a69b11e74 add license header to all std lib files
add SPDX license identifier
copyright ownership is zig contributors
2020-08-20 16:07:04 -04:00
Andrew Kelley
1a3f250f19 .debug_line incremental compilation initial support
Supports writing the first function. Still TODO is:
 * handling the .debug_line header growing too large
 * adding a new file to an existing compilation
 * adding an additional function to an existing file
 * handling incremental updates
 * adding the main IR debug ops for IR instructions

There are also issues to work out:
 * readelf --debug-dump=rawline is saying there is no .debug_str section
   even though there is
 * readelf --debug-dump=decodedline is saying the file index 0 is bad
   and reporting some other kind of corruption.
2020-08-02 19:25:26 -07:00
Sahnvour
f67ce1e35f make use of hasUniqueRepresentation to speed up hashing facilities, fastpath in getAutoHashFn is particularly important for hashmap performance
gives a 1.18x speedup on gotta-go-fast hashmap bench
2020-07-26 23:04:33 +02:00
daurnimator
4b48266bb7
std: add StringHashMapUnmanaged 2020-07-13 00:34:02 +10:00
Josh Wolfe
eddc68ad94 remove stray allocator parameter 2020-07-10 06:27:07 +00:00
Andrew Kelley
7bd0500589 Merge remote-tracking branch 'origin/master' into register-allocation 2020-07-08 20:46:06 -07:00
Vexu
485231deae
fix HashMap.clone() 2020-07-06 16:51:53 +03:00
Andrew Kelley
ad2ed457dd std: expose unmanaged hash maps
These are useful when you have many of them in memory, and already have
the allocator stored elsewhere.
2020-07-06 06:10:03 +00:00
Andrew Kelley
3a89f214aa update more HashMap API usage 2020-07-05 21:11:42 +00:00
Andrew Kelley
3c8b13d998 std hash map: do the pow2 improvement again
it's a noticeable speedup
2020-07-05 21:11:42 +00:00
Andrew Kelley
632acffcbd update std lib to new hash map API 2020-07-05 21:11:42 +00:00
Andrew Kelley
b3b6ccba50 reimplement std.HashMap
* breaking changes to the API. Some of the weird decisions from before
   are changed to what would be more expected.
   - `get` returns `?V`, use `getEntry` for the old API.
   - `put` returns `!void`, use `fetchPut` for the old API.
 * HashMap now has a comptime parameter of whether to store hashes with
   entries. AutoHashMap has heuristics on whether to set this parameter.
   For example, for integers, it is false, since equality checking is
   cheap, but for strings, it is true, since equality checking is
   probably expensive.
 * The implementation has a separate array for entry_index /
   distance_from_start_index. Entries no longer has holes; it is an
   ArrayList, and iteration is simpler and more cache coherent.
   This is inspired by Python's new dictionaries.
 * HashMap is separated into an "unmanaged" and a "managed" API. The
   unmanaged API is where the actual implementation is; the managed API
   wraps it and provides a more convenient API, storing the allocator.
 * Memory usage: When there are less than or equal to 8 entries, HashMap
   now incurs only a single pointer-size integer as overhead, opposed to
   using an ArrayList.
 * Since the entries array is separate from the indexes array, the holes
   in the indexes array take up less room than the holes in the entries
   array otherwise would. However the entries array also allocates
   additional capacity for appending into the array.
 * HashMap now maintains insertion order. Deletion performs a "swap
   remove". It's now possible to modify the HashMap while iterating.
2020-07-05 21:11:42 +00:00
Andrew Kelley
bae0c9b554 std.HashMap: allow ensureCapacity with a zero parameter 2020-06-02 14:41:45 -04:00