26129 Commits

Author SHA1 Message Date
Andrew Kelley
8ebebbd134 std.macho: remove alignment from LoadCommandIterator 2023-10-03 14:55:17 -07:00
Andrew Kelley
d8540dd708 std: add type-erased reader; base GenericReader on it
The idea here is to avoid code bloat by having only one actual io.Reader
implementation, which is type erased, and then implement a GenericReader
that preserves type information on top of that as thin glue code.

The strategy here is for that glue code to `@errSetCast` the result of
the type-erased reader functions, however, while trying to do that I
ran into #17343.
2023-10-03 14:55:17 -07:00
Krzysztof Wolicki
4df7f7c86a Add -fno-stack-protector to flags when building libzigcpp
This allows both debug and release builds to link to it
without forcing release builds to link to libssp
2023-10-03 11:21:21 -07:00
Andrew Kelley
87d09edf2d
Merge pull request #17352 from kcbanner/extern_union_comptime_memory
sema: Support reinterpreting extern/packed unions at comptime via field access
2023-10-03 11:20:08 -07:00
ocrap7
c933a7c58c llvm: remove extra copy of wrapped payloads 2023-10-03 11:16:11 -07:00
Andrew Kelley
5c3393dc2e
Merge pull request #17375 from xxxbxxx/packed-struct
codegen: fix field offsets in packed structs
2023-10-03 11:13:40 -07:00
Andrew Kelley
7733894761
Merge pull request #17341 from rzezeski/illumos-updates
Illumos/Solaris updates
2023-10-03 11:04:41 -07:00
Andrew Kelley
47f08605bd
Merge pull request #17383 from squeek502/gpa-optim-treap
GeneralPurposeAllocator: Considerably improve worst case performance
2023-10-03 10:58:48 -07:00
Ian Johnson
6734d2117e Add behavior test for empty tuple type
Closes #16412
2023-10-03 16:01:08 +03:00
Andrew Kelley
df4853a627
Merge pull request #17363 from ziglang/tar-symlinks
introduce the `zig fetch` subcommand and symlink support in zig packages
2023-10-03 03:33:26 -07:00
Frank Denis
4930094e62 valgrind.memcheck: fix makeMem*()
The `makeMem*()` functions crashed under valgrind in Debug and
ReleaseSafe modes.

The reason being that `doMemCheckClientRequestExpr()` returns `0`
when not running under Valgrind, and `maxInt(usize)` when running
under Valgrind.

Thus, `@as(i1, @intCast(maxInt(usize)))` always fails and these
functions crashed before returning.

That being said, what these functions used to return was quite
unexpected: `0` on error and `-1` on success (=running under valgrind).
That doesn't match any Zig nor C conventions.

But that return value doesn't seem to be very useful. Either we are
running under Valgrind or we are not. There's no point in checking this
for every single call. Applications are likely to always discard it.

So, just return a `void` instead.

Also avoid function comments that start with `Similarly, ...` because
that doesn't refer to anything in the context of autodoc or in IDEs.
2023-10-03 02:51:01 -07:00
Ryan Liptak
95f4c1532a Treap: do not set key to undefined in remove to allow re-use of removed nodes 2023-10-03 01:21:51 -07:00
Ryan Liptak
cf3572a66b GeneralPurposeAllocator: Considerably improve worst case performance
Before this commit, GeneralPurposeAllocator could run into incredibly degraded performance in scenarios where the bucket count for a particular size class grew to be large. For example, if exactly `slot_count` allocations of a single size class were performed and then all of them were freed except one, then the bucket for those allocations would have to be kept around indefinitely. If that pattern of allocation were done over and over, then the bucket list for that size class could grow incredibly large.

This allocation pattern has been seen in the wild: https://github.com/Vexu/arocc/issues/508#issuecomment-1738275688

In that case, the length of the bucket list for the `128` size class would grow to tens of thousands of buckets and cause Debug runtime to balloon to ~8 minutes whereas with the c_allocator the Debug runtime would be ~3 seconds.

To address this, there are three different changes happening here:

1. std.Treap is used instead of a doubly linked list for the lists of buckets. This takes the time complexity of searchBucket [used in resize and free] from O(n) to O(log n), but increases the time complexity of insert from O(1) to O(log n) [before, all new buckets would get added to the head of the list]. Note: Any data structure with O(log n) or better search/insert/delete would also work for this use-case.
2. If the 'current' bucket for a size class is full, the list of buckets is never traversed and instead a new bucket is allocated. Previously, traversing the bucket list could only find a non-full bucket in specific circumstances, and only because of a separate optimization that is no longer needed (before, after any resize/free, the affected bucket would be moved to the head of the bucket list to allow searchBucket to perform better on average). Now, the current_bucket for each size class only changes when either (1) the current bucket is emptied/freed, or (2) a new bucket is allocated (due to the current bucket being full or null). Because each bucket's alloc_cursor only moves forward (i.e. slots within a bucket are never re-used), we can therefore always know that any bucket besides the current_bucket will be full, so traversing the list in the hopes of finding an existing non-full bucket is entirely pointless.
3. Size + alignment information for small allocations has been moved into the Bucket data instead of keeping it in a separate HashMap. This offers an improvement over the HashMap since whenever we need to get/modify the length/alignment of an allocation it's extremely likely we will already have calculated any bucket-related information necessary to get the data.

The first change is the most relevant and accounts for most of the benefit here. Also note that the overall functionality of GeneralPurposeAllocator is unchanged.

In the degraded `arocc` case, these changes bring Debug performance from ~8 minutes to ~20 seconds.

Benchmark 1: test-master.bat
  Time (mean ± σ):     481.263 s ±  5.440 s    [User: 479.159 s, System: 1.937 s]
  Range (min … max):   477.416 s … 485.109 s    2 runs

Benchmark 2: test-optim-treap.bat
  Time (mean ± σ):     19.639 s ±  0.037 s    [User: 18.183 s, System: 1.452 s]
  Range (min … max):   19.613 s … 19.665 s    2 runs

Summary
  'test-optim-treap.bat' ran
   24.51 ± 0.28 times faster than 'test-master.bat'

Note: Much of the time taken on Windows in this particular case is related to gathering stack traces. With `.stack_trace_frames = 0` the runtime goes down to 6.7 seconds, which is a little more than 2.5x slower compared to when the c_allocator is used.

These changes may or mat not introduce a slight performance regression in the average case:

Here's the standard library tests on Windows in Debug mode:

Benchmark 1 (10 runs): std-tests-master.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          16.0s  ± 30.8ms    15.9s  … 16.1s           1 (10%)        0%
  peak_rss           42.8MB ± 8.24KB    42.8MB … 42.8MB          0 ( 0%)        0%
Benchmark 2 (10 runs): std-tests-optim-treap.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          16.2s  ± 37.6ms    16.1s  … 16.3s           0 ( 0%)        💩+  1.3% ±  0.2%
  peak_rss           42.8MB ± 5.18KB    42.8MB … 42.8MB          0 ( 0%)          +  0.1% ±  0.0%

And on Linux:

Benchmark 1: ./test-master
  Time (mean ± σ):     16.091 s ±  0.088 s    [User: 15.856 s, System: 0.453 s]
  Range (min … max):   15.870 s … 16.166 s    10 runs
 
Benchmark 2: ./test-optim-treap
  Time (mean ± σ):     16.028 s ±  0.325 s    [User: 15.755 s, System: 0.492 s]
  Range (min … max):   15.735 s … 16.709 s    10 runs
 
Summary
  './test-optim-treap' ran
    1.00 ± 0.02 times faster than './test-master'
2023-10-03 01:21:51 -07:00
Veikka Tuominen
0bdbd3e235 Sema: fix issues in @errorCast with error unions 2023-10-03 00:45:48 -07:00
xdBronch
c9c3ee704c correctly detect apple a15 and a16 chips 2023-10-03 00:36:59 -07:00
Xavier Bouchoux
405705cb76 codegen: fix byte-aligned field offsets in unaligned nested packed structs 2023-10-03 05:34:19 +00:00
Xavier Bouchoux
62d178e91a codegen: fix field offsets in packed structs
* add nested packed struct/union behavior tests
 * use ptr_info.packed_offset rather than trying to duplicate the logic from Sema.structFieldPtrByIndex()
 * use the container_ptr_info.packed_offset to account for non-aligned nested structs.
 * dedup type.packedStructFieldBitOffset() and module.structPackedFieldBitOffset()
2023-10-03 06:39:20 +02:00
Ryan Liptak
da7ecfb2de Treap: Add InorderIterator 2023-10-02 21:11:14 -07:00
Ian Johnson
573a13f8be Support symlinks for git+http(s) dependencies 2023-10-02 18:14:57 -07:00
kcbanner
1b8a50ea5e union: skip failing tests on ppc 2023-10-02 20:39:02 -04:00
Andrew Kelley
21181181bf zig fetch: enhanced error reporting
* Package: use std.tar diagnostics to give detailed error messages
* std.tar: add diagnostic for unsupported file type
2023-10-02 17:02:25 -07:00
Andrew Kelley
ef9966c985 introduce the 'zig fetch' command + symlink support
zig fetch [options] <url>
zig fetch [options] <path>

Fetches a package which is found at <url> or <path> into the global
cache directory, printing the package hash to stdout.

Closes #16972
Related to #14280

Additionally, this commit:

* Adds uncompressed .tar support to package fetching
* Introduces symlink support to package fetching
2023-10-02 17:02:25 -07:00
Andrew Kelley
309c53295f std.fs: give readLink an explicit error set 2023-10-02 17:02:24 -07:00
Andrew Kelley
a4352982b3 compiler: extract package hashing logic to separate file
There are no functional changes in this commit.
2023-10-02 17:02:24 -07:00
Andrew Kelley
a5144d19b7 std.tar: support symlinks
closes #16678
2023-10-02 17:02:24 -07:00
Carl Ã…stholm
412d863ba5 std.Build: expose -idirafter to the build system 2023-10-02 16:22:07 -07:00
Ryan Zezeski
dd026588d0 illumos: fix dynamic linker path 2023-10-02 16:37:37 -06:00
Ryan Zezeski
42ad3e265c illumos does not have versions
The 5.11 in uname is not something that is ever updated. There is no
versioning of the illumos system in general. Illumos prefers to rely
on feature detection.

I can't say what Solaris does these days as I do not work at Oracle;
so I left it alone.
2023-10-02 16:23:17 -06:00
Stephen Gregoratto
285970982a Add illumos OS tag
- Adds `illumos` to the `Target.Os.Tag` enum. A new function,
  `isSolarish` has been added that returns true if the tag is either
  Solaris or Illumos. This matches the naming convention found in Rust's
  `libc` crate[1].
- Add the tag wherever `.solaris` is being checked against.
- Check for the C pre-processor macro `__illumos__` in CMake to set the
  proper target tuple. Illumos distros patch their compilers to have
  this in the "built-in" set (verified with `echo | cc -dM -E -`).

  Alternatively you could check the output of `uname -o`.

Right now, both Solaris and Illumos import from `c/solaris.zig`. In the
future it may be worth putting the shared ABI bits in a base file, and
mixing that in with specific `c/solaris.zig`/`c/illumos.zig` files.

[1]: 6e02a329a2/src/unix/solarish
2023-10-02 15:31:49 -06:00
Stephen Gregoratto
51fa7ef1c4 solaris: set correct target tuple in CMake 2023-10-02 15:31:32 -06:00
kcbanner
fb33bc99e1 sema: handle big-endian when bitcasting between different-sized union fields
Updated the tests to also run at runtime, and moved them to union.zig
2023-10-02 13:28:13 -04:00
kcbanner
d657b6c0e2 sema: support reinterpreting extern/packed unions at comptime via field access
My previous change for reading / writing to unions at comptime did not handle
union field read/writes correctly in all cases. Previously, if a field was
written to a union, it would overwrite the entire value. This is problematic
when a field of a larger size is subsequently read, because the value would not
be long enough, causing a panic.

Additionally, the writing behaviour itself was incorrect. Writing to a field of
a packed or extern union should only overwrite the bits corresponding to that
field, allowing for memory reintepretation via field writes / reads.

I addressed these problems as follows:

Add the concept of a "backing type" for extern / packed unions
(`Type.unionBackingType`).  For extern unions, this is a `u8` array, for packed
unions it's an integer matching the `bitSize` of the union. Whenever union
memory is read at comptime, it's read as this type.

When union memory is written at comptime, the tag may still be known. If so, the
memory is written using the tagged type. If the tag is unknown (because this
union had previously been read from memory), it's simply written back out as the
backing type.

I added `write_packed` to the `reinterpret` field of
`ComptimePtrMutationKit`. This causes writes of the operand to be packed - which
is necessary when writing to a field of a packed union. Without this, writing a
value to a `u1` field would overwrite the entire byte it occupied.

The final case to address was reading a different (potentially larger) field
from a union when it was written with a known tag. To handle this, a new kind of
bitcast was introduced (`bitCastUnionFieldVal`) which supports reading a larger
field by using a backing buffer that has the unwritten bits set to
undefined. The reason to support this (vs always just writing the union as it's
backing type), is that no reads to larger fields ever occur at comptime, it
would be strictly worse to have spent time writing the full backing type.
2023-10-02 13:15:28 -04:00
Andrew Kelley
53775b0999 CLI: fix -fno-clang
Aro/Clang detection logic treated `-fno-clang` the same as `-fclang`.
2023-10-01 21:37:02 -07:00
Veikka Tuominen
fc4d53e2ea
Merge pull request #17221 from Vexu/aro-translate-c
Aro translate-c
2023-10-02 07:08:53 +03:00
Jacob Young
0f1652dc60
Merge pull request #17262 from jacobly0/x86_64
x86_64: support operations that are implemented in compiler_rt
2023-10-01 20:45:42 -04:00
kcbanner
62a0fbdaef air_print: fix panic when printing .abs 2023-10-01 15:08:50 -07:00
Veikka Tuominen
5792570197 add Aro sources as a dependency
ref: 5688dbccfb58216468267a0f46b96bed7013715a
2023-10-01 23:51:54 +03:00
Veikka Tuominen
47050fbb7d aro translate-c: update to cast builtin changes 2023-10-01 23:51:54 +03:00
Veikka Tuominen
7ec729b3ae aro-translate-c: move shared types to a common namespace 2023-10-01 23:51:54 +03:00
Veikka Tuominen
31ecf75311 aro-translate-c: translate enums 2023-10-01 23:51:54 +03:00
Veikka Tuominen
fef94da958 add compiler flag for selecting C frontend 2023-10-01 23:51:54 +03:00
Jacob Young
da335f0ee4 x86_64: implement float @sqrt builtin 2023-10-01 15:09:52 -04:00
Jacob Young
fbe5bf469e x86_64: implement float arithmetic builtins 2023-10-01 15:09:52 -04:00
Jacob Young
1eb023908d x86_64: implement float round builtins 2023-10-01 15:09:52 -04:00
Jacob Young
c3042cbe12 x86_64: add missing caller preserved regs
All allocatable registers have to be either callee preserved or caller
preserved.
2023-10-01 15:09:52 -04:00
Jacob Young
8470652f10 x86_64: implement float compare and cast builtins 2023-10-01 15:09:52 -04:00
Jacob Young
6d5cbdb863 behavior: cleanup floatop tests 2023-10-01 15:09:52 -04:00
Jacob Young
3bd1b9e15f x86_64: implement and test unary float builtins 2023-10-01 15:09:52 -04:00
Jakub Konka
af40bce08a x86_64: emit R_X86_64_GOT32 for non-PIC GOT references 2023-10-01 21:09:35 +02:00
Andrew Kelley
8e1421f19e
Merge pull request #17346 from Vexu/errSetCast
Sema: implement `@errSetCast` for error unions
2023-10-01 12:00:17 -07:00