161 Commits

Author SHA1 Message Date
Andrew Kelley
41dd2beaac compiler-rt: math functions reorg
* unify the logic for exporting math functions from compiler-rt,
   with the appropriate suffixes and prefixes.
   - add all missing f128 and f80 exports. Functions with missing
     implementations call other functions and have TODO comments.
   - also add f16 functions
 * move math functions from freestanding libc to compiler-rt (#7265)
 * enable all the f128 and f80 code in the stage2 compiler and behavior
   tests (#11161).
 * update std lib to use builtins rather than `std.math`.
2022-04-27 12:20:44 -07:00
Cody Tapscott
96d86e3465 compiler_rt: Fix rounding edge case for __mulxf3 2022-04-25 17:21:09 -07:00
Cody Tapscott
d930e015c7 compiler_rt: Implement __divxf3 for f80 2022-04-25 17:21:09 -07:00
Cody Tapscott
6c0114e044 compiler_rt: Implement fmodx for f80 2022-04-25 17:21:05 -07:00
Cody Tapscott
b2950866b1 compiler_rt: Fix rounding/NaN handling for f80 add/sub
There were a few minor bugs in the rounding behavior and Inf/NaN
handling for the f80 __addxf3 and __subtf3 functions.

This change updates the original generic implementation to correctly
handle f80 floats, including the explicit integer bit.
2022-04-18 21:45:46 -07:00
Cody Tapscott
d760cae2b1 compiler_rt: implement __mulxf3 for f80 2022-04-18 20:46:03 -07:00
Koakuma
33956b8e55 compiler_rt: atomics: clr -> clrb 2022-04-15 19:48:25 +07:00
Koakuma
fac2a2e754 compiler_rt: atomics: Formatting change for flag definition 2022-04-15 19:44:46 +07:00
Koakuma
5b283fba77 compiler_rt: atomics: Add Leon CAS instruction check for SPARC atomics 2022-04-15 19:40:36 +07:00
Koakuma
6aa89115f9 ompiler_rt: atomics: Split long lines and add comment on constants 2022-04-15 19:31:55 +07:00
Koakuma
274e2a1ef1 compiler_rt: atomics: Add TAS lock support for SPARC
Some SPARC CPUs (particularly old and/or embedded ones) only has atomic
TAS instruction available (`ldstub`). This adds support for emitting
that instruction in the spinlock.
2022-04-15 08:09:57 +07:00
Cody Tapscott
319555a669 Add floatFractionalBits to replace floatMantissaDigits 2022-04-12 12:33:16 -07:00
Cody Tapscott
04dd43934a Skip some floatXiYf tests on non-x86 platforms
These need to be skipped because of a bug with `@floatToInt`
on stage1:  https://github.com/ziglang/zig/issues/11408
2022-04-12 10:25:29 -07:00
Cody Tapscott
b5d5685a4e compiler_rt: Implement floatXiYf/fixXfYi, incl f80
This change:
 - Adds  generic implementation of the float -> integer conversion
   functions floatXiYf, including support for f80
 - Updates the existing implementation of integer -> float conversion
   fixXiYf to support f16 and f80
 - Fixes the handling of the explicit integer bit in `__trunctfxf2`
 - Combines the test cases for fixXfYi/floatXiYf into a single file
 - Renames `fmodl` to `fmodq`, since it operates on 128-bit floats

The new implementation for floatXiYf has been benchmarked, and generally
provides equal or better performance versus the current implementations:

Throughput (MiB/s) - Before
     |    u32   |    i32   |    u64   |    i64   |   u128   |   i128   |
-----|----------|----------|----------|----------|----------|----------|
 f16 |     none |     none |     none |     none |     none |     none |
 f32 |  2231.67 |  2001.19 |  1745.66 |  1405.77 |  2173.99 |  1874.63 |
 f64 |  1407.17 |  1055.83 |  2911.68 |  2437.21 |  1676.05 |  1476.67 |
 f80 |     none |     none |     none |     none |     none |     none |
f128 |   327.56 |   321.25 |   645.92 |   654.52 |  1153.56 |  1096.27 |

Throughput (MiB/s) - After
     |    u32   |    i32   |    u64   |    i64   |   u128   |   i128   |
-----|----------|----------|----------|----------|----------|----------|
 f16 |  1407.61 |  1637.25 |  3555.03 |  2594.56 |  3680.60 |  3063.34 |
 f32 |  2101.36 |  2122.62 |  3225.46 |  3123.86 |  2860.05 |  1985.21 |
 f64 |  1395.57 |  1314.87 |  2409.24 |  2196.30 |  2384.95 |  1908.15 |
 f80 |   475.53 |   457.92 |   884.50 |   812.12 |  1475.27 |  1382.16 |
f128 |   359.60 |   350.91 |   723.08 |   706.80 |  1296.42 |  1198.87 |
2022-04-12 10:25:26 -07:00
viri
9b5c02022f
compiler-rt(divtf3): fix remark, add more tests 2022-04-08 20:43:27 -06:00
viri
e46c612503
use math/float.zig everywhere 2022-04-07 05:04:38 -06:00
Meghan
b73cf97c93
replace other uses of std.meta.Vector with @Vector (#11346) 2022-03-30 14:12:14 -04:00
Jan Philipp Hafer
5d89955543 compiler_rt: specify goals, organize README and compiler_rt.zig
* goals
  - zig as linker for object files generated by other compilers
  - zig-specific runtime features for eventual standardisation

* changes
  - missing routines are marked with `missing`
  - structure inspired by libgcc docs, but improved order and wording
  - rename misspelled functions
  - reorder and rephrase compiler_rt.zig to reflect documentation
  - potential decimal float or fixed-point arithmetic support:
    * 'Decimal float library routines' ca. 120 functions
    * 'Fixed-point fractional library routines' ca. 300 functions

thanks to @Vexu for multiple reviews and @scheibo for review
2022-02-23 16:38:51 -05:00
Mateusz Radomski
b5f8fb85e6
Implement f128 @rem 2022-02-13 15:37:38 +02:00
Andrew Kelley
a024aff932 make f80 less hacky; lower as u80 on non-x86
Get rid of `std.math.F80Repr`. Instead of trying to match the memory
layout of f80, we treat it as a value, same as the other floating point
types. The functions `make_f80` and `break_f80` are introduced to
compose an f80 value out of its parts, and the inverse operation.

stage2 LLVM backend: fix pointer to zero length array tripping LLVM
assertion. It now checks for when the element type is a zero-bit type
and lowers such thing the same way that pointers to other zero-bit types
are lowered.

Both stage1 and stage2 LLVM backends are adjusted so that f80 is lowered
as x86_fp80 on x86_64 and i386 architectures, and identical to a u80 on
others. LLVM constants are lowered in a less hacky way now that #10860
is fixed, by using the expression `(exp << 64) | fraction` using llvm
constants.

Sema is improved to handle c_longdouble by recursively handling it
correctly for whatever the float bit width is. In both stage1 and
stage2.
2022-02-12 11:18:23 +01:00
Jan Philipp Hafer
fc59a04061 compiler_rt: add subo
- approach by Hacker's Delight with wrapping subtraction
- performance expected to be similar to addo
- tests with all relevant combinations of min,max with -1,0,+1 and all
  combinations of sequences +-1,2,4..,max
2022-02-08 02:14:29 -05:00
matu3ba
3db130ff3d
compiler_rt: add addo (#10824)
- approach by Hacker's Delight with wrapping addition
- ca. 1.10x perf over the standard approach on my laptop
- tests with all combinations of min,max with -1,0,+1 and combinations of
  sequences +-1,2,4..,max
2022-02-07 13:27:21 -05:00
Andrew Kelley
d4805472c3 compiler_rt: addXf3: add coercion to @clz
We're going to remove the first parameter from this function in the
future. Stage2 already ignores the first parameter. So we put an `@as`
in here to make it work for both.
2022-02-06 20:06:00 -07:00
Andrew Kelley
8dcb1eba60
Merge pull request #10738 from Vexu/f80
Add compiler-rt functions for f80
2022-02-05 20:57:32 -05:00
Jan Philipp Hafer
01d48e55a5 compiler_rt: optimize mulo
- use usize to decide if register size is big enough to store
  multiplication result or if division is necessary
- multiplication routine with check of integer bounds
- wrapping multipliation and division routine from Hacker's Delight
2022-02-05 01:35:46 -05:00
Veikka Tuominen
6a736f0c8c compiler-rt: add add/sub for f80 2022-02-04 22:38:13 +02:00
Veikka Tuominen
9bbd3ab257 compiler-rt: add comparison functions for f80 2022-02-04 22:22:43 +02:00
Veikka Tuominen
72cef17b1a compiler-rt: add trunc functions for f80 2022-02-04 22:18:44 +02:00
Veikka Tuominen
5c4ef1a64c compiler-rt: add extend functions for f80 2022-02-04 22:16:07 +02:00
John Schmidt
9ee67b967b stage2: avoid inferred struct in os_version_check.zig
Before this commit, compiling an empty main with Stage 2 on macOS x86_64 results in

```
../stage2/bin/zig build-exe -ODebug -fLLVM empty_main.zig
error: sub-compilation of compiler_rt failed
    [...]/zig/stage2/lib/zig/std/special/compiler_rt/os_version_check.zig:26:10: error: TODO: Sema.zirStructInit for runtime-known struct values
```

By assigning the value to a variable we can sidestep the issue for now.
2022-01-26 00:48:05 -05:00
Meghan
c08b190c69
lint: duplicate import (#10519) 2022-01-07 00:06:06 -05:00
Andrew Kelley
5e086b2b4c compiler-rt: small refactor in atomics 2022-01-02 17:58:54 -07:00
Andrew Kelley
b4d6e85a33 Sema: implement peer type resolution of signed and unsigned ints
This allows stage2 to build more of compiler-rt.

I also changed `-%` to `-` for comptime ints in the div and mul
implementations of compiler-rt. This is clearer code and also happens to
work around a bug in stage2.
2022-01-02 14:11:37 -07:00
Andrew Kelley
6b14c58f63 compiler-rt: simplify implementations
This improves readability as well as compatibility with stage2. Most of
compiler-rt is now enabled for stage2 with just a few functions disabled
(until stage2 passes more behavior tests).
2022-01-02 13:16:17 -07:00
Andrew Kelley
be5130ec53 compiler_rt: move more functions to the stage2 section
also move more already-passing behavior tests to the passing section.
2021-12-29 00:39:25 -07:00
Jan Philipp Hafer
17046674a7 compiler_rt: add __negvsi2, __negvdi2, __negvti2
- neg can only overflow, if a == MIN
- case `-0` is properly handled by hardware, so overflow check by comparing
  `a == MIN` is sufficient
- tests: MIN, MIN+1, MIN+4, -42, -7, -1, 0, 1, 7..

See #1290
2021-12-27 14:35:45 -08:00
Jan Philipp Hafer
405ff911da compiler_rt: add __absvsi2, __absvdi2, __absvti2
- abs can only overflow, if a == MIN
- comparing the sign change from wrapping addition is branchless
- tests: MIN, MIN+1,..MIN+4, -42, -7, -1, 0, 1, 7..

See #1290
2021-12-26 13:21:18 -08:00
Isaac Freund
9f9f215305
stage1, stage2: rename c_void to anyopaque (#10316)
zig fmt now replaces c_void with anyopaque to make updating
code easy.
2021-12-19 00:24:45 -05:00
Andrew Kelley
3532abe0c6 compiler_rt: reorganize in a way that stage2 understands
Before this commit, stage2 behavior tests are regressed; it cannot build
compiler-rt.
2021-12-15 14:23:28 -07:00
Jan Philipp Hafer
20328e976f compiler_rt: add __cmpXi2 and __ucmpXi2
- adds __cmpsi2, __cmpdi2, __cmpti2
- adds __ucmpsi2, __ucmpdi2, __ucmpti2
- use 2 if statements with 2 temporaries and a constant
- tests: MIN, MIN+1, MIN/2, -1, 0, 1, MAX/2, MAX-1, MAX if applicable

See #1290
2021-12-14 14:21:30 -08:00
Jan Philipp Hafer
0550198c98 compiler_rt: simplify popcount "magic constants"
- magic constants are nicer to construct ie with
  (~@as(unsigned type, 0) / 3) == 0x55...55
- thanks to Stefan Kanthak for the idea
2021-12-14 14:19:51 -08:00
Jan Philipp Hafer
eb1e75b2b8 compiler_rt: refactor __mulodi2 and __muloti2 to get __mulosi2
- use comptime instead of 2 identical implementations
- tests: port missing tests and link to archived llvm-mirror release 80

See #1290
2021-12-14 14:16:24 -08:00
Jan Philipp Hafer
c56663dee8 compiler_rt: add __negsi2, __negdi2, __negti2
- use negXi2.zig to prevent confusion with negXf2.zig
- used for size optimized builds and machines without carry instruction
- tests: special cases 0, -INT_MIN
  * use divTrunc range and shift with constant offsets

See #1290
2021-12-14 14:14:31 -08:00
Jan Philipp Hafer
efdb94486b compiler_rt: add __bswapsi2, __bswapdi2 and __bswapti2
- each byte gets masked, shifted and combined
- use boring masks instead of comptime for readability
- tests: bit patterns with reverse operation, if applicable

See #1290
2021-12-11 01:43:37 -08:00
matu3ba
282e2c714f
compiler_rt: add __ffssi2, __ffsdi2 and __ffsti2 (#10268)
See #1290
2021-12-04 16:23:33 -05:00
Jan Philipp Hafer
e4c053f047 compiler_rt: add __paritysi2, __paritydi2, __parityti2
- use Bit Twiddling Hacks: Compute parity in parallel
- test cases derived from popcount.zig
- tests: compare naive approach 10_000 times with random numbers created
  from naive seed 42
- compiler_rt.zig: sort by LLVM builtin order and add comments to improve structure

See #1290
2021-12-01 13:35:19 -08:00
Jan Philipp Hafer
f2608df0fb compiler_rt: add __ctzsi2, __ctzdi2 and __ctzti2
- structure derived from count0bits.zig
- test cases derived from clzsi2_test.zig and
  cross-checked via short helper program

See #1290
2021-11-30 18:31:16 -08:00
Andrew Kelley
902df103c6 std lib API deprecations for the upcoming 0.9.0 release
See #3811
2021-11-30 00:13:07 -07:00
Jan Philipp Hafer
1ea650bb75 compiler_rt: add __popcountsi2, __popcountdi2 and __popcountti2
- apply simpler approach than LLVM for __popcountdi2
  taken from The Art of Computer Programming and generalized
- rename popcountdi2.zig to popcount.zig
- test cases derived from popcountdi2_test.zig
- tests: compare naive approach 10_000 times with
  random numbers created from naive seed 42

    See #1290
2021-11-29 12:50:25 -08:00
Stephen Gutekanst
b613210140
compiler_rt: implement __isPlatformVersionAtLeast (Objective-C @available expressions) for Darwin (#10232) 2021-11-29 14:54:23 -05:00