111 Commits

Author SHA1 Message Date
Marc Tiehuis
2947a2faab export __floatuntikf for PowerPC
PPC __floatuntikf is equivalent to X86 __floatuntitf.
2022-05-03 20:12:29 +12:00
Andrew Kelley
51f8ce1825 compiler-rt: fix powerpc f128 suffix 2022-04-27 18:26:47 -07:00
Andrew Kelley
1ac21cdec5 compiler-rt: avoid symbol conflicts
Weak aliases don't work on Windows, so we avoid exporting the `l` alias
on this platform for functions we know will collide.
2022-04-27 18:14:44 -07:00
Andrew Kelley
41dd2beaac compiler-rt: math functions reorg
* unify the logic for exporting math functions from compiler-rt,
   with the appropriate suffixes and prefixes.
   - add all missing f128 and f80 exports. Functions with missing
     implementations call other functions and have TODO comments.
   - also add f16 functions
 * move math functions from freestanding libc to compiler-rt (#7265)
 * enable all the f128 and f80 code in the stage2 compiler and behavior
   tests (#11161).
 * update std lib to use builtins rather than `std.math`.
2022-04-27 12:20:44 -07:00
Cody Tapscott
28c05b0a53 compiler_rt: Bypass fmodx impl. on stage2
This function is codegen'd incorrectly in stage2, since it fails to
generate the correct soft-float operations. This will be fixed once
issue #11161 is implemented
2022-04-25 19:38:59 -07:00
Cody Tapscott
d930e015c7 compiler_rt: Implement __divxf3 for f80 2022-04-25 17:21:09 -07:00
Cody Tapscott
6c0114e044 compiler_rt: Implement fmodx for f80 2022-04-25 17:21:05 -07:00
Cody Tapscott
d760cae2b1 compiler_rt: implement __mulxf3 for f80 2022-04-18 20:46:03 -07:00
Cody Tapscott
1c1cfe1533 Skip @rem/@mod tests on stage2, due to missing fmodl implementation 2022-04-12 10:25:29 -07:00
Cody Tapscott
b5d5685a4e compiler_rt: Implement floatXiYf/fixXfYi, incl f80
This change:
 - Adds  generic implementation of the float -> integer conversion
   functions floatXiYf, including support for f80
 - Updates the existing implementation of integer -> float conversion
   fixXiYf to support f16 and f80
 - Fixes the handling of the explicit integer bit in `__trunctfxf2`
 - Combines the test cases for fixXfYi/floatXiYf into a single file
 - Renames `fmodl` to `fmodq`, since it operates on 128-bit floats

The new implementation for floatXiYf has been benchmarked, and generally
provides equal or better performance versus the current implementations:

Throughput (MiB/s) - Before
     |    u32   |    i32   |    u64   |    i64   |   u128   |   i128   |
-----|----------|----------|----------|----------|----------|----------|
 f16 |     none |     none |     none |     none |     none |     none |
 f32 |  2231.67 |  2001.19 |  1745.66 |  1405.77 |  2173.99 |  1874.63 |
 f64 |  1407.17 |  1055.83 |  2911.68 |  2437.21 |  1676.05 |  1476.67 |
 f80 |     none |     none |     none |     none |     none |     none |
f128 |   327.56 |   321.25 |   645.92 |   654.52 |  1153.56 |  1096.27 |

Throughput (MiB/s) - After
     |    u32   |    i32   |    u64   |    i64   |   u128   |   i128   |
-----|----------|----------|----------|----------|----------|----------|
 f16 |  1407.61 |  1637.25 |  3555.03 |  2594.56 |  3680.60 |  3063.34 |
 f32 |  2101.36 |  2122.62 |  3225.46 |  3123.86 |  2860.05 |  1985.21 |
 f64 |  1395.57 |  1314.87 |  2409.24 |  2196.30 |  2384.95 |  1908.15 |
 f80 |   475.53 |   457.92 |   884.50 |   812.12 |  1475.27 |  1382.16 |
f128 |   359.60 |   350.91 |   723.08 |   706.80 |  1296.42 |  1198.87 |
2022-04-12 10:25:26 -07:00
jagt
b7f4045184 add compiler_rt ceilf/ceil/ceill
this should fix stage1 build error with msvc 2019
2022-03-20 13:29:23 -07:00
Andrew Kelley
2c434cddd6 AstGen: add missing coercion for const locals
A const local which had its init expression write to the result pointer,
but then gets elided to directly initialize, was missing the coercion to
the type annotation.
2022-03-15 16:41:10 -07:00
Andrew Kelley
3c1ebf9556 compiler_rt: avoid redundant exports when testing 2022-03-06 19:58:01 -07:00
Andrew Kelley
4c17b93f0a compiler_rt: additional powerpc-specific alias
Apparently LLVM lowers f128 fma to `fmaf128` instead of `fmal`.
2022-03-06 16:34:44 -07:00
Andrew Kelley
c68d9773df compiler-rt: make __fmax and fmaq aliases of fmal
on targets where that is the case.
2022-03-06 16:11:39 -07:00
Andrew Kelley
71b8760d3b stage2: rework @mulAdd
* mul_add AIR instruction: use `pl_op` instead of `ty_pl`. The type is
   always the same as the operand; no need to waste bytes redundantly
   storing the type.
 * AstGen: use coerced_ty for all the operands except for one which we
   use to communicate the type.
 * Sema: use the correct source location for requireRuntimeBlock in
   handling of `@mulAdd`.
 * native backends: handle liveness even for the functions that are
   TODO.
 * C backend: implement `@mulAdd`. It lowers to libc calls.
 * LLVM backend: make `@mulAdd` handle all float types.
   - improved fptrunc and fpext to handle f80 with compiler-rt calls.
 * Value.mulAdd: handle all float types and use the `@mulAdd` builtin.
 * behavior tests: revert the changes to testing `@mulAdd`. These
   changes broke the test coverage, making it only tested at
   compile-time.

Improved f80 support:
 * std.math.fma handles f80
 * move fma functions from freestanding libc to compiler-rt
   - add __fmax and fmal
   - make __fmax and fmaq only exported when they don't alias fmal.
   - make their linkage weak just like the rest of compiler-rt symbols.
 * removed `longDoubleIsF128` and replaced it with `longDoubleIs` which
   takes a type as a parameter. The implementation is now more accurate
   and handles more targets. Similarly, in stage2 the function
   CTypes.sizeInBits is more accurate for long double for more targets.
2022-03-06 16:11:39 -07:00
Andrew Kelley
67ba4c5679 stage2: add all functions to freestanding libc
Looks like all these functions are at least compiling successfully. I
haven't tried to run their test suites yet.

The one exception is `clone` which is crashing the compiler due to the
inline assembly. Still, this is progress!
2022-03-03 01:24:26 -07:00
Jan Philipp Hafer
5d89955543 compiler_rt: specify goals, organize README and compiler_rt.zig
* goals
  - zig as linker for object files generated by other compilers
  - zig-specific runtime features for eventual standardisation

* changes
  - missing routines are marked with `missing`
  - structure inspired by libgcc docs, but improved order and wording
  - rename misspelled functions
  - reorder and rephrase compiler_rt.zig to reflect documentation
  - potential decimal float or fixed-point arithmetic support:
    * 'Decimal float library routines' ca. 120 functions
    * 'Fixed-point fractional library routines' ca. 300 functions

thanks to @Vexu for multiple reviews and @scheibo for review
2022-02-23 16:38:51 -05:00
Mateusz Radomski
b5f8fb85e6
Implement f128 @rem 2022-02-13 15:37:38 +02:00
Jan Philipp Hafer
fc59a04061 compiler_rt: add subo
- approach by Hacker's Delight with wrapping subtraction
- performance expected to be similar to addo
- tests with all relevant combinations of min,max with -1,0,+1 and all
  combinations of sequences +-1,2,4..,max
2022-02-08 02:14:29 -05:00
matu3ba
3db130ff3d
compiler_rt: add addo (#10824)
- approach by Hacker's Delight with wrapping addition
- ca. 1.10x perf over the standard approach on my laptop
- tests with all combinations of min,max with -1,0,+1 and combinations of
  sequences +-1,2,4..,max
2022-02-07 13:27:21 -05:00
Veikka Tuominen
6a736f0c8c compiler-rt: add add/sub for f80 2022-02-04 22:38:13 +02:00
Veikka Tuominen
9bbd3ab257 compiler-rt: add comparison functions for f80 2022-02-04 22:22:43 +02:00
Veikka Tuominen
72cef17b1a compiler-rt: add trunc functions for f80 2022-02-04 22:18:44 +02:00
Veikka Tuominen
5c4ef1a64c compiler-rt: add extend functions for f80 2022-02-04 22:16:07 +02:00
Andrew Kelley
d819663543 disable some broken stuff for stage2 llvm backend on aarch64 2022-01-21 14:42:58 -07:00
Andrew Kelley
4d05f2ae5f remove zig_is_stage2 from @import("builtin")
Instead use the standarized option for communicating the
zig compiler backend at comptime, which is `zig_backend`. This was
introduced in commit 1c24ef0d0b09a12a1fe98056f2fc04de78a82df3.
2022-01-17 21:55:49 -07:00
Andrew Kelley
9bf2bda683 compiler_rt: one less exception for stage2 2022-01-13 00:32:48 -07:00
Andrew Kelley
b4d6e85a33 Sema: implement peer type resolution of signed and unsigned ints
This allows stage2 to build more of compiler-rt.

I also changed `-%` to `-` for comptime ints in the div and mul
implementations of compiler-rt. This is clearer code and also happens to
work around a bug in stage2.
2022-01-02 14:11:37 -07:00
Andrew Kelley
6b14c58f63 compiler-rt: simplify implementations
This improves readability as well as compatibility with stage2. Most of
compiler-rt is now enabled for stage2 with just a few functions disabled
(until stage2 passes more behavior tests).
2022-01-02 13:16:17 -07:00
Andrew Kelley
be5130ec53 compiler_rt: move more functions to the stage2 section
also move more already-passing behavior tests to the passing section.
2021-12-29 00:39:25 -07:00
Jan Philipp Hafer
17046674a7 compiler_rt: add __negvsi2, __negvdi2, __negvti2
- neg can only overflow, if a == MIN
- case `-0` is properly handled by hardware, so overflow check by comparing
  `a == MIN` is sufficient
- tests: MIN, MIN+1, MIN+4, -42, -7, -1, 0, 1, 7..

See #1290
2021-12-27 14:35:45 -08:00
Jan Philipp Hafer
405ff911da compiler_rt: add __absvsi2, __absvdi2, __absvti2
- abs can only overflow, if a == MIN
- comparing the sign change from wrapping addition is branchless
- tests: MIN, MIN+1,..MIN+4, -42, -7, -1, 0, 1, 7..

See #1290
2021-12-26 13:21:18 -08:00
Andrew Kelley
3532abe0c6 compiler_rt: reorganize in a way that stage2 understands
Before this commit, stage2 behavior tests are regressed; it cannot build
compiler-rt.
2021-12-15 14:23:28 -07:00
Jan Philipp Hafer
20328e976f compiler_rt: add __cmpXi2 and __ucmpXi2
- adds __cmpsi2, __cmpdi2, __cmpti2
- adds __ucmpsi2, __ucmpdi2, __ucmpti2
- use 2 if statements with 2 temporaries and a constant
- tests: MIN, MIN+1, MIN/2, -1, 0, 1, MAX/2, MAX-1, MAX if applicable

See #1290
2021-12-14 14:21:30 -08:00
Jan Philipp Hafer
eb1e75b2b8 compiler_rt: refactor __mulodi2 and __muloti2 to get __mulosi2
- use comptime instead of 2 identical implementations
- tests: port missing tests and link to archived llvm-mirror release 80

See #1290
2021-12-14 14:16:24 -08:00
Jan Philipp Hafer
c56663dee8 compiler_rt: add __negsi2, __negdi2, __negti2
- use negXi2.zig to prevent confusion with negXf2.zig
- used for size optimized builds and machines without carry instruction
- tests: special cases 0, -INT_MIN
  * use divTrunc range and shift with constant offsets

See #1290
2021-12-14 14:14:31 -08:00
Jan Philipp Hafer
efdb94486b compiler_rt: add __bswapsi2, __bswapdi2 and __bswapti2
- each byte gets masked, shifted and combined
- use boring masks instead of comptime for readability
- tests: bit patterns with reverse operation, if applicable

See #1290
2021-12-11 01:43:37 -08:00
matu3ba
282e2c714f
compiler_rt: add __ffssi2, __ffsdi2 and __ffsti2 (#10268)
See #1290
2021-12-04 16:23:33 -05:00
Jan Philipp Hafer
e4c053f047 compiler_rt: add __paritysi2, __paritydi2, __parityti2
- use Bit Twiddling Hacks: Compute parity in parallel
- test cases derived from popcount.zig
- tests: compare naive approach 10_000 times with random numbers created
  from naive seed 42
- compiler_rt.zig: sort by LLVM builtin order and add comments to improve structure

See #1290
2021-12-01 13:35:19 -08:00
Jan Philipp Hafer
f2608df0fb compiler_rt: add __ctzsi2, __ctzdi2 and __ctzti2
- structure derived from count0bits.zig
- test cases derived from clzsi2_test.zig and
  cross-checked via short helper program

See #1290
2021-11-30 18:31:16 -08:00
Jan Philipp Hafer
1ea650bb75 compiler_rt: add __popcountsi2, __popcountdi2 and __popcountti2
- apply simpler approach than LLVM for __popcountdi2
  taken from The Art of Computer Programming and generalized
- rename popcountdi2.zig to popcount.zig
- test cases derived from popcountdi2_test.zig
- tests: compare naive approach 10_000 times with
  random numbers created from naive seed 42

    See #1290
2021-11-29 12:50:25 -08:00
Stephen Gutekanst
b613210140
compiler_rt: implement __isPlatformVersionAtLeast (Objective-C @available expressions) for Darwin (#10232) 2021-11-29 14:54:23 -05:00
J.C. Moyer
939f51aebb compiler_rt: export floorf, floor, and floorl
This fixes a stage1 linker error when compiling with VS2022 on Windows.
2021-11-21 16:44:21 -05:00
Andrew Kelley
069c83d58c stage2: change @bitCast to always be by-value
After a discussion about language specs, this seems like the best way to
go, because it's simpler to reason about both for humans and compilers.

The `bitcast_result_ptr` ZIR instruction is no longer needed.

This commit also implements writing enums, arrays, and vectors to
virtual memory at compile-time.

This unlocked some more of compiler-rt being able to build, which
in turn unlocks saturating arithmetic behavior tests.

There was also a memory leak in the comptime closure system which is now
fixed.
2021-10-22 15:35:35 -07:00
Andrew Kelley
7f70c27e9d stage2: more division support
AIR:
 * div is renamed to div_trunc.
 * Add div_float, div_floor, div_exact.
   - Implemented in Sema and LLVM codegen. C backend has a stub.

Improvements to std.math.big.Int:
 * Add `eqZero` function to `Mutable`.
 * Fix incorrect results for `divFloor`.

Compiler-rt:
 * Add muloti4 to the stage2 section.
2021-10-21 19:05:26 -07:00
Andrew Kelley
b0f80ef0d5 stage2: remove use of builtin.stage2_arch workaround
The LLVM backend no longer needs this hack! However, the other backends
still do. So there are still some traces of this workaround in use for now.
2021-10-13 18:43:43 -07:00
Andrew Kelley
01ad6c0b02 freestanding libc: don't rely on compiler_rt symbols we don't have yet
Previous commit made fmal depend on __extendxftf2 and __trunctfxf2 but
we don't have implementations of those yet.
2021-10-05 18:19:31 -07:00
Andrew Kelley
4ee91bb8a8 stage1: work around LLVM's buggy fma lowering
* move fmaq from freestanding libc to compiler_rt, unconditionally
   exported weak_odr.
 * stage1: add fmaf, fmal, fmaq as symbols that compiler-rt might
   generate calls to.
 * stage1: lower `@mulAdd` directly to a call to `fmaq` instead of to
   the LLVM intrinsic because LLVM will lower it to `fmal` even when the
   target's `long double` is not equivalent to `f128`.

This commit is intended to fix the test suite which is failing on the
previous commit.
2021-10-05 14:22:38 -07:00
Andrew Kelley
6115cf2240 migrate from std.Target.current to @import("builtin").target
closes #9388
closes #9321
2021-10-04 23:48:55 -07:00