Jan Philipp Hafer 5d89955543 compiler_rt: specify goals, organize README and compiler_rt.zig
* goals
  - zig as linker for object files generated by other compilers
  - zig-specific runtime features for eventual standardisation

* changes
  - missing routines are marked with `missing`
  - structure inspired by libgcc docs, but improved order and wording
  - rename misspelled functions
  - reorder and rephrase compiler_rt.zig to reflect documentation
  - potential decimal float or fixed-point arithmetic support:
    * 'Decimal float library routines' ca. 120 functions
    * 'Fixed-point fractional library routines' ca. 300 functions

thanks to @Vexu for multiple reviews and @scheibo for review
2022-02-23 16:38:51 -05:00

268 lines
12 KiB
Markdown

If hardware lacks basic or specialized functionality, compiler-rt adds such functionality
for basic arithmetic(s).
One such example is 64-bit integer multiplication on 32-bit x86.
Goals:
1. zig as linker for object files produced by other compilers
=> `function compatibility` to compiler-rt and libgcc for same-named functions
* compatibility conflict between compiler-rt and libgcc: prefer compiler-rt
2. `symbol-level compatibility` low-priority compared to emitted calls by llvm
* symbol-level compatibility: libgcc even lower priority
3. add zig-specific language runtime features, see #7265
* example: arbitrary bit width integer arithmetic
* lower to call those functions for e.g. multiplying two i12345 numbers together
* proper naming + documention for standardizing (allow languages to follow our exmaple)
Current status (tracking libgcc documentation):
- Integer library routines => almost implemented
- Soft float library routines => only f80 routines missing
- Decimal float library routines => unimplemented (~120 functions)
- Fixed-point fractional library routines => unimplemented (~300 functions)
- Exception handling routines => unclear, if supported (~32+x undocumented functions)
- Miscellaneous routines => unclear, if supported (cache control and stack function)
- No zig-specific language runtime features in compiler-rt yet
This library is automatically built as-needed for the compilation target and
then statically linked and therefore is a transparent dependency for the
programmer.
For details see `../compiler_rt.zig`.
The routines in this folder are listed below.
Routines are annotated as `type source routine // description`, with `routine`
being the name used in aforementioned `compiler_rt.zig`.
`dev` means deviating from compiler_rt, `port` ported, `source` is the
information source for the implementation, `none` means unimplemented.
Some examples for the naming convention are:
- dev source name_routine, name_routine2 various implementations for performance, simplicity etc
- port llvm compiler-rt library routines from [LLVM](http://compiler-rt.llvm.org/)
* LLVM emits library calls to compiler-rt, if the hardware lacks functionality
- port musl libc routines from [musl](https://musl.libc.org/)
If the library or information source is uncommon, use the entry `other` for `source`.
Please do not break the search by inserting entries in another format than `impl space source`.
Bugs should be solved by trying to duplicate the bug upstream, if possible.
* If the bug exists upstream, get it fixed upstream and port the fix downstream to Zig.
* If the bug only exists in Zig, use the corresponding C code and debug
both implementations side by side to figure out what is wrong.
## Integer library routines
#### Integer Bit operations
- dev HackersDelight __clzsi2 // count leading zeros
- dev HackersDelight __clzdi2 // count leading zeros
- dev HackersDelight __clzti2 // count leading zeros
- dev HackersDelight __ctzsi2 // count trailing zeros
- dev HackersDelight __ctzdi2 // count trailing zeros
- dev HackersDelight __ctzti2 // count trailing zeros
- dev __ctzsi2 __ffssi2 // find least significant 1 bit
- dev __ctzsi2 __ffsdi2 // find least significant 1 bit
- dev __ctzsi2 __ffsti2 // find least significant 1 bit
- dev BitTwiddlingHacks __paritysi2 // bit parity
- dev BitTwiddlingHacks __paritydi2 // bit parity
- dev BitTwiddlingHacks __parityti2 // bit parity
- dev TAOCP __popcountsi2 // bit population
- dev TAOCP __popcountdi2 // bit population
- dev TAOCP __popcountti2 // bit population
- dev other __bswapsi2 // a byteswapped
- dev other __bswapdi2 // a byteswapped
- dev other __bswapti2 // a byteswapped
#### Integer Comparison
- port llvm __cmpsi2 // (a<b)=>output=0, (a==b)=>output=1, (a>b)=>output=2
- port llvm __cmpdi2
- port llvm __cmpti2
- port llvm __ucmpsi2 // (a<b)=>output=0, (a==b)=>output=1, (a>b)=>output=2
- port llvm __ucmpdi2
- port llvm __ucmpti2
#### Integer Arithmetic
- none none __ashlsi3 // a << b unused in llvm, missing (e.g. used by rl78)
- port llvm __ashldi3 // a << b
- port llvm __ashlti3 // a << b
- none none __ashrsi3 // a >> b arithmetic (sign fill) missing (e.g. used by rl78)
- port llvm __ashrdi3 // a >> b arithmetic (sign fill)
- port llvm __ashrti3 // a >> b arithmetic (sign fill)
- none none __lshrsi3 // a >> b logical (zero fill) missing (e.g. used by rl78)
- port llvm __lshrdi3 // a >> b logical (zero fill)
- port llvm __lshrti3 // a >> b logical (zero fill)
- port llvm __negdi2 // -a symbol-level compatibility: libgcc
- port llvm __negti2 // -a unnecessary: unused in backends
- port llvm __mulsi3 // a * b signed
- port llvm __muldi3 // a * b signed
- port llvm __multi3 // a * b signed
- port llvm __divsi3 // a / b signed
- port llvm __divdi3 // a / b signed
- port llvm __divti3 // a / b signed
- port llvm __udivsi3 // a / b unsigned
- port llvm __udivdi3 // a / b unsigned
- port llvm __udivti3 // a / b unsigned
- port llvm __modsi3 // a % b signed
- port llvm __moddi3 // a % b signed
- port llvm __modti3 // a % b signed
- port llvm __umodsi3 // a % b unsigned
- port llvm __umoddi3 // a % b unsigned
- port llvm __umodti3 // a % b unsigned
- port llvm __udivmoddi4 // a / b, rem.* = a % b unsigned
- port llvm __udivmodti4 // a / b, rem.* = a % b unsigned
- port llvm __udivmodsi4 // a / b, rem.* = a % b unsigned
- port llvm __divmodsi4 // a / b, rem.* = a % b signed, ARM
#### Integer Arithmetic with trapping overflow
- dev BitTwiddlingHacks __absvsi2 // abs(a)
- dev BitTwiddlingHacks __absvdi2 // abs(a)
- dev BitTwiddlingHacks __absvti2 // abs(a)
- port llvm __negvsi2 // -a symbol-level compatibility: libgcc
- port llvm __negvdi2 // -a unnecessary: unused in backends
- port llvm __negvti2 // -a
- TODO upstreaming __addvsi3..__mulvti3 after testing panics works
- dev HackersDelight __addvsi3 // a + b
- dev HackersDelight __addvdi3 // a + b
- dev HackersDelight __addvti3 // a + b
- dev HackersDelight __subvsi3 // a - b
- dev HackersDelight __subvdi3 // a - b
- dev HackersDelight __subvti3 // a - b
- dev HackersDelight __mulvsi3 // a * b
- dev HackersDelight __mulvdi3 // a * b
- dev HackersDelight __mulvti3 // a * b
#### Integer Arithmetic which returns if overflow (would be faster without pointer)
- dev HackersDelight __addosi4 // a + b, overflow=>ov.*=1 else 0
- dev HackersDelight __addodi4 // (completeness + performance, llvm does not use them)
- dev HackersDelight __addoti4 //
- dev HackersDelight __subosi4 // a - b, overflow=>ov.*=1 else 0
- dev HackersDelight __subodi4 // (completeness + performance, llvm does not use them)
- dev HackersDelight __suboti4 //
- dev HackersDelight __mulosi4 // a * b, overflow=>ov.*=1 else 0
- dev HackersDelight __mulodi4 // (required by llvm)
- dev HackersDelight __muloti4 //
## Float library routines
#### Float Conversion
- todo todo __extendsfdf2 // extend a f32 => f64
- todo todo __extendsftf2 // extend a f32 => f128
- dev llvm __extendsfxf2 // extend a f32 => f80
- todo todo __extenddftf2 // extend a f64 => f128
- dev llvm __extenddfxf2 // extend a f64 => f80
- todo todo __truncdfsf2 // truncate a to narrower mode of return type, rounding towards zero
- todo todo __trunctfdf2 //
- todo todo __trunctfsf2 //
- dev llvm __truncxfsf2 //
- dev llvm __truncxfdf2 //
- todo todo __fixsfsi // convert a to i32, rounding towards zero
- todo todo __fixdfsi //
- todo todo __fixtfsi //
- none none __fixxfsi // missing
- todo todo __fixsfdi // convert a to i64, rounding towards zero
- todo todo __fixdfdi //
- todo todo __fixtfdi //
- none none __fixxfdi // missing
- todo todo __fixsfti // convert a to i128, rounding towards zero
- todo todo __fixdfti //
- todo todo __fixtfdi //
- none none __fixxfti // missing
- __fixunssfsi // convert to u32, rounding towards zero. negative values become 0.
- __fixunsdfsi //
- __fixunstfsi //
- __fixunsxfsi // missing
- __fixunssfdi // convert to u64, rounding towards zero. negative values become 0.
- __fixunsdfdi //
- __fixunstfdi //
- __fixunsxfdi // missing
- __fixunssfti // convert to u128, rounding towards zero. negative values become 0.
- __fixunsdfti //
- __fixunstfdi //
- __fixunsxfti // missing
- __floatsisf // convert i32 to floating point
- __floatsidf //
- __floatsitf //
- __floatsixf // missing
- __floatdisf // convert i64 to floating point
- __floatdidf //
- __floatditf //
- __floatdixf // missing
- __floattisf // convert i128 to floating point
- __floattidf //
- __floattixf // missing
- __floatunsisf // convert i32 to floating point
- __floatunsidf //
- __floatunsitf //
- __floatunsixf // missing
- __floatundisf // convert i64 to floating point
- __floatundidf //
- __floatunditf //
- __floatundixf // missing
- __floatuntisf // convert i128 to floating point
- __floatuntidf //
- __floatuntitf //
- __floatuntixf // missing
#### Float Comparison
- __cmpsf2 // return (a<b)=>-1,(a==b)=>0,(a>b)=>1,Nan=>1 dont rely on this
- __cmpdf2 // exported from __lesf2, __ledf2, __letf2 (below)
- __cmptf2 //
- __unordsf2 // (input==NaN) => out!=0 else out=0,
- __unorddf2 // __only reliable for (input!=Nan)__
- __unordtf2 //
- __eqsf2 // (a!=NaN) and (b!=Nan) and (a==b) => output=0
- __eqdf2 //
- __eqtf2 //
- __nesf2 // (a==NaN) or (b==Nan) or (a!=b) => output!=0
- __nedf2 //
- __netf2 //
- __gesf2 // (a!=Nan) and (b!=Nan) and (a>=b) => output>=0
- __gedf2 //
- __getf2 //
- __ltsf2 // (a!=Nan) and (b!=Nan) and (a<b) => output<0
- __ltdf2 //
- __lttf2 //
- __lesf2 // (a!=Nan) and (b!=Nan) and (a<=b) => output<=0
- __ledf2 //
- __letf2 //
- __gtsf2 // (a!=Nan) and (b!=Nan) and (a>b) => output>0
- __gtdf2 //
- __gttf2 //
#### Float Arithmetic
- __addsf3 // a + b f32
- __adddf3 // a + b f64
- __addtf3 // a + b f128
- __addxf3 // a + b f80
- __aeabi_fadd // a + b f64 ARM: AAPCS
- __aeabi_dadd // a + b f64 ARM: AAPCS
- __subsf3 // a - b
- __subdf3 // a - b
- __subtf3 // a - b
- __subxf3 // a - b f80
- __aeabi_fsub // a - b f64 ARM: AAPCS
- __aeabi_dsub // a - b f64 ARM: AAPCS
- __mulsf3 // a * b
- __muldf3 // a * b
- __multf3 // a * b
- __mulxf3 // a * b missing
- __divsf3 // a / b
- __divdf3 // a / b
- __divtf3 // a / b
- __divxf3 // a / b missing
- __negsf2 // -a symbol-level compatibility: libgcc uses this for the rl78
- __negdf2 // -a unnecessary: can be lowered directly to a xor
- __negtf2 // -a
- __negxf2 // -a
#### Floating point raised to integer power
- __powisf2 // unclear, if supported a ^ b
- __powidf2 //
- __powitf2 //
- __powixf2 //
- __mulsc3 // unsupported (a+ib) * (c+id)
- __muldc3 //
- __multc3 //
- __mulxc3 //
- __divsc3 // unsupported (a+ib) * / (c+id)
- __divdc3 //
- __divtc3 //
- __divxc3 //