zig/lib/std/fmt/parse_float/convert_slow.zig
Marc Tiehuis 2085a4af56 add new float-parser based on eisel-lemire algorithm
The previous float-parsing method was lacking in a lot of areas. This
commit introduces a state-of-the art implementation that is both
accurate and fast to std.

Code is derived from working repo https://github.com/tiehuis/zig-parsefloat.
This includes more test-cases and performance numbers that are present
in this commit.

* Accuracy

The primary testing regime has been using test-data found at
https://github.com/tiehuis/parse-number-fxx-test-data. This is a fork of
upstream with support for f128 test-cases added. This data has been
verified against other independent implementations and represents
accurate round-to-even IEEE-754 floating point semantics.

* Performance

Compared to the existing parseFloat implementation there is ~5-10x
performance improvement using the above corpus. (f128 parsing excluded
in below measurements).

** Old

    $ time ./test_all_fxx_data
    3520298/5296694 succeeded (1776396 fail)

    ________________________________________________________
    Executed in   28.68 secs    fish           external
       usr time   28.48 secs    0.00 micros   28.48 secs
       sys time    0.08 secs  694.00 micros    0.08 secs

** This Implementation

    $ time ./test_all_fxx_data
    5296693/5296694 succeeded (1 fail)

    ________________________________________________________
    Executed in    4.54 secs    fish           external
       usr time    4.37 secs  515.00 micros    4.37 secs
       sys time    0.10 secs  171.00 micros    0.10 secs

Further performance numbers can be seen using the
https://github.com/tiehuis/simple_fastfloat_benchmark/ repository, which
compares against some other well-known string-to-float conversion
functions. A breakdown can be found here:

0d9f020f1a/PERFORMANCE.md (commit-b15406a0d2e18b50a4b62fceb5a6a3bb60ca5706)

In summary, we are within 20% of the C++ reference implementation and
have about ~600-700MB/s throughput on a Intel I5-6500 3.5Ghz.

* F128 Support

Finally, f128 is now completely supported with full accuracy. This does
use a slower path which is possible to improve in future.

* Behavioural Changes

There are a few behavioural changes to note.

 - `parseHexFloat` is now redundant and these are now supported directly
   in `parseFloat`.
 - We implement round-to-even in all parsing routines. This is as
   specified by IEEE-754. Previous code used different rounding
   mechanisms (standard was round-to-zero, hex-parsing looked to use
   round-up) so there may be subtle differences.

Closes #2207.
Fixes #11169.
2022-05-03 16:46:40 +12:00

115 lines
4.3 KiB
Zig

const std = @import("std");
const math = std.math;
const common = @import("common.zig");
const BiasedFp = common.BiasedFp;
const Decimal = @import("decimal.zig").Decimal;
const mantissaType = common.mantissaType;
const max_shift = 60;
const num_powers = 19;
const powers = [_]u8{ 0, 3, 6, 9, 13, 16, 19, 23, 26, 29, 33, 36, 39, 43, 46, 49, 53, 56, 59 };
pub fn getShift(n: usize) usize {
return if (n < num_powers) powers[n] else max_shift;
}
/// Parse the significant digits and biased, binary exponent of a float.
///
/// This is a fallback algorithm that uses a big-integer representation
/// of the float, and therefore is considerably slower than faster
/// approximations. However, it will always determine how to round
/// the significant digits to the nearest machine float, allowing
/// use to handle near half-way cases.
///
/// Near half-way cases are halfway between two consecutive machine floats.
/// For example, the float `16777217.0` has a bitwise representation of
/// `100000000000000000000000 1`. Rounding to a single-precision float,
/// the trailing `1` is truncated. Using round-nearest, tie-even, any
/// value above `16777217.0` must be rounded up to `16777218.0`, while
/// any value before or equal to `16777217.0` must be rounded down
/// to `16777216.0`. These near-halfway conversions therefore may require
/// a large number of digits to unambiguously determine how to round.
///
/// The algorithms described here are based on "Processing Long Numbers Quickly",
/// available here: <https://arxiv.org/pdf/2101.11408.pdf#section.11>.
pub fn convertSlow(comptime T: type, s: []const u8) BiasedFp(T) {
const MantissaT = mantissaType(T);
const min_exponent = -(1 << (math.floatExponentBits(T) - 1)) + 1;
const infinite_power = (1 << math.floatExponentBits(T)) - 1;
const mantissa_explicit_bits = math.floatMantissaBits(T);
var d = Decimal(T).parse(s); // no need to recheck underscores
if (d.num_digits == 0 or d.decimal_point < Decimal(T).min_exponent) {
return BiasedFp(T).zero();
} else if (d.decimal_point >= Decimal(T).max_exponent) {
return BiasedFp(T).inf(T);
}
var exp2: i32 = 0;
// Shift right toward (1/2 .. 1]
while (d.decimal_point > 0) {
const n = @intCast(usize, d.decimal_point);
const shift = getShift(n);
d.rightShift(shift);
if (d.decimal_point < -Decimal(T).decimal_point_range) {
return BiasedFp(T).zero();
}
exp2 += @intCast(i32, shift);
}
// Shift left toward (1/2 .. 1]
while (d.decimal_point <= 0) {
const shift = blk: {
if (d.decimal_point == 0) {
break :blk switch (d.digits[0]) {
5...9 => break,
0, 1 => @as(usize, 2),
else => 1,
};
} else {
const n = @intCast(usize, -d.decimal_point);
break :blk getShift(n);
}
};
d.leftShift(shift);
if (d.decimal_point > Decimal(T).decimal_point_range) {
return BiasedFp(T).inf(T);
}
exp2 -= @intCast(i32, shift);
}
// We are now in the range [1/2 .. 1] but the binary format uses [1 .. 2]
exp2 -= 1;
while (min_exponent + 1 > exp2) {
var n = @intCast(usize, (min_exponent + 1) - exp2);
if (n > max_shift) {
n = max_shift;
}
d.rightShift(n);
exp2 += @intCast(i32, n);
}
if (exp2 - min_exponent >= infinite_power) {
return BiasedFp(T).inf(T);
}
// Shift the decimal to the hidden bit, and then round the value
// to get the high mantissa+1 bits.
d.leftShift(mantissa_explicit_bits + 1);
var mantissa = d.round();
if (mantissa >= (@as(MantissaT, 1) << (mantissa_explicit_bits + 1))) {
// Rounding up overflowed to the carry bit, need to
// shift back to the hidden bit.
d.rightShift(1);
exp2 += 1;
mantissa = d.round();
if ((exp2 - min_exponent) >= infinite_power) {
return BiasedFp(T).inf(T);
}
}
var power2 = exp2 - min_exponent;
if (mantissa < (@as(MantissaT, 1) << mantissa_explicit_bits)) {
power2 -= 1;
}
// Zero out all the bits above the explicit mantissa bits.
mantissa &= (@as(MantissaT, 1) << mantissa_explicit_bits) - 1;
return .{ .f = mantissa, .e = power2 };
}