Underlying fix should have been d7b029995c.
u128 limb sizes are still not fully tested as we are missing compiler-rt
support (__divei4, __modei4 on x86_64). Should be no zig blockers so the
assertion has been removed.
- Fix zig_clz_u128 not respecting the bits argument. This was crashing the compile-rt addxf3 tests with the cbe
- Instead of redering a negation for negative 128 bit int literals, render the literal as twos complement. This allows
rendering int representations of floats correctly (specifically f80).
* Export invalidFmtErr
To allow consistent use of "invalid format string" compile error
response for badly formatted format strings.
See https://github.com/ziglang/zig/pull/13489#issuecomment-1311759340.
* Replace format compile errors with invalidFmtErr
- Provides more consistent compile errors.
- Gives user info about the type of the badly formated value.
* Rename invalidFmtErr as invalidFmtError
For consistency. Zig seems to use “Error” more often than “Err”.
* std: add invalid format string checks to remaining custom formatters
* pass reference-trace to comp when building build file; fix checkobjectstep
Packed memory has a well-defined layout that doesn't require
conversion from an integer to read from. Let's use it :-)
This change means that for bitcasting to/from a packed value that
is N layers deep, we no longer have to create N temporary big-ints
and perform N copies.
Other miscellaneous improvements:
- Adds support for casting to packed enums and vectors
- Fixes bitcasting to/from vectors outside of a packed struct
- Adds a fast path for bitcasting <= u/i64
- Fixes bug when bitcasting f80 which would clear following fields
This also changes the bitcast memory layout of exotic integers on
big-endian systems to match what's empirically observed on our targets.
Technically, this layout is not guaranteed by LLVM so we should probably
ban bitcasts that reveal these padding bits, but for now this is an
improvement.
Previously, you might obtain `-lLLVM-15` from the CMake configuration,
but we might not be able to locate the library if it's not in your
system library path.
Ideally this duplicated code could be factored out into a function, but
there doesn't seem to be any way in the Zig type system to represent an
argument to a function called at comptime that is only needed if it is
comptime-known. Instead, we document what is going on in an adjacent
comment in case it gets copy-pasted into new methods in the future.
When trying to allocate memory for functions like `Managed.init` and
`Managed.set` on the stack, a comptime-known allocation size is desired.
The doc comments for these functions indicate that `calcLimbLen` can be
used to determine how many limbs to allocate, but if `value` is not
comptime-known, then neither is `calcLimbLen(value)`. However, an upper
bound on the allocation size is still computable at comptime in this
case, so this note documents an expression that can be used, rather than
trying to add it to every doc comment that mentions `calcLimbLen`.
Many of the Managed methods accepted by-val parameters which could
reference Limb slices that became invalid memory after any
ensureCapacity calls. Now, Managed methods accept `*const Managed`
parameters so that if the function allows aliasing and the
ensure-capacity call resizes the Limb slice, it also affects the
aliased parameters, avoiding use-after-free bugs.
This is a breaking change that reduces the requirement for callsites to
manually make the ensure-capacity changes prior to calling many of the
Managed methods.
Closes#11897
Fixes#11353
The renderer treats comments and doc comments differently since doc
comments are parsed into the Ast. This commit adds a check after getting
the text for the doc comment and trims whitespace at the end before
rendering.
The `a = 0,` in the test is here to avoid a ParseError while parsing the
test.
Big-int functions were updated to respect the provided abi_size, rather
than inferring a potentially incorrect abi_size implicitly.
In combination with the convention that any required padding bits are
added on the MSB end, this means that exotic integers can potentially
have a well-defined memory layout.
This adds a new path which avoids using compiler_rt generated div
udivmod instructions in the case that a divisor is less than half the
max usize value. Two half-limb divisions are performed instead which
ensures that non-emulated division instructions are actually used. This
does not improve the udivmod code which should still be reviewed
independently of this issue.
Notably this improves the performance of the toString implementation of
non-power-of-two bases considerably.
Division performance is improved ~1000% based on some coarse testing.
The following test code is used to provide a rough comparison between
the old vs. new method.
```
const std = @import("std");
const Managed = std.math.big.int.Managed;
const allocator = std.heap.c_allocator;
fn fib(a: *Managed, n: usize) !void {
var b = try Managed.initSet(allocator, 1);
defer b.deinit();
var c = try Managed.init(allocator);
defer c.deinit();
var i: usize = 0;
while (i < n) : (i += 1) {
try c.add(a.toConst(), b.toConst());
a.swap(&b);
b.swap(&c);
}
}
pub fn main() !void {
var a = try Managed.initSet(allocator, 0);
defer a.deinit();
try fib(&a, 1_000_000);
// Note: Next two lines (and printed digit count) omitted on no-print version.
const as = try a.toString(allocator, 10, .lower);
defer allocator.free(as);
std.debug.print("fib: digit count: {}, limb count: {}\n", .{ as.len, a.limbs.len });
}
```
```
==> time.no-print <==
limb count: 10849
________________________________________________________
Executed in 10.60 secs fish external
usr time 10.44 secs 0.00 millis 10.44 secs
sys time 0.02 secs 1.12 millis 0.02 secs
==> time.old <==
fib: digit count: 208988, limb count: 10849
________________________________________________________
Executed in 22.78 secs fish external
usr time 22.43 secs 1.01 millis 22.43 secs
sys time 0.03 secs 0.13 millis 0.03 secs
==> time.optimized <==
fib: digit count: 208988, limb count: 10849
________________________________________________________
Executed in 11.59 secs fish external
usr time 11.56 secs 1.03 millis 11.56 secs
sys time 0.03 secs 0.12 millis 0.03 secs
```
Perf data for non-optimized and optimized, verifying no udivmod is
generated by the new code.
```
$ perf report -i perf.data.old --stdio
- Total Lost Samples: 0
-
- Samples: 90K of event 'cycles:u'
- Event count (approx.): 71603695208
-
- Overhead Command Shared Object Symbol
- ........ ....... ................ ...........................................
-
52.97% t t [.] compiler_rt.udivmod.udivmod
45.97% t t [.] std.math.big.int.Mutable.addCarry
0.83% t t [.] main
0.08% t libc-2.33.so [.] __memmove_avx_unaligned_erms
0.08% t t [.] __udivti3
0.03% t [unknown] [k] 0xffffffff9a0010a7
0.02% t t [.] std.math.big.int.Managed.ensureCapacity
0.01% t libc-2.33.so [.] _int_malloc
0.00% t libc-2.33.so [.] __malloc_usable_size
0.00% t libc-2.33.so [.] _int_free
0.00% t t [.] 0x0000000000004a80
0.00% t t [.] std.heap.CAllocator.resize
0.00% t libc-2.33.so [.] _mid_memalign
0.00% t libc-2.33.so [.] sysmalloc
0.00% t libc-2.33.so [.] __posix_memalign
0.00% t t [.] std.heap.CAllocator.alloc
0.00% t ld-2.33.so [.] do_lookup_x
$ perf report -i perf.data.optimized --stdio
- Total Lost Samples: 0
-
- Samples: 46K of event 'cycles:u'
- Event count (approx.): 36790112336
-
- Overhead Command Shared Object Symbol
- ........ ....... ................ ...........................................
-
79.98% t t [.] std.math.big.int.Mutable.addCarry
15.14% t t [.] main
4.58% t t [.] std.math.big.int.Managed.ensureCapacity
0.21% t libc-2.33.so [.] __memmove_avx_unaligned_erms
0.05% t [unknown] [k] 0xffffffff9a0010a7
0.02% t libc-2.33.so [.] _int_malloc
0.01% t t [.] std.heap.CAllocator.alloc
0.01% t libc-2.33.so [.] __malloc_usable_size
0.00% t libc-2.33.so [.] systrim.constprop.0
0.00% t libc-2.33.so [.] _mid_memalign
0.00% t t [.] 0x0000000000000c7d
0.00% t libc-2.33.so [.] malloc
0.00% t ld-2.33.so [.] check_match
```
Closes#10630.
AIR:
* div is renamed to div_trunc.
* Add div_float, div_floor, div_exact.
- Implemented in Sema and LLVM codegen. C backend has a stub.
Improvements to std.math.big.Int:
* Add `eqZero` function to `Mutable`.
* Fix incorrect results for `divFloor`.
Compiler-rt:
* Add muloti4 to the stage2 section.