Currenty copy_file_range always uses at least two syscalls:
1. As many as it needs to do the initial copy (always 1 during my
testing)
2. The last one is always when offset is the size of the file.
The second syscall is used to detect the terminating condition. However,
because we do a stat for other reasons, we know the size of the file,
and we can skip the syscall.
Sparse files: since copy_file_range expands holes of sparse files, I
conclude that this layer was not intended to work with sparse files. In
other words, this commit does not make it worse for sparse file society.
Test program
------------
const std = @import("std");
pub fn main() !void {
const arg1 = std.mem.span(std.os.argv[1]);
const arg2 = std.mem.span(std.os.argv[2]);
try std.fs.cwd().copyFile(arg1, std.fs.cwd(), arg2, .{});
}
Test output (current master)
----------------------------
Observe two `copy_file_range` syscalls: one with 209 bytes, one with
zero:
$ zig build-exe cp.zig
$ strace ./cp ./cp.zig ./cp2.zig |& grep copy_file_range
copy_file_range(3, [0], 5, [0], 4294967295, 0) = 209
copy_file_range(3, [209], 5, [209], 4294967295, 0) = 0
$
Test output (this diff)
-----------------------
Observe a single `copy_file_range` syscall with 209 bytes:
$ /code/zig/build/zig build-exe cp.zig
$ strace ./cp ./cp.zig ./cp2.zig |& grep copy_file_range
copy_file_range(3, [0], 5, [0], 4294967295, 0) = 209
$
I did not fully wire it up in main.zig when I originally implemented
`-z nocopyreloc` in #11679 (440f5249f1a). Finish it.
If we strictly follow the rules, we should bump the cache has version,
since the field was technically added only now. But since nobody
complained thus far, I don't think many users care that much about it
and we can omit it.
same change as [68e26a2ceea85a1] "std: check for overflow in writeCurrentStackTrace"
On arm64 macOS, the address of the last frame is 0x0 rather than
a positive value like 0x1 on x86_64 macOS, therefore, we overflow
an integer trying to subtract 1 when printing the stack trace. This
patch fixes it by first checking for this condition before trying
to subtract 1.
Same behaviour on i386-windows-msvc.
Note that we do not need to signal the `SignalIterator` about this
as it will correctly detect this condition on the subsequent iteration
and return `null`, thus terminating the loop.
* Sema: implement linksection on functions
* Implement function linksection in Sema.
* Don't clobber function linksection/align/addrspace in Sema.
* Fix copy-paste typo in tests.
* Add a bunch of missing test_step.dependOn.
* Fix checkInSymtab match.
Closes#12546
* Fix for: DefaultRwLock accumulates write-waiters, eventually fails to write lock #13163
* Comment out debug.print at the end of the last test.
* Code formatting
* - use equality test after lock/unlock rather than peeking into internals.
however, this is still implementation specific and only done for
DefaultRwLock.
- add num_reads maximum to ensure that reader threads stop if writer threads are
starved
- use relaxed orderings for the read atomic counter
- don't check at the end for non-zero read ops, since the reader threads may
only run once if they are starved
* More review changes
- Monotonic is sufficient for incrementing the reads counter
It is not yet determined whether the Zig language will land on
text-based string concatenation for inline assembly, as Zig 0.9.1
allows, and as this commit allows, or whether it will introduce a new
assembly syntax more integrated with the rest of the language. Until
this decision is made, this commit relaxes the restriction which was
preventing inline assembly expressions from using comptime expressions
for the assembly source code.
Commit f14cc75 accidentally added a const when grepping for assignments
to `std.builtin.Type.StructField.default_value`, however when looking
into it further, I noticed that even though this default_value field is
emitted into the .data section, the value it points to is actually
emitted into the .rodata section, so it seems correct to use const here.
- For ALU operations, src should be allowed to be an explicit Reg.
- Expose AluOp and JmpOp as public types.
This makes code generation using BPF as a backend easier,
as AluOp and JmpOp can be used directly as part of an IR
'Self' isn't a very good name to describe what it does.
This commit changes the type name into `CodeGen` and the parameter
to `func` as we're generating code for a function.
With this change, the backend's coding style is in line with the
self-hosted Wasm-linker.
When we return an operand directly as a result, we must call
`reuseOperand`. This commit ensures it's done for all currently-
implemented AIR instructions.
Rather than accepting a canonical branch and a target branch
we allow to directly merge a branch into the parent branch.
This is possible as there's no overlapping and we have infinite
registers to our availability. This makes merging a lot simpler.
When determining the type of a local (read: register), we would
previously subtract the stack locals also. However, this locals
are also within the same `locals` list, meaning the type of the
local we were retrieving was off by 2. This could create a validation
error when we re-use a local of a different type.
Upon a branch, we only allow locals to be freed which were allocated
within the same branch as where they die. This ensures that when two
or more branches target the same operand we do not try to free
it more than once. This does however not implement freeing the local
upon branch merging yet.
When reusing an operand it increases the reference count, then when
an operand dies it will only decrease the reference count. When
this reaches 0, the local will be virtually freed, meaning it can be
re-used for a new local.
By reference counting the locals, we can ensure that when we free
a local, no local will be reused while it still has references pointing
to it. This prevents misscompilations. The compiler will also panic if
we free a local more than we reference it, introducing extra safety to
ensure they match up.
This hooks reusal of locals into liveness analysis.
Meaning that when an operand dies, and is a local,
it will automatically be freed so it can be re-used
when a new local is required. The result of this, is
a lower allocation required for locals. Having less
locals means smaller binary size, as well as faster
compilation speed when loaded by the runtime.
* When a field starts at some bit offset within a byte you need to load
starting from that byte and shift, not starting from the next byte,
so a rounded-down divide is required here, not a rounded-up one.
* Remove paragraph from doc that no longer relates to anything.
Closes#12363
When we want a runtime pointer to a zero-bit value we use an undef
pointer, but what if we want a runtime pointer to a comptime-only value?
Normally, if `T` is a comptime-only type such as `*const comptime_int`,
then `*const T` would also be a comptime-only type, so anything
referencing a comptime-only value is usually also comptime-only, and
therefore not emitted to the executable.
However, what if instead we have a `*const anyopaque` pointing to a
comptime-only value? Certainly, `*const anyopaque` is a runtime type,
and so we need some runtime value to store, even when it happens to be
pointing to a comptime-only value. In this case we want to do the same
thing as we do when pointing to a zero-bit value, so we use
`hasRuntimeBits` to handle both cases instead of ignoring comptime.
Closes#12025