26 Commits

Author SHA1 Message Date
Igor Anić
d645114f7e add deflate implemented from first principles
Zig deflate compression/decompression implementation. It supports compression and decompression of gzip, zlib and raw deflate format.

Fixes #18062.

This PR replaces current compress/gzip and compress/zlib packages. Deflate package is renamed to flate. Flate is common name for deflate/inflate where deflate is compression and inflate decompression.

There are breaking change. Methods signatures are changed because of removal of the allocator, and I also unified API for all three namespaces (flate, gzip, zlib).

Currently I put old packages under v1 namespace they are still available as compress/v1/gzip, compress/v1/zlib, compress/v1/deflate. Idea is to give users of the current API little time to postpone analyzing what they had to change. Although that rises question when it is safe to remove that v1 namespace.

Here is current API in the compress package:

```Zig
// deflate
    fn compressor(allocator, writer, options) !Compressor(@TypeOf(writer))
    fn Compressor(comptime WriterType) type

    fn decompressor(allocator, reader, null) !Decompressor(@TypeOf(reader))
    fn Decompressor(comptime ReaderType: type) type

// gzip
    fn compress(allocator, writer, options) !Compress(@TypeOf(writer))
    fn Compress(comptime WriterType: type) type

    fn decompress(allocator, reader) !Decompress(@TypeOf(reader))
    fn Decompress(comptime ReaderType: type) type

// zlib
    fn compressStream(allocator, writer, options) !CompressStream(@TypeOf(writer))
    fn CompressStream(comptime WriterType: type) type

    fn decompressStream(allocator, reader) !DecompressStream(@TypeOf(reader))
    fn DecompressStream(comptime ReaderType: type) type

// xz
   fn decompress(allocator: Allocator, reader: anytype) !Decompress(@TypeOf(reader))
   fn Decompress(comptime ReaderType: type) type

// lzma
    fn decompress(allocator, reader) !Decompress(@TypeOf(reader))
    fn Decompress(comptime ReaderType: type) type

// lzma2
    fn decompress(allocator, reader, writer !void

// zstandard:
    fn DecompressStream(ReaderType, options) type
    fn decompressStream(allocator, reader) DecompressStream(@TypeOf(reader), .{})
    struct decompress
```

The proposed naming convention:
 - Compressor/Decompressor for functions which return type, like Reader/Writer/GeneralPurposeAllocator
 - compressor/compressor for functions which are initializers for that type, like reader/writer/allocator
 - compress/decompress for one shot operations, accepts reader/writer pair, like read/write/alloc

```Zig
/// Compress from reader and write compressed data to the writer.
fn compress(reader: anytype, writer: anytype, options: Options) !void

/// Create Compressor which outputs the writer.
fn compressor(writer: anytype, options: Options) !Compressor(@TypeOf(writer))

/// Compressor type
fn Compressor(comptime WriterType: type) type

/// Decompress from reader and write plain data to the writer.
fn decompress(reader: anytype, writer: anytype) !void

/// Create Decompressor which reads from reader.
fn decompressor(reader: anytype) Decompressor(@TypeOf(reader)

/// Decompressor type
fn Decompressor(comptime ReaderType: type) type

```

Comparing this implementation with the one we currently have in Zig's standard library (std).
Std is roughly 1.2-1.4 times slower in decompression, and 1.1-1.2 times slower in compression. Compressed sizes are pretty much same in both cases.
More resutls in [this](https://github.com/ianic/flate) repo.

This library uses static allocations for all structures, doesn't require allocator. That makes sense especially for deflate where all structures, internal buffers are allocated to the full size. Little less for inflate where we std version uses less memory by not preallocating to theoretical max size array which are usually not fully used.

For deflate this library allocates 395K while std 779K.
For inflate this library allocates 74.5K while std around 36K.

Inflate difference is because we here use 64K history instead of 32K in std.

If merged existing usage of compress gzip/zlib/deflate need some changes. Here is example with necessary changes in comments:

```Zig

const std = @import("std");

// To get this file:
// wget -nc -O war_and_peace.txt https://www.gutenberg.org/ebooks/2600.txt.utf-8
const data = @embedFile("war_and_peace.txt");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer std.debug.assert(gpa.deinit() == .ok);
    const allocator = gpa.allocator();

    try oldDeflate(allocator);
    try new(std.compress.flate, allocator);

    try oldZlib(allocator);
    try new(std.compress.zlib, allocator);

    try oldGzip(allocator);
    try new(std.compress.gzip, allocator);
}

pub fn new(comptime pkg: type, allocator: std.mem.Allocator) !void {
    var buf = std.ArrayList(u8).init(allocator);
    defer buf.deinit();

    // Compressor
    var cmp = try pkg.compressor(buf.writer(), .{});
    _ = try cmp.write(data);
    try cmp.finish();

    var fbs = std.io.fixedBufferStream(buf.items);
    // Decompressor
    var dcp = pkg.decompressor(fbs.reader());

    const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
    defer allocator.free(plain);
    try std.testing.expectEqualSlices(u8, data, plain);
}

pub fn oldDeflate(allocator: std.mem.Allocator) !void {
    const deflate = std.compress.v1.deflate;

    // Compressor
    var buf = std.ArrayList(u8).init(allocator);
    defer buf.deinit();
    // Remove allocator
    // Rename deflate -> flate
    var cmp = try deflate.compressor(allocator, buf.writer(), .{});
    _ = try cmp.write(data);
    try cmp.close(); // Rename to finish
    cmp.deinit(); // Remove

    // Decompressor
    var fbs = std.io.fixedBufferStream(buf.items);
    // Remove allocator and last param
    // Rename deflate -> flate
    // Remove try
    var dcp = try deflate.decompressor(allocator, fbs.reader(), null);
    defer dcp.deinit(); // Remove

    const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
    defer allocator.free(plain);
    try std.testing.expectEqualSlices(u8, data, plain);
}

pub fn oldZlib(allocator: std.mem.Allocator) !void {
    const zlib = std.compress.v1.zlib;

    var buf = std.ArrayList(u8).init(allocator);
    defer buf.deinit();

    // Compressor
    // Rename compressStream => compressor
    // Remove allocator
    var cmp = try zlib.compressStream(allocator, buf.writer(), .{});
    _ = try cmp.write(data);
    try cmp.finish();
    cmp.deinit(); // Remove

    var fbs = std.io.fixedBufferStream(buf.items);
    // Decompressor
    // decompressStream => decompressor
    // Remove allocator
    // Remove try
    var dcp = try zlib.decompressStream(allocator, fbs.reader());
    defer dcp.deinit(); // Remove

    const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
    defer allocator.free(plain);
    try std.testing.expectEqualSlices(u8, data, plain);
}

pub fn oldGzip(allocator: std.mem.Allocator) !void {
    const gzip = std.compress.v1.gzip;

    var buf = std.ArrayList(u8).init(allocator);
    defer buf.deinit();

    // Compressor
    // Rename compress => compressor
    // Remove allocator
    var cmp = try gzip.compress(allocator, buf.writer(), .{});
    _ = try cmp.write(data);
    try cmp.close(); // Rename to finisho
    cmp.deinit(); // Remove

    var fbs = std.io.fixedBufferStream(buf.items);
    // Decompressor
    // Rename decompress => decompressor
    // Remove allocator
    // Remove try
    var dcp = try gzip.decompress(allocator, fbs.reader());
    defer dcp.deinit(); // Remove

    const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
    defer allocator.free(plain);
    try std.testing.expectEqualSlices(u8, data, plain);
}

```
2024-02-14 18:28:20 +01:00
jacwil
68ea1121fc objcopy ofmt=hex iterates through segments instead of sections 2024-01-22 21:29:21 -08:00
Xavier Bouchoux
f7b82ed416 objcopy: fix typo
fixes aecc15391a4c37a4504d29cb7e3179990b180773
the usual last 'harmless' cosmetic ajustement before commit strikes again...
2023-08-15 02:52:38 -07:00
Xavier Bouchoux
aecc15391a objcpy: preserve the mode when striping an elf file.
+ clean some @as()
2023-08-12 09:54:20 +02:00
mlugg
f26dda2117 all: migrate code to new cast builtin syntax
Most of this migration was performed automatically with `zig fmt`. There
were a few exceptions which I had to manually fix:

* `@alignCast` and `@addrSpaceCast` cannot be automatically rewritten
* `@truncate`'s fixup is incorrect for vectors
* Test cases are not formatted, and their error locations change
2023-06-24 16:56:39 -07:00
d18g
991e00c270
objcopy.zig allow outputting zero length sections (#16121)
Co-authored-by: Andrew Kelley <andrew@ziglang.org>
2023-06-22 01:55:34 -07:00
d18g
939e4d81e1 Update objcopy.zig
Fix `--only-section=` handling
2023-06-20 13:12:19 -07:00
Andrew Kelley
a72d634b73
Merge pull request #16046 from BratishkaErik/issue-6128
Renaming `@xtoy` to `@YfromX`
2023-06-19 22:36:24 -07:00
Eric Joldasov
50339f595a all: zig fmt and rename "@XToY" to "@YFromX"
Signed-off-by: Eric Joldasov <bratishkaerik@getgoogleoff.me>
2023-06-19 12:34:42 -07:00
Xavier Bouchoux
c39191a086 objcopy: add support for --compress-debug-sections 2023-06-19 08:51:03 +02:00
Xavier Bouchoux
5eacddce99 objcopy: support some more elf file variants
"SHT_NOBITS" sections can be easily moved around in the file, they can be anywhere since there is no data...

and remove logic about the categorization of symbol tables:
  I don't understand or remember why it was needed,
  and, in my tests, it works fine without...

It previouly failed to strip an exe linked with `mold`
2023-06-19 07:29:39 +02:00
Motiejus Jakštys
d41111d7ef mem: rename align*Generic to mem.align*
Anecdote 1: The generic version is way more popular than the non-generic
one in Zig codebase:

     git grep -w alignForward | wc -l
    56
     git grep -w alignForwardGeneric | wc -l
    149

     git grep -w alignBackward | wc -l
    6
     git grep -w alignBackwardGeneric | wc -l
    15

Anecdote 2: In my project (turbonss) that does much arithmetic and
alignment I exclusively use the Generic functions.

Anecdote 3: we used only the Generic versions in the Macho Man's linker
workshop.
2023-06-17 12:49:13 -07:00
Ali Chraghi
3db3cf7790 std.sort: add pdqsort and heapsort 2023-05-23 17:55:59 -07:00
Andrew Kelley
42973d73e6 compiler: use @memcpy instead of std.mem.copy 2023-04-28 13:24:43 -07:00
Andrew Kelley
6261c13731 update codebase to use @memset and @memcpy 2023-04-28 13:24:43 -07:00
Xavier Bouchoux
e1cf2c8346 objcopy: cleanups and extract helper functions to reduce bloat
parts of the code are independant of Elf32/Elf64 variant, so avoid
generating the code twice by putting them outside of the generic struct.
2023-03-20 08:39:23 +01:00
Xavier Bouchoux
9ea404f03d objcopy: add support for stripping 32bits ELF files 2023-03-20 08:39:23 +01:00
Xavier Bouchoux
b014ffc74f objcopy: use defined data for padding ELF sections
Seems the right thing to do, but not strictly necessary.
2023-03-20 08:39:23 +01:00
Xavier Bouchoux
7ab48163ee objcopy: add support for --add-gnu-debuglink and --only-keep-debug
as documented at
https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html

It is now equivalent to do

```
zig objcopy --only-keep-debug bar foo.debug
zig objcopy -g bar foo.tmp
zig objcopy --add-gnu-debuglink=foo.debug foo.tmp foo
rm foo.tmp
```

or

```
zig objcopy --only-keep-debug bar foo foo.debug
zig objcopy -g --add-gnu-debuglink=foo.debug bar foo
```

or

```
zig objcopy -g --extract-to=foo.debug bar foo
```
2023-03-20 08:39:23 +01:00
Xavier Bouchoux
60d033d1f3 objcopy: fix compilation on 32-bit systems 2023-03-20 08:39:23 +01:00
Xavier Bouchoux
3a700d60ba objcopy: add some support for --strip-debug and --strip-all
Support for the use case:
`zig objcopy --strip-all program stripped --extract-to stripped.dbg`
to separate the debug & symbols sections out to a companion file, with a corresponding a .gnu_debuglink.

note: this is "a minimal effort implementation"
 It doesn't support all possibile elf files: there may be some sections type that need fixups, the program header may need fix up, ...
 It was written for a specific use case (strip debug info to a sperate file, for linux 64-bits executables built with `zig` or `zig c++` )
 It doesn't support 32-bit files, or file with non-native endianess.
2023-03-20 08:39:23 +01:00
Andrew Kelley
ede5dcffea make the build runner and test runner talk to each other
std.Build.addTest creates a CompileStep as before, however, this kind of
step no longer actually runs the unit tests. Instead it only compiles
it, and one must additionally create a RunStep from the CompileStep in
order to actually run the tests.

RunStep gains integration with the default test runner, which now
supports the standard --listen=- argument in order to communicate over
stdin and stdout. It also reports test statistics; how many passed,
failed, and leaked, as well as directly associating the relevant stderr
with the particular test name that failed.

This separation of CompileStep and RunStep means that
`CompileStep.Kind.test_exe` is no longer needed, and therefore has been
removed in this commit.

 * build runner: show unit test statistics in build summary
 * added Step.writeManifest since many steps want to treat it as a
   warning and emit the same message if it fails.
 * RunStep: fixed error message that prints the failed command printing
   the original argv and not the adjusted argv in case an interpreter
   was used.
 * RunStep: fixed not passing the command line arguments to the
   interpreter.
 * move src/Server.zig to std.zig.Server so that the default test runner
   can use it.
 * the simpler test runner function which is used by work-in-progress
   backends now no longer prints to stderr, which is necessary in order
   for the build runner to not print the stderr as a warning message.
2023-03-15 10:48:14 -07:00
Andrew Kelley
a333bb91ff zig objcopy: support the compiler protocol
This commit extracts out server code into src/Server.zig and uses it
both in the main CLI as well as `zig objcopy`.

std.Build.ObjCopyStep now adds `--listen=-` to the CLI for `zig objcopy`
and observes the protocol for progress and other kinds of integrations.

This fixes the last two test failures of this branch when I run
`zig build test` locally.
2023-03-15 10:48:14 -07:00
Andrew Kelley
aeaef8c0ff update std lib and compiler sources to new for loop syntax 2023-02-18 19:17:21 -07:00
Eckhart Köppen
c2d37224c8 zig objcopy: Fix corrupted hex output
Do not return stack local data for hex record payload data.
2023-01-11 21:15:20 +02:00
Andrew Kelley
9be5323e93 add zig objcopy subcommand
This commit moves the logic from `std.build.InstallRawStep` into `zig
objcopy`. The options here are limited, but we can add features as
needed.

closes #9261

New issues can be opened for specific objcopy flag support.
2022-12-13 15:37:52 -05:00