46 Commits

Author SHA1 Message Date
Frank Denis
d5585bc650
Implement threaded BLAKE3 (#25587)
Allows BLAKE3 to be computed using multiple threads.
2025-11-01 07:40:03 +01:00
Frank Denis
6669885aa2
Faster BLAKE3 implementation (#25574)
This is a rewrite of the BLAKE3 implementation, with vectorization.

On Apple Silicon, the new implementation is about twice as fast as the previous one.

With AVX2, it is more than 4 times faster.

With AVX512, it is more than 7.5x faster than the previous implementation (from 678 MB/s to 5086 MB/s).
2025-10-15 14:03:56 +02:00
Andrew Kelley
57dbc9e74a std.Io: delete GenericWriter 2025-08-28 18:30:57 -07:00
Andrew Kelley
e7b18a7ce6 std.crypto: remove inline from most functions
To quote the language reference,

It is generally better to let the compiler decide when to inline a
function, except for these scenarios:

* To change how many stack frames are in the call stack, for debugging
  purposes.
* To force comptime-ness of the arguments to propagate to the return
  value of the function, as in the above example.
* Real world performance measurements demand it. Don't guess!

Note that inline actually restricts what the compiler is allowed to do.
This can harm binary size, compilation speed, and even runtime
performance.

`zig run lib/std/crypto/benchmark.zig -OReleaseFast`
[-before-] vs {+after+}

              md5:        [-990-]        {+998+} MiB/s
             sha1:       [-1144-]       {+1140+} MiB/s
           sha256:       [-2267-]       {+2275+} MiB/s
           sha512:        [-762-]        {+767+} MiB/s
         sha3-256:        [-680-]        {+683+} MiB/s
         sha3-512:        [-362-]        {+363+} MiB/s
        shake-128:        [-835-]        {+839+} MiB/s
        shake-256:        [-680-]        {+681+} MiB/s
   turboshake-128:       [-1567-]       {+1570+} MiB/s
   turboshake-256:       [-1276-]       {+1282+} MiB/s
          blake2s:        [-778-]        {+789+} MiB/s
          blake2b:       [-1071-]       {+1086+} MiB/s
           blake3:       [-1148-]       {+1137+} MiB/s
            ghash:      [-10044-]      {+10033+} MiB/s
          polyval:       [-9726-]      {+10033+} MiB/s
         poly1305:       [-2486-]       {+2703+} MiB/s
         hmac-md5:        [-991-]        {+998+} MiB/s
        hmac-sha1:       [-1134-]       {+1137+} MiB/s
      hmac-sha256:       [-2265-]       {+2288+} MiB/s
      hmac-sha512:        [-765-]        {+764+} MiB/s
      siphash-2-4:       [-4410-]       {+4438+} MiB/s
      siphash-1-3:       [-7144-]       {+7225+} MiB/s
   siphash128-2-4:       [-4397-]       {+4449+} MiB/s
   siphash128-1-3:       [-7281-]       {+7374+} MiB/s
  aegis-128x4 mac:      [-73385-]      {+74523+} MiB/s
  aegis-256x4 mac:      [-30160-]      {+30539+} MiB/s
  aegis-128x2 mac:      [-66662-]      {+67267+} MiB/s
  aegis-256x2 mac:      [-16812-]      {+16806+} MiB/s
   aegis-128l mac:      [-33876-]      {+34055+} MiB/s
    aegis-256 mac:       [-8993-]       {+9087+} MiB/s
         aes-cmac:       2036 MiB/s
           x25519:      [-20670-]      {+16844+} exchanges/s
          ed25519:      [-29763-]      {+29576+} signatures/s
       ecdsa-p256:       [-4762-]       {+4900+} signatures/s
       ecdsa-p384:       [-1465-]       {+1500+} signatures/s
  ecdsa-secp256k1:       [-5643-]       {+5769+} signatures/s
          ed25519:      [-21926-]      {+21721+} verifications/s
          ed25519:      [-51200-]      {+50880+} verifications/s (batch)
 chacha20Poly1305:       [-1189-]       {+1109+} MiB/s
xchacha20Poly1305:       [-1196-]       {+1107+} MiB/s
 xchacha8Poly1305:       [-1466-]       {+1555+} MiB/s
 xsalsa20Poly1305:        [-660-]        {+620+} MiB/s
      aegis-128x4:      [-76389-]      {+78181+} MiB/s
      aegis-128x2:      [-53946-]      {+53495+} MiB/s
       aegis-128l:      [-27219-]      {+25621+} MiB/s
      aegis-256x4:      [-49351-]      {+49542+} MiB/s
      aegis-256x2:      [-32390-]      {+32366+} MiB/s
        aegis-256:       [-8881-]       {+8944+} MiB/s
       aes128-gcm:       [-6095-]       {+6205+} MiB/s
       aes256-gcm:       [-5306-]       {+5427+} MiB/s
       aes128-ocb:       [-8529-]      {+13974+} MiB/s
       aes256-ocb:       [-7241-]       {+9442+} MiB/s
        isapa128a:        [-204-]        {+214+} MiB/s
    aes128-single:  [-133857882-]  {+134170944+} ops/s
    aes256-single:   [-96306962-]   {+96408639+} ops/s
         aes128-8: [-1083210101-] {+1073727253+} ops/s
         aes256-8:  [-762042466-]  {+767091778+} ops/s
           bcrypt:      0.009 s/ops
           scrypt:      [-0.018-]      {+0.017+} s/ops
           argon2:      [-0.037-]      {+0.060+} s/ops
      kyber512d00:     [-206057-]     {+205779+} encaps/s
      kyber768d00:     [-156074-]     {+150711+} encaps/s
     kyber1024d00:     [-116626-]     {+115469+} encaps/s
      kyber512d00:     [-181149-]     {+182046+} decaps/s
      kyber768d00:     [-136965-]     {+135676+} decaps/s
     kyber1024d00:     [-101307-]     {+100643+} decaps/s
      kyber512d00:     [-123624-]     {+123375+} keygen/s
      kyber768d00:      [-69465-]      {+70828+} keygen/s
     kyber1024d00:      [-43117-]      {+43208+} keygen/s
2025-07-13 18:26:13 +02:00
Andrew Kelley
9f27d770a1 std.io: deprecated Reader/Writer; introduce new API 2025-07-07 22:43:51 -07:00
Jacob Young
4fcc750ba5 x86_64: implement more shuffles 2024-02-25 11:22:10 +01:00
Jacob Young
2fdc9e6ae8 x86_64: implement @shuffle 2024-02-25 11:22:10 +01:00
mlugg
51595d6b75
lib: correct unnecessary uses of 'var' 2023-11-19 09:55:07 +00:00
Andrew Kelley
3fc6fc6812 std.builtin.Endian: make the tags lower case
Let's take this breaking change opportunity to fix the style of this
enum.
2023-10-31 21:37:35 -04:00
Jacob Young
d890e81761 mem: fix ub in writeInt
Use inline to vastly simplify the exposed API.  This allows a
comptime-known endian parameter to be propogated, making extra functions
for a specific endianness completely unnecessary.
2023-10-31 21:37:35 -04:00
Jacob Young
fe93332ba2 x86_64: implement enough to pass unicode tests
* implement vector comparison
 * implement reduce for bool vectors
 * fix `@memcpy` bug
 * enable passing std tests
2023-10-23 22:42:18 -04:00
Jacob Young
27fe945a00 Revert "Revert "Merge pull request #17637 from jacobly0/x86_64-test-std""
This reverts commit 6f0198cadbe29294f2bf3153a27beebd64377566.
2023-10-22 15:46:43 -04:00
Andrew Kelley
6f0198cadb Revert "Merge pull request #17637 from jacobly0/x86_64-test-std"
This reverts commit 0c99ba1eab63865592bb084feb271cd4e4b0357e, reversing
changes made to 5f92b070bf284f1493b1b5d433dd3adde2f46727.

This caused a CI failure when it landed in master branch due to a
128-bit `@byteSwap` in std.mem.
2023-10-22 12:16:35 -07:00
Jacob Young
2e6e39a700 x86_64: fix bugs and disable erroring tests 2023-10-21 10:55:41 -04:00
mlugg
f26dda2117 all: migrate code to new cast builtin syntax
Most of this migration was performed automatically with `zig fmt`. There
were a few exceptions which I had to manually fix:

* `@alignCast` and `@addrSpaceCast` cannot be automatically rewritten
* `@truncate`'s fixup is incorrect for vectors
* Test cases are not formatted, and their error locations change
2023-06-24 16:56:39 -07:00
r00ster91
2593156068 migration: std.math.{min, min3, max, max3} -> @min & @max 2023-06-16 13:44:09 -07:00
Andrew Kelley
6261c13731 update codebase to use @memset and @memcpy 2023-04-28 13:24:43 -07:00
Jacob Young
2770159606 std: reenable vectorized code with the C backend 2023-03-06 08:09:32 -05:00
Andrew Kelley
aeaef8c0ff update std lib and compiler sources to new for loop syntax 2023-02-18 19:17:21 -07:00
Andrew Kelley
f0530385b5 update existing behavior tests and std lib to new for loop semantics 2023-02-18 19:17:21 -07:00
Jacob Young
93d60d0de7 std: avoid vector usage with the C backend
Vectors are not yet implemented in the C backend, so no reason to
prevent code using the standard library from compiling in the meantime.
2022-11-01 20:38:37 -04:00
Veikka Tuominen
6d44c0a16c std: update tests to stage2 semantics 2022-06-03 20:21:20 +03:00
Andrew Kelley
7be340e3cc std.crypto.blake3: use @Vector instead of std.meta.Vector 2022-03-27 14:53:40 -07:00
Meghan
c84147f90d
std: add writer methods on all crypto.hash types (#10168) 2021-11-20 01:37:17 -08:00
Andrew Kelley
6115cf2240 migrate from std.Target.current to @import("builtin").target
closes #9388
closes #9321
2021-10-04 23:48:55 -07:00
Ali Chraghi
db181b173f
Update hash & crypto benchmarks run comment (#9790)
* sync function arguments name with other same functions
2021-09-19 23:03:18 -07:00
Andrew Kelley
d29871977f remove redundant license headers from zig standard library
We already have a LICENSE file that covers the Zig Standard Library. We
no longer need to remind everyone that the license is MIT in every single
file.

Previously this was introduced to clarify the situation for a fork of
Zig that made Zig's LICENSE file harder to find, and replaced it with
their own license that required annual payments to their company.
However that fork now appears to be dead. So there is no need to
reinforce the copyright notice in every single file.
2021-08-24 12:25:09 -07:00
Jacob G-W
9fffffb07b fix code broken from previous commit 2021-06-21 17:03:03 -07:00
Isaac Freund
5b850d5c92
Run zig fmt on src/ and lib/std/
This replaces callconv(.Inline) with the more idiomatic inline keyword.
2021-05-20 17:14:18 +02:00
Veikka Tuominen
fd77f2cfed std: update usage of std.testing 2021-05-08 15:15:30 +03:00
LemonBoy
057bf1afc9 std: Add more error checking in hexToBytes
Prevent the function from turning into an endless loop that may or may
not perform OOB accesses.
2021-02-21 12:19:03 +02:00
Tadeo Kondrak
5dfe0e7e8f
Convert inline fn to callconv(.Inline) everywhere 2021-02-10 20:06:12 -07:00
Frank Denis
6c2e0c2046 Year++ 2020-12-31 15:45:24 -08:00
Frank Denis
4417206230 Now that they support vectors, use math.rot{l,r} 2020-11-05 17:19:48 -05:00
Frank Denis
72064eba23 std/crypto: vectorize BLAKE3
Gives a ~40% speedup on x86_64.

However, the generic code remains faster on aarch64.

This is still processing only one block at a time for now.

I'm pretty confident that processing more blocks per round
will eventually give a substantial performance improvement on
all platforms with vector units.
2020-10-25 21:13:14 -04:00
Frank Denis
fa17447090 std/crypto: make the whole APIs more consistent
- use `PascalCase` for all types. So, AES256GCM is now Aes256Gcm.
- consistently use `_length` instead of mixing `_size` and `_length` for the
constants we expose
- Use `minimum_key_length` when it represents an actual minimum length.
Otherwise, use `key_length`.
- Require output buffers (for ciphertexts, macs, hashes) to be of the right
size, not at least of that size in some functions, and the exact size elsewhere.
- Use a `_bits` suffix instead of `_length` when a size is represented as a
number of bits to avoid confusion.
- Functions returning a constant-sized slice are now defined as a slice instead
of a pointer + a runtime assertion. This is the case for most hash functions.
- Use `camelCase` for all functions instead of `snake_case`.

No functional changes, but these are breaking API changes.
2020-10-17 18:53:08 -04:00
Frank Denis
fc55cd458a Hash functions now accept an option set
- This avoids having multiple `init()` functions for every combination
of optional parameters
- The API is consistent across all hash functions
- New options can be added later without breaking existing applications.
  For example, this is going to come in handy if we implement parallelization
  for BLAKE2 and BLAKE3.
- We don't have a mix of snake_case and camelCase functions any more, at
least in the public crypto API

Support for BLAKE2 salt and personalization (more commonly called context)
parameters have been implemented by the way to illustrate this.
2020-08-21 00:51:14 +02:00
Frank Denis
446597bd3c Remove the reset() function from hash functions
Justification:
- reset() is unnecessary; states that have to be reused can be copied
- reset() is error-prone. Copying a previous state prevents forgetting
  struct members.
- reset() forces implementation to store sensitive data (key, initial state)
  in memory even when they are not needed.
- reset() is confusing as it has a different meaning elsewhere in Zig.
2020-08-20 23:02:10 +02:00
Frank Denis
f92a5d7944 Repair crypto/benchmark; add BLAKE2b256
Some MACs have a 64-bit output
2020-08-20 23:02:10 +02:00
Andrew Kelley
4a69b11e74 add license header to all std lib files
add SPDX license identifier
copyright ownership is zig contributors
2020-08-20 16:07:04 -04:00
Vexu
85fd484f07
std: fix blake3 assignment to constant 2020-05-04 14:45:36 +03:00
Jay Petacat
cb2c14e03f blake3: Workaround issue #4373 with named types 2020-02-02 18:44:50 -05:00
Jay Petacat
923e567c6d blake3: Replace &arr with arr[0..] for slice args 2020-02-02 14:59:36 -05:00
Jay Petacat
b143fc0d32 blake3: Name and const pointer refinements 2020-02-02 14:42:57 -05:00
Jay Petacat
d098e212ad blake3: Convert *const [n]u8 types to [n]u8
I do not see many cases of constant pointers to arrays in the stdlib.
In fact, this makes the code run a little faster, probably because Zig
automatically converts to pointers where it makes sense.
2020-02-02 14:08:10 -05:00
Jay Petacat
4b86c1e3bb crypto: Add BLAKE3 hashing algorithm
This is a translation of the [official reference implementation][1] with
few other changes. The bad news is that the reference implementation is
designed for simplicity and not speed, so there's a lot of room for
performance improvement. The good news is that, according to the crypto
benchmark, the implementation is still fast relative to the other
hashing algorithms:

```
         md5: 430 MiB/s
        sha1: 386 MiB/s
      sha256: 191 MiB/s
      sha512: 275 MiB/s
    sha3-256: 233 MiB/s
    sha3-512: 137 MiB/s
     blake2s: 464 MiB/s
     blake2b: 526 MiB/s
      blake3: 576 MiB/s
    poly1305: 1479 MiB/s
    hmac-md5: 653 MiB/s
   hmac-sha1: 553 MiB/s
 hmac-sha256: 222 MiB/s
      x25519: 8685 exchanges/s
```

[1]: https://github.com/BLAKE3-team/BLAKE3
2020-02-01 23:03:23 -05:00