Intel keeps changing the latency & throughput of the aes* and clmul
instructions every time they release a new model.
Adjust `optimal_parallel_blocks` accordingly, keeping 8 as a safe
default for unknown data.
It seems that Apple has finally got rid of the 32bit versions of
`fstat` and `fstatat`, and instead, only 64bit versions are available
on BigSur and Apple Silicon.
The tweak in this commit is required to make Zig stage1 compile on
BigSur + aarch64.
std.os.accept already wants to allow null, which matches `man 3p accept`:
> address Either a null pointer, or a pointer to a sockaddr structure
> where the address of the connecting socket shall be re‐
> turned.
>
> address_len Either a null pointer, if address is a null pointer, or a
> pointer to a socklen_t object which on input specifies the
> length of the supplied sockaddr structure, and on output
> specifies the length of the stored address.
Fixes ziglang#6832.
Gives a ~40% speedup on x86_64.
However, the generic code remains faster on aarch64.
This is still processing only one block at a time for now.
I'm pretty confident that processing more blocks per round
will eventually give a substantial performance improvement on
all platforms with vector units.
The bcrypt function intentionally requires quite a lot of CPU cycles
to complete.
In addition to that, not having its full state constantly in the
CPU L1 cache causes a massive performance drop.
These properties slow down brute-force attacks against low-entropy
inputs (typically passwords), and GPU-based attacks get little
to no advantages over CPUs.
The NaCl constructions are available in pretty much all programming
languages, making them a solid choice for applications that require
interoperability.
Go includes them in the standard library, JavaScript has the popular
tweetnacl.js module, and reimplementations and ports of TweetNaCl
have been made everywhere.
Zig has almost everything that NaCl has at this point, the main
missing component being the Salsa20 cipher, on top on which NaCl's
secretboxes, boxes, and sealedboxes can be implemented.
So, here they are!
And clean the X25519 API up a little bit by the way.