Introduces std.zig.ErrorBundle which is a trivially serializeable set
of compilation errors. This is in the standard library so that both
the compiler and the build runner can use it. The idea is they will
use it to communicate compilation errors over a binary protocol.
The binary encoding of ErrorBundle is a bit problematic - I got a little
too aggressive with compaction. I need to change it in a follow-up
commit to use some indirection in the error message list, otherwise
iteration is too unergonomic. In fact it's so problematic right now that
the logic getAllErrorsAlloc() actually fails to produce a viable
ErrorBundle because it puts SourceLocation data in between the root
level ErrorMessage data.
This commit has a simplification - redundant logic for rendering AST
errors to stderr has been removed in favor of moving the logic for
lowering AST errors into AstGen. So even if we get parse errors, the
errors will get lowered into ZIR before being reported. I believe this
will be useful when working on --autofix. Either way, some redundant
brittle logic was happily deleted.
In Compilation, updateSubCompilation() is improved to properly perform
error reporting when a sub-compilation object fails. It no longer dumps
directly to stderr; instead it populates an ErrorBundle object, which
gets added to the parent one during getAllErrorsAlloc().
In package fetching code, instead of dumping directly to stderr, it now
populates an ErrorBundle object, and gets properly reported at the CLI
layer of abstraction.
- improve fn prototypes of process_vm_writev
- make the memory writable in the ELF file
- force the linker to always append the function
- write updates with process_vm_writev
With this commit, the build runner now communicates progress towards
completion of the step graph to the terminal, while also printing the
stderr of child processes as soon as possible, without clobbering each
other, and without clobbering the CLI progress output.
API users can take advantage of these to freely write to the terminal
which has an ongoing progress display, similar to what Ninja does when
compiling C/C++ objects and a warning or error message is printed.
* Step.init() now takes an options struct
* Step.init() now captures a small stack trace and stores it in the
Step so that it can be accessed when printing user-friendly debugging
information, including the lines of code that created the step in
question.
Instead of dumping directly to stderr. This prevents processes running
simultaneously from racing their stderr against each other.
For now it only reports at the end, but an improvement would be to
report as soon as a failed step occurs.
After sorting the step stack so that dependencies can be popped before
their dependants are popped, there is still a situation left to handle
correctly:
Example:
A depends on:
B
C
D depends on:
E
F
They will be ordered like this:
A B C D E F
If there are 6+ cores, then all of them will be evaluated at once,
incorrectly evaluating A and D before their dependencies.
Starting evaluation of F and then E is correct, but waiting until they
are done is not correct because it should start working on B and C as
well.
This commit solves the problem by computing dependants in the dependency
loop checking logic, and then having workers queue up their dependants
when they finish their own work.
Implementation of the IND-CCA2 post-quantum secure key encapsulation
mechanism (KEM) CRYSTALS-Kyber, as submitted to the third round of the NIST
Post-Quantum Cryptography (v3.02/"draft00"), and selected for standardisation.
Co-authored-by: Frank Denis <124872+jedisct1@users.noreply.github.com>
* Add configurable side channels mitigations; enable them on soft AES
Our software AES implementation doesn't have any mitigations against
side channels.
Go's generic implementation is not protected at all either, and even
OpenSSL only has minimal mitigations.
Full mitigations against cache-based attacks (bitslicing, fixslicing)
come at a huge performance cost, making AES-based primitives pretty
much useless for many applications. They also don't offer any
protection against other classes of side channel attacks.
In practice, partially protected, or even unprotected implementations
are not as bad as it sounds. Exploiting these side channels requires
an attacker that is able to submit many plaintexts/ciphertexts and
perform accurate measurements. Noisy measurements can still be
exploited, but require a significant amount of attempts. Wether this
is exploitable or not depends on the platform, application and the
attacker's proximity.
So, some libraries made the choice of minimal mitigations and some
use better mitigations in spite of the performance hit. It's a
tradeoff (security vs performance), and there's no one-size-fits all
implementation.
What applies to AES applies to other cryptographic primitives.
For example, RSA signatures are very sensible to fault attacks,
regardless of them using the CRT or not. A mitigation is to verify
every produced signature. That also comes with a performance cost.
Wether to do it or not depends on wether fault attacks are part of
the threat model or not.
Thanks to Zig's comptime, we can try to address these different
requirements.
This PR adds a `side_channels_protection` global, that can later
be complemented with `fault_attacks_protection` and possibly other
knobs.
It can have 4 different values:
- `none`: which doesn't enable additional mitigations.
"Additional", because it only disables mitigations that don't have
a big performance cost. For example, checking authentication tags
will still be done in constant time.
- `basic`: which enables mitigations protecting against attacks in
a common scenario, where an attacker doesn't have physical access to
the device, cannot run arbitrary code on the same thread, and cannot
conduct brute-force attacks without being throttled.
- `medium`: which enables additional mitigations, offering practical
protection in a shared environement.
- `full`: which enables all the mitigations we have.
The tradeoff is that the more mitigations we enable, the bigger the
performance hit will be. But this let applications choose what's
best for their use case.
`medium` is the default.
Currently, this only affects software AES, but that setting can
later be used by other primitives.
For AES, our implementation is a traditional table-based, with 4
32-bit tables and a sbox.
Lookups in that table have been replaced by function calls. These
functions can add a configurable noise level, making cache-based
attacks more difficult to conduct.
In the `none` mitigation level, the behavior is exactly the same
as before. Performance also remains the same.
In other levels, we compress the T tables into a single one, and
read data from multiple cache lines (all of them in `full` mode),
for all bytes in parallel. More precise measurements and way more
attempts become necessary in order to find correlations.
In addition, we use distinct copies of the sbox for key expansion
and encryption, so that they don't share the same L1 cache entries.
The best known attacks target the first two AES round, or the last
one.
While future attacks may improve on this, AES achieves full
diffusion after 4 rounds. So, we can relax the mitigations after
that. This is what this implementation does, enabling mitigations
again for the last two rounds.
In `full` mode, all the rounds are protected.
The protection assumes that lookups within a cache line are secret.
The cachebleed attack showed that it can be circumvented, but
that requires an attacker to be able to abuse hyperthreading and
run code on the same core as the encryption, which is rarely a
practical scenario.
Still, the current AES API allows us to transparently switch to
using fixslicing/bitslicing later when the `full` mitigation level
is enabled.
* Software AES: use little-endian representation.
Virtually all platforms are little-endian these days, so optimizing
for big-endian CPUs doesn't make sense any more.
Apple M1/M2 have an EOR3 instruction that can XOR 2 operands with
another one, and LLVM knows how to take advantage of it.
However, two EOR can't be automatically combined into an EOR3 if
one of them is in an assembly block.
That simple change speeds up ciphers doing an AES round immediately
followed by a XOR operation on Apple Silicon.
Before:
aegis-128l mac: 12534 MiB/s
aegis-256 mac: 6722 MiB/s
aegis-128l: 10634 MiB/s
aegis-256: 6133 MiB/s
aes128-gcm: 3890 MiB/s
aes256-gcm: 3122 MiB/s
aes128-ocb: 2832 MiB/s
aes256-ocb: 2057 MiB/s
After:
aegis-128l mac: 15667 MiB/s
aegis-256 mac: 8240 MiB/s
aegis-128l: 12656 MiB/s
aegis-256: 7214 MiB/s
aes128-gcm: 3976 MiB/s
aes256-gcm: 3202 MiB/s
aes128-ocb: 2835 MiB/s
aes256-ocb: 2118 MiB/s
Today I found out that posix_spawn is trash. It's actually implemented
on top of fork/exec inside of libc (or libSystem in the case of macOS).
So, anything posix_spawn can do, we can do better. In particular, what
we can do better is handle spawning of child processes that are
potentially foreign binaries. If you try to spawn a wasm binary, for
example, posix spawn does the following:
* Goes ahead and creates a child process.
* The child process writes "foo.wasm: foo.wasm: cannot execute binary file"
to stderr (yes, it prints the filename twice).
* The child process then exits with code 126.
This behavior is indistinguishable from the binary being successfully
spawned, and then printing to stderr, and exiting with a failure -
something that is an extremely common occurrence.
Meanwhile, using the lower level fork/exec will simply return ENOEXEC
code from the execve syscall (which is mapped to zig error.InvalidExe).
The posix_spawn behavior means the zig build runner can't tell the
difference between a failure to run a foreign binary, and a binary that
did run, but failed in some other fashion. This is unacceptable, because
attempting to excecve is the proper way to support things like Rosetta.
The TurboSHAKE paper just got published:
https://eprint.iacr.org/2023/342.pdf
and unlike the previous K12 paper, suggests 0x1F instead of 0x01
as the default value for "D".
* Fix SHA3 with streaming
Leftover bytes should be added to the buffer, not to the state.
(or, always to the state; we can and probably should eventually get
rid of the buffer)
Fixes#14851
* Add a test for SHA-3 with streaming
Man page for posix lists EMFILE, man page for linux ENFILE.
Also posix says "The mmap() function adds an extra reference to the file
associated with the file descriptor fildes which is not removed by a
subsequent close() on that file descriptor. This reference is removed
when there are no more mappings to the file."
It sounds counter-intuitive, that a process limit but no system limit can
be exceeeded.
As far as I understand, fildes is only used for file descriptor backed mmaps.
In Linux when interacting with the virtual file system when writing
in invalid value to a file the OS will return errno 22 (INVAL).
Instead of triggering an unreachable, this change now returns a
newly introduced error.InvalidArgument.
In Linux when writing to various files in the virtual file system,
for example /sys/fs/cgroup, if you write an invalid value to a file
you'll get errno 16.
This change allows for these specific cases to be caught instead of
being lumped together in UnexpectedError.
Previously, this API had pid, to be used on POSIX systems, and handle,
to be used on Windows.
This commit unifies the API, defining an Id type that is either the pid
or the HANDLE depending on the target OS.
This commit also prepares for the future by allowing one to import via
`std.process.Child` which is the fully qualified namespace that I intend
to migrate to in the future.