Ascon is the family of cryptographic constructions standardized by NIST
for lightweight cryptography.
The Zig standard library already included the Ascon permutation itself,
but higher-level constructions built on top of it were intentionally
postponed until NIST released the final specification.
That specification has now been published as NIST SP 800-232:
https://csrc.nist.gov/pubs/sp/800/232/final
With this publication, we can now confidently include these constructions
in the standard library.
The Zig standard library lacked schemes that resist nonce reuse.
AES-SIV and AES-GCM-SIV are the standard options for this.
AES-GCM-SIV can be very useful when Zig is used to target embedded
systems, and AES-SIV is especially useful for key wrapping.
Also take it as an opportunity to add a bunch of test vectors to
modes.ctr and make sure it works with block ciphers whose size is
not 16.
Basically everything that has a direct replacement or no uses left.
Notable omissions:
- std.ArrayHashMap: Too much fallout, needs a separate cleanup.
- std.debug.runtime_safety: Too much fallout.
- std.heap.GeneralPurposeAllocator: Lots of references to it remain, not
a simple find and replace as "debug allocator" is not equivalent to
"general purpose allocator".
- std.io.Reader: Is being reworked at the moment.
- std.unicode.utf8Decode(): No replacement, needs a new API first.
- Manifest backwards compat options: Removal would break test data used
by TestFetchBuilder.
- panic handler needs to be a namespace: Many tests still rely on it
being a function, needs a separate cleanup.
std.crypto: add constant-time codecs
Add constant-time hex/base64 codecs designed to process cryptographic
secrets, adapted from libsodium's implementations.
Introduce a `crypto.codecs` namespace for crypto-related encoders and
decoders. Move ASN.1 codecs to this namespace.
This will also naturally accommodate the proposed PEM codecs.
* std.crypto.aes: introduce AES block vectors
Modern Intel CPUs with the VAES extension can handle more than a
single AES block per instruction.
So can some ARM and RISC-V CPUs. Software implementations with
bitslicing can also greatly benefit from this.
Implement low-level operations on AES block vectors, and the
parallel AEGIS variants on top of them.
AMD Zen4:
aegis-128x4: 73225 MiB/s
aegis-128x2: 51571 MiB/s
aegis-128l: 25806 MiB/s
aegis-256x4: 46742 MiB/s
aegis-256x2: 30227 MiB/s
aegis-256: 8436 MiB/s
aes128-gcm: 5926 MiB/s
aes256-gcm: 5085 MiB/s
AES-GCM, and anything based on AES-CTR are also going to benefit
from this later.
* Make AEGIS-MAC twice a fast
std.crypto has quite a few instances of breaking naming conventions.
This is the beginning of an effort to address that.
Deprecates `std.crypto.utils`.
Add module for mapping ASN1 types to Zig types. See
`asn1.Tag.fromZig` for the mapping. Add DER encoder and decoder.
See `asn1/test.zig` for example usage of every ASN1 type.
This implementation allows ASN1 tags to be overriden with `asn1_tag`
and `asn1_tags`:
```zig
const MyContainer = (enum | union | struct) {
field: u32,
pub const asn1_tag = asn1.Tag.init(...);
// This specifies a tag's class, and if explicit, additional encoding
// rules.
pub const asn1_tags = .{
.field = asn1.FieldTag.explicit(0, .context_specific),
};
};
```
Despite having an enum tag type, ASN1 frequently uses OIDs as enum
values. This is supported via an `pub const oids` field.
```zig
const MyEnum = enum {
a,
pub const oids = asn1.Oid.StaticMap(MyEnum).initComptime(.{
.a = "1.2.3.4",
});
};
```
Futhermore, a container may choose to implement encoding and decoding
however it deems fit. This allows for derived fields since Zig has a far
more powerful type system than ASN1.
```zig
// ASN1 has no standard way of tagging unions.
const MyContainer = union(enum) {
derived: PowerfulZigType,
const WeakAsn1Type = ...;
pub fn encodeDer(self: MyContainer, encoder: *der.Encoder) !void {
try encoder.any(WeakAsn1Type{...});
}
pub fn decodeDer(decoder: *der.Decoder) !MyContainer {
const weak_asn1_type = try decoder.any(WeakAsn1Type);
return .{ .derived = PowerfulZigType{...} };
}
};
```
An unfortunate side-effect is that decoding and encoding cannot have
complete complete error sets unless we limit what errors users may
return. Luckily, PKI ASN1 types are NOT recursive so the inferred
error set should be sufficient.
Finally, other encodings are possible, but this patch only implements
a buffered DER encoder and decoder.
In an effort to keep the changeset minimal this PR does not actually
use the DER parser for stdlib PKI, but a tested example of how it may
be used for Certificate is available
[here.](https://github.com/clickingbuttons/asn1/blob/69c5709d/src/Certificate.zig)
Closes#19775.
A lot of these "shorthand" doc comments were redundant, low quality
filler content. Better to let the actual modules speak for themselves
with top level doc comments rather than trying to document their
aliases.
ML-KEM is the Kyber post-quantum secure key encapsulation mechanism,
as being standardized by NIST.
Too bad, they decided to rename it; the "Kyber" name was so much
better!
This implements the current draft (NIST FIPS-203), which is already
being deployed even though the specification is not finalized.
This reverts commit 0c99ba1eab63865592bb084feb271cd4e4b0357e, reversing
changes made to 5f92b070bf284f1493b1b5d433dd3adde2f46727.
This caused a CI failure when it landed in master branch due to a
128-bit `@byteSwap` in std.mem.
A minimal set of simple, safe functions for Montgomery arithmetic,
designed for cryptographic primitives.
Also update the current RSA cert validation to use it, getting rid
of the FixedBuffer hack and the previous limitations.
Make the check of the RSA public key a little bit more strict by
the way.
These are great permutations, and there's nothing wrong with them
from a practical security perspective.
However, both were competing in the NIST lightweight crypto
competition.
Gimli didn't pass the 3rd selection round, and is not much used
in the wild besides Zig and libhydrogen. It will never be
standardized and is unlikely to get more traction in the future.
Xoodyak, that Xoodoo is the permutation of, was a finalist.
It has a lot of advantages and *might* be standardized without NIST.
But this is too early to tell, and too risky to commit to it
in a standard library.
For lightweight crypto, Ascon is the one that we know NIST will
standardize and that we can safely rely on from a usage perspective.
Switch to a traditional ChaCha-based CSPRNG, with an Ascon-based one
as an option for constrained systems.
Add a RNG benchmark by the way.
Gimli and Xoodoo served us well. Their code will be maintained,
but outside the standard library.
Implementation of the IND-CCA2 post-quantum secure key encapsulation
mechanism (KEM) CRYSTALS-Kyber, as submitted to the third round of the NIST
Post-Quantum Cryptography (v3.02/"draft00"), and selected for standardisation.
Co-authored-by: Frank Denis <124872+jedisct1@users.noreply.github.com>
* Add configurable side channels mitigations; enable them on soft AES
Our software AES implementation doesn't have any mitigations against
side channels.
Go's generic implementation is not protected at all either, and even
OpenSSL only has minimal mitigations.
Full mitigations against cache-based attacks (bitslicing, fixslicing)
come at a huge performance cost, making AES-based primitives pretty
much useless for many applications. They also don't offer any
protection against other classes of side channel attacks.
In practice, partially protected, or even unprotected implementations
are not as bad as it sounds. Exploiting these side channels requires
an attacker that is able to submit many plaintexts/ciphertexts and
perform accurate measurements. Noisy measurements can still be
exploited, but require a significant amount of attempts. Wether this
is exploitable or not depends on the platform, application and the
attacker's proximity.
So, some libraries made the choice of minimal mitigations and some
use better mitigations in spite of the performance hit. It's a
tradeoff (security vs performance), and there's no one-size-fits all
implementation.
What applies to AES applies to other cryptographic primitives.
For example, RSA signatures are very sensible to fault attacks,
regardless of them using the CRT or not. A mitigation is to verify
every produced signature. That also comes with a performance cost.
Wether to do it or not depends on wether fault attacks are part of
the threat model or not.
Thanks to Zig's comptime, we can try to address these different
requirements.
This PR adds a `side_channels_protection` global, that can later
be complemented with `fault_attacks_protection` and possibly other
knobs.
It can have 4 different values:
- `none`: which doesn't enable additional mitigations.
"Additional", because it only disables mitigations that don't have
a big performance cost. For example, checking authentication tags
will still be done in constant time.
- `basic`: which enables mitigations protecting against attacks in
a common scenario, where an attacker doesn't have physical access to
the device, cannot run arbitrary code on the same thread, and cannot
conduct brute-force attacks without being throttled.
- `medium`: which enables additional mitigations, offering practical
protection in a shared environement.
- `full`: which enables all the mitigations we have.
The tradeoff is that the more mitigations we enable, the bigger the
performance hit will be. But this let applications choose what's
best for their use case.
`medium` is the default.
Currently, this only affects software AES, but that setting can
later be used by other primitives.
For AES, our implementation is a traditional table-based, with 4
32-bit tables and a sbox.
Lookups in that table have been replaced by function calls. These
functions can add a configurable noise level, making cache-based
attacks more difficult to conduct.
In the `none` mitigation level, the behavior is exactly the same
as before. Performance also remains the same.
In other levels, we compress the T tables into a single one, and
read data from multiple cache lines (all of them in `full` mode),
for all bytes in parallel. More precise measurements and way more
attempts become necessary in order to find correlations.
In addition, we use distinct copies of the sbox for key expansion
and encryption, so that they don't share the same L1 cache entries.
The best known attacks target the first two AES round, or the last
one.
While future attacks may improve on this, AES achieves full
diffusion after 4 rounds. So, we can relax the mitigations after
that. This is what this implementation does, enabling mitigations
again for the last two rounds.
In `full` mode, all the rounds are protected.
The protection assumes that lookups within a cache line are secret.
The cachebleed attack showed that it can be circumvented, but
that requires an attacker to be able to abuse hyperthreading and
run code on the same core as the encryption, which is rarely a
practical scenario.
Still, the current AES API allows us to transparently switch to
using fixslicing/bitslicing later when the `full` mitigation level
is enabled.
* Software AES: use little-endian representation.
Virtually all platforms are little-endian these days, so optimizing
for big-endian CPUs doesn't make sense any more.
Make the Keccak permutation public, as it's useful for more than
SHA-3 (kMAC, SHAKE, TurboSHAKE, TupleHash, etc).
Our Keccak implementation was accepting f as a comptime parameter,
but always used 64-bit words and 200 byte states, so it actually
didn't work with anything besides f=1600.
That has been fixed. The ability to use reduced-round versions
was also added in order to support M14 and K12.
The state was constantly converted back and forth between bytes
and words, even though only a part of the state is actually used
for absorbing and squeezing bytes. It was changed to something
similar to the other permutations we have, so we can avoid extra
copies, and eventually add vectorized implementations.
In addition, the SHAKE extendable output function (XOF) was
added (SHAKE128, SHAKE256). It is required by newer schemes,
such as the Kyber post-quantum key exchange mechanism, whose
implementation is currently blocked by SHAKE missing from our
standard library.
Breaking change: `Keccak_256` and `Keccak_512` were renamed to
`Keccak256` and `Keccak512` for consistency with all other
hash functions.
Ascon has been selected as new standard for lightweight cryptography
in the NIST Lightweight Cryptography competition.
Ascon won over Gimli and Xoodoo.
The permutation is unlikely to change. However, NIST may tweak
the constructions (XOF, hash, authenticated encryption) before
standardizing them. For that reason, implementations of those
are better maintained outside the standard library for now.
In fact, we already had an Ascon implementation in Zig:
`std.crypto.aead.isap` is based on it. While the implementation was
here, there was no public API to access it directly.
So:
- The Ascon permutation is now available as `std.crypto.core.Ascon`,
with everything needed to use it in AEADs and other Ascon-based
constructions
- The ISAP implementation now uses std.crypto.core.Ascon instead of
keeping a private copy
- The default CSPRNG replaces Xoodoo with Ascon. And instead of an
ad-hoc construction, it's using the XOFa mode of the NIST submission.
Only a little bit of generalized logic for DER encoding is needed and so
it can live inside the Certificate namespace.
This commit removes the generic "parse object id" function which is no
longer used in favor of more specific, smaller sets of object ids used
with ComptimeStringMap.
* Update the AEGIS specification URL to the current draft
* std.crypto.auth: add AEGIS MAC
The Pelican-based authentication function of the AEGIS construction
can be used independently from authenticated encryption, as a faster
and more secure alternative to GHASH/POLYVAL/Poly1305.
We already expose GHASH, POLYVAL and Poly1305 for use outside AES-GCM
and ChaChaPoly, so there are no reasons not to expose the MAC from AEGIS
as well.
Like other 128-bit hash functions, finding a collision only requires
~2^64 attempts or inputs, which may still be acceptable for many
practical applications.
Benchmark (Apple M1):
siphash128-1-3: 3222 MiB/s
ghash: 8682 MiB/s
aegis-128l mac: 12544 MiB/s
Benchmark (Zen 2):
siphash128-1-3: 4732 MiB/s
ghash: 5563 MiB/s
aegis-128l mac: 19270 MiB/s
POLYVAL is GHASH's little brother, required by the AES-GCM-SIV
construction. It's defined in RFC8452.
The irreducible polynomial is a mirror of GHASH's (which doesn't
change anything in our implementation that didn't reverse the raw
bits to start with).
But most importantly, POLYVAL encodes byte strings as little-endian
instead of big-endian, which makes it a little bit faster on the
vast majority of modern CPUs.
So, both share the same code, just with comptime magic to use the
correct endianness and only double the key for GHASH.
...instead of hard-coding it to 20.
- This is consistent with the ChaCha implementation
- NaCl and libsodium, that this API is designed to interop with,
also support 8 and 12 round variants. The 12 round variant, in
particular, provides the same security level as the 20 round variant,
but is obviously faster.
- scrypt currently uses its own non optimized version of Salsa, just
because it use 8 rounds instead of 20. This will help remove code
duplication.
No behavior nor public API changes. The Salsa20 and XSalsa20 still
represent the 20-round variant.
Gimli was a game changer. A permutation that is large enough to be
used in sponge-like constructions, yet small enough to be compact
to implement and fast on a wide range of platforms.
And Gimli being part of the Zig standard library was awesome.
But since then, Gimli entered the NIST Lightweight Cryptography
Competition, competing againt other candidates sharing a similar set
of properties.
Unfortunately, Gimli didn't pass the 3rd round.
There are no practical attacks against Gimli when used correctly, but
NIST's decision means that Gimli is unlikely to ever get any traction.
So, maybe the time has come to move Gimli from the standard library
to another repository.
We shouldn't do it without providing an alternative, though.
And the best candidate for this is probably Xoodoo.
Xoodoo is the core function of Xoodyak, one of the finalists of the
NIST LWC competition, and the most direct competitor to Gimli. It is
also a 384-bit permutation, so it can easily be used everywhere Gimli
was used with no parameter changes.
It is the building block of Xoodyak (for actual encryption and hashing)
as well as Charm, that some Zig applications are already using.
Like Gimli that it was heavily inspired from, it is compact and
suitable for constrained environments.
This change adds the Xoodoo permutation to std.crypto.core.
The set of public functions includes everything required to later
implement existing Xoodoo-based constructions.
In order to prepare for the Gimli deprecation, the default
CSPRNG was changed to a Xoodoo-based that works exactly the same way.
A hash function cascade was a common way to avoid length-extension
attacks with traditional hash functions such as the SHA-2 family.
Add `std.crypto.hash.composition` to do exactly that using arbitrary
hash functions, and pre-define the common SHA2-based ones.
With this, we can now sign and verify Bitcoin signatures in pure Zig.
std.crypto.ecc: add support for the secp256k1 curve
Usage of the secp256k1 elliptic curve recently grew exponentially,
since this is the curve used by Bitcoin and other popular blockchains
such as Ethereum.
With this, Zig has support for all the widely deployed elliptic curves
today.
ECDSA is the most commonly used signature scheme today, mainly for
historical and conformance reasons. It is a necessary evil for
many standard protocols such as TLS and JWT.
It is tricky to implement securely and has been the root cause of
multiple security disasters, from the Playstation 3 hack to multiple
critical issues in OpenSSL and Java.
This implementation combines lessons learned from the past with
recent recommendations.
In Zig, the NIST curves that ECDSA is almost always instantied with
use formally verified field arithmetic, giving us peace of mind
even on edge cases. And the API rejects neutral elements where it
matters, and unconditionally checks for non-canonical encoding for
scalars and group elements. This automatically eliminates common
vulnerabilities such as https://sk.tl/2LpS695v .
ECDSA's security heavily relies on the security of the random number
generator, which is a concern in some environments.
This implementation mitigates this by computing deterministic
nonces using the conservative scheme from Pornin et al. with the
optional addition of randomness as proposed in Ericsson's
"Deterministic ECDSA and EdDSA Signatures with Additional Randomness"
document. This approach mitigates both the implications of a weak RNG
and the practical implications of fault attacks.
Project Wycheproof is a Google project to test crypto libraries against
known attacks by triggering edge cases. It discovered vulnerabilities
in virtually all major ECDSA implementations.
The entire set of ECDSA-P256-SHA256 test vectors from Project Wycheproof
is included here. Zero defects were found in this implementation.
The public API differs from the Ed25519 one. Instead of raw byte strings
for keys and signatures, we introduce Signature, PublicKey and SecretKey
structures.
The reason is that a raw byte representation would not be optimal.
There are multiple standard representations for keys and signatures,
and decoding/encoding them may not be cheap (field elements have to be
converted from/to the montgomery domain).
So, the intent is to eventually move ed25519 to the same API, which
is not going to introduce any performance regression, but will bring
us a consistent API, that we can also reuse for RSA.
After P-256, here comes P-384, also known as secp384r1.
Like P-256, it is required for TLS, and is the current NIST recommendation for key exchange and signatures, for better or for worse.
Like P-256, all the finite field arithmetic has been computed and verified to be correct by fiat-crypto.