465 Commits

Author SHA1 Message Date
Frank Denis
59af6417bb
crypto.ghash: define aggregate tresholds as blocks, not bytes (#13507)
These constants were read as a block count in initForBlockCount()
but at the same time, as a size in update().

The unit could be blocks or bytes, but we should use the same one
everywhere.

So, use blocks as intended.

Fixes #13506
2022-11-10 19:00:00 +01:00
Frank Denis
36e618aef1 crypto.ghash: compatibility with stage1
Defining the selector enum outside the function definition is
required for stage1.
2022-11-08 16:59:53 +01:00
Frank Denis
7d48cb1138
std.crypto: make ghash faster, esp. for small messages (#13464)
* std.crypto: make ghash faster, esp. for small messages

Aggregated reduction requires 5 additional multiplications (to
precompute the powers of H), in order to save 2 multiplications
per batch.

So, only use large batches when it's actually interesting to do so.

For the last blocks, reuse the precomputations in order to perform
a single reduction.

Also, even in .ReleaseSmall, allow 2-block aggregation.
The speedup is worth it, and the code increase is reasonable.

And in .ReleaseFast, bump the upper batch size up to 16.

Leverage comptime by the way instead of duplicating code.

std/crypto/benchmark.zig on Apple M1:

    Zig 0.10.0: 2769 MiB/s
        Before: 6014 MiB/s
         After: 7334 MiB/s

Normalize function names by the way.

* Change clmul() to accept the half to be processed

This avoids a bunch of truncate() calls.

* Add more ghash tests to check all code paths
2022-11-07 21:45:29 +01:00
Frank Denis
32563e6829
crypto.core.aes: process 6 block in parallel instead of 8 on aarch64 (#13473)
* crypto.core.aes: process 6 block in parallel instead of 8 on aarch64

At least on Apple Silicon, this is slightly faster than 8 blocks.

* AES: add parallel blocks for tigerlake, rocketlake, alderlake, zen3
2022-11-07 12:28:37 +01:00
Frank Denis
907f3ef887
crypto.salsa20: make the number of rounds a comptime parameter (#13442)
...instead of hard-coding it to 20.

- This is consistent with the ChaCha implementation
- NaCl and libsodium, that this API is designed to interop with,
also support 8 and 12 round variants. The 12 round variant, in
particular, provides the same security level as the 20 round variant,
but is obviously faster.
- scrypt currently uses its own non optimized version of Salsa, just
because it use 8 rounds instead of 20. This will help remove code
duplication.

No behavior nor public API changes. The Salsa20 and XSalsa20 still
represent the 20-round variant.
2022-11-06 23:52:41 +01:00
Frank Denis
96793530cd
std.crypto.pwhash.bcrypt: inline the Feistel network function (#13416)
std/crypto/benchmark.zig results:

* Intel i5

before: 3.144 s/ops
 after: 1.922 s/ops

* Apple M1

before: 2.067 s/ops
 after: 1.373 s/ops
2022-11-03 13:10:08 +01:00
Jacob Young
93d60d0de7 std: avoid vector usage with the C backend
Vectors are not yet implemented in the C backend, so no reason to
prevent code using the standard library from compiling in the meantime.
2022-11-01 20:38:37 -04:00
Frank Denis
0d192ee9ef
std.crypto.onetimeauth.Ghash: make GHASH 2 - 2.5x faster (#13374)
Rewrite GHASH to use 128-bit multiplication over non-reversed
integers, and up to 8 blocks aggregated reduction.

lib/std/crypto/benchmark.zig results:

Xeon E5:
  Before: 1604 MiB/s
   After: 4005 MiB/s

Apple M1:
  Before: 2769 MiB/s
   After: 6014 MiB/s

This also makes AES-GCM faster by the way.
2022-11-01 13:49:13 -04:00
Frank Denis
ddb9eac05c ed25519: recommend using the seed to recover a key pair 2022-11-01 07:26:32 +01:00
Frank Denis
9e44710fc4
Ed25519.KeyPair.fromSecretKey() didn't compile after the API changes (#13386)
Fixes #13378
2022-11-01 07:10:40 +01:00
Cody Tapscott
67fa3262b1 std.crypto: Use featureSetHas to gate intrinsics
This also fixes a bug where the feature gating was not taking
effect at comptime due to https://github.com/ziglang/zig/issues/6768
2022-10-28 17:17:08 -07:00
Cody Tapscott
f9fe548e41 std.crypto: Add isComptime guard around intrinsics
Comptime code can't execute assembly code, so we need some way to
force comptime code to use the generic path. This should be replaced
with whatever is implemented for #868, when that day comes.

I am seeing that the result for the hash is incorrect in stage1 and
crashes stage2, so presumably this never worked correctly. I will follow
up on that soon.
2022-10-28 15:21:10 -07:00
Cody Tapscott
4c1f71e866 std.crypto: Optimize SHA-256 intrinsics for AMD x86-64
This gets us most of the way back to the performance I had when
I was using the LLVM intrinsics:
  - Intel Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz:
       190.67 MB/s (w/o intrinsics) -> 1285.08 MB/s
  - AMD EPYC 7763 (VM) @ 2.45 GHz:
       240.09 MB/s (w/o intrinsics) -> 1360.78 MB/s
  - Apple M1:
       216.96 MB/s (w/o intrinsics) -> 2133.69 MB/s

Minor changes to this source can swing performance from 400 MB/s to
1400 MB/s or... 20 MB/s, depending on how it interacts with the
optimizer. I have a sneaking suspicion that despite LLVM inheriting
GCC's extremely strict inline assembly semantics, its passes are
rather skittish around inline assembly (and almost certainly, its
instruction cost models can assume nothing)
2022-10-28 15:21:10 -07:00
Cody Tapscott
ee241c47ee std.crypto: SHA-256 Properly gate comptime conditional
This feature detection must be done at comptime so that we avoid
generating invalid ASM for the target.
2022-10-28 15:21:10 -07:00
Cody Tapscott
10edb6d352 crypto.sha2: Use intrinsics for SHA-256 on x86-64 and AArch64
There's probably plenty of room to optimize these further in the
future, but for the moment this gives ~3x improvement on Intel
x86-64 processors, ~5x on AMD, and ~10x on M1 Macs.

These extensions are very new - Most processors prior to 2020 do
not support them.

AVX-512 is a slightly older alternative that we could use on Intel
for a much bigger performance bump, but it's been fused off on
Intel's latest hybrid architectures and it relies on computing
independent SHA hashes in parallel. In contrast, these SHA intrinsics
provide the usual single-threaded, single-stream interface, and should
continue working on new processors.

AArch64 also has SHA-512 intrinsics that we could take advantage
of in the future
2022-10-28 15:21:10 -07:00
Frank Denis
f28e4e03ee
std.sign.ecdsa: add support for incremental signatures (#13332)
Similar to what was done for EdDSA, allow incremental creation
and verification of ECDSA signatures.

Doing so for ECDSA is trivial, and can be useful for TLS as well
as the future package manager.
2022-10-28 16:25:37 +02:00
Frank Denis
9c0d975a09
Revamp the ed25519 API (#13309) 2022-10-27 19:07:42 +02:00
Naoki MATSUMOTO
cd4865d88c
std.crypto.sign.ecdsa: accepts unusual parameters like EcdsaP384Sha256 (#13302)
This commit accepts unusual parameters like EcdsaP384Sha256.
Some certifictes(below certs are in /etc/ssl/certs/ca-certificates.crt on Ubuntu 22.04) use EcdsaP384Sha256 to sign itself.
- Subject: C=GR, L=Athens, O=Hellenic Academic and Research Institutions Cert. Authority, CN=Hellenic Academic and Research Institutions ECC RootCA 2015
- Subject: C=US, ST=Texas, L=Houston, O=SSL Corporation, CN=SSL.com EV Root Certification Authority ECC
- Subject: C=US, ST=Texas, L=Houston, O=SSL Corporation, CN=SSL.com Root Certification Authority ECC

In verify(), hash array `h` is allocated to be larger than the scalar.encoded_length.
The array is regarded as big-endian.
Hash values are filled in the back of the array and the rest bytes in front are filled with zero.

In sign(), the hash array is allocated and filled as same as verify().
In deterministicScalar(), hash bytes are insufficient to generate `k`
To generate `k` without narrowing its value range,
this commit uses algorithm stage h. in  "Section 3.2 Generation of k" in RFC6979.
2022-10-26 13:18:06 +02:00
Frank Denis
22b71b1376 crypto/bcrypt: don't reimplement base64, just use a custom alphabet
Now that std.base64 supports everything bcrypt needs to encode its
parameters, we don't need to include another implementation.
2022-10-25 21:52:03 -07:00
Matheus C. França
b41b35f578
crypto/benchmark - replace testing allocator
Fix error: Cannot use testing allocator outside of test block
2022-10-20 14:04:59 +03:00
Andrew Kelley
d3d24874c9 std: remove deprecated API for the upcoming release
See #3811
2022-09-16 14:46:53 -04:00
Veikka Tuominen
62ff8871ed stage2+stage1: remove type parameter from bit builtins
Closes #12529
Closes #12511
Closes #6835
2022-08-22 11:19:20 +03:00
Veikka Tuominen
19af4b5cad std: add workaround for stage2 bug 2022-08-09 16:19:55 +03:00
Frank Denis
fa321a07cd
crypto.sign.ed25519: include a context string in blind key signatures (#12316)
The next revision of the specification is going to include a context
string in the way blinded scalars are computed.

See:
https://github.com/cfrg/draft-irtf-cfrg-signature-key-blinding/issues/30#issuecomment-1180516152
https://github.com/cfrg/draft-irtf-cfrg-signature-key-blinding/pull/37
2022-08-03 15:25:15 +02:00
InKryption
a0d3a87ce1 std.fmt: require specifier for unwrapping ?T and E!T 2022-07-26 11:25:49 -07:00
r00ster
cff5d9c805
std.mem: add first method to SplitIterator and SplitBackwardsIterator 2022-07-25 22:04:30 +03:00
Andrew Kelley
934573fc5d Revert "std.fmt: require specifier for unwrapping ?T and E!T."
This reverts commit 7cbd586ace46a8e8cebab660ebca3cfc049305d9.

This is causing a fail to build from source:

```
./lib/std/fmt.zig:492:17: error: cannot format optional without a specifier (i.e. {?} or {any})
                @compileError("cannot format optional without a specifier (i.e. {?} or {any})");
                ^
./src/link/MachO/Atom.zig:544:26: note: called from here
                log.debug("  RELA({s}) @ {x} => %{d} in object({d})", .{
                         ^
```

I looked at the code to fix it but none of those args are optionals.
2022-07-24 11:50:10 -07:00
InKryption
7cbd586ace
std.fmt: require specifier for unwrapping ?T and E!T.
Co-authored-by: Veikka Tuominen <git@vexu.eu>
2022-07-24 12:01:56 +03:00
Frank Denis
6f0807f50f
crypto.sign.ed25519: add support for blind key signatures (#11868)
Key blinding allows public keys to be augmented with a secret
scalar, making multiple signatures from the same signer unlinkable.

https://datatracker.ietf.org/doc/draft-dew-cfrg-signature-key-blinding/

This is required by privacy-preserving applications such as Tor
onion services and the PrivacyPass protocol.
2022-07-08 13:21:37 +02:00
Frank Denis
38096960fb
crypto.sign.ecdsa: fix toCompressedSec1()/toUnompressedSec1() (#12009)
The Ecdsa.PublicKey type is not a direct alias for a curve element.

So, use the inner field containing the curve element for serialization.
2022-07-06 08:30:43 +02:00
Frank Denis
ee01dd4032
crypto: add the Xoodoo permutation, prepare for Gimli deprecation (#11866)
Gimli was a game changer. A permutation that is large enough to be
used in sponge-like constructions, yet small enough to be compact
to implement and fast on a wide range of platforms.

And Gimli being part of the Zig standard library was awesome.

But since then, Gimli entered the NIST Lightweight Cryptography
Competition, competing againt other candidates sharing a similar set
of properties.

Unfortunately, Gimli didn't pass the 3rd round.

There are no practical attacks against Gimli when used correctly, but
NIST's decision means that Gimli is unlikely to ever get any traction.

So, maybe the time has come to move Gimli from the standard library
to another repository.

We shouldn't do it without providing an alternative, though.
And the best candidate for this is probably Xoodoo.

Xoodoo is the core function of Xoodyak, one of the finalists of the
NIST LWC competition, and the most direct competitor to Gimli. It is
also a 384-bit permutation, so it can easily be used everywhere Gimli
was used with no parameter changes.

It is the building block of Xoodyak (for actual encryption and hashing)
as well as Charm, that some Zig applications are already using.

Like Gimli that it was heavily inspired from, it is compact and
suitable for constrained environments.

This change adds the Xoodoo permutation to std.crypto.core.

The set of public functions includes everything required to later
implement existing Xoodoo-based constructions.

In order to prepare for the Gimli deprecation, the default
CSPRNG was changed to a Xoodoo-based that works exactly the same way.
2022-07-01 13:18:08 +02:00
Frank Denis
48fd92365a
std.crypto.hash: allow creating hash functions from compositions (#11965)
A hash function cascade was a common way to avoid length-extension
attacks with traditional hash functions such as the SHA-2 family.

Add `std.crypto.hash.composition` to do exactly that using arbitrary
hash functions, and pre-define the common SHA2-based ones.

With this, we can now sign and verify Bitcoin signatures in pure Zig.
2022-07-01 11:37:41 +02:00
Frank Denis
234ccb4a50
std.crypto.ecc: add support for the secp256k1 curve (#11880)
std.crypto.ecc: add support for the secp256k1 curve

Usage of the secp256k1 elliptic curve recently grew exponentially,
since this is the curve used by Bitcoin and other popular blockchains
such as Ethereum.

With this, Zig has support for all the widely deployed elliptic curves
today.
2022-06-29 15:11:33 +02:00
Frank Denis
41533fa6a1
std/crypto/{25519,pcurves}: make the scalar field order public (#11955)
For 25519, it's very likely that applications would ever need the
serialized representation. Expose the value as an integer as in
other curves. Rename the internal representation from `field_size`
to `field_order` for consistency.

Also fix a common typo in `scalar.sub()`.
2022-06-29 07:44:43 +02:00
Frank Denis
b2e4dda001
std.crypto.{p256,p384}: process the top nibble in mulDoubleBasePublic (#11956)
Unlike curve25519 where the scalar size is not large enough to fill
the top nibble, this can definitely be the case for p256 and p384.
2022-06-29 07:43:49 +02:00
Andrew Kelley
a71d00a4d5 std.crypto.25519.field: avoid excessive inlining
This valid zig code produces reasonable LLVM IR, however, on the
wasm32-wasi target, when using the wasmtime runtime, the number of
locals of the `isSquare` function exceeds 50000, causing wasmtime
to refuse to execute the binary.

The `inline` keyword in Zig is intended to be used only where it is
semantically necessary; not as an optimization hint. Otherwise, this may
produce unwanted binary bloat for the -OReleaseSmall use case.

In the future, it is possible that we may end up with both `inline`
keyword, which operates as it does in status quo, and additionally
`callconv(.inline_hint)` which has no semantic impact, but may be
observed by optimization passes.

In this commit, I also cleaned up `isSquare` by eliminating an
unnecessary mutable variable, replacing it with several local constants.

Closes #11947.
2022-06-27 19:11:55 -07:00
Veikka Tuominen
38a1222c87 std.crypto: fix invalid pass by value 2022-06-20 15:11:22 +03:00
Frank Denis
27610b0a0f
std/crypto: add support for ECDSA signatures (#11855)
ECDSA is the most commonly used signature scheme today, mainly for
historical and conformance reasons. It is a necessary evil for
many standard protocols such as TLS and JWT.

It is tricky to implement securely and has been the root cause of
multiple security disasters, from the Playstation 3 hack to multiple
critical issues in OpenSSL and Java.

This implementation combines lessons learned from the past with
recent recommendations.

In Zig, the NIST curves that ECDSA is almost always instantied with
use formally verified field arithmetic, giving us peace of mind
even on edge cases. And the API rejects neutral elements where it
matters, and unconditionally checks for non-canonical encoding for
scalars and group elements. This automatically eliminates common
vulnerabilities such as https://sk.tl/2LpS695v .

ECDSA's security heavily relies on the security of the random number
generator, which is a concern in some environments.

This implementation mitigates this by computing deterministic
nonces using the conservative scheme from Pornin et al. with the
optional addition of randomness as proposed in Ericsson's
"Deterministic ECDSA and EdDSA Signatures with Additional Randomness"
document. This approach mitigates both the implications of a weak RNG
and the practical implications of fault attacks.

Project Wycheproof is a Google project to test crypto libraries against
known attacks by triggering edge cases. It discovered vulnerabilities
in virtually all major ECDSA implementations.

The entire set of ECDSA-P256-SHA256 test vectors from Project Wycheproof
is included here. Zero defects were found in this implementation.

The public API differs from the Ed25519 one. Instead of raw byte strings
for keys and signatures, we introduce Signature, PublicKey and SecretKey
structures.

The reason is that a raw byte representation would not be optimal.
There are multiple standard representations for keys and signatures,
and decoding/encoding them may not be cheap (field elements have to be
converted from/to the montgomery domain).

So, the intent is to eventually move ed25519 to the same API, which
is not going to introduce any performance regression, but will bring
us a consistent API, that we can also reuse for RSA.
2022-06-15 08:55:39 +02:00
Frank Denis
7c660d17cd
crypto/pcurves: compute constants for inversion at comptime (#11780) 2022-06-13 08:13:52 +02:00
Veikka Tuominen
6b36774adc std: disable failing tests, add zig2 build test-std to CI 2022-06-12 10:43:28 +03:00
Veikka Tuominen
bb84c87a47 std: add necessary @alignCasts 2022-06-06 13:11:50 -07:00
Veikka Tuominen
6d44c0a16c std: update tests to stage2 semantics 2022-06-03 20:21:20 +03:00
Frank Denis
26aea8cfa1
crypto: add support for the NIST P-384 curve (#11735)
After P-256, here comes P-384, also known as secp384r1.

Like P-256, it is required for TLS, and is the current NIST recommendation for key exchange and signatures, for better or for worse.

Like P-256, all the finite field arithmetic has been computed and verified to be correct by fiat-crypto.
2022-05-31 17:29:38 +02:00
Frank Denis
b08d32ceb5 crypto/25519: add scalar.random(), use CompressedScalar type
Add the ability to generate a random, canonical curve25519 scalar,
like we do for p256.

Also leverage the existing CompressedScalar type to represent these
scalars.
2022-05-26 13:30:03 +02:00
Helio Machado
e0be22b6c0
std.crypto: cosmetic improvement to AES multiplication algorithm (#11616)
std.crypto: cosmetic improvement to AES multiplication algorithm (#11616)
2022-05-25 19:23:49 +02:00
Felix "xq" Queißner
6d27341b96 Fixes comptime 'error: cannot assign to constant' error in siphash. 2022-05-16 22:31:09 -04:00
Frank Denis
aaf4011c2c Typo 2022-05-10 15:57:46 +02:00
Helio Machado
941b6830b1
std.crypto: generate AES constants at compile time (#11612)
* std/crypto: generate AES constants at compile time

* Apply suggestions from code review

Co-authored-by: Frank Denis <124872+jedisct1@users.noreply.github.com>

* Update lib/std/crypto/aes/soft.zig

* Separate encryption and decryption tables

* Run `zig fmt`

* Increase branch quota and remove redundant align

* Update lib/std/crypto/aes/soft.zig

Co-authored-by: Frank Denis <124872+jedisct1@users.noreply.github.com>

* Rename identifiers and simplify dataflow

* Increase branch quota (again) and fix comment

Co-authored-by: Frank Denis <124872+jedisct1@users.noreply.github.com>
2022-05-09 18:50:01 +02:00
Frank Denis
098bee0e56
edwards25519 fixes (#11568)
* edwards25519: fix X coordinate of the base point

Reported by @OfekShochat -- Thanks!

* edwards25519: reduce public scalar when the top bit is set, not cleared

This is an optimization for the unexpected case of a scalar
larger than the field size.

Fixes #11563

* edwards25519: add a test implicit reduction of invalid scalars
2022-05-03 05:28:34 +02:00
Isaac Freund
6f4343b61a std: replace usage of std.meta.bitCount() with @bitSizeOf() 2022-04-27 11:10:52 +02:00