Frank Denis 7cfeae1ce7
std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs (#13566)
* std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs

Carryless multiplication was slow on older Intel CPUs, justifying
the need for using Karatsuba multiplication.

This is not the case any more; using 4 multiplications to multiply
two 128-bit numbers is actually faster than 3 multiplications +
shifts and additions.

This is also true on aarch64.

Keep using Karatsuba only when targeting x86 (granted, this is a bit
of a brutal shortcut, we should really list all the CPU models that
had a slow clmul instruction).

Also remove useless agg_2 treshold and restore the ability to
precompute only H and H^2 in ReleaseSmall.

Finally, avoid using u256. Using 128-bit registers is actually faster.

* Use a switch, add some comments
2022-11-17 13:07:07 +01:00
..
2022-11-15 23:23:27 +02:00
2022-10-30 01:09:31 -07:00
2022-10-06 21:22:20 +03:00
2022-11-04 00:09:27 +03:30
2022-11-04 00:09:27 +03:30
2022-11-13 17:36:56 +02:00
2022-11-04 00:09:27 +03:30
2022-08-23 21:11:02 -07:00
2022-11-04 00:09:27 +03:30
2022-01-07 00:06:06 -05:00
2022-09-11 23:18:43 -04:00
2022-11-04 00:09:27 +03:30
2022-11-12 09:40:40 +01:00
2022-11-04 00:09:27 +03:30
2022-11-04 00:09:27 +03:30
2022-04-15 17:01:01 -05:00
2022-11-04 00:09:27 +03:30