LemonBoy 61e9e82bdc std: Make the CRC32 calculation slightly faster
Speed up a little the slicing-by-8 code path by replacing the
(load+shift+xor)*4 sequence with a single u32 load plus a xor.

Before:

```
iterative:  1018 MiB/s [000000006c3b110d]
small keys:  1075 MiB/s [0035bf3dcac00000]
```

After:

```
iterative:  1114 MiB/s [000000006c3b110d]
small keys:  1324 MiB/s [0035bf3dcac00000]
```
2020-09-13 16:32:21 -04:00
..
2020-09-11 20:02:41 -04:00
2020-09-10 18:53:20 -07:00
2020-09-08 13:04:14 -04:00
2020-09-07 20:44:01 +03:00
2020-09-03 15:05:47 +03:00
2020-09-04 05:15:03 +03:00
2020-09-11 20:02:41 -04:00
2020-08-22 12:45:29 -07:00
2020-09-04 12:48:36 +02:00
2020-09-04 22:49:14 +03:00
2020-08-20 16:07:04 -04:00
2020-09-04 05:22:26 +03:00