21063 Commits

Author SHA1 Message Date
Stevie Hryciw
32b97df50e langref: add appendix and explain 'container' terminology 2022-11-12 15:42:29 +02:00
IntegratedQuantum
fbc4331f18 Implements std.math.sign for float vectors. 2022-11-12 15:41:55 +02:00
Jakub Konka
7733246d6e pdb: make SuperBlock def public 2022-11-12 09:40:40 +01:00
Frank Denis
df7223c7f2 crypto.AesGcm: provision ghash for the final block 2022-11-11 18:04:22 +01:00
Veikka Tuominen
4f285d4dac GitHub: add issue template for error messages 2022-11-11 18:15:13 +02:00
Cody Tapscott
2897641fb9 stage2: Support modifiers in inline asm
These are supported using %[ident:mod] syntax. This allows requesting,
e.g., the "w" (32-bit) vs. "x" (64-bit) views of AArch64 registers.

See https://llvm.org/docs/LangRef.html#asm-template-argument-modifiers
2022-11-11 16:01:31 +02:00
Cody Tapscott
b605cb2742 cmake: Mark <root>/.git/HEAD as a configure dependency
This ensures that the Zig version will be re-computed when jumping
through the source tree, which is especially important if bisecting
across AstGen- or other changes that must not use the old cache.
2022-11-10 19:34:56 -05:00
Andrew Kelley
892fb0fc88
Merge pull request #13074 from topolarity/stage2-opt
stage2: Miscellaneous fixes to vector arithmetic and copy elision
2022-11-10 19:34:43 -05:00
Jacob Young
e40c38d258 Sema: avoid breaking hash contract when instantiating generic functions
* Add tagName to Value which behaves like @tagName.
 * Add hashUncoerced to Value as an alternative to hash when we want to
   produce the same hash for value that can coerce to each other.
 * Hash owner_decl instead of module_fn in Sema.instantiateGenericCall
   since Module.Decl.Index is not affected by ASLR like *Module.Fn was,
   and also because GenericCallAdapter.eql was already doing this.
 * Use Value.hashUncoerced in Sema.instantiateGenericCall because
   GenericCallAdapter.eql uses Value.eqlAdvanced to compare args, which
   ignores coersions.
 * Add revealed missing cases to Value.eqlAdvanced.

Without these changes, we were breaking the hash contract for
monomorphed_funcs, and were generating different hashes for values that
compared equal.  This resulted in a 0.2% chance when compiling
self-hosted of producing a different output, which depended on
fingerprint collisions of hashes that were affected by ASLR.  Normally,
the different hashes would have resulted in equal checks being skipped,
but in the case of a fingerprint collision, the truth would be revealed
and the compiler's behavior would diverge.
2022-11-10 14:35:57 -05:00
Cody Tapscott
7b978bf1e0 stage2: Rename Value.compare to compareAll, etc.
These functions have a very error-prone API. They are essentially
`all(cmp(op, ...))` but that's not reflected in the name.

This renames these functions to `compareAllAgainstZero...` etc.
for clarity and fixes >20 locations where the predicate was
incorrect.

In the future, the scalar `compare` should probably be split off
from the vector comparison. Rank-polymorphic programming is great,
but a proper implementation in Zig would decouple comparison and
reduction, which then needs a way to fuse ops at comptime.
2022-11-10 12:24:02 -07:00
Cody Tapscott
b1357091ae Add test case for #12043
This bug is already resolved, just want to make sure we don't lose
the test case. Closes #12043
2022-11-10 12:23:59 -07:00
Cody Tapscott
fbda15632d stage2 sema: Make vector constants when operating on vectors
Resolves https://github.com/ziglang/zig/issues/13058
2022-11-10 12:22:40 -07:00
Cody Tapscott
a2f4de1663 stage2 llvm: Elide more loads
Adds optimizations for by-ref types to:
  - .struct_field_val
  - .slice_elem_val
  - .ptr_elem_val

I would have expected LLVM to be able to optimize away these
temporaries since we don't leak pointers to them and they are fed
straight from def to use, but empirically it does not.

Resolves https://github.com/ziglang/zig/issues/12713
Resolves https://github.com/ziglang/zig/issues/12638
2022-11-10 12:22:40 -07:00
Cody Tapscott
8f3880074f stage2: Be more strict about eliding loads
This change makes any of the `*_val` instructions check whether it's
safe to elide copies for by-ref types rather than performing this
elision blindly.

AIR instructions fixed:
 - .array_elem_val
 - .struct_field_val
 - .unwrap_errunion_payload
 - .try
 - .optional_payload

These now all respect value semantics, as expected.

P.S. Thanks to Andrew for the new way to approach this. Many of the
lines here are from his recommended change, which comes with the
significant advantage that loads are now as small as the intervening
memory access allows.

Co-authored by: Andrew Kelley <andrew@ziglang.org>
2022-11-10 12:22:40 -07:00
Cody Tapscott
ff699722da stage2: Fix comptime array initialization
This is a follow-up to 9dc98fba, which made comptime initialization
patterns for union/struct more robust, especially when storing to
comptime-known pointers (and globals).

Resolves #13063.
2022-11-10 12:22:37 -07:00
Frank Denis
59af6417bb
crypto.ghash: define aggregate tresholds as blocks, not bytes (#13507)
These constants were read as a block count in initForBlockCount()
but at the same time, as a size in update().

The unit could be blocks or bytes, but we should use the same one
everywhere.

So, use blocks as intended.

Fixes #13506
2022-11-10 19:00:00 +01:00
Jakub Konka
04b8ce5fd3 Merge branch 'jcmoyer-lld-explicit-pdb' 2022-11-10 16:52:43 +01:00
Jakub Konka
4b3637820d
Merge pull request #13495 from ziglang/macho-dsym
stage2: misc DWARF debug info fixes and additions for x86_64 and aarch64
2022-11-10 16:50:57 +01:00
Jakub Konka
1357790ec9 win: combine PDB fixes into one changeset 2022-11-10 14:03:11 +01:00
J.C. Moyer
dd8df1caf3 Windows: Explicitly pass PDB paths to lld-link
On Windows, lld-link resolves PDB output paths using `/` and embeds the
result in the final executable, which breaks some native tooling like
WPR/WPA. This commit overrides the default behavior of lld-link by
explicitly setting the output PDB filename and binary-embedded path.
2022-11-10 13:43:35 +01:00
Jakub Konka
0914e0a4ec dwarf: do not assume unsigned 64bit integer for the enum value 2022-11-10 09:36:45 +01:00
Jakub Konka
2d5fbbb44e
Merge pull request #13485 from ziglang/arm64-non-null-actual
aarch64: optionals
2022-11-09 21:13:50 +01:00
Jakub Konka
31e755df6f aarch64: handle .stack_argument_offset as a valid local var 2022-11-09 19:58:14 +01:00
Jakub Konka
df09d9c14a x86_64: add DWARF encoding for vector registers
Clean up how we handle emitting of DWARF debug info for `x86_64`
codegen.
2022-11-09 18:35:06 +01:00
Jakub Konka
02852098ee aarch64: emit DWARF debug info for fn params and locals
We postpone emitting debug info until *after* we generate the function
so that we have an idea of the consumed stack space. The stack offsets
encoded within DWARF are with respect to the frame pointer `.fp`.
2022-11-09 18:35:06 +01:00
Jakub Konka
7007ecdc05 macho: create dSYM bundle directly in the emit dir 2022-11-09 18:35:06 +01:00
Veikka Tuominen
41b7e40d75
Merge pull request #13418 from ryanschneider/signal-alignment-13216
std.os: fix alignment of Sigaction.handler_fn
2022-11-09 17:36:40 +02:00
IntegratedQuantum
d1e7be0bd1
Handle sentinel slices in std.mem.zeroes
Fixes #13256
2022-11-09 17:33:48 +02:00
Veikka Tuominen
61842da9f7 llvm: implement packed unions
Closes #13340
2022-11-09 17:14:38 +02:00
bfredl
95f989a05b Fixes to linux/bpf/btf.zig
- the meaning of packed structs changed in zig 0.10. adjust accordingly.
  Use "extern struct" for the cases that directly map to C structs.

- Add new type info kinds, like enum64 and DeclTag

- change the Type enum to use the canonical names from libbpf.
  This is more predictable when comparing with external BPF
  documentation (than invented synonyms that need to be guessed)
2022-11-09 17:14:22 +02:00
Jakub Konka
a2e67173d1
Merge pull request #13487 from ziglang/zld-dwarf-string
macho: misc DWARF parser fixes
2022-11-09 06:53:24 +01:00
Jakub Konka
188ad31cf3 macho: fix 32bit build 2022-11-08 21:21:25 +01:00
Andrew Kelley
a65ba6c85a CI: stop using cloud.drone.io
This service stopped working two days ago for unknown reasons. Until it
is determined how to get it working again, or we switch to a different
CI provider for aarch64, this CI test coverage is disabled so that
we can continue to use the CI for other targets.
2022-11-08 11:04:33 -07:00
Jakub Konka
9db63d4f1d macho: fix handling of DW_FORM_block* forms 2022-11-08 17:45:28 +01:00
Frank Denis
36e618aef1 crypto.ghash: compatibility with stage1
Defining the selector enum outside the function definition is
required for stage1.
2022-11-08 16:59:53 +01:00
Jakub Konka
7145064efd macho: fix parsing len of DW_FORM_string 2022-11-08 15:45:15 +01:00
Jakub Konka
e83590d0e8 aarch64: pass some tests dealing with optionals 2022-11-08 13:59:47 +01:00
Jakub Konka
179f16904f aarch64: circumvent zig0 inference problems 2022-11-08 13:59:06 +01:00
Jakub Konka
32ad218f5a aarch64: revert changes to .call 2022-11-08 13:50:30 +01:00
Jakub Konka
45f65c8445 aarch64: fix implementation of .is_null and .is_non_null 2022-11-08 13:42:58 +01:00
Jakub Konka
0d556877af aarch64: implement .wrap_optional always saving to the stack 2022-11-08 13:42:58 +01:00
Jakub Konka
a07449450f aarch64: implement optionalPayload when mcv is register 2022-11-08 13:42:58 +01:00
Jakub Konka
35bd5363ee aarch64: implement isNull() for non-pointer optionals 2022-11-08 13:42:58 +01:00
Jakub Konka
0de56d1722 aarch64: partially implement optionalPayload() 2022-11-08 13:42:58 +01:00
Jakub Konka
cd7cbed651 aarch64: partially implement isNull() 2022-11-08 13:42:58 +01:00
Frank Denis
7d48cb1138
std.crypto: make ghash faster, esp. for small messages (#13464)
* std.crypto: make ghash faster, esp. for small messages

Aggregated reduction requires 5 additional multiplications (to
precompute the powers of H), in order to save 2 multiplications
per batch.

So, only use large batches when it's actually interesting to do so.

For the last blocks, reuse the precomputations in order to perform
a single reduction.

Also, even in .ReleaseSmall, allow 2-block aggregation.
The speedup is worth it, and the code increase is reasonable.

And in .ReleaseFast, bump the upper batch size up to 16.

Leverage comptime by the way instead of duplicating code.

std/crypto/benchmark.zig on Apple M1:

    Zig 0.10.0: 2769 MiB/s
        Before: 6014 MiB/s
         After: 7334 MiB/s

Normalize function names by the way.

* Change clmul() to accept the half to be processed

This avoids a bunch of truncate() calls.

* Add more ghash tests to check all code paths
2022-11-07 21:45:29 +01:00
Ganesan Rajagopal
88d2e4f66a langref.html.in: Simplify printing types in examples
zig stdlib fmt has a formatter for types which prints the type name.  So,
just use @TypeOf(type) instead of the longer @typeInfo(@TypeOf(type)).
2022-11-07 15:07:21 +02:00
Veikka Tuominen
dc128f403b
Merge pull request #13446 from Vexu/stage2-fixes
Stage2 bug fixes
2022-11-07 14:17:26 +02:00
Frank Denis
32563e6829
crypto.core.aes: process 6 block in parallel instead of 8 on aarch64 (#13473)
* crypto.core.aes: process 6 block in parallel instead of 8 on aarch64

At least on Apple Silicon, this is slightly faster than 8 blocks.

* AES: add parallel blocks for tigerlake, rocketlake, alderlake, zen3
2022-11-07 12:28:37 +01:00
Frank Denis
907f3ef887
crypto.salsa20: make the number of rounds a comptime parameter (#13442)
...instead of hard-coding it to 20.

- This is consistent with the ChaCha implementation
- NaCl and libsodium, that this API is designed to interop with,
also support 8 and 12 round variants. The 12 round variant, in
particular, provides the same security level as the 20 round variant,
but is obviously faster.
- scrypt currently uses its own non optimized version of Salsa, just
because it use 8 rounds instead of 20. This will help remove code
duplication.

No behavior nor public API changes. The Salsa20 and XSalsa20 still
represent the 20-round variant.
2022-11-06 23:52:41 +01:00