The LLVM bitcode requires all type references in
structs to be to earlier defined types.
We make sure types are ordered in the builder
itself in order to avoid having to iterate the
types multiple times and changing the values
of type indicies.
This introduces the new test step `test-c-import`, and removes the
ability of the behavior tests to `@cImport` paths relative to `test`.
This allows the behavior tests to be run without translate c.
This commit works around #18967 by adding an `AccumulatingReader`, which
accumulates data read from the underlying packfile, and by keeping track
of the position in the packfile and hash/checksum information separately
rather than using reader composition. That is, the packfile position and
hashes/checksums are updated with the accumulated read history data only
after we can determine what data has actually been used by the
decompressor rather than merely being buffered.
The only addition to the standard library APIs to support this change is
the `unreadBytes` function in `std.compress.flate.Inflate`, which allows
the user to determine how many bytes have been read only for buffering
and not used as part of compressed data.
These changes can be reverted if #18967 is resolved with a decompressor
that reads precisely only the number of bytes needed for decompression.
This code is run when printing a stack trace in a debug executable, so
it has to be fast even without compiler optimizations.
Adding a `@panic` to the top of `main` and running an x86_64 backend
compiled compiler goes from `1m32.773s` to `0m3.232s`.
Until now literal and distance code lengths where treated as two
different arrays. But according to rfc they can overlap:
The code length repeat codes can cross from HLIT + 257 to the
HDIST + 1 code lengths. In other words, all code lengths form
a single sequence of HLIT + HDIST + 258 values.
Now code lengths are decoded in single array which is then split
to literal and distance part.
Similar to the previous commit, errors coercing the panic message to
`[]const u8` now point at the operand to `@panic` rather than the actual
builtin call.
When coercing the operand of a `ret_node` etc instruction, the source
location for errors used to point to the entire `return` statement.
Instead, we now point to the operand, as would be expected if there was
an explicit `as_node` instruction (like there used to be).
Previously, the `src_node` field of `struct_decl`, `union_decl`,
`enum_decl`, and `opaque_decl` was optional, included in trailing data
only if a flag in `Small` was set. However, this was unnecessary logic:
AstGen always provided the source node. We can simplify a few bits of
logic by making this field non-optional, moving it into non-trailing
data.
There was one place where the field was actually omitted before: the
root struct of a file was at source node 0, so the node was
coincidentally elided. Therefore, this commit has a fixed cost of 4
bytes of ZIR per file.
In most cases where AstGen is coercing to a fixed type (such as `u29`,
`type`, `std.builtin.CallingConvention) we do not necessarily require an
explicit coercion instruction. Instead, Sema knows the type that is
required, and can perform the coercion after the fact. This means we can
use the `coerced_ty` result location kind, saving unnecessary coercion
instructions and therefore ZIR bytes.
This required a few enhancements to Sema to introduce missing coercions.
`sema.src` is a failed experiment. It introduces complexity, and makes
often unwarranted assumptions about the existence of instructions
providing source locations, requiring an unreasonable amount of caution
in AstGen for correctness. Eliminating it simplifies the whole frontend.
This required adding source locations to a few instructions, but the
cost in ZIR bytes should be counteracted by the other work on this
branch.
AstGen has logic to elide leading `dbg_stmt` instructions when multiple
are emitted consecutively; however, it only applied in some cases. A
simple reshuffle here makes this logic apply universally, saving some
bytes in ZIR.
This is a small optimization to generated ZIR. In any function where the
return type is not a trivial Ref, we know it is almost certainly not
`void` (unless the user aliased it or did something else weird to fool
AstGen), and thus the return type is very likely to be required for
return value RLS at some point. Thus, we can just emit one `ret_type` at
the start of the function and use it throughout.
This sees a very small improvement in overall ZIR bytes.