After this change, the frontend and backend cooperate to keep track of
which Decls are actually emitted into the machine code. When any backend
sees a `decl_ref` Value, it must mark the corresponding Decl `alive`
field to true.
This prevents unused comptime data from spilling into the output object
files. For example, if you do an `inline for` loop, previously, any
intermediate value calculations would have gone into the object file.
Now they are garbage collected immediately after the owner Decl has its
machine code generated.
In the frontend, when it is time to send a Decl to the linker, if it has
not been marked "alive" then it is deleted instead.
Additional improvements:
* Resolve type ABI layouts after successful semantic analysis of a
Decl. This is needed so that the backend has access to struct fields.
* Sema: fix incorrect logic in resolveMaybeUndefVal. It should return
"not comptime known" instead of a compile error for global variables.
* `Value.pointerDeref` now returns `null` in the case that the pointer
deref cannot happen at compile-time. This is true for global
variables, for example. Another example is if a comptime known
pointer has a hard coded address value.
* Binary arithmetic sets the requireRuntimeBlock source location to the
lhs_src or rhs_src as appropriate instead of on the operator node.
* Fix LLVM codegen for slice_elem_val which had the wrong logic for
when the operand was not a pointer.
As noted in the comment in the implementation of deleteUnusedDecl, a
future improvement will be to rework the frontend/linker interface to
remove the frontend's responsibility of calling allocateDeclIndexes.
I discovered some issues with the plan9 linker backend that are related
to this, and worked around them for now.
* AIR no longer has a `variables` array. Instead of the `varptr`
instruction, Sema emits a constant with a `decl_ref`.
* AIR no longer has a `ref` instruction. There is no longer any
instruction that takes a value and returns a pointer to it. If this
is desired, Sema must either create an anynomous Decl and return a
constant `decl_ref`, or in the case of a runtime value, emit an
`alloc` instruction, `store` the value to it, and then return the
`alloc`.
* The `ref_val` Value Tag is eliminated. `decl_ref` should be used
instead. Also added is `eu_payload_ptr` which points to the payload
of an error union, given an error union pointer.
In general, Sema should avoid calling `analyzeRef` if it can be helped.
For example in the case of field_val and elem_val, there should never be
a reason to create a temporary (alloc or decl). Recent previous commits
made progress along that front.
There is a new abstraction in Sema, which looks like this:
var anon_decl = try block.startAnonDecl();
defer anon_decl.deinit();
// here 'anon_decl.arena()` may be used
const decl = try anon_decl.finish(ty, val);
// decl is typically now used with `decl_ref`.
This pattern is used to upgrade `ref_val` usages to `decl_ref` usages.
Additional improvements:
* Sema: fix source location resolution for calling convention
expression.
* Sema: properly report "unable to resolve comptime value" for loads of
global variables. There is now a set of functions which can be
called if the callee wants to obtain the Value even if the tag is
`variable` (indicating comptime-known address but runtime-known value).
* Sema: `coerce` resolves builtin types before checking equality.
* Sema: fix `u1_type` missing from `addType`, making this type have a
slightly more efficient representation in AIR.
* LLVM backend: fix `genTypedValue` for tags `decl_ref` and `variable`
to properly do an LLVMConstBitCast.
* Remove unused parameter from `Value.toEnum`.
After this commit, some test cases are no longer passing. This is due to
the more principled approach to comptime references causing more
anonymous decls to get sent to the linker for codegen. However, in all
these cases the decls are not actually referenced by the runtime machine
code. A future commit in this branch will implement garbage collection
of decls so that unused decls do not get sent to the linker for codegen.
This will make the tests go back to passing.
* Add AIR instruction: struct_field_val
- This is part of an effort to eliminate the AIR instruction `ref`.
- It's implemented for C backend and LLVM backend so far.
* Rename `resolvePossiblyUndefinedValue` to `resolveMaybeUndefVal` just
to save some columns on long lines.
* Sema: add `fieldVal` alongside `fieldPtr` (renamed from
`namedFieldPtr`). This is part of an effort to eliminate the AIR
instruction `ref`. The idea is to avoid unnecessary loads, stores,
stack usage, and IR instructions, by paying a DRY cost.
LLVM backend improvements:
* internal linkage vs exported linkage is implemented, along with
aliases. There is an issue with incremental updates due to missing
LLVM API for deleting aliases; see the relevant comment in this commit.
- `updateDeclExports` is hooked up to the LLVM backend now.
* Fix usage of `Type.tag() == .noreturn` rather than calling `isNoReturn()`.
* Properly mark global variables as mutable/constant.
* Fix llvm type generation of function pointers
* Fix codegen for calls of function pointers
* Implement llvm type generation of error unions and error sets.
* Implement AIR instructions: addwrap, subwrap, mul, mulwrap, div,
bit_and, bool_and, bit_or, bool_or, xor, struct_field_ptr,
struct_field_val, unwrap_errunion_err, add for floats, sub for
floats.
After this commit, `zig test` on a file with `test "example" {}`
correctly generates and executes a test binary. However the
`test_functions` slice is undefined and just happens to be going into
the .bss section, causing the length to be 0. The next step towards
`zig test` will be replacing the `test_functions` Decl Value with the
set of test function pointers, before it is sent to linker/codegen.
* There is now a main_pkg in addition to root_pkg. They are usually the
same. When using `zig test`, main_pkg is the user's source file and
root_pkg has the test runner.
* scanDecl no longer looks for test decls outside the package being
tested. honoring `--test-filter` is still TODO.
* test runner main function has a void return value rather than
`anyerror!void`
* Sema is improved to generate better AIR for for loops on slices.
* Sema: fix incorrect capacity calculation in zirBoolBr
* Sema: add compile errors for trying to use slice fields as an lvalue.
* Sema: fix type coercion for error unions
* Sema: fix analyzeVarRef generating garbage AIR
* C codegen: fix renderValue for error unions with 0 bit payload
* C codegen: implement function pointer calls
* CLI: fix usage text
Adds 4 new AIR instructions:
* slice_len, slice_ptr: to get the ptr and len fields of a slice.
* slice_elem_val, ptr_slice_elem_val: to get the element value of
a slice, and a pointer to a slice.
AstGen gains a new functionality:
* One of the unused flags of struct decls is now used to indicate
structs that are known to have non-zero size based on the AST alone.
It incorrectly did not process the death of its operand. Additionally:
* delete dead code accidentally introduced in fe14e339458a578657f3890f00d654a15c84422c
* improve AIR printing code to include liveness data for operands.
Now an exclamation point ("!") indicates the tombstone of an AIR
instruction.
* Breaking language change: inline assembly must use string literal
syntax. This is in preparation for inline assembly improvements that
involve more integration with the Zig language. This means we cannot
rely on text substitution.
* Liveness: properly handle inline assembly and function calls with
more than 3 operands.
- More than 35 operands is not yet supported. This is a low priority
to implement.
- This required implementation in codegen.zig as well.
* Liveness: fix bug causing incorrect tomb bits.
* Sema: enable switch expressions that are evaluated at compile-time.
- Runtime switch instructions still need to be reworked in this
branch. There was a TODO left here (by me) with a suggestion to do
some bigger changes as part of the AIR memory reworking. Now that
time has come and I plan to honor the suggestion in a future commit
before merging this branch.
* AIR printing: fix missing ')' on alive instructions.
We're back to "hello world" working for the x86_64 backend.
Now the branch is compiling again, provided that one uses
`-Dskip-non-native`, but many code paths are disabled. The code paths
can now be re-enabled one at a time and updated to conform to the new
AIR memory layout.
to the link infrastructure, instead of being stored with Module.Fn. This
moves towards a strategy to make more efficient use of memory by not
storing Air or Liveness data in the Fn struct, but computing it on
demand, immediately sending it to the backend, and then immediately
freeing it.
Backends which want to defer codegen until flush() such as SPIR-V
must move the Air/Liveness data upon `updateFunc` being called and keep
track of that data in the backend implementation itself.
It's pretty compact, with each AIR instruction only taking up 4 bits,
plus a sparse table for special instructions such as conditional branch,
switch branch, and function calls with more than 2 arguments.
This commit changes the AIR file and the documentation of the memory
layout. The actual work of modifying the surrounding code (in Sema and
codegen) is not yet done.
* rename files to adhere to conventions
* remove unnecessary function / optionality
* fix merge conflict
* better panic message
* remove unnecessary TODO comment
* proper namespacing of declarations
* clean up documentation comments
* no copyright header needed for a brand new zig file that is not
copied from anywhere
AstGen had the then-else logic backwards for if expressions
on error unions. This commit fixes it.
Turns out AstGen only really needs `is_non_null` and `is_non_err`,
and does not need the `is_null` or `is_err` variants. So I removed the
`is_null{,_ptr}` and `is_err{,_ptr}` ZIR instructions (-4) and
added `is_non_err`, `is_non_err_ptr` ZIR instructions (+2) for
a total of (-2) ZIR instructions, giving us a tiny bit more headroom
within the 256 tag limit. This required swapping the order of
then/else blocks in a handful of cases, but ultimately means the
ZIR will be in the same as source order, which is convenient
when debugging.
AIR code on the other hand, gains the `is_non_err` and `is_non_err_ptr`
instructions.
Sema: fix logic in zirErrUnionCode and zirErrUnionCodePtr returning the
wrong result type.
* implement enough of ret_err_value to pass wasm tests
* only do the proper `@panic` implementation for the backends which
support it, which is currently only the C backend. The other backends
will see `@breakpoint(); unreachable;` same as before.
- I plan to do AIR memory layout reworking as a prerequisite to
fixing other backends, because that will help me put all the
constants up front, which will allow the codegen to lower to memory
without jumps.
* `@panic` is implemented using anon decls for the message. Makes it
easier on the backends. Might want to look into re-using decls for
this in the future.
* implement DWARF .debug_info for pointer-like optionals.
We can just use bitcast instead of error_to_int, int_to_error since
errorToInt and intToError do not actually do anything, just change types.
This allows us to remove 2 air ops that were the exact same as bitcast
- hash/eql functions moved into a Context object
- *Context functions pass an explicit context
- *Adapted functions pass specialized keys and contexts
- new getPtr() function returns a pointer to value
- remove functions renamed to fetchRemove
- new remove functions return bool
- removeAssertDiscard deleted, use assert(remove(...)) instead
- Keys and values are stored in separate arrays
- Entry is now {*K, *V}, the new KV is {K, V}
- BufSet/BufMap functions renamed to match other set/map types
- fixed iterating-while-modifying bug in src/link/C.zig
We've settled on the nomenclature for the artifacts the compiler
pipeline produces:
1. Tokens
2. AST (Abstract Syntax Tree)
3. ZIR (Zig Intermediate Representation)
4. AIR (Analyzed Intermediate Representation)
5. Machine Code
Renaming `ir` identifiers to `air` will come with the inevitable
air-memory-layout branch that I plan to start after the 0.8.0 release.
Conflicts:
* src/codegen/spirv.zig
* src/link/SpirV.zig
We're going to want to improve the stage2 test harness to print
the source file name when a compile error occurs otherwise std lib
contributors are going to see some confusing CI failures when they cause
stage2 AstGen compile errors.
Conflicts:
* build.zig
* src/Compilation.zig
* src/codegen/spirv/spec.zig
* src/link/SpirV.zig
* test/stage2/darwin.zig
- this one might be problematic; start.zig looks for `main` in the
root source file, not `_main`. Not sure why there is an underscore
there in master branch.
As it stands, the backend is incomplete, and there is no active contributor,
making it dead weight.
However, anyone is free to resurrect this backend at any time.
Conflicts:
* lib/std/os/linux.zig
* lib/std/os/windows/bits.zig
* src/Module.zig
* src/Sema.zig
* test/stage2/test.zig
Mainly I wanted Jakub's new macOS code for respecting stack size, since
we now depend on it for debug builds able to pass one of the test cases
for recursive comptime function calls with `@setEvalBranchQuota`.
The conflicts were all trivial.