It's pretty compact, with each AIR instruction only taking up 4 bits,
plus a sparse table for special instructions such as conditional branch,
switch branch, and function calls with more than 2 arguments.
This commit changes the AIR file and the documentation of the memory
layout. The actual work of modifying the surrounding code (in Sema and
codegen) is not yet done.
* rename files to adhere to conventions
* remove unnecessary function / optionality
* fix merge conflict
* better panic message
* remove unnecessary TODO comment
* proper namespacing of declarations
* clean up documentation comments
* no copyright header needed for a brand new zig file that is not
copied from anywhere
AstGen had the then-else logic backwards for if expressions
on error unions. This commit fixes it.
Turns out AstGen only really needs `is_non_null` and `is_non_err`,
and does not need the `is_null` or `is_err` variants. So I removed the
`is_null{,_ptr}` and `is_err{,_ptr}` ZIR instructions (-4) and
added `is_non_err`, `is_non_err_ptr` ZIR instructions (+2) for
a total of (-2) ZIR instructions, giving us a tiny bit more headroom
within the 256 tag limit. This required swapping the order of
then/else blocks in a handful of cases, but ultimately means the
ZIR will be in the same as source order, which is convenient
when debugging.
AIR code on the other hand, gains the `is_non_err` and `is_non_err_ptr`
instructions.
Sema: fix logic in zirErrUnionCode and zirErrUnionCodePtr returning the
wrong result type.
* implement enough of ret_err_value to pass wasm tests
* only do the proper `@panic` implementation for the backends which
support it, which is currently only the C backend. The other backends
will see `@breakpoint(); unreachable;` same as before.
- I plan to do AIR memory layout reworking as a prerequisite to
fixing other backends, because that will help me put all the
constants up front, which will allow the codegen to lower to memory
without jumps.
* `@panic` is implemented using anon decls for the message. Makes it
easier on the backends. Might want to look into re-using decls for
this in the future.
* implement DWARF .debug_info for pointer-like optionals.
We can just use bitcast instead of error_to_int, int_to_error since
errorToInt and intToError do not actually do anything, just change types.
This allows us to remove 2 air ops that were the exact same as bitcast
- hash/eql functions moved into a Context object
- *Context functions pass an explicit context
- *Adapted functions pass specialized keys and contexts
- new getPtr() function returns a pointer to value
- remove functions renamed to fetchRemove
- new remove functions return bool
- removeAssertDiscard deleted, use assert(remove(...)) instead
- Keys and values are stored in separate arrays
- Entry is now {*K, *V}, the new KV is {K, V}
- BufSet/BufMap functions renamed to match other set/map types
- fixed iterating-while-modifying bug in src/link/C.zig
We've settled on the nomenclature for the artifacts the compiler
pipeline produces:
1. Tokens
2. AST (Abstract Syntax Tree)
3. ZIR (Zig Intermediate Representation)
4. AIR (Analyzed Intermediate Representation)
5. Machine Code
Renaming `ir` identifiers to `air` will come with the inevitable
air-memory-layout branch that I plan to start after the 0.8.0 release.
Conflicts:
* src/codegen/spirv.zig
* src/link/SpirV.zig
We're going to want to improve the stage2 test harness to print
the source file name when a compile error occurs otherwise std lib
contributors are going to see some confusing CI failures when they cause
stage2 AstGen compile errors.
Conflicts:
* build.zig
* src/Compilation.zig
* src/codegen/spirv/spec.zig
* src/link/SpirV.zig
* test/stage2/darwin.zig
- this one might be problematic; start.zig looks for `main` in the
root source file, not `_main`. Not sure why there is an underscore
there in master branch.
As it stands, the backend is incomplete, and there is no active contributor,
making it dead weight.
However, anyone is free to resurrect this backend at any time.
Conflicts:
* lib/std/os/linux.zig
* lib/std/os/windows/bits.zig
* src/Module.zig
* src/Sema.zig
* test/stage2/test.zig
Mainly I wanted Jakub's new macOS code for respecting stack size, since
we now depend on it for debug builds able to pass one of the test cases
for recursive comptime function calls with `@setEvalBranchQuota`.
The conflicts were all trivial.
instead of node indexes.
* AstGen: dbg_stmt instructions now have line and column indexes,
relative to the parent declaration. This allows codegen to emit debug
info without having the source bytes, tokens, or AST nodes loaded
in memory.
* ZIR: each decl has the absolute line number. This allows computing
line numbers from offsets without consulting source code bytes.
Memory management: creating a function definition does not prematurely
set the Decl arena. Instead the function is allocated with the general
purpose allocator.
Codegen no longer looks at source code bytes for any reason. They can
remain unloaded from disk.
* AstGen: add missing `break_inline` for comptime blocks.
* Module: call getTree() in byteOffset(). This generates the AST when
using cached ZIR and compile errors need to be reported.
* Scope.File: distinguish between successful ZIR generation and AIR
generation (when Decls in scope have been scanned).
- `semaFile` correctly avoids doing work twice.
* Implement first pass at `lookupInNamespace`. It has various TODOs
left, such as `usingnamespace`, and setting up Decl dependencies.
This was also an experiment to see if it were easier to implement a new
feature when using the instruction encoder.
Verdict: It's not that much easier, but I think it's certainly much more
readable, because the description of the Instruction annotates what each
field means. Right now, precise knowledge of x86_64 instructions is
still required because things like when to set the 64-bit flag, how to
read x86_64 instruction references, etc. are still not automatically
done for you.
In the future, this interface might make it sligtly easier to write an
assembler for x86_64, by abstracting the bit-fiddling aspects of
instruction encoding.
From my very cursory reading, it seems that the register manager doesn't
distinguish between registers that are physically the same but have
different sizes.
In that case, this means that during codegen, we can't rely on
`reg.size()` when determining the width of the operations we have to
perform. Instead, we must use some form of `ty.abiSize(self.target.*)`
to determine the size of the type we're operating with. If this size is
64 bits, then we should enable 64-bit operation.
This fixed a bug in the codegen for spilling instructions, which was
overwriting the previous stack entry with zeroes. See the modified test
case in this commit.
There are parts of it that I didn't modify because the byte
representation was important (e.g. we need to know what the exact
byte position where we store the address into the offset table is)
Instead of Module setting up the root_scope with the root source file,
instead, Module relies on the package table graph being set up properly,
and inside `update()`, it does the equivalent of `_ = @import("std");`.
This, in term, imports start.zig, which has the logic to call main (or
not). `Module` no longer has `root_scope` - the root source file is no
longer special, it's just in the package table mapped to "root".
I also went ahead and implemented proper detection of updated files.
mtime, inode, size, and source hash are kept in `Scope.File`.
During an update, iterate over `import_table` and stat each file to find
out which ones are updated.
The source hash is redundant with the source hash used by the struct
decl that corresponds to the file, so it should be removed in a future
commit before merging the branch.
* AstGen: add "previously declared here" notes for variables shadowing
decls.
* Parse imports as structs. Module now calls `AstGen.structDeclInner`,
which is called by `AstGen.containerDecl`.
- `importFile` is a bit kludgy with how it handles the top level Decl
that kinda gets merged into the struct decl at the end of the
function. Be on the look out for bugs related to that as well as
possibly cleaner ways to implement this.
* Module: factor out lookupDeclName into lookupIdentifier and lookupNa
* Rename `Scope.Container` to `Scope.Namespace`.
* Delete some dead code.
This branch won't work until `usingnamespace` is implemented because it
relies on `@import("builtin").OutputMode` and `OutputMode` comes from a
`usingnamespace`.