This is part of an ongoing effort to reduce size of in-memory AST. This
enum flattening pattern is widespread throughout the self-hosted
compiler.
This is a API breaking change for consumers of the self-hosted parser.
* AST: flatten ControlFlowExpression into Continue, Break, and Return.
* AST: unify identifiers and literals into the same AST type: OneToken
* AST: ControlFlowExpression uses TrailerFlags to optimize storage
space.
* astgen: support `var` as well as `const` locals, and support
explicitly typed locals. Corresponding Module and codegen code is not
implemented yet.
* astgen: support result locations.
* ZIR: add the following instructions (see the corresponding doc
comments for explanations of semantics):
- alloc
- alloc_inferred
- bitcast_result_ptr
- coerce_result_block_ptr
- coerce_result_ptr
- coerce_to_ptr_elem
- ensure_result_used
- ensure_result_non_error
- ret_ptr
- ret_type
- store
- param_type
* the skeleton structure for result locations is set up. It's looking
pretty clean so far.
* add compile error for unused result and compile error for discarding
errors.
* astgen: split builtin calls up to implemented manually, and implement
`@as`, `@bitCast` (and others) with respect to result locations.
* add CLI support for hex and raw object formats. They are not
supported by the self-hosted compiler yet, and emit errors.
* rename `--c` CLI to `-ofmt=[objectformat]` which can be any of the
object formats. Only ELF and C are supported so far. Also added missing
help to the help text.
* Remove hard tabs from C backend test cases. Shame on you Noam, you
are grounded, you should know better, etc. Bad boy.
* Delete C backend code and test case that relied on comptime_int
incorrectly making it all the way to codegen.
InfixOp is flattened out so that each operator is an independent AST
node tag. The two kinds of structs are now Catch and SimpleInfixOp.
Beginning implementation of supporting codegen for const locals.
ast.Node.Id => ast.Node.Tag, matching recent style conventions.
Now multiple different AST node tags can map to the same AST node data
structures. In this commit, simple prefix operators now all map top
SimplePrefixOp.
`ast.Node.castTag` is now preferred over `ast.Node.cast`.
Upcoming: InfixOp flattened out.
These AST nodes now have a flags field and then a bunch of optional
trailing objects. The end result is lower memory usage and consequently
better performance. This is part of an ongoing effort to reduce the
amount of memory parsed ASTs take up.
Running `zig fmt` on the std lib:
* cache-misses: 2,554,321 => 2,534,745
* instructions: 3,293,220,119 => 3,302,479,874
* peak memory: 74.0 MiB => 73.0 MiB
Holding the entire std lib AST in memory at the same time:
93.9 MiB => 88.5 MiB
This is part of a larger effort to improve the memory layout of AST
nodes of the self-hosted parser to reduce wasted memory. Reduction of
wasted memory also translates to improved performance because of fewer
memory allocations, and fewer cache misses.
Compared to master, when running `zig fmt` on the std lib:
* cache-misses: 801,829 => 768,624
* instructions: 3,234,877,167 => 3,232,075,022
* peak memory: 81480 KB => 75964 KB
Start implementing https://github.com/ziglang/zig/issues/4917 which is to rename instream/outstream to reader/writer. This first change allows code to use Writer/writer instead of OutStream/outStream, but still maintains the old outstream names with "Deprecated" comments.
To prevent cache misses, token ids go in their own array, and the
start/end offsets go in a different one.
perf measurement before:
2,667,914 cache-misses:u
2,139,139,935 instructions:u
894,167,331 cycles:u
perf measurement after:
1,757,723 cache-misses:u
2,069,932,298 instructions:u
858,105,570 cycles:u
The DocComment AST node now only points to the first doc comment token.
API users are expected to iterate over the following tokens directly.
After this commit there are no more linked lists in use in the
self-hosted AST API.
Performance impact is negligible. Memory usage slightly reduced.
* Extract Call ast node tag out of SuffixOp; parameters go in memory
after Call.
* Demote AsmInput and AsmOutput from AST nodes to structs inside the
Asm node.
* The following ast nodes get their sub-node lists directly following
them in memory:
- ErrorSetDecl
- Switch
- BuiltinCall
* ast.Node.Asm gets slices for inputs, outputs, clobbers instead of
singly linked lists
Performance changes:
throughput: 72.7 MiB/s => 74.0 MiB/s
maxrss: 72 KB => 69 KB (nice)
block statements are now directly following the Block AST node rather
than a singly linked list. This had negligible impact on performance:
throughput: 72.3 MiB/s => 72.7 MiB/s
however it greatly improves the API since the statements are laid out in
a flat array in memory.
These SuffixOp nodes have their own ast.Node tags now:
* ArrayInitializer
* ArrayInitializerDot
* StructInitializer
* StructInitializerDot
Their sub-expression lists are general-purpose-allocator allocated
and then copied into the arena after completion of parsing.
throughput: 72.9 MiB/s => 74.4 MiB/s
maxrss: 68 KB => 72 KB
The API is also nicer since the sub expression lists are now flat arrays
instead of singly linked lists.
Instead of being its own node, it's a struct inside FnProto.
Instead of FnProto having a SinglyLinkedList of ParamDecl nodes,
ParamDecls are appended directly in memory after the FnProto.
throughput: 72.2 MiB/s => 72.9 MiB/s
maxrss: 70 KB => 68 KB
Importantly, the API is improved as well since the data is arranged
linearly in memory.
This makes fields and decl ast nodes part of the Root and ContainerDecl
AST nodes.
Surprisingly, it's a performance regression from using a singly-linked
list for these nodes:
throughput: 76.5 MiB/s => 69.4 MiB/s
However it has much better memory usage:
maxrss: 392 KB => 77 KB
It's also better API for consumers of the parser, since it is a flat
list in memory.
std.ast uses a singly linked list for lists of things. This is a
breaking change to the self-hosted parser API.
std.ast.Tree has been separated into a private "Parser" type which
represents in-progress parsing, and std.ast.Tree which has only
"output" data. This means cleaner, but breaking, API for parse results.
Specifically, `tokens` and `errors` are no longer SegmentedList but a
slice.
The way to iterate over AST nodes has necessarily changed since lists of
nodes are now singly linked lists rather than SegmentedList.
From these changes, I observe the following on the
self-hosted-parser benchmark from ziglang/gotta-go-fast:
throughput: 45.6 MiB/s => 55.6 MiB/s
maxrss: 359 KB => 342 KB
This commit breaks the build; more updates are necessary to fix API
usage of the self-hosted parser.