* Remove parser error on double ampersand
* Add failing test for double ampersand case
* Add error when encountering double ampersand in AstGen
"Bit and" operator should not make sense when one of its operands
is an address.
* Check that 2 ampersands are adjacent to each other in source string
* Remove cases of unused variables in tests
`return` statements use a new function `nodeMayEvalToError` which does
some basic checks on the AST node to return never, always, or maybe.
Depending on this result, AstGen skips the errdefers, always includes
the errdefers, or emits a conditional branch to check whether the return
value is an error that Sema will have to evaluate.
Closes#8821
Unblocks #9047
The Zig language specification will support identifiers and field access
in order to refer to which declaration to export with `@export`.
This commit implements the change in AstGen and updates the language
reference.
- hash/eql functions moved into a Context object
- *Context functions pass an explicit context
- *Adapted functions pass specialized keys and contexts
- new getPtr() function returns a pointer to value
- remove functions renamed to fetchRemove
- new remove functions return bool
- removeAssertDiscard deleted, use assert(remove(...)) instead
- Keys and values are stored in separate arrays
- Entry is now {*K, *V}, the new KV is {K, V}
- BufSet/BufMap functions renamed to match other set/map types
- fixed iterating-while-modifying bug in src/link/C.zig
Before this, if a compile error occurred, it would cause the previous
value for e.g. the function scope to not get reset. If the AstGen
process continued, it would result in a violation of the data
guarantees that it relies on.
This commit takes advantage of defer to ensure the previous value is
always reset, even in the case of an error.
Closes#8920
* Remove the ability for GenZir parent Scope to be null. Now there is a
Top Scope at the top.
* Introduce Scope.Namespace to contain a table of decl names in order
to emit a compile error for name conflicts.
* Fix use of invalid memory when reporting compile errors by
duplicating decl names into a temporary heap allocated buffer.
* Fix memory leak in while and for loops, not cleaning up their
labeled_breaks and store_to_block_ptr_list arrays.
* Fix stage2 test cases because now the source location of redundant
comptime keyword compile errors is improved.
* Implement compile error for local variable shadowing declaration.
which can be either parent, func, or anon. Here's the enum reproduced in
the commit message for convenience:
```zig
pub const NameStrategy = enum(u2) {
/// Use the same name as the parent declaration name.
/// e.g. `const Foo = struct {...};`.
parent,
/// Use the name of the currently executing comptime function call,
/// with the current parameters. e.g. `ArrayList(i32)`.
func,
/// Create an anonymous name for this declaration.
/// Like this: "ParentDeclName_struct_69"
anon,
};
```
With this information in the ZIR, a future commit can improve the
names of structs, unions, enums, and opaques.
In order to accomplish this, the following ZIR instruction forms were
removed and replaced with Extended op codes:
* struct_decl
* struct_decl_packed
* struct_decl_extern
* union_decl
* union_decl_packed
* union_decl_extern
* enum_decl
* enum_decl_nonexhaustive
By being extended opcodes, one more u32 is needed, however we more than
make up for it by repurposing the 16 "small" bits to provide shorter
encodings for when decls_len == 0, fields_len == 0, a source node is not
provided, etc. There tends to be no downside, and in fact sometimes
upsides, to using an extended op code when there is a need for flag
bits, which is the case for all three of these. Likewise, the container
layout can be encoded in these bits rather than into the opcode.
The following 4 ZIR instructions were added, netting a total of 4 freed
up ZIR enum tags for future use:
* opaque_decl_anon
* opaque_decl_func
* error_set_decl_anon
* error_set_decl_func
This is so that opaques and error sets can have the same name hint as
structs, enums, and unions.
`std.builtin.ContainerLayout` gets an explicit integer tag type so that
it can be used inside packed structs.
This commit also makes `Module.Namespace` use a separate set for
anonymous decls, thus allowing anonymous decls to share the same
`Decl.name` as their owner `Decl` objects.
* Sema: implement global variables
- Improved global constants to stop needlessly creating a Var
structure; they can just store the value directly.
- This required making memory management a bit more sophisticated to
detect when a Decl owns the Namespace associated with it, for the
purposes of deinitialization.
* Decl.name and Namespace decl table keys no longer directly
reference ZIR; instead they have heap-duped names, so that deleted
decls, which no longer have any ZIR to reference for their names, can
be removed from the parent Namespace table.
- In the future I would like to explore going a different direction
with this, where the strings would still point to the ZIR however
they would be removed from their owner Namespace objects during the
update detection. The design principle here is that the existence
of incremental compilation as a feature should not incur any cost
for the use case when it is not used. In this example Decl names
could simply point to ZIR string table memory, and it is only
because of incremental compilation that we duplicate their names.
* AstGen: implement threadlocal variables
* CLI: call cleanExit after building a compilation so that in release
modes we don't bother freeing memory or closing file descriptors,
allowing the OS to do it more efficiently.
* Avoid calling `freeDecl` in the linker for unreferenced Decl objects.
* Fix CBE test case expecting the compile error to point to the wrong
column.
AstGen is now completely independent from the rest of the compiler. It
ingests an AST tree and produces ZIR code as the output, without
depending on any of the glue code of the compiler.
This allows Sema to namespace them separately from function decls with
the same name. Ran into this in std.math.order conflicting with a test
with the same name.
instead of node indexes.
* AstGen: dbg_stmt instructions now have line and column indexes,
relative to the parent declaration. This allows codegen to emit debug
info without having the source bytes, tokens, or AST nodes loaded
in memory.
* ZIR: each decl has the absolute line number. This allows computing
line numbers from offsets without consulting source code bytes.
Memory management: creating a function definition does not prematurely
set the Decl arena. Instead the function is allocated with the general
purpose allocator.
Codegen no longer looks at source code bytes for any reason. They can
remain unloaded from disk.
* AstGen: LocalVal and LocalPtr use string table indexes for their
names. This is more efficient because local variable declarations do
need to include the variable names so that semantic analysis can emit
a compile error if a declaration is shadowed. So we take advantage of
this fact by comparing string table indexes when resolving names.
* The arg ZIR instructions are needed for the above reasoning, as well
as to emit equivalent AIR instructions for debug info.
Now that we have these arg instructions, get rid of the special
`Zir.Inst.Ref` range for parameters. ZIR instructions now refer
to the arg instructions for parameters.
* Move identAsString and strLitAsString from Module.GenZir to AstGen
where they belong.
* AstGen: add missing `break_inline` for comptime blocks.
* Module: call getTree() in byteOffset(). This generates the AST when
using cached ZIR and compile errors need to be reported.
* Scope.File: distinguish between successful ZIR generation and AIR
generation (when Decls in scope have been scanned).
- `semaFile` correctly avoids doing work twice.
* Implement first pass at `lookupInNamespace`. It has various TODOs
left, such as `usingnamespace`, and setting up Decl dependencies.
We do this by reserving string table indexes 0 and 1 in ZIR to be
special. Decls now have 0 to mean comptime or usingnamespace, and 1 to
mean an unnamed test decl.
* Remove some unused imports in AstGen.zig. I think it would make sense
to start decoupling AstGen from the rest of the compiler code,
similar to how the tokenizer and parser are decoupled.
* AstGen: For decls, move the block_inline instructions to the top of
the function so that they get lower ZIR instruction indexes. With
this, the block_inline instruction index combined with its corresponding
break_inline instruction index can be used to form a ZIR instruction
range. This is useful for allocating an array to map ZIR instructions
to semantically analyzed instructions.
* Module: extract emit-h functionality into a struct, and only allocate
it when emit-h is activated.
* Module: remove the `decl_table` field. This previously was a table of
all Decls in the entire Module. A "name hash" strategy was used to
find decls within a given namespace, using this global table. Now,
each Namespace has its own map of name to children Decls.
- Additionally, there were 3 places that relied on iterating over
decl_table in order to function:
- C backend and SPIR-V backend. These now have their own decl_table
that they keep populated when `updateDecl` and `removeDecl` are
called.
- emit-h. A `decl_table` field has been added to the new GlobalEmitH
struct which is only allocated when emit-h is activated.
* Module: fix ZIR serialization/deserialization bug in debug mode having
to do with the secret safety tag for untagged unions. There is still an
open TODO to investigate a friendlier solution to this problem with
the language.
* Module: improve deserialization of ZIR to allocate only exactly as
much capacity as length in the instructions array so as to not waste
space.
* Module: move `srcHashEql` to `std.zig` to live next to the definition
of `SrcHash` itself.
* Module: re-introduce the logic for scanning top level declarations
within a namespace.
* Compilation: add an `analyze_pkg` Job which is used to kick off the
start of semantic analysis by doing the equivalent of
`_ = @import("std");`. The `analyze_pkg` job is unconditionally added
to the work queue on every update(), with pkg set to the std lib pkg.
* Rename TZIR to AIR in a few places. A more comprehensive rename will
come later.
* Every decl provides a 16 byte source hash which can be used to detect
if the source code for any particular decl has changed.
* Include comptime decls, test decls, and usingnamespace decls in the
decls list of namespaces.
- Tests are encoded as extended functions with is_test bit set.