The main motivating change here is to prevent the creation of a fake
Decl object by the frontend in order to `@export()` a value.
Instead, `link.updateDeclExports` is renamed to `link.updateExports` and
accepts a tagged union which can be either a Decl.Index or a
InternPool.Index.
Commit 5393e56500d499753dbc39704c0161b47d1e4d5c has a flaw pointed out
by @mlugg: the `ty` field of pointer values changes when comptime values
are pointer-casted. This commit introduces a new encoding which
additionally stores the "original pointer type" which is used to store
the alignment of the anonymous decl, and potentially other information
in the future such as section and pointer address space. However, this
new encoding is only used when the original pointer type differs from
the casted pointer type in a meaningful way.
I was able to make the LLVM backend and the C backend lower anonymous
decls with the appropriate alignment, however I will need some help
figuring out how to do this for the backends that lower anonymous decls
via src/codegen.zig and the wasm backend.
Introduce the new mechanism needed to render anonymous decls to C code
that the frontend is now using.
The current strategy is to collect the set of used anonymous decls into
one ArrayHashMap for the entire compilation, and then render them during
flush().
In the future this may need to be adjusted for incremental compilation
purposes, so that removing a Decl from decl_table means that newly
unused anonymous decls are no longer rendered. However, let's do one
thing at a time. The only goal of this branch is to stop using
Module.Decl objects for unnamed constants.
Start keeping track of dependencies on anon decls for dependency
ordering during flush()
Currently this causes use of undefined symbols because these
dependencies need to get rendered into the output.
When the compiler's state lives through multiple Compilation.update()
calls, the C backend stores the rendered C source code for each
decl code body and forward declarations.
With this commit, the state is still stored, but it is managed in one
big array list in link/C.zig rather than many array lists, one for each
decl. This means simpler serialization and deserialization.
Abridged summary:
* Move `Module.Fn` into `InternPool`.
* Delete a lot of confusing and problematic `Sema` logic related to
generic function calls.
This commit removes `Module.Fn` and replaces it with two new
`InternPool.Tag` values:
* `func_decl` - corresponding to a function declared in the source
code. This one contains line/column numbers, zir_body_inst, etc.
* `func_instance` - one for each monomorphization of a generic
function. Contains a reference to the `func_decl` from whence the
instantiation came, along with the `comptime` parameter values (or
types in the case of `anytype`)
Since `InternPool` provides deduplication on these values, these fields
are now deleted from `Module`:
* `monomorphed_func_keys`
* `monomorphed_funcs`
* `align_stack_fns`
Instead of these, Sema logic for generic function instantiation now
unconditionally evaluates the function prototype expression for every
generic callsite. This is technically required in order for type
coercions to work. The previous code had some dubious, probably wrong
hacks to make things work, such as `hashUncoerced`. I'm not 100% sure
how we were able to eliminate that function and still pass all the
behavior tests, but I'm pretty sure things were still broken without
doing type coercion for every generic function call argument.
After the function prototype is evaluated, it produces a deduplicated
`func_instance` `InternPool.Index` which can then be used for the
generic function call.
Some other nice things made by this simplification are the removal of
`comptime_args_fn_inst` and `preallocated_new_func` from `Sema`, and the
messy logic associated with them.
I have not yet been able to measure the perf of this against master
branch. On one hand, it reduces memory usage and pointer chasing of the
most heavily used `InternPool` Tag - function bodies - but on the other
hand, it does evaluate function prototype expressions more than before.
We will soon find out.
Most of this migration was performed automatically with `zig fmt`. There
were a few exceptions which I had to manually fix:
* `@alignCast` and `@addrSpaceCast` cannot be automatically rewritten
* `@truncate`'s fixup is incorrect for vectors
* Test cases are not formatted, and their error locations change
These are frequently invalidated whenever a string is interned, so avoid
creating pointers to `string_bytes` wherever possible. This is an
attempt to fix random CI failures.
The main motivation for this commit is eliminating Decl.value_arena.
Everything else is dominoes.
Decl.name used to be stored in the GPA, now it is stored in InternPool.
It ended up being simpler to migrate other strings to be interned as
well, such as struct field names, union field names, and a few others.
This ended up requiring a big diff, sorry about that. But the changes
are pretty nice, we finally start to take advantage of InternPool's
existence.
global_error_set and error_name_list are simplified. Now it is a single
ArrayHashMap(NullTerminatedString, void) and the index is the error tag
value.
Module.tmp_hack_arena is re-introduced (it was removed in
eeff407941560ce8eb5b737b2436dfa93cfd3a0c) in order to deal with
comptime_args, optimized_order, and struct and union fields. After
structs and unions get moved into InternPool properly, tmp_hack_arena
can be deleted again.
* Support always_tail and never_tail/never_inline with a comptime callee using clang
* Support never_inline using gcc
* Support never_inline using msvc
Unfortunately, can't enable behavior tests because of the conditional support.
Global constant initializers can reference functions, so forward declare
the constants and initialize them later with the function definitions,
which guarantees that they appear after all declarations.
Now instead of zig.h being baked into the compiler binary, it is a
header file distributed along with all the other header files
distributed with Zig.
Closes#11643
At least on Linux, the pwritev syscall checks the pointer and returns
EFAULT before it checks if the length is nonzero.
Perhaps this should be fixed in the standard library, however, these are
still improvements since they make the kernel do less work within the
syscall.
Rather than allocating Decl objects with an Allocator, we instead allocate
them with a SegmentedList. This provides four advantages:
* Stable memory so that one thread can access a Decl object while another
thread allocates additional Decl objects from this list.
* It allows us to use u32 indexes to reference Decl objects rather than
pointers, saving memory in Type, Value, and dependency sets.
* Using integers to reference Decl objects rather than pointers makes
serialization trivial.
* It provides a unique integer to be used for anonymous symbol names,
avoiding multi-threaded contention on an atomic counter.
* The `@bitCast` workaround is removed in favor of `@ptrCast` properly
doing element casting for slice element types. This required an
enhancement both to stage1 and stage2.
* stage1 incorrectly accepts `.{}` instead of `{}`. stage2 code that
abused this is fixed.
* Make some parameters comptime to support functions in switch
expressions (as opposed to making them function pointers).
* Avoid relying on local temporaries being mutable.
* Workarounds for when stage1 and stage2 disagree on function pointer
types.
* Workaround recursive formatting bug with a `@panic("TODO")`.
* Remove unreachable `else` prongs for some inferred error sets.
All in effort towards #89.
1. Changed Zig pointers to functions to be typedef'd so then we can
treat them the same as other types.
2. Distinguished between const slices (zig_L prefix) and mut slices
(zig_M prefix).
3. Changed lowering of Zig "const pointers" (e.g. *const u8) to to C
"pointers to const" (e.g. const char *) rather than C "const
pointers" (e.g. char * const)
4. Ensured that all typedefs are "linked" even if the decl doesn't
require any forward declarations
5. Added test that exercises function pointer type rendering
6. Changed .slice_ptr instruction to allocate pointer local rather than
a uintptr_t local
Because ArrayList.initCapacity uses 'precise' capacity allocation, this should save memory on average, and definitely will save memory in cases where ArrayList is used where a regular allocated slice could have also be used.
The ensureUnusedCapacity did not reserve a big enough number. I changed
it to no longer guess the capacity because I saw that the number of
possible items was not determinable ahead of time and this can therefore
avoid allocating more memory than necessary.
The C backend is the only backend that requires each decl to be output
in an order that satisfies the dependency graph. Here it is implemented
with a simple algorithm based on a `remaining_decls` set, using the
`dependencies` edges that are already stored for each Decl.
This satisfies incremental compilation as well as how `zig test` works,
which calls `updateDecl` on `test_functions`.
Previously, linker backends or machine code backends were able to hold
on to references to inside Sema's temporary arena. However there can
be large objects stored there that we want to free after machine code is
generated.
The primary change in this commit is to use a temporary arena for Sema
of function bodies that gets freed after machine code backend finishes
handling `updateFunc` (at the same time that Air and Liveness get freed).
The other changes in this commit are fixing issues that fell out from
the primary change.
* The C linker backend is rewritten to handle updateDecl and updateFunc
separately. Also, all Decl updates get access to typedefs and
fwd_decls, not only functions.
* The C linker backend is updated to the new API that does not depend
on allocateDeclIndexes and does not have to handle garbage collected
decls.
* The C linker backend uses an arena for Type/Value objects that
`typedefs` references. These can be garbage collected every so often
after flush(), however that garbage collection code is not
implemented at this time. It will be pretty simple, just allocate a
new arena, copy all the Type objects to it, update the keys of the
hash map, free the old arena.
* Sema: fix a handful of instances of not copying Type/Value objects
from the temporary arena into the appropriate Decl arena.
* Type: fix some function types not reporting hasCodeGenBits()
correctly.