421 Commits

Author SHA1 Message Date
Jakub Konka
0134e5d2a1 codegen: separate getAnonDeclVAddr into lowerAnonDecl and the former
Implement the stub for Elf.

I believe that separating the concerns, namely, having an interface
function that is responsible for signalling the linker to lower
the anon decl only, and a separate function to obtain the decl's
vaddr is preferable since it allows us to handle codegen errors
in a simpler way.
2023-10-03 12:49:12 -07:00
Andrew Kelley
0483b4a512 link: stub out getAnonDeclVAddr 2023-10-03 12:12:51 -07:00
Andrew Kelley
c0b5512544 compiler: start handling anonymous decls differently
Instead of explicitly creating a `Module.Decl` object for each anonymous
declaration, each `InternPool.Index` value is implicitly understood to
be an anonymous declaration when encountered by backend codegen.

The memory management strategy for these anonymous decls then becomes to
garbage collect them along with standard InternPool garbage.

In the interest of a smooth transition, this commit only implements this
new scheme for string literals and leaves all the previous mechanisms in
place.
2023-10-03 12:12:50 -07:00
kcbanner
9f4649b197 codegen/sema: handle unions with unknown tags in more places 2023-09-23 14:34:01 -04:00
kcbanner
f2a24b48e1 sema: rework the comptime representation of comptime unions
When the tag is not known, it's set to `.none`. In this case, the value is either an
array of bytes (for extern unions) or an integer (for packed unions).
2023-09-23 13:05:04 -04:00
kcbanner
2fddd767ba sema: add support for unions in readFromMemory and writeToMemory 2023-09-23 13:04:56 -04:00
Andrew Kelley
55bc8a7fa9 compiler: fix compilation for 32-bit targets 2023-09-21 15:27:25 -07:00
Andrew Kelley
accd5701c2 compiler: move struct types into InternPool proper
Structs were previously using `SegmentedList` to be given indexes, but
were not actually backed by the InternPool arrays.

After this, the only remaining uses of `SegmentedList` in the compiler
are `Module.Decl` and `Module.Namespace`. Once those last two are
migrated to become backed by InternPool arrays as well, we can introduce
state serialization via writing these arrays to disk all at once.

Unfortunately there are a lot of source code locations that touch the
struct type API, so this commit is still work-in-progress. Once I get it
compiling and passing the test suite, I can provide some interesting
data points such as how it affected the InternPool memory size and
performance comparison against master branch.

I also couldn't resist migrating over a bunch of alignment API over to
use the log2 Alignment type rather than a mismash of u32 and u64 byte
units with 0 meaning something implicitly different and special at every
location. Turns out you can do all the math you need directly on the
log2 representation of alignments.
2023-09-21 14:48:40 -07:00
Jakub Konka
ce88df497c elf: do not store Symbol's index in Symbol 2023-09-13 21:51:43 +02:00
Jakub Konka
4d29b39678
Merge pull request #17113 from ziglang/elf-linker
elf: upstream zld/ELF functionality, part 1
2023-09-13 10:07:07 +02:00
Andrew Kelley
cb6201715a InternPool: prevent anon struct UAF bugs with type safety
Instead of using actual slices for InternPool.Key.AnonStructType, this
commit changes to use Slice types instead, which store a
long-lived index rather than a pointer.

This is a follow-up to 7ef1eb1c27754cb0349fdc10db1f02ff2dddd99b.
2023-09-12 20:08:56 -04:00
Jakub Konka
44e84af874 elf: add simplistic reloc scanning mechanism 2023-09-12 16:32:55 +02:00
Jakub Konka
6ad5db030c elf: store GOT index in symbol extra array; use GotSection for GOT 2023-09-08 18:12:53 +02:00
Jakub Konka
a9df098cd2 elf: make everything upside down - track by Symbol.Index rather than Atom.Index 2023-09-06 13:14:00 +02:00
Jakub Konka
bc37c95e56 elf: simplify accessors to symbols, atoms, etc 2023-09-04 11:23:19 +02:00
Andrew Kelley
ada0010471 compiler: move unions into InternPool
There are a couple concepts here worth understanding:

Key.UnionType - This type is available *before* resolving the union's
fields. The enum tag type, number of fields, and field names, field
types, and field alignments are not available with this.

InternPool.UnionType - This one can be obtained from the above type with
`InternPool.loadUnionType` which asserts that the union's enum tag type
has been resolved. This one has all the information available.

Additionally:

* ZIR: Turn an unused bit into `any_aligned_fields` flag to help
  semantic analysis know whether a union has explicit alignment on any
  fields (usually not).
* Sema: delete `resolveTypeRequiresComptime` which had the same type
  signature and near-duplicate logic to `typeRequiresComptime`.
  - Make opaque types not report comptime-only (this was inconsistent
    between the two implementations of this function).
* Implement accepted proposal #12556 which is a breaking change.
2023-08-22 13:54:14 -07:00
antlilja
928d43f61a Fix integer overflow in field padding calculation
The old code was iterating and generating symbols
for fields in their declared order instead of the
memory optimized order while getting offsets in
the memory optimized order.
2023-07-31 15:14:31 -07:00
r00ster91
d962ad5ea0 codegen: writer().writeByteNTimes -> appendNTimes
Both ways do the same thing but I think the compiler might have an
easier time optimizing `appendNTimes` because it does less
things/the path is shorter.

I have not done any benchmarking at runtime but have compared the
instruction count of both ways a little here: https://zig.godbolt.org/z/vr193W9oj
`b` (`appendNTimes`) is ~103 instructions while `a`
(`writer().writeByteNTimes`) is ~117 instructions.

And looking at the implementation of `writeByteNTimes`, it only seems to
buffer up 256 bytes before doing another `writeAll` which for
`std.ArrayList` probably means another allocation, whereas when directly
using `appendNTimes`, the entire exact additional capacity required is known from the start.

Either way, this would be more consistent anyway.
2023-07-21 21:32:18 -07:00
Andrew Kelley
db33ee45b7 rework generic function calls
Abridged summary:

 * Move `Module.Fn` into `InternPool`.
 * Delete a lot of confusing and problematic `Sema` logic related to
   generic function calls.

This commit removes `Module.Fn` and replaces it with two new
`InternPool.Tag` values:

 * `func_decl` - corresponding to a function declared in the source
   code. This one contains line/column numbers, zir_body_inst, etc.

 * `func_instance` - one for each monomorphization of a generic
   function. Contains a reference to the `func_decl` from whence the
   instantiation came, along with the `comptime` parameter values (or
   types in the case of `anytype`)

Since `InternPool` provides deduplication on these values, these fields
are now deleted from `Module`:

 * `monomorphed_func_keys`
 * `monomorphed_funcs`
 * `align_stack_fns`

Instead of these, Sema logic for generic function instantiation now
unconditionally evaluates the function prototype expression for every
generic callsite. This is technically required in order for type
coercions to work. The previous code had some dubious, probably wrong
hacks to make things work, such as `hashUncoerced`. I'm not 100% sure
how we were able to eliminate that function and still pass all the
behavior tests, but I'm pretty sure things were still broken without
doing type coercion for every generic function call argument.

After the function prototype is evaluated, it produces a deduplicated
`func_instance` `InternPool.Index` which can then be used for the
generic function call.

Some other nice things made by this simplification are the removal of
`comptime_args_fn_inst` and `preallocated_new_func` from `Sema`, and the
messy logic associated with them.

I have not yet been able to measure the perf of this against master
branch. On one hand, it reduces memory usage and pointer chasing of the
most heavily used `InternPool` Tag - function bodies - but on the other
hand, it does evaluate function prototype expressions more than before.
We will soon find out.
2023-07-18 19:02:05 -07:00
Jacob Young
3f13987a76 x86_64: add missing padding to global unions 2023-06-25 19:14:03 -04:00
Jacob Young
5b74278510 x86_64: fix global pointers to packed struct fields 2023-06-25 19:14:03 -04:00
mlugg
f26dda2117 all: migrate code to new cast builtin syntax
Most of this migration was performed automatically with `zig fmt`. There
were a few exceptions which I had to manually fix:

* `@alignCast` and `@addrSpaceCast` cannot be automatically rewritten
* `@truncate`'s fixup is incorrect for vectors
* Test cases are not formatted, and their error locations change
2023-06-24 16:56:39 -07:00
Eric Joldasov
50339f595a all: zig fmt and rename "@XToY" to "@YFromX"
Signed-off-by: Eric Joldasov <bratishkaerik@getgoogleoff.me>
2023-06-19 12:34:42 -07:00
Motiejus Jakštys
d41111d7ef mem: rename align*Generic to mem.align*
Anecdote 1: The generic version is way more popular than the non-generic
one in Zig codebase:

     git grep -w alignForward | wc -l
    56
     git grep -w alignForwardGeneric | wc -l
    149

     git grep -w alignBackward | wc -l
    6
     git grep -w alignBackwardGeneric | wc -l
    15

Anecdote 2: In my project (turbonss) that does much arithmetic and
alignment I exclusively use the Generic functions.

Anecdote 3: we used only the Generic versions in the Macho Man's linker
workshop.
2023-06-17 12:49:13 -07:00
Andrew Kelley
a10ddba921
Merge pull request #16064 from Luukdegram/wasm-linker
wasm/linker: symbol resolution improvements
2023-06-16 22:03:35 -07:00
Luuk de Gram
1cfad29f10
codegen: fix union padding
This regressed during the internpool merges. This commit
reinstates the padding logic for unions.
2023-06-16 17:16:56 +02:00
Jacob G-W
5343a2f566 plan9: revamp the relocation system to allow decl refs 2023-06-16 08:34:30 -04:00
Jacob G-W
9e8c7b104e Plan9: Add support for lazy symbols
This includes a renaming from DeclBlock to Atom.
2023-06-16 08:34:30 -04:00
Jacob Young
e23b0a01e6 InternPool: fix yet more key lifetime issues 2023-06-10 20:47:59 -07:00
Andrew Kelley
69b7b91092 compiler: eliminate Decl.value_arena and Sema.perm_arena
The main motivation for this commit is eliminating Decl.value_arena.
Everything else is dominoes.

Decl.name used to be stored in the GPA, now it is stored in InternPool.
It ended up being simpler to migrate other strings to be interned as
well, such as struct field names, union field names, and a few others.
This ended up requiring a big diff, sorry about that. But the changes
are pretty nice, we finally start to take advantage of InternPool's
existence.

global_error_set and error_name_list are simplified. Now it is a single
ArrayHashMap(NullTerminatedString, void) and the index is the error tag
value.

Module.tmp_hack_arena is re-introduced (it was removed in
eeff407941560ce8eb5b737b2436dfa93cfd3a0c) in order to deal with
comptime_args, optimized_order, and struct and union fields. After
structs and unions get moved into InternPool properly, tmp_hack_arena
can be deleted again.
2023-06-10 20:47:58 -07:00
Jacob Young
123cfab984 codegen: fix doubled global sentinels 2023-06-10 20:47:58 -07:00
Andrew Kelley
bb526426e7 InternPool: remove memoized_decl
This is neither a type nor a value. Simplifies `addStrLit` as well as
the many places that switch on `InternPool.Key`.

This is a partial revert of bec29b9e498e08202679aa29a45dab2a06a69a1e.
2023-06-10 20:47:58 -07:00
mlugg
a0d4ef0acf InternPool: add representation for value of empty enums and unions
This is a bit odd, because this value doesn't actually exist:
see #15909. This gets all the empty enum/union behavior tests passing.

Also adds an assertion to `Sema.analyzeBodyInner` which would have
helped figure out the issue here much more quickly.
2023-06-10 20:47:57 -07:00
Andrew Kelley
82f6f164a1 InternPool: improve hashing performance
Key.PtrType is now an extern struct so that hashing it can be done by
reinterpreting bytes directly. It also uses the same representation for
type_pointer Tag encoding and the Key. Accessing pointer attributes now
requires packed struct access, however, many operations are now a copy
of a u32 rather than several independent fields.

This function moves the top two most used Key variants - pointer types
and pointer values - to use a single-shot hash function that branches
for small keys instead of calling memcpy.

As a result, perf against merge-base went from 1.17x ± 0.04 slower to
1.12x ± 0.04 slower. After the pointer value hashing was changed, total
CPU instructions spent in memcpy went from 4.40% to 4.08%, and after
additionally improving pointer type hashing, it further decreased to
3.72%.
2023-06-10 20:47:57 -07:00
Jacob Young
a702af062b x86_64: fix InternPool regressions 2023-06-10 20:47:56 -07:00
Jacob Young
3064d2aa7b behavior: additional llvm fixes 2023-06-10 20:47:56 -07:00
Jacob Young
3b6ca1d35b Module: move memoized data to the intern pool
This avoids memory management bugs with the previous implementation.
2023-06-10 20:47:56 -07:00
Jacob Young
2d5bc01469 behavior: get more test cases passing with llvm 2023-06-10 20:47:56 -07:00
Andrew Kelley
fc358435cb C backend: InternPool fixes 2023-06-10 20:47:56 -07:00
Andrew Kelley
d5f0ee0d62 codegen: fix lowering of constant structs 2023-06-10 20:47:55 -07:00
Jacob Young
f2c716187c InternPool: fix more crashes 2023-06-10 20:47:55 -07:00
Jacob Young
9a738c0be5 Module: intern the values of decls when they are marked alive
I'm not sure if this is the right place for this to happen, and
it should become obsolete when comptime mutation is rewritten
and the remaining legacy value tags are remove, so keeping this
as a separate revertable commit.
2023-06-10 20:47:55 -07:00
Jacob Young
1a4626d2cf InternPool: remove more legacy values
Reinstate some tags that will be needed for comptime init.
2023-06-10 20:47:54 -07:00
Jacob Young
6e0de1d116 InternPool: port most of value tags 2023-06-10 20:47:54 -07:00
Andrew Kelley
9ff514b6a3 compiler: move error union types and error set types to InternPool
One change worth noting in this commit is that `module.global_error_set`
is no longer kept strictly up-to-date. The previous code reserved
integer error values when dealing with error set types, but this is no
longer needed because the integer values are not needed for semantic
analysis unless `@errorToInt` or `@intToError` are used and therefore
may be assigned lazily.
2023-06-10 20:47:53 -07:00
Andrew Kelley
7bf91fc79a compiler: eliminate legacy Type.Tag.pointer
Now pointer types are stored only in InternPool.
2023-06-10 20:47:53 -07:00
Andrew Kelley
17882162b3 stage2: move function types to InternPool 2023-06-10 20:47:53 -07:00
Andrew Kelley
88dbd62bcb stage2: move enum tag values into the InternPool
I'm seeing a new assertion trip: the call to `enumTagFieldIndex` in the
implementation of `@Type` is attempting to query the field index of an
union's enum tag, but the type of the enum tag value provided is not the
same as the union's tag type. Most likely this is a problem with type
coercion, since values are now typed.

Another problem is that I added some hacks in std.builtin because I
didn't see any convenient way to access them from Sema. That should
definitely be cleaned up before merging this branch.
2023-06-10 20:46:17 -07:00
Andrew Kelley
5881a2d637 stage2: move enum types into the InternPool
Unlike unions and structs, enums are actually *encoded* into the
InternPool directly, rather than using the SegmentedList trick. This
results in them being quite compact, and greatly improved the ergonomics
of using enum types throughout the compiler.

It did however require introducing a new concept to the InternPool which
is an "incomplete" item - something that is added to gain a permanent
Index, but which is then mutated in place. This was necessary because
enum tag values and tag types may reference the namespaces created by
the enum itself, which required constructing the namespace, decl, and
calling analyzeDecl on the decl, which required the decl value, which
required the enum type, which required an InternPool index to be
assigned and for it to be meaningful.

The API for updating enums in place turned out to be quite slick and
efficient - the methods directly populate pre-allocated arrays and
return the information necessary to output the same compilation errors
as before.
2023-06-10 20:42:30 -07:00
Andrew Kelley
3ba099bfba stage2: move union types and values to InternPool 2023-06-10 20:42:30 -07:00