163 Commits

Author SHA1 Message Date
Andrew Kelley
482b995a49 stage2: blaze the trail for std lib integration
This branch adds "builtin" and "std" to the import table when using the
self-hosted backend.

"builtin" gains one additional item:

```
pub const zig_is_stage2 = true; // false when using stage1 backend
```

This allows the std lib to do conditional compilation based on detecting
which backend is being used. This will be removed from builtin as soon
as self-hosted catches up to feature parity with stage1.
Keep a sharp eye out - people are going to be tempted to abuse this.
The general rule of thumb is do not use `builtin.zig_is_stage2`. However
this commit breaks the rule so that we can gain limited start.zig support
as we incrementally improve the self-hosted compiler.

This commit also implements `fullyQualifiedNameHash` and related
functionality, which effectively puts all Decls in their proper
namespaces. `fullyQualifiedName` is not yet implemented.

Stop printing "todo" log messages for test decls unless we are in test
mode.

Add "previous definition here" error notes for Decl name collisions.

This commit does not bring us yet to a newly passing test case.

Here's what I'm working towards:

```zig
const std = @import("std");

export fn main() c_int {
    const a = std.fs.base64_alphabet[0];
    return a - 'A';
}
```

Current output:

```
$ ./zig-cache/bin/zig build-exe test.zig
test.zig:3:1: error: TODO implement more analyze elemptr
zig-cache/lib/zig/std/start.zig:38:46: error: TODO implement structInitExpr ty
```

So the next steps are clear:
 * Sema: improve elemptr
 * AstGen: implement structInitExpr
2021-04-08 19:05:05 -07:00
Andrew Kelley
b9e508c410 stage2: revert to only has_decl and export ZIR support
Reverting most of the code from the previous commits in this branch.
Will pull in the code with modifications bit by bit.
2021-04-08 11:29:31 -07:00
Timon Kruiper
ac14b52e85 stage2: add support for start.zig
This adds a simplified start2.zig that the current stage2 compiler is
able to generate code for.
2021-04-08 14:23:18 +02:00
Timon Kruiper
a97efbd185 stage2: add support for root pkg
Fix some infinite recursions, because the code assumed that packages
cannot point to each other. But this assumption does not hold anymore.
2021-04-08 14:23:18 +02:00
Timon Kruiper
fb16cb9183 stage2: add initial support for builtin pkg
We currently emit a simplified builtin2.zig that is able to be parsed
and correctly interpreted in stage2.
2021-04-08 14:23:17 +02:00
Andrew Kelley
12087d4cba stage2: fix incremental compilation handling of parse errors
Before, incremental compilation would crash when trying to emit compile
errors for the update after introducing a parse error.

Parse errors are handled by not invalidating any existing semantic
analysis. However, only the parse error must be reported, with all the
other errors suppressed. Once the parse error is fixed, the new file can
be treated as an update to the previously-succeeded update.
2021-04-07 20:36:01 -07:00
Andrew Kelley
4996c2b6a9 stage2: fix incremental compilation Decl deletion logic
* `analyzeContainer` now has an `outdated_decls` set as well as
   `deleted_decls`. Instead of queuing up outdated Decls for re-analysis
   right away, they are added to this new set. When processing the
   `deleted_decls` set, we remove deleted Decls from the
   `outdated_decls` set, to avoid deleted Decl pointers from being in
   the work_queue. Only after processing the deleted decls do we add
   analyze_decl work items to the queue.

 * Module.deletion_set is now an `AutoArrayHashMap` rather than `ArrayList`.
   `declareDeclDependency` will now remove a Decl from it as appropriate.
   When processing the `deletion_set` in `Compilation.performAllTheWork`,
   it now assumes all Decl in the set are to be deleted.

 * Fix crash when handling parse errors. Currently we unload the
   `ast.Tree` if any parse errors occur. Previously the code emitted a
   LazySrcLoc pointing to a token index, but then when we try to resolve
   the token index to a byte offset to create a compile error message,
   the  ast.Tree` would be unloaded. Now we use
   `LazySrcLoc.byte_abs` instead of `token_abs` so the error message can
   be created even with the `ast.Tree` unloaded.

Together, these changes solve a crash that happened with incremental
compilation when Decls were added and removed in some combinations.
2021-04-07 19:54:28 -07:00
Andrew Kelley
281a7baaea Merge remote-tracking branch 'origin/master' into zir-memory-layout
Wanted to make sure those new test cases still pass.

Also grab that CI fix so we can get those green check marks.
2021-03-28 19:42:43 -07:00
jacob gw
0005b34637 stage2: implement sema for @errorToInt and @intToError 2021-03-28 18:22:01 -07:00
Jakub Konka
3913332145 Fix digest format specifier after std.fmt updates 2021-03-20 15:09:45 -07:00
Andrew Kelley
56677f2f2d astgen: support blocks
We are now passing this test:

```zig
export fn _start() noreturn {}
```

```
test.zig:1:30: error: expected noreturn, found void
```

I ran into an issue where we get an integer overflow trying to compute
node index offsets from the containing Decl. The problem is that the
parser adds the Decl node after adding the child nodes. For some things,
it is easy to reserve the node index and then set it later, however, for
this case, it is not a trivial code change, because depending on tokens
after parsing the decl determines whether we want to add a new node or
not.

Possible strategies here:

1. Rework the parser code to make sure that Decl nodes are before
   children nodes in the AST node array.

2. Use signed integers for Decl node offsets.

3. Just flip the order of subtraction and addition. Expect Decl Node
   index to be greater than children Node indexes.

I opted for (3) because it seems like the simplest thing to do. We'll
want to unify the logic for computing the offsets though because if the
logic gets repeated, it will probably get repeated wrong.
2021-03-19 23:15:18 -07:00
Andrew Kelley
aef3e534f5 stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:

 * `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
   each instruction independently, there is now a Tag and 8 bytes of
   data available for all ZIR instructions. Small instructions fit
   within these 8 bytes; larger ones use 4 bytes for an index into
   `extra`. There is also `string_bytes` so that we can have 4 byte
   references to strings. `zir.Inst.Tag` describes how to interpret
   those 8 bytes of data.
   - This is shared by all `Block` scopes.

 * `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
   structure, the arrays are mutable, and get resized as we add/delete
   things. There is extra state to keep track of things. This struct is
   stored on the stack. Once it is finished, it produces an immutable
   `zir.Code`, which will remain on the heap for the duration of a
   function's existence.
   - This is shared by all `GenZir` scopes.

 * `Sema`: represents in-progress semantic analysis of a `zir.Code`.
   This data is stored on the stack and is shared among all `Block`
   scopes. It is now the main "self" argument to everything in the file
   that was previously named `zir_sema.zig`.
   Additionally, I moved some logic that was in `Module` into here.

`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.

astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.

I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.

Overhaul of Source Locations:

Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.

Now there are more types involved into source locations, and more ways
to describe a source location.

 * AllErrors.Message: embrace the assumption that files always have less
   than 2 << 32 bytes.
 * SrcLoc gets more complicated, to model more complicated source
   locations.
 * Introduce LazySrcLoc, which can model interesting source locations
   with very little stored state. Useful for avoiding doing unnecessary
   work when no compile errors occur.

Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.

Miscellaneous:

 * std.zig: string literals have more helpful result values for
   reporting errors. There is now a lower level API and a higher level
   API.
   - side note: I noticed that the string literal logic needs some love.
     There is some unnecessarily hacky code there.
 * cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
   probably broke stuff and needs to get fixed.
 * Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
   think this quite how this code will be organized. Need some more
   careful planning about how to implement structs, unions, enums. They
   need to be independent Decls, just like a top level function.
2021-03-16 00:03:22 -07:00
Veikka Tuominen
0a7be71bc2
stage2 cbe: non pointer optionals 2021-03-08 00:33:56 +02:00
Andrew Kelley
8e6c2b7a47 Merge remote-tracking branch 'origin/master' into ast-memory-layout 2021-02-24 15:08:23 -07:00
jacob gw
1bd434fd18 std.Progress: improve support for "dumb" terminals 2021-02-21 12:12:17 +02:00
Michael Dusan
153cd4da0c macos: fix cond to enable ZIG_SYSTEM_LINKER_HACK
closes #8037
2021-02-19 23:37:13 -05:00
Andrew Kelley
70761d7c52 stage2: remove incorrect newlines from log statements 2021-02-19 20:27:06 -07:00
Veikka Tuominen
c0540967e9
translate-c: render array stuff 2021-02-16 16:40:43 +02:00
Andrew Kelley
3d0f4b9030 stage2: start reworking Module/astgen for memory layout changes
This commit does not reach any particular milestone, it is
work-in-progress towards getting things to build.

There's a `@panic("TODO")` in translate-c that should be removed when
working on translate-c stuff.
2021-02-11 23:29:55 -07:00
Andrew Kelley
1adac0a55b never pass -s to clang
We only use clang to produce object files; the idea of stripping is not
relevant here.

Fixes regression in previous commit.
2021-02-07 15:08:48 -07:00
Andrew Kelley
e197a03124 zig cc: recognize the -s flag to be "strip" 2021-02-07 14:51:27 -07:00
Andrew Kelley
0cfa39304b zig cc: recognize more coff linker options
Related: #7874
2021-01-24 14:30:28 -07:00
Andrew Kelley
0d4b6ac741 add LTO support
The CLI gains -flto and -fno-lto options to override the default.
However, the cool thing about this is that the defaults are great! In
general when you use build-exe in release mode, Zig will enable LTO if
it would work and it would help.

zig cc supports detecting and honoring the -flto and -fno-lto flags as
well. The linkWithLld functions are improved to all be the same with
regards to copying the artifact instead of trying to pass single objects
through LLD with -r. There is possibly a future improvement here as
well; see the respective TODOs.

stage1 is updated to support outputting LLVM bitcode instead of machine
code when lto is enabled. This allows LLVM to optimize across the Zig and
C/C++ code boundary.

closes #2845
2021-01-23 18:18:07 -07:00
Andrew Kelley
072d1e088c stage2: fix anonymous Decl ty/val wrong arena
string literals and error set types were allocating the ty/val fields of
the anonymous Decl into the owner Decl's arena, rather than the
new anonymous Decl's arena as intended. This caused use of undefined
value later on in the pipeline.
2021-01-19 16:25:55 -07:00
Andrew Kelley
1af31baf0b stage2: -Dlog enables all logging, log scopes can be set at runtime
Previously you had to recompile if you wanted to change the log scopes
that get printed. Now, log scopes can be set at runtime, and -Dlog
controls whether all logging is available at runtime.

Purpose here is a nicer development experience. Most likely stage2
developers will always want -Dlog enabled and then pass --debug-log
scopes when debugging particular issues.
2021-01-19 15:49:08 -07:00
Andrew Kelley
8c9ac4db97 stage2: implement error notes and regress -femit-zir
* Implement error notes
   - note: other symbol exported here
   - note: previous else prong is here
   - note: previous '_' prong is here
 * Add Compilation.CObject.ErrorMsg. This object properly converts to
   AllErrors.Message when the time comes.
 * Add Compilation.CObject.failure_retryable. Properly handles
   out-of-memory and other transient failures.
 * Introduce Module.SrcLoc which has not only a byte offset but also
   references the file which the byte offset applies to.
 * Scope.Block now contains both a pointer to the "owner" Decl and the
   "source" Decl. As an example, during inline function call, the
   "owner" will be the Decl of the caller and the "source" will be the
   Decl of the callee.
 * Module.ErrorMsg now sports a `file_scope` field so that notes can
   refer to source locations in a file other than the parent error
   message.
 * Some instances where a `*Scope` was stored, now store a
   `*Scope.Container`.
 * Some methods in the `Scope` namespace were moved to the more specific
   type, since there was only an implementation for one particular tag.
   - `removeDecl` moved to `Scope.Container`
   - `destroy` moved to `Scope.File`
 * Two kinds of Scope deleted:
   - zir_module
   - decl
 * astgen: properly use DeclVal / DeclRef. DeclVal was incorrectly
   changed to be a reference; this commit fixes it. Fewer ZIR
   instructions processed as a result.
   - declval_in_module is renamed to declval
   - previous declval ZIR instruction is deleted; it was only for .zir
     files.
 * Test harness: friendlier diagnostics when an unexpected set of errors
   is encountered.
 * zir_sema: fix analyzeInstBlockFlat by properly calling resolvingInst
   on the last zir instruction in the block.

Compile log implementation:
 * Write to a buffer rather than directly to stderr.
 * Only keep track of 1 callsite per Decl.
 * No longer mutate the ZIR Inst struct data.
 * "Compile log statement found" errors are only emitted when there are
   no other compile errors.

-femit-zir and support for .zir source files is regressed. If we wanted
to support this again, outputting .zir would need to be done as yet
another backend rather than in the haphazard way it was previously
implemented.

For parsing .zir, it was implemented previously in a way that was not
helpful for debugging. We need tighter integration with the test harness
for it to be useful; so clearly a rewrite is needed. Given that a
rewrite is needed, and it was getting in the way of progress and
organization of the rest of stage2, I regressed the feature.
2021-01-16 22:51:01 -07:00
Andrew Kelley
a9667b5a85 organize std lib concurrency primitives and add RwLock
* move concurrency primitives that always operate on kernel threads to
   the std.Thread namespace
 * remove std.SpinLock. Nobody should use this in a non-freestanding
   environment; the other primitives are always preferable. In
   freestanding, it will be necessary to put custom spin logic in there,
   so there are no use cases for a std lib version.
 * move some std lib files to the top level fields convention
 * add std.Thread.spinLoopHint
 * add std.Thread.Condition
 * add std.Thread.Semaphore
 * new implementation of std.Thread.Mutex for Windows and non-pthreads Linux
 * add std.Thread.RwLock

Implementations provided by @kprotty
2021-01-14 20:41:37 -07:00
Andrew Kelley
5b2a79848c stage2: cleanups regarding red zone CLI flags
* CLI: change to -mred-zone and -mno-red-zone to match gcc/clang.
 * build.zig: remove the double negative and make it an optional bool.
   This follows precedent from other flags, allowing the compiler CLI to
   be the decider of what is default instead of duplicating the default
   value into the build system code.
 * Compilation: make it an optional `want_red_zone` instead of a
   `no_red_zone` bool. The default is decided by a call to
   `target_util.hasRedZone`.
 * When creating a Clang command line, put -mred-zone on the command
   line if we are forcing it to be enabled.
 * Update update_clang_options.zig with respect to the recent {s}/{} format changes.
 * `zig cc` integration with red zone preference.
2021-01-11 22:07:21 -07:00
Lee Cannon
8932c2d745 Added support for no red zone 2021-01-11 22:07:14 -07:00
Jay Petacat
a0ad2dee6a builtin: Add zig_version
This will enable code to perform version checks and make it easier to
support multiple versions of Zig.

Within the SemVer implementation, an intermediate value needed to be
coerced to a slice to workaround a comptime bug.

Closes #6466
2021-01-09 12:50:39 -08:00
Jonathan Marler
31802c6c68 remove z/Z format specifiers
Zig's format system is flexible enough to add custom formatters.  This PR removes the new z/Z format specifiers that were added for printing Zig identifiers and replaces them with custom formatters.
2021-01-07 23:49:22 -08:00
Jay Petacat
a9b505fa77 Reduce use of deprecated IO types
Related: #4917
2021-01-07 23:48:58 -08:00
Andrew Kelley
5ee0431527 stage2: update to new ArrayListHashMap API 2021-01-06 17:40:25 -07:00
Andrew Kelley
d7d905696c
Merge pull request #7622 from tetsuo-cpp/array-hash-map-improvements
std: Support equivalent ArrayList operations in ArrayHashMap
2021-01-06 16:32:23 -08:00
Andrew Kelley
76870a2265
Merge pull request #7700 from FireFox317/more-stage2-stuff-llvm
stage2: improvements to LLVM backend
2021-01-06 16:06:32 -08:00
Andrew Kelley
efe94a9a12 stage2: C backend: support unused Decls 2021-01-06 16:47:09 -07:00
Timon Kruiper
b1cfa923be stage2: rename and move files related to LLVM backend 2021-01-06 10:52:20 +01:00
g-w1
ab5f7b5156 stage2: add compile log statement 2021-01-05 18:43:41 -07:00
Andrew Kelley
3e39d0c44f minor fixups from moving identifiers and files around 2021-01-05 17:41:22 -07:00
Andrew Kelley
1a2dd85570 stage2: C backend: re-implement emit-h
and also mark functions as `extern "C"` as appropriate to support c++
compilers.
2021-01-05 17:41:14 -07:00
Andrew Kelley
7b8cede61f stage2: rework the C backend
* std.ArrayList gains `moveToUnmanaged` and dead code
   `ArrayListUnmanaged.appendWrite` is deleted.
 * emit_h state is attached to Module rather than Compilation.
 * remove the implementation of emit-h because it did not properly
   integrate with incremental compilation. I will re-implement it
   in a follow-up commit.
 * Compilation: use the .codegen_failure tag rather than
   .dependency_failure tag for when `bin_file.updateDecl` fails.

C backend:
 * Use a CValue tagged union instead of strings for C values.
 * Cleanly separate state into Object and DeclGen:
   - Object is present only when generating a .c file
   - DeclGen is present for both generating a .c and .h
 * Move some functions into their respective Object/DeclGen namespace.
 * Forward decls are managed by the incremental compilation frontend; C
   backend no longer renders function signatures based on callsites.
   For simplicity, all functions always get forward decls.
 * Constants are managed by the incremental compilation frontend. C
   backend no longer has a "constants" section.
 * Participate in incremental compilation. Each Decl gets an ArrayList
   for its generated C code and it is updated when the Decl is updated.
   During flush(), all these are joined together in the output file.
 * The new CValue tagged union is used to clean up using of assigning to
   locals without an additional pointer local.
 * Fix bug with bitcast of non-pointers making the memcpy destination
   immutable.
2021-01-05 17:41:14 -07:00
Alex Cameron
89286376c6 std: Rename ArrayList shrink => shrinkAndFree 2021-01-06 00:55:51 +11:00
Andrew Kelley
2fe8a48215 ci: omit stage2 backend from stage1 on Windows
to avoid out-of-memory on the CI runs.
2021-01-04 14:59:18 -07:00
Timon Kruiper
7e5aacab69 stage2: add some missing deallocations in Compilation.zig 2021-01-03 17:39:43 +01:00
Andrew Kelley
006e7f6805 stage2: re-use ZIR for comptime and inline calls
Instead of freeing ZIR after semantic analysis, we keep it around so
that it can be used for comptime calls, inline calls, and generic
function calls. ZIR memory is now managed by the Decl arena.

Debug dump() functions are conditionally compiled; only available in
Debug builds of the compiler.

Add a test for an inline function call.
2021-01-02 19:11:55 -07:00
Andrew Kelley
9362f382ab stage2: implement function call inlining in the frontend
* remove the -Ddump-zir thing. that's handled through --verbose-ir
 * rework Fn to have an is_inline flag without requiring any more memory
   on the heap per function.
 * implement a rough first version of dumping typed zir (tzir) which is
   a lot more helpful for debugging than what we had before. We don't
   have a way to parse it though.
 * keep track of whether the inline-ness of a function changes because
   if it does we have to go update callsites.
 * add compile error for inline and export used together.

inline function calls and comptime function calls are implemented the
same way. A block instruction is set up to capture the result, and then
a scope is set up that has a flag for is_comptime and some state if the
scope is being inlined.

when analyzing `ret` instructions, zig looks for inlining state in the
scope, and if found, treats `ret` as a `break` instruction instead, with
the target block being the one set up at the inline callsite.

Follow-up items:
 * Complete out the debug TZIR dumping code.
 * Don't redundantly generate ZIR for each inline/comptime function
   call. Instead we should add a new state enum tag to Fn.
 * comptime and inlining branch quotas.
 * Add more test cases.
2021-01-02 19:11:19 -07:00
Andrew Kelley
974c008a0e convert more {} to {d} and {s} 2021-01-02 19:03:14 -07:00
LemonBoy
04f37dcd8e stage2: Use {z} instead of {s} in generated Zig code 2021-01-02 17:12:57 -07:00
LemonBoy
1c13ca5a05 stage2: Use {s} instead of {} when formatting strings 2021-01-02 17:12:57 -07:00
Jakub Konka
763d807377 Duplicate OSAtomic.h between aarch64 and x86_64
The reason this is required is for two reasons: 1) the libc
targeting `aarch64` macOS is slightly newer than that targeting
`x86_64`, and 2) `OSAtomic.h` uses relative imports rather than
system-wide imports for accompanying headers which clearly is an
oversight on Apple's part. Until such time when `libkern` headers
between `x86_64` and `aarch64` are identical, this will require a
manual intervention to duplicate the relevant headers between the
respective architectures.
2021-01-02 15:30:14 +01:00