3348 Commits

Author SHA1 Message Date
Andrew Kelley
f5279cbada zig fmt: implement top-level fields 2021-02-03 17:02:12 -07:00
Andrew Kelley
1a83b29bea zig fmt: implement if, call, field access, assignment 2021-02-02 21:05:53 -07:00
Andrew Kelley
0c6b98b825 zig fmt: implement simple test with doc comments 2021-02-01 21:31:41 -07:00
Andrew Kelley
272a0ab359 zig fmt: implement "line comment followed by top-level comptime" 2021-02-01 20:11:55 -07:00
Andrew Kelley
20554d32c0 zig fmt: start reworking with new memory layout
* start implementation of ast.Tree.firstToken and lastToken
 * clarify some ast.Node doc comments
 * reimplement renderToken
2021-02-01 17:23:49 -07:00
Andrew Kelley
bf8fafc37d stage2: tokenizer does not emit line comments anymore
only std.zig.render cares about these, and it can find them in the
original source easily enough.
2021-01-31 21:57:48 -07:00
Andrew Kelley
4dca99d3f6 stage2: rework AST memory layout
This is a proof-of-concept of switching to a new memory layout for
tokens and AST nodes. The goal is threefold:

 * smaller memory footprint
 * faster performance for tokenization and parsing
 * most importantly, a proof-of-concept that can be also applied to ZIR
   and TZIR to improve the entire compiler pipeline in this way.

I had a few key insights here:

 * Underlying premise: using less memory will make things faster, because
   of fewer allocations and better cache utilization. Also using less
   memory is valuable in and of itself.
 * Using a Struct-Of-Arrays for tokens and AST nodes, saves the bytes of
   padding between the enum tag (which kind of token is it; which kind
   of AST node is it) and the next fields in the struct. It also improves
   cache coherence, since one can peek ahead in the tokens array without
   having to load the source locations of tokens.
 * Token memory can be conserved by only having the tag (1 byte) and byte
   offset (4 bytes) for a total of 5 bytes per token. It is not necessary
   to store the token ending byte offset because one can always re-tokenize
   later, but also most tokens the length can be trivially determined from
   the tag alone, and for ones where it doesn't, string literals for
   example, one must parse the string literal again later anyway in
   astgen, making it free to re-tokenize.
 * AST nodes do not actually need to store more than 1 token index because
   one can poke left and right in the tokens array very cheaply.

So far we are left with one big problem though: how can we put AST nodes
into an array, since different AST nodes are different sizes?

This is where my key observation comes in: one can have a hash table for
the extra data for the less common AST nodes! But it gets even better than
that:

I defined this data that is always present for every AST Node:

 * tag (1 byte)
   - which AST node is it
 * main_token (4 bytes, index into tokens array)
   - the tag determines which token this points to
 * struct{lhs: u32, rhs: u32}
   - enough to store 2 indexes to other AST nodes, the tag determines
     how to interpret this data

You can see how a binary operation, such as `a * b` would fit into this
structure perfectly. A unary operation, such as `*a` would also fit,
and leave `rhs` unused. So this is a total of 13 bytes per AST node.
And again, we don't have to pay for the padding to round up to 16 because
we store in struct-of-arrays format.

I made a further observation: the only kind of data AST nodes need to
store other than the main_token is indexes to sub-expressions. That's it.
The only purpose of an AST is to bring a tree structure to a list of tokens.
This observation means all the data that nodes store are only sets of u32
indexes to other nodes. The other tokens can be found later by the compiler,
by poking around in the tokens array, which again is super fast because it
is struct-of-arrays, so you often only need to look at the token tags array,
which is an array of bytes, very cache friendly.

So for nearly every kind of AST node, you can store it in 13 bytes. For the
rarer AST nodes that have 3 or more indexes to other nodes to store, either
the lhs or the rhs will be repurposed to be an index into an extra_data array
which contains the extra AST node indexes. In other words, no hash table needed,
it's just 1 big ArrayList with the extra data for AST Nodes.

Final observation, no need to have a canonical tag for a given AST. For example:
The expression `foo(bar)` is a function call. Function calls can have any
number of parameters. However in this example, we can encode the function
call into the AST with a tag called `FunctionCallOnlyOneParam`, and use lhs
for the function expr and rhs for the only parameter expr. Meanwhile if the
code was `foo(bar, baz)` then the AST node would have to be `FunctionCall`
with lhs still being the function expr, but rhs being the index into
`extra_data`. Then because the tag is `FunctionCall` it means
`extra_data[rhs]` is the "start" and `extra_data[rhs+1]` is the "end".
Now the range `extra_data[start..end]` describes the list of parameters
to the function.

Point being, you only have to pay for the extra bytes if the AST actually
requires it. There's no limit to the number of different AST tag encodings.

Preliminary results:

 * 15% improvement on cache-misses
 * 28% improvement on total instructions executed
 * 26% improvement on total CPU cycles
 * 22% improvement on wall clock time

This is 1/4 items on the checklist before this can actually be merged:

 * [x] parser
 * [ ] render (zig fmt)
 * [ ] astgen
 * [ ] translate-c
2021-01-30 20:16:59 -07:00
Andrew Kelley
766b315b38 std.GeneralPurposeAllocator: logging improvements
It now uses the log scope "gpa" instead of "std".

Additionally, there is a new config option `verbose_log` which enables
info log messages for every allocation. Can be useful when debugging.
This option is off by default.
2021-01-30 20:15:26 -07:00
Andrew Kelley
0808d98e10 add std.MultiArrayList
Also known as "Struct-Of-Arrays" or "SOA". The purpose of this data
structure is to provide a similar API to ArrayList but instead of
the element type being a struct, the fields of the struct are in N
different arrays, all with the same length and capacity.

Having this abstraction means we can put them in the same allocation,
avoiding overhead with the allocator. It also saves a tiny bit of
overhead from the redundant capacity and length fields, since each
struct element shares the same value.

This is an alternate implementation to #7854.
2021-01-30 20:12:13 -07:00
Joran Dirk Greef
881ecdc72f Add MAX_RW_COUNT limit to std.os.pread()
Fixes: https://github.com/ziglang/zig/issues/7805
2021-01-25 10:41:38 -08:00
Timon Kruiper
e23bc1f76a render: fix bug when rendering struct initializer with length 1
This crashed the compiler when running translate-c. See the added test.
2021-01-25 10:40:00 -08:00
Andrew Kelley
4ca1f4ec2e
Merge pull request #7846 from LemonBoy/filtertest
stage1: don't filter test blocks with empty label
2021-01-25 10:39:11 -08:00
Joran Dirk Greef
68a040aec7 linux: add fallocate() to io_uring 2021-01-25 10:34:20 -08:00
Timon Kruiper
9238d12537 windows: make sure to handle PATH_NOT_FOUND when deleting files
Fixes #7879
2021-01-25 10:33:08 -08:00
Andrew Kelley
2b321c25ce std.Progress: call refreshWithHeldLock as appropriate 2021-01-24 12:22:17 -07:00
Timon Kruiper
4f7d76f19c fix windows bug in Progress.zig
This bug caused the compiler to deadlock when multiple c objects
were build in parallel.

Thanks @kprotty for finding this bug!
2021-01-24 12:20:51 -07:00
LemonBoy
134f5fd3d6 std: Update test "" to test where it makes sense 2021-01-22 15:46:58 +01:00
LemonBoy
ac004e1bf1 stage1: Allow nameless test blocks
Nameless blocks are never filtered, the test prefix is still applied.
2021-01-22 15:46:58 +01:00
Jakub Konka
843d91e75d Bring back stack trace printing on ARM Darwin
This temporary patch fixes a segfault caused by miscompilation
by the LLD when generating stubs for initialization of thread local
storage. We effectively bypass TLS in the default panic handler
so that no segfault is generated and the stack trace is correctly
reported back to the user.

Note that, this is linked directly to a bigger issue with LLD
ziglang/zig#7527 and when resolved, we only need to remove the
`comptime` code path introduced with this patch to use the default
panic handler that relies on TLS.

Co-authored-by: Andrew Kelley <andrew@ziglang.org>
2021-01-21 23:20:42 +01:00
Andrew Kelley
d5d0619aac stage2: ELF: avoid multiplication for ideal capacity
ideal capacity is now determined by e.g.
x += x / f
rather than
x = x * b / a

This turns a multiplication into an addition, making it less likely to
overflow the integer. This commit also introduces padToIdeal() which
does saturating arithmetic so that no overflow is possible when
calculating ideal capacity.

closes #7830
2021-01-19 13:47:51 -07:00
Andrew Kelley
0353c9601a
Merge pull request #7814 from LemonBoy/fix-7760
std: Fixed pipe2 fallback
2021-01-18 11:49:42 -08:00
Julian Maingot
4c5f69a065 update error return doc
Docs were out of sync with code
2021-01-18 11:04:33 -08:00
LemonBoy
6418f9ae91 std: Add missing cast when calling fcntl w/ constant args
comptime_int arguments are a big no no.
2021-01-18 18:02:09 +01:00
LemonBoy
f33bac2b12 std: define pipe2 only for os that support it 2021-01-18 17:24:26 +01:00
LemonBoy
9d18df142c std: Fixed pipe2 fallback
Use both F_SETFD and F_SETFL depending on what flag we're setting.

Closes #7760
2021-01-18 14:52:35 +01:00
Andrew Kelley
8436134499 std.ArrayHashMap: add "AssertDiscard" function variants
* Add `swapRemoveAssertDiscard`
 * Add `orderedRemoveAssertDiscard`
 * Deprecate `removeAssertDiscard`
2021-01-16 22:49:20 -07:00
Andrew Kelley
1f65828ec6
Merge pull request #7716 from koachan/sparc64-libs
stage1: SPARCv9 f128 enablement
2021-01-16 12:10:03 -08:00
Guillaume Ballet
f7d7cb6268 crypto: add legacy keccak hash functions 2021-01-15 12:36:38 -08:00
Koakuma
1d67ab8823 Fix _Qp_cmp definition 2021-01-15 19:07:39 +07:00
Koakuma
bbb58b10f6 Add compiler-rt stub for SPARC CPUs 2021-01-15 19:07:38 +07:00
Andrew Kelley
19f893c6bb std.Thread: avoid compile errors for single-threaded OS's 2021-01-14 22:42:29 -07:00
Andrew Kelley
ad301d687a fix namespace of kernel32 function calls 2021-01-14 21:42:49 -07:00
Andrew Kelley
9e1aeda3bf std.Thread.StaticResetEvent: call spinLoopHint appropriately 2021-01-14 21:34:30 -07:00
Andrew Kelley
9698ea3173 std.Thread.Mutex: restore the "Held" API
so that std.Thread.Mutex.Dummy can be used as a drop in replacement.
2021-01-14 21:28:22 -07:00
Andrew Kelley
a9667b5a85 organize std lib concurrency primitives and add RwLock
* move concurrency primitives that always operate on kernel threads to
   the std.Thread namespace
 * remove std.SpinLock. Nobody should use this in a non-freestanding
   environment; the other primitives are always preferable. In
   freestanding, it will be necessary to put custom spin logic in there,
   so there are no use cases for a std lib version.
 * move some std lib files to the top level fields convention
 * add std.Thread.spinLoopHint
 * add std.Thread.Condition
 * add std.Thread.Semaphore
 * new implementation of std.Thread.Mutex for Windows and non-pthreads Linux
 * add std.Thread.RwLock

Implementations provided by @kprotty
2021-01-14 20:41:37 -07:00
Asherah Connor
2b0e3ee228
std.os.uefi.protocols.FileProtocol: fix and expose get_position, set_position (#7762) 2021-01-13 21:46:22 -05:00
Jay Petacat
a021c7b1b2 Move fmt.testFmt to testing.expectFmt 2021-01-12 18:13:29 -08:00
Bill Nagel
2c79d669a7 add missing ECONNRESET from getsockoptError 2021-01-12 18:11:58 -08:00
Andrew Kelley
70c608add8
Merge pull request #7577 from semarie/emutls
implement emutls inside compiler_rt.zig
2021-01-12 17:54:02 -08:00
Andrew Kelley
e564d2ca3c
Merge pull request #7714 from mikdusan/target-macos
macos: reimplement OS version detection
2021-01-12 16:45:50 -08:00
Bill Nagel
1e2be14b6b define nfds_t for windows 2021-01-12 16:37:58 -08:00
Sébastien Marie
d7aa7dbab2 implement emutls in compiler_rt 2021-01-12 05:39:46 +00:00
Sébastien Marie
ebf2a7e9b9 add pthread_key functions 2021-01-12 05:39:46 +00:00
Andrew Kelley
8ea2b40e5f std.event.Loop: fix race condition when starting the time wheel
closes #7572
2021-01-11 22:23:03 -07:00
Andrew Kelley
5b2a79848c stage2: cleanups regarding red zone CLI flags
* CLI: change to -mred-zone and -mno-red-zone to match gcc/clang.
 * build.zig: remove the double negative and make it an optional bool.
   This follows precedent from other flags, allowing the compiler CLI to
   be the decider of what is default instead of duplicating the default
   value into the build system code.
 * Compilation: make it an optional `want_red_zone` instead of a
   `no_red_zone` bool. The default is decided by a call to
   `target_util.hasRedZone`.
 * When creating a Clang command line, put -mred-zone on the command
   line if we are forcing it to be enabled.
 * Update update_clang_options.zig with respect to the recent {s}/{} format changes.
 * `zig cc` integration with red zone preference.
2021-01-11 22:07:21 -07:00
Lee Cannon
8932c2d745 Added support for no red zone 2021-01-11 22:07:14 -07:00
Michael Dusan
4c3de99253
more fixups
- clarify comments
- `NativeTargetInfo.detect()` propagate macOS errors
- `macos.detect()` drop `std.log` usage
2021-01-11 20:58:31 -05:00
Michael Dusan
f2be1fb23e
macos: reimplement OS version detection
The macOS version is now obtained by parsing `SystemVersion.plist`.

Test cases added for plist files that date back to '2005 Panther and up
to the recent '2020 Big Sur 11.1 release of macOS.

Thus we are now able to reliably identify 10.3...11.1 and higher.

- drop use of kern.osproductversion sysctl
- drop use of kern.osversion sysctl (fallback)
- drop kern.osversion tests
- add `lib.std.zig.system.detect()`
- add minimalistic parser for `SystemVersion.plist`
- add test cases for { 10.3, 10.3.9, 10.15.6, 11.0, 11.1 }

closes #7569
2021-01-11 19:54:56 -05:00
Rohlem
c96272f618 std.os.windows.GetFinalPathNameByHandle: remove intermediate buffers
... and mem.copy operations. Requires slightly larger input buffers than result length. Add helper functions std.mem.alignInBytes and std.mem.alignInSlice.
2021-01-11 17:48:19 -07:00
Rohlem
f301a8467c std.os.windows.GetFinalPathNameByHandle: remove QueryInformationFile code path 2021-01-11 17:48:18 -07:00