This commit changes how we represent comptime-mutable memory
(`comptime var`) in the compiler in order to implement the intended
behavior that references to such memory can only exist at comptime.
It does *not* clean up the representation of mutable values, improve the
representation of comptime-known pointers, or fix the many bugs in the
comptime pointer access code. These will be future enhancements.
Comptime memory lives for the duration of a single Sema, and is not
permitted to escape that one analysis, either by becoming runtime-known
or by becoming comptime-known to other analyses. These restrictions mean
that we can represent comptime allocations not via Decl, but with state
local to Sema - specifically, the new `Sema.comptime_allocs` field. All
comptime-mutable allocations, as well as any comptime-known const allocs
containing references to such memory, live in here. This allows for
relatively fast checking of whether a value references any
comptime-mtuable memory, since we need only traverse values up to
pointers: pointers to Decls can never reference comptime-mutable memory,
and pointers into `Sema.comptime_allocs` always do.
This change exposed some faulty pointer access logic in `Value.zig`.
I've fixed the important cases, but there are some TODOs I've put in
which are definitely possible to hit with sufficiently esoteric code. I
plan to resolve these by auditing all direct accesses to pointers (most
of them ought to use Sema to perform the pointer access!), but for now
this is sufficient for all realistic code and to get tests passing.
This change eliminates `Zcu.tmp_hack_arena`, instead using the Sema
arena for comptime memory mutations, which is possible since comptime
memory is now local to the current Sema.
This change should allow `Decl` to store only an `InternPool.Index`
rather than a full-blown `ty: Type, val: Value`. This commit does not
perform this refactor.
Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior.
WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8.
Closes#18694Closes#1774Closes#2565
Ill-formed UTF-8 byte sequences are replaced by the replacement character (U+FFFD) according to "U+FFFD Substitution of Maximal Subparts" from Chapter 3 of the Unicode standard, and as specified by https://encoding.spec.whatwg.org/#utf-8-decoder
The function returns the vector length, not the byte size of the vector or the bit size of individual elements. This distinction is very important and some usages of this function in the stdlib operated under these incorrect assumptions.
Use inline to vastly simplify the exposed API. This allows a
comptime-known endian parameter to be propogated, making extra functions
for a specific endianness completely unnecessary.
This reverts commit 0c99ba1eab63865592bb084feb271cd4e4b0357e, reversing
changes made to 5f92b070bf284f1493b1b5d433dd3adde2f46727.
This caused a CI failure when it landed in master branch due to a
128-bit `@byteSwap` in std.mem.
Originally inspired by Go's `utf8.Valid` function. Includes some test cases from Go's test suite.
Further optimized to be faster in all tested cases (short/long ascii/UTF8), in all release modes.
Takes advantage of SIMD for the ASCII fast path.
* std.unicode: Add more UTF-16 decoding functions
This mostly makes parts of Utf16LeIterator reusable
* std.json: Fix decoding of UTF-16 surrogate pairs
Before this commit, there were 524,288 codepoints that would get decoded improperly. After this commit, there are 0.
Fixes#16828
Most of this migration was performed automatically with `zig fmt`. There
were a few exceptions which I had to manually fix:
* `@alignCast` and `@addrSpaceCast` cannot be automatically rewritten
* `@truncate`'s fixup is incorrect for vectors
* Test cases are not formatted, and their error locations change
There are still a few occurrences of "stage1" in the standard library
and self-hosted compiler source, however, these instances need a bit
more careful inspection to ensure no breakage.
I hit the "quotes in an RSP file" issue when trying to compile gRPC using
"zig cc". As a fun exercise, I decided to see if I could fix it myself.
I'm fully open to this code being flat-out rejected. Or I can take feedback
to fix it up.
This modifies (and renames) _ArgIteratorWindows_ in process.zig such that
it works with arbitrary strings (or the contents of an RSP file).
In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO"
listed in _ClangArgIterator_.
This change closes#4833.
**Pros:**
- It has the nice attribute of handling "RSP file" arguments in the same way it
handles "cmd_line" arguments.
- High Performance, minimal allocations
- Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes
in a command line were entirely dropped
- Added a test case for the above bug
- Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface
across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_
rather than _next()_)
- Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once
for the entire cmd_line
**Cons:**
- Breaking Change in std library on Windows: Call
_ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_
- PhaseMage is new with contributions to Zig, might need a lot of hand-holding
- PhaseMage is a Windows person, non-Windows stuff will need to be double-checked
**Testing Done:**
- Wrote a few new test cases in process.zig
- zig.exe build test -Dskip-release (no new failures seen)
- zig cc now builds gRPC without error
Because ArrayList.initCapacity uses 'precise' capacity allocation, this should save memory on average, and definitely will save memory in cases where ArrayList is used where a regular allocated slice could have also be used.
We already have a LICENSE file that covers the Zig Standard Library. We
no longer need to remind everyone that the license is MIT in every single
file.
Previously this was introduced to clarify the situation for a fork of
Zig that made Zig's LICENSE file harder to find, and replaced it with
their own license that required annual payments to their company.
However that fork now appears to be dead. So there is no need to
reinforce the copyright notice in every single file.
* AstGen: implement "unreachable code" error for blocks. This works at
the statement level.
* stage1: remove the "unreachable code" error implementation, which
means removing the `is_gen` field from IrInstSrc. This is one small
step towards a smaller memory footprint for stage1. The benefits
won't be realized until a future commit because this flag took
advantage of padding.
There may be a regression here with "union has no associated enum"
error, and there is a regression with the following code:
```zig
const a = noreturn;
```
A future commit will address these regressions.
Conflicts:
* doc/langref.html.in
* lib/std/enums.zig
* lib/std/fmt.zig
* lib/std/hash/auto_hash.zig
* lib/std/math.zig
* lib/std/mem.zig
* lib/std/meta.zig
* test/behavior/alignof.zig
* test/behavior/bitcast.zig
* test/behavior/bugs/1421.zig
* test/behavior/cast.zig
* test/behavior/ptrcast.zig
* test/behavior/type_info.zig
* test/behavior/vector.zig
Master branch added `try` to a bunch of testing function calls, and some
lines also had changed how to refer to the native architecture and other
`@import("builtin")` stuff.