132 Commits

Author SHA1 Message Date
Andrew Kelley
377e8579f9 std.zig.tokenizer: simplify
I pointed a fuzzer at the tokenizer and it crashed immediately. Upon
inspection, I was dissatisfied with the implementation. This commit
removes several mechanisms:
* Removes the "invalid byte" compile error note.
* Dramatically simplifies tokenizer recovery by making recovery always
  occur at newlines, and never otherwise.
* Removes UTF-8 validation.
* Moves some character validation logic to `std.zig.parseCharLiteral`.

Removing UTF-8 validation is a regression of #663, however, the existing
implementation was already buggy. When adding this functionality back,
it must be fuzz-tested while checking the property that it matches an
independent Unicode validation implementation on the same file. While
we're at it, fuzzing should check the other properties of that proposal,
such as no ASCII control characters existing inside the source code.

Other changes included in this commit:

* Deprecate `std.unicode.utf8Decode` and its WTF-8 counterpart. This
  function has an awkward API that is too easy to misuse.
* Make `utf8Decode2` and friends use arrays as parameters, eliminating a
  runtime assertion in favor of using the type system.

After this commit, the crash found by fuzzing, which was
"\x07\xd5\x80\xc3=o\xda|a\xfc{\x9a\xec\x91\xdf\x0f\\\x1a^\xbe;\x8c\xbf\xee\xea"
no longer causes a crash. However, I did not feel the need to add this
test case because the simplified logic eradicates most crashes of this
nature.
2024-07-31 16:57:42 -07:00
Andrew Kelley
54b7e144b1 initial support for integrated fuzzing
* Add the `-ffuzz` and `-fno-fuzz` CLI arguments.
* Detect fuzz testing flags from zig cc.
* Set the correct clang flags when fuzz testing is requested. It can be
  combined with TSAN and UBSAN.
* Compilation: build fuzzer library when needed which is currently an
  empty zig file.
* Add optforfuzzing to every function in the llvm backend for modules
  that have requested fuzzing.
* In ZigLLVMTargetMachineEmitToFile, add the optimization passes for
  sanitizer coverage.
* std.mem.eql uses a naive implementation optimized for fuzzing when
  builtin.fuzz is true.

Tracked by #20702
2024-07-22 13:07:02 -07:00
Andrew Kelley
eb4028bf30 add std.fmt.hex
converts an unsigned integer into an array
2024-07-20 01:06:29 -07:00
Andrew Kelley
30ec43a6c7 Zcu: extract permanent state from File
Primarily, this commit removes 2 fields from File, relying on the data
being stored in the `files` field, with the key as the path digest, and
the value as the struct decl corresponding to the File. This table is
serialized into the compiler state that survives between incremental
updates.

Meanwhile, the File struct remains ephemeral data that can be
reconstructed the first time it is needed by the compiler process, as
well as operated on by independent worker threads.

A key outcome of this commit is that there is now a stable index that
can be used to refer to a File. This will be needed when serializing
error messages to survive incremental compilation updates.
2024-07-04 17:51:35 -07:00
Igor Anić
e0350859bb use unreachable keyword for unreachable code path 2024-07-03 12:12:53 -07:00
Igor Anić
035c1b6522 std.tar: add strip components error to diagnostics
This was the only kind of error which was raised in pipeToFileSystem and
not added to Diagnostics.
Shell tar silently ignores paths which are stripped out when used with
`--strip-components` switch. This enables that same behavior, errors
will be collected in diagnostics but caller is free to ignore that type
of diagnostics errors.
Enables use case where caller knows structure of the tar file and want
to extract only some deeply nested folders ignoring upper files/folders.

Fixes: #17620 by giving caller options:
- not provide diagnostic and get errors
- provide diagnostics and analyze errors
- provide diagnostics and ignore errors
2024-07-03 12:12:50 -07:00
Andrew Kelley
0fcd59eada rename src/Module.zig to src/Zcu.zig
This patch is a pure rename plus only changing the file path in
`@import` sites, so it is expected to not create version control
conflicts, even when rebasing.
2024-06-22 22:59:56 -04:00
Ryan Liptak
76fb2b685b std: Convert deprecated aliases to compile errors and fix usages
Deprecated aliases that are now compile errors:

- `std.fs.MAX_PATH_BYTES` (renamed to `std.fs.max_path_bytes`)
- `std.mem.tokenize` (split into `tokenizeAny`, `tokenizeSequence`, `tokenizeScalar`)
- `std.mem.split` (split into `splitSequence`, `splitAny`, `splitScalar`)
- `std.mem.splitBackwards` (split into `splitBackwardsSequence`, `splitBackwardsAny`, `splitBackwardsScalar`)
- `std.unicode`
  + `utf16leToUtf8Alloc`, `utf16leToUtf8AllocZ`, `utf16leToUtf8`, `fmtUtf16le` (all renamed to have capitalized `Le`)
  + `utf8ToUtf16LeWithNull` (renamed to `utf8ToUtf16LeAllocZ`)
- `std.zig.CrossTarget` (moved to `std.Target.Query`)

Deprecated `lib/std/std.zig` decls were deleted instead of made a `@compileError` because the `refAllDecls` in the test block would trigger the `@compileError`. The deleted top-level `std` namespaces are:

- `std.rand` (renamed to `std.Random`)
- `std.TailQueue` (renamed to `std.DoublyLinkedList`)
- `std.ChildProcess` (renamed/moved to `std.process.Child`)

This is not exhaustive. Deprecated aliases that I didn't touch:
  + `std.io.*`
  + `std.Build.*`
  + `std.builtin.Mode`
  + `std.zig.c_translation.CIntLiteralRadix`
  + anything in `src/`
2024-06-13 10:18:59 -04:00
Ryan Liptak
0cef727e59 More precise error message for unencodable \u escapes
The surrogate code points U+D800 to U+DFFF are valid code points but are not Unicode scalar values. This commit makes the error message more accurately reflect what is actually allowed in `\u` escape sequences.

From https://www.unicode.org/versions/Unicode15.0.0/ch03.pdf:

> D71 High-surrogate code point: A Unicode code point in the range U+D800 to U+DBFF.
> D73 Low-surrogate code point: A Unicode code point in the range U+DC00 to U+DFFF.
>
> 3.9 Unicode Encoding Forms
> D76 Unicode scalar value: Any Unicode code point except high-surrogate and low-surrogate code points.

Related: #20270
2024-06-12 16:49:00 -04:00
Andrew Kelley
3b77f1ed7e rename zig-cache to .zig-cache
closes #20077
2024-05-29 10:20:15 -07:00
Andrew Kelley
f97c2f28fd update the codebase for the new std.Progress API 2024-05-27 20:56:48 -07:00
Christofer Nolander
8f6b1f2c38
zig fetch: resolve branch/tag names to commit SHA (#19941)
* Revert "Revert "Merge pull request #19349 from nolanderc/save-commit""

This reverts commit 6ca4ed5948d8eaab28fc5e3706aeb1b113a210af.

* update to new URI changes, rework `--save` type

* initialize `latest_commit` to null everywhere
2024-05-11 14:19:35 -07:00
Andrew Kelley
6ca4ed5948 Revert "Merge pull request #19349 from nolanderc/save-commit"
This reverts commit 7fa2357d0586cef742bf691d69a6cffdd353b496, reversing
changes made to cb77bd672c3b398e3c5f6be80af03243bf8638e3.
2024-05-10 16:41:15 -07:00
Andrew Kelley
7fa2357d05
Merge pull request #19349 from nolanderc/save-commit
`zig fetch`: resolve branch/tag names to commit SHA
2024-05-10 16:27:30 -07:00
Andrew Kelley
a72292513e add std.Thread.Pool.spawnWg
This function accepts a WaitGroup parameter and manages the reference
counting therein. It also is infallible.

The existing `spawn` function is still handy when the job wants to
further schedule more tasks.
2024-05-03 20:58:02 -07:00
Jonathan Marler
a96b78c170 add std.zip and support zip files in build.zig.zon
fixes #17408

Helpful reviewers/testers include Joshe Wolfe, Auguste Rame, Andrew
Kelley and Jacob Young.

Co-authored-by: Joel Gustafson <joelg@mit.edu>
2024-05-03 16:58:53 -04:00
Igor Anić
9cfac4718d fetch: combine unpack error validation in a single function
Follow up of: #19500
[discussion](https://github.com/ziglang/zig/pull/19500#discussion_r1558238138)
2024-04-11 15:44:40 -07:00
Jacob Young
c4587dc9f4 Uri: propagate per-component encoding
This allows `std.Uri.resolve_inplace` to properly preserve the fact
that `new` is already escaped but `base` may not be.  I originally tried
just moving `raw_uri` around, but it made uri resolution unmanagably
complicated, so I instead added per-component information to `Uri` which
allows extra allocations to be avoided when constructing uris with
components from different sources, and in some cases, deferring the work
all the way to when the uri is printed, where an allocator may not even
be needed.

Closes #19587
2024-04-10 02:11:54 -07:00
Igor Anić
4151e6c31b fetch: use arena allocator for diagnostic/UnpackResult
Reference:
https://github.com/ziglang/zig/pull/19500#discussion_r1556476973

Arena is now used for Diagnostic (tar and git). `deinit` is not called on Diagnostic
allowing us to reference strings from Diagnostic in UnpackResult without
dupe.

That seamed reasonable to me. Instead of using gpa for Diagnostic, and
then dupe to arena. Or using arena for both and making dupe so we can deinit
Diagnostic.
2024-04-09 15:00:22 +02:00
Igor Anić
f8dd2a1064 fetch: add executable bit test 2024-04-09 15:00:22 +02:00
Igor Anić
8c58b8fe01 fetch: refactor package root in errors
Use stripRoot in less places. Strip it while copying error from
diagnostic to unpack result so other palaces can be free of this logic.
2024-04-09 15:00:22 +02:00
Igor Anić
b422e4a202 fetch: save test cases
Prepare test cases, store them in Fetch/testdata.
They cover changes in this PR as well from previous one #19111.
2024-04-09 15:00:22 +02:00
Igor Anić
3d5a9237f7 fetch: use empty string instead of null for root_dir
Make it consistent with Cache.Path sub_path.
Remove null check in many locations.
2024-04-09 15:00:22 +02:00
Igor Anić
5f0b434f90 fetch: remove root_dir from error messages
To be consistent with paths in manifest.
2024-04-09 15:00:22 +02:00
Igor Anić
22e9c50376 fetch: fix test tarball
Should include folder structure, at least root folder so it can be found
in pipeToFileSystem.
2024-04-09 15:00:21 +02:00
Igor Anić
fc745fb05c fetch: remove test with mount and net dependencies 2024-04-09 15:00:21 +02:00
Igor Anić
84fac2242c fix zig fmt 2024-04-09 15:00:21 +02:00
Igor Anić
dc61c2e904 add comments 2024-04-09 15:00:21 +02:00
Igor Anić
373d48212f fetch: add pathological packages test
Using test cases from:
https://github.com/ianprime0509/pathological-packages repository.
Depends on existence of the FAT32 file system. Folder is in FAT32 file
system because it is case insensitive and and does not support symlinks.

It is complicated test case requires internet connection, depends on
existence of FAT32 in the specific location. But it is so valuable for
development. Running `zig test Package.zig` is so much faster than
building zig binary and running `zig fetch URL`. Committing it here
although it should probably be removed.
2024-04-09 15:00:21 +02:00
Igor Anić
5a38924a7d fetch.git: collect file create diagnostic errors
On case insensitive file systems, don't overwrite files with same name
in different casing. Add diagnostic error so caller could decide what to do.
2024-04-09 15:00:21 +02:00
Igor Anić
ad60b6c1ed fetch: update comments 2024-04-09 15:00:21 +02:00
Igor Anić
dfec4918a3 fetch: remove absolute path from tests 2024-04-09 15:00:21 +02:00
Igor Anić
4d6a7e074b fetch: filter unpack errors
Report only errors which are not filtered by paths in build.zig.zon.
2024-04-09 15:00:21 +02:00
Igor Anić
a0790914b4 fetch: return UnpackResult from unpackResource
Test that we are still outputing same errors.
2024-04-09 15:00:21 +02:00
Igor Anić
b3339f3cd5 tar: store diagnostic errors into UnpackResult
Test that UnpackResult prints same error output as diagnostic.
2024-04-09 15:00:21 +02:00
Igor Anić
9e47857ae9 fetch: make failing test 2024-04-09 15:00:21 +02:00
Ian Johnson
129de47a71 Fix first-time package fetches
Closes #19557
Closes #19561

Previously, `build.zig` was not being detected correctly by
`computeHash` for packages where there is a containing root directory.
2024-04-07 09:07:37 +02:00
Igor Anić
34bb670bb6 package manager: set executable bit
Based on file content. Detects elf magic header or shebang line.

Executable bit is ignored in hash calculation, as it was before this. So
packages hashes are not changed.

Reference:
https://github.com/ziglang/zig/issues/17463#issuecomment-1984798880

Fixes: 17463

Test is here:
7c4600d7bb/src/main.zig (L307)
(if #19500 got accepted I'll move this test to the Fetch.zig)
2024-04-06 13:00:57 -07:00
Igor Anić
a60b7af2c1 fetch: fix manifest included paths filtering
Filter should be applied on path where package root folder (if
there is any) is stripped. Manifest is inside package root and has paths
relative to package root not temporary directory root.
2024-04-04 01:59:15 +02:00
Igor Anić
8545cb0147 fetch: use package root detection pipeToFileSystem
Based on:
https://github.com/ziglang/zig/pull/19111#discussion_r1548620985

In pipeToFileSystem we are already iterating over all files in tarball.
Inspecting them there for existence of single root folder saves two
syscalls later.
2024-04-03 21:20:39 +02:00
Igor Anić
363acf4991 fetch: update comments 2024-04-03 21:17:57 +02:00
Igor Anić
1431e34cb9 fetch: remove one openDir in runResource
Based on comment:
https://github.com/ziglang/zig/pull/19111#discussion_r1548640939

computeHash finds all files in temporary directory. There is no
difference on what path are they. When calculating hash normalized_path
must be set relative to package root. That's the place where we strip
root if needed.
2024-04-03 21:01:29 +02:00
Igor Anić
24304a4385 fetch: save syscall, and add comment
From review comments: https://github.com/ziglang/zig/pull/19111
2024-04-03 17:08:41 +02:00
Igor Anić
c4261bf562 fetch: find package root only for archives 2024-04-03 17:06:20 +02:00
Igor Anić
15ca0c9471 fix typo 2024-04-03 17:06:20 +02:00
Igor Anić
a5a928b966 package manager: don't strip components in tar
Unpack tar without removing leading root folder. Then find package root
in unpacked tmp folder.
2024-04-03 17:06:20 +02:00
Igor Anić
bc5076715b fetch: fix failing test
Prior to
[this](ef9966c985 (diff-08c935ef8c633bb630641d44230597f1cff5afb0e736d451e2ba5569fa53d915R805))
commit tar was not a valid extension. After that this one is valid case.
2024-04-03 17:06:20 +02:00
Andrew Kelley
7bc0b74b6d move Package.Path to std.Build.Cache.Path 2024-03-21 16:16:47 -07:00
Andrew Kelley
cd62005f19 extract std.posix from std.os
closes #5019
2024-03-19 11:45:09 -07:00
Christofer Nolander
57cfe0778c zig fetch: resolve branch/tag names to commit SHA 2024-03-18 19:48:11 +01:00