I pointed a fuzzer at the tokenizer and it crashed immediately. Upon inspection, I was dissatisfied with the implementation. This commit removes several mechanisms: * Removes the "invalid byte" compile error note. * Dramatically simplifies tokenizer recovery by making recovery always occur at newlines, and never otherwise. * Removes UTF-8 validation. * Moves some character validation logic to `std.zig.parseCharLiteral`. Removing UTF-8 validation is a regression of #663, however, the existing implementation was already buggy. When adding this functionality back, it must be fuzz-tested while checking the property that it matches an independent Unicode validation implementation on the same file. While we're at it, fuzzing should check the other properties of that proposal, such as no ASCII control characters existing inside the source code. Other changes included in this commit: * Deprecate `std.unicode.utf8Decode` and its WTF-8 counterpart. This function has an awkward API that is too easy to misuse. * Make `utf8Decode2` and friends use arrays as parameters, eliminating a runtime assertion in favor of using the type system. After this commit, the crash found by fuzzing, which was "\x07\xd5\x80\xc3=o\xda|a\xfc{\x9a\xec\x91\xdf\x0f\\\x1a^\xbe;\x8c\xbf\xee\xea" no longer causes a crash. However, I did not feel the need to add this test case because the simplified logic eradicates most crashes of this nature.
Test Case Quick Reference
Use comments at the end of the file to indicate metadata about the test case. Here are examples of different kinds of tests:
Compile Error Test
If you want it to be run with zig test and match expected error messages:
// error
// is_test=true
//
// :4:13: error: 'try' outside function scope
Execution
This will do zig run on the code and expect exit code 0.
// run
Translate-c
If you want to test translating C code to Zig use translate-c:
// translate-c
// c_frontend=aro,clang
// target=x86_64-linux
//
// pub const foo = 1;
// pub const immediately_after_foo = 2;
//
// pub const somewhere_else_in_the_file = 3:
Run Translated C
If you want to test translating C code to Zig and then executing it use run-translated-c:
// run-translated-c
// c_frontend=aro,clang
// target=x86_64-linux
//
// Hello world!
Incremental Compilation
Make multiple files that have ".", and then an integer, before the ".zig" extension, like this:
hello.0.zig
hello.1.zig
hello.2.zig
Each file can be a different kind of test, such as expecting compile errors, or expecting to be run and exit(0). The test harness will use these to simulate incremental compilation.
At the time of writing there is no way to specify multiple files being changed as part of an update.
Subdirectories
Subdirectories do not have any semantic meaning but they can be used for organization since the test harness will recurse into them. The full directory path will be prepended as a prefix on the test case name.
Limiting which Backends and Targets are Tested
// run
// backend=stage2,llvm
// target=x86_64-linux,x86_64-macos
Possible backends are:
stage1: equivalent to-fstage1.stage2: equivalent to passing-fno-stage1 -fno-LLVM.llvm: equivalent to-fLLVM -fno-stage1.