* Scan from line start when finding tag in tokenizer
This resolves a crash that can occur for invalid bytes like carriage
returns that are valid characters when not parsed from within literals.
There are potentially other edge cases this could resolve as well, as
the calling code for this function didn't account for any potential
'pending_invalid_tokens' that could be queued up by the tokenizer from
within another state.
* Fix carriage return crash in multiline string
Follow the guidance of #38:
> However CR directly before NL is interpreted as only a newline and not part of the multiline string. zig fmt will delete the CR.
Zig fmt already had code for deleting carriage returns, but would still
crash - now it no longer does so. Carriage returns encountered before
line-feeds are now appropriately removed on program compilation as well.
* Only accept carriage returns before line feeds
Previous commit was much less strict about this, this more closely
matches the desired spec of only allow CR characters in a CRLF pair, but
not otherwise.
* Fix CR being rejected when used as whitespace
Missed this comment from ziglang/zig-spec#83:
> CR used as whitespace, whether directly preceding NL or stray, is still unambiguously whitespace. It is accepted by the grammar and replaced by the canonical whitespace by zig fmt.
* Add tests for carriage return handling
* improve error message when build manifest file is missing
* update std.zig.Ast to support ZON
* Compilation.AllErrors.Message: make the notes field a const slice
* move build manifest parsing logic into src/Manifest.zig and add more
checks, and make the checks integrate into the standard error
reporting code so that reported errors look sexy
closes#14290
* std.zig.parse is moved to std.zig.Ast.parse
* the new function has an additional parameter that requires passing
Mode.zig or Mode.zon
* moved parser.zig code to Parse.zig
* added parseZon function next to parseRoot function
This fixes a bug in std.net caused during the introduction of
meta.assumeSentinel due to the unfortunate semantics of mem.span()
This leaves only 3 remaining uses of meta.assumeSentinel() in the
standard library, each of which could be a simple @ptrCast([*:0]T, foo)
instead. I think this function should likely be removed.
- Add cpuid / getXCR0 functions for the cbe to use instead of asm blocks
- Don't cast between 128 bit types during truncation
- Fixup truncation to use functions for shifts / adds
- Fixup float casts for undefined values
- Add test for 128 bit integer truncation
Move the parsing of Root from the parse function to the Parser.parseRoot
method.
Add an empty line between each rule, for consistency with grammar.y.
Ensure that normal doc-comments are always on the top, for consistency.
Add extract-grammar.zig to the tools directory; it is used to extract
the PEG grammar from parse.zig, so that it can be compared with
grammar.y.
* allow file level `union {}` to parse as tuple field
this was found while fuzzing zls.
* before this patch the input `union {}` crashed the parser. after
this, it parses correctly just like `struct {}`.
* adds behavior tests for both inputs `struct {}` and `union {}`,
checking that each becomes a file level tuple field.
these were found while fuzzing zls.
this patch prevents overflow for the following file contents and adds
tests for them.
* `enum(u32)` - causes overflow in std.zig.Ast.fullContainerDecl()
* `*x` - causes overflow in std.zig.Ast.fullPtrType()
* `**x` - causes overflow in std.zig.Ast.firstToken()
There are still a few occurrences of "stage1" in the standard library
and self-hosted compiler source, however, these instances need a bit
more careful inspection to ensure no breakage.