From 06e9b2c4e360e738c3be3d3e5d5a36006fd5b224 Mon Sep 17 00:00:00 2001 From: Manlio Perillo Date: Mon, 16 Jan 2023 19:14:43 +0100 Subject: [PATCH] langref: document UTF-8 BOM handling The current compiler ignores the UTF-8 BOM if it is at the start of the file, and disallows it anywhere else. Document it in the Source Encoding section. --- doc/langref.html.in | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/doc/langref.html.in b/doc/langref.html.in index bb4657f3ee..e7d124ac3a 100644 --- a/doc/langref.html.in +++ b/doc/langref.html.in @@ -11480,6 +11480,10 @@ fn readU32Be() u32 {} but use of hard tabs is discouraged. See {#link|Grammar#}.

+ For compatibility with other tools, the compiler ignores a UTF-8-encoded byte order mark (U+FEFF) + if it is the first Unicode code point in the source text. A byte order mark is not allowed anywhere else in the source. +

+

Note that running zig fmt on a source file will implement all recommendations mentioned here. Note also that the stage1 compiler does not yet support CR or HT control characters.