The name of the game here is to avoid CreateProcessW calls at all costs,
and only ever try calling it when we have a real candidate for execution.
Secondarily, we want to minimize the number of syscalls used when checking
for each PATHEXT-appended version of the app name.
An overview of the technique used:
- Open the search directory for iteration (either cwd or a path from PATH)
- Use NtQueryDirectoryFile with a wildcard filename of `<app name>*` to
check if anything that could possibly match either the unappended version
of the app name or any of the versions with a PATHEXT value appended exists.
- If the wildcard NtQueryDirectoryFile call found nothing, we can exit early
without needing to use PATHEXT at all.
This allows us to use a <open dir, NtQueryDirectoryFile, close dir> sequence
for any directory that doesn't contain any possible matches, instead of having
to use a separate look up for each individual filename combination (unappended +
each PATHEXT appended). For directories where the wildcard *does* match something,
we only need to do a maximum of <number of supported PATHEXT extensions> more
NtQueryDirectoryFile calls.
---
In addition, we now only evaluate the extensions in PATHEXT that we know we can handle (.COM, .EXE, .BAT, .CMD) and ignore the rest.
---
This commit also makes two edge cases match Windows behavior:
- If an app name has the extension .exe and it is attempted to be executed, that is now treated as unrecoverable and InvalidExe is immediately returned no matter where the .exe is (cwd or in the PATH). This matches the behavior of the Windows cmd.exe.
- If the app name contains more than just a filename (e.g. it has path separators), then it is excluded from PATH searching and only does a cwd search. This matches the behavior of Windows cmd.exe.
This matches `cmd.exe` behavior. For example, if there is only a file named `mycommand` in the cwd but it is a Linux executable, then running the command `mycommand` will result in:
'mycommand' is not recognized as an internal or external command, operable program or batch file.
However, if there is *both* a `mycommand` (that is a Linux executable) and a `mycommand.exe` that is a valid Windows exe, then running the command `mycommand` will successfully run `mycommand.exe`.
Avoid a lot of unnecessary utf8 -> utf16 conversion and use a single ArrayList buffer for all the joined paths instead of a separate allocation for each join
For example, if the command is specified as `something.exe`, the retry will now try:
```
C:\some\path\something.exe
C:\some\path\something.exe.COM
C:\some\path\something.exe.EXE
C:\some\path\something.exe.BAT
... etc ...
```
whereas before it would only try the versions with an added extension from `PATHEXT`, which would cause the retry to fail on things that it should find.
Given `main.go`:
package main
import _ "os/user"
func main() {}
Compiling it to linux/arm64:
$ CGO_CFLAGS='-O0' GOOS=linux GOARCH=arm64 CGO_ENABLED=1 CC="zig cc -target aarch64-linux-gnu.2.28" go build main.go
Results in this error:
runtime/cgo(.text): unknown symbol memset in callarm64
runtime/cgo(.text): unknown symbol memset in callarm64
runtime/cgo(.text): relocation target memset not defined
In the midst of intermediate compilations files we can see this commmand:
ld.lld -o _cgo_.o <...> /tmp/go-build206961058/b043/_x009.o <...> ~/.cache/zig/.../libcompiler_rt.a <...> ~/.cache/.../libc.so.6
`_x009.o` needs memset:
$ readelf -Ws ./b043/_x009.o | grep memset
22: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND memset
Both `libcompiler_rt.a` and `libc.so.6` provide it:
$ readelf -Ws ~/.cache/zig/.../libcompiler_rt.a | grep memset
870: 0000000000000000 318 FUNC WEAK DEFAULT 519 memset
$ readelf -Ws ~/.cache/zig/.../libc.so.6 | grep -w memset
476: 000000000001d34c 0 FUNC GLOBAL DEFAULT 7 memset@@GLIBC_2.2.5
Since `libcompiler_rt.a` comes before libc in the linker line, the
resulting `_cgo_.o` still links to a weak, unversioned memset:
$ readelf -Ws ./b043/_cgo_.o | grep -w memset
40: 000000000022c07c 160 FUNC WEAK DEFAULT 14 memset
719: 000000000022c07c 160 FUNC WEAK DEFAULT 14 memset
Since the final linking step is done by Golang's linker, it does not
know of `libcompiler_rt.a`, and fails to link with the error message
above. However, Go linker does recognize memset from glibc. If we
specify an `-lc` equivalent before the `libcompiler_rt.a`, it will link
to memset from libc:
$ readelf -Wa ./b043/_x009.o |grep memset
14: 0000000000000000 0 FUNC GLOBAL DEFAULT UND memset@GLIBC_2.17 (2)
157: 0000000000000000 0 FUNC GLOBAL DEFAULT UND memset@GLIBC_2.17
... and then `main.go` will compile+link successfully.
Why doesn't Go linker take memset from glibc? An educated guess: Go
determines whether to link with glibc from what the program asks (I
presume `.dynsym`). Since `memset` is no longer attributed to glibc, Go
skips linking to glibc altogether.
Bonus question: curious why `-O0` is necessary? Because when
optimizations are enabled (the default), the C compiler replaces
`memset` function call with plain `stp` instructions (on aarch64).
By pulling out the parallel hashing setup from `CodeSignature.zig`,
we can now reuse it different places across MachO linker (for now;
I can totally see its usefulness beyond MachO, eg. in COFF or ELF too).
The parallel hasher is generic over actual hasher such as Sha256 or MD5.
The implementation is kept as it was.
For UUID calculation, depending on the linking mode:
* incremental - since it only supports debug mode, we don't bother with MD5
hashing of the contents, and populate it with random data but only once
per a sequence of in-place binary patches
* traditional - in debug, we use random string (for speed); in release,
we calculate the hash, however we use LLVM/LLD's trick in that we
calculate a series of MD5 hashes in parallel and then one an MD5 of MD5
final hash to generate digest.
- Previously, some of the compress tests used `@src()` in combination with `dirname` and `openDirAbsolute` to read test files at runtime, which both excludes platforms that `openDirAbsolute` is not implemented for (WASI) and platforms that `SourceLocation.file` is not absolute (this was true for me locally on Windows). Instead of converting the tests to use `fs.cwd().openDir`, they have been converted to use `@embedFile` to avoid any potential problems with the runtime cwd.
- In order to use `@embedFile`, some of the `[]u8` parameters needed to be changed to `[]const u8`; none of them needed to be non-const anyway
- The tests now use `expectEqual` and `expectEqualSlices` where appropriate for better diagnostics