The name of the game here is to avoid CreateProcessW calls at all costs,
and only ever try calling it when we have a real candidate for execution.
Secondarily, we want to minimize the number of syscalls used when checking
for each PATHEXT-appended version of the app name.
An overview of the technique used:
- Open the search directory for iteration (either cwd or a path from PATH)
- Use NtQueryDirectoryFile with a wildcard filename of `<app name>*` to
check if anything that could possibly match either the unappended version
of the app name or any of the versions with a PATHEXT value appended exists.
- If the wildcard NtQueryDirectoryFile call found nothing, we can exit early
without needing to use PATHEXT at all.
This allows us to use a <open dir, NtQueryDirectoryFile, close dir> sequence
for any directory that doesn't contain any possible matches, instead of having
to use a separate look up for each individual filename combination (unappended +
each PATHEXT appended). For directories where the wildcard *does* match something,
we only need to do a maximum of <number of supported PATHEXT extensions> more
NtQueryDirectoryFile calls.
---
In addition, we now only evaluate the extensions in PATHEXT that we know we can handle (.COM, .EXE, .BAT, .CMD) and ignore the rest.
---
This commit also makes two edge cases match Windows behavior:
- If an app name has the extension .exe and it is attempted to be executed, that is now treated as unrecoverable and InvalidExe is immediately returned no matter where the .exe is (cwd or in the PATH). This matches the behavior of the Windows cmd.exe.
- If the app name contains more than just a filename (e.g. it has path separators), then it is excluded from PATH searching and only does a cwd search. This matches the behavior of Windows cmd.exe.
* Handle a `null` return from `llvmFieldIndex`.
* Add a behavior test to test this code path.
* Reword this test name, which incorrectly described how pointers to
zero-bit fields behave, and instead describe the actual test.
Previously we used the relocation index to find the corresponding
symbol that represents the type. However, the index actually
represents the index into the list of types. We solved this by
first retrieving the original type, and then finding its location
in the new list of types. When the atom file is 'null', it means
the type originates from a Zig function pointer or a synthetic
function. In both cases, the final type index was already resolved
and therefore equals to relocation's index value.
This was a poor naming choice; these are parameters, not arguments.
Parameters specify what kind of arguments are expected, whereas the arguments are the actual values passed.
This matches `cmd.exe` behavior. For example, if there is only a file named `mycommand` in the cwd but it is a Linux executable, then running the command `mycommand` will result in:
'mycommand' is not recognized as an internal or external command, operable program or batch file.
However, if there is *both* a `mycommand` (that is a Linux executable) and a `mycommand.exe` that is a valid Windows exe, then running the command `mycommand` will successfully run `mycommand.exe`.
Avoid a lot of unnecessary utf8 -> utf16 conversion and use a single ArrayList buffer for all the joined paths instead of a separate allocation for each join
For example, if the command is specified as `something.exe`, the retry will now try:
```
C:\some\path\something.exe
C:\some\path\something.exe.COM
C:\some\path\something.exe.EXE
C:\some\path\something.exe.BAT
... etc ...
```
whereas before it would only try the versions with an added extension from `PATHEXT`, which would cause the retry to fail on things that it should find.
When parsing the table of contents containing the symbols and their
positions we initially used the index within the map to retrieve
the offset. However, during resizing of the underlaying array this
would invalidate those indexes which meant incorrect offsets were
being stored for symbols. We now use the current symbol index
to also get the index into the symbol position instead.
Given `main.go`:
package main
import _ "os/user"
func main() {}
Compiling it to linux/arm64:
$ CGO_CFLAGS='-O0' GOOS=linux GOARCH=arm64 CGO_ENABLED=1 CC="zig cc -target aarch64-linux-gnu.2.28" go build main.go
Results in this error:
runtime/cgo(.text): unknown symbol memset in callarm64
runtime/cgo(.text): unknown symbol memset in callarm64
runtime/cgo(.text): relocation target memset not defined
In the midst of intermediate compilations files we can see this commmand:
ld.lld -o _cgo_.o <...> /tmp/go-build206961058/b043/_x009.o <...> ~/.cache/zig/.../libcompiler_rt.a <...> ~/.cache/.../libc.so.6
`_x009.o` needs memset:
$ readelf -Ws ./b043/_x009.o | grep memset
22: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND memset
Both `libcompiler_rt.a` and `libc.so.6` provide it:
$ readelf -Ws ~/.cache/zig/.../libcompiler_rt.a | grep memset
870: 0000000000000000 318 FUNC WEAK DEFAULT 519 memset
$ readelf -Ws ~/.cache/zig/.../libc.so.6 | grep -w memset
476: 000000000001d34c 0 FUNC GLOBAL DEFAULT 7 memset@@GLIBC_2.2.5
Since `libcompiler_rt.a` comes before libc in the linker line, the
resulting `_cgo_.o` still links to a weak, unversioned memset:
$ readelf -Ws ./b043/_cgo_.o | grep -w memset
40: 000000000022c07c 160 FUNC WEAK DEFAULT 14 memset
719: 000000000022c07c 160 FUNC WEAK DEFAULT 14 memset
Since the final linking step is done by Golang's linker, it does not
know of `libcompiler_rt.a`, and fails to link with the error message
above. However, Go linker does recognize memset from glibc. If we
specify an `-lc` equivalent before the `libcompiler_rt.a`, it will link
to memset from libc:
$ readelf -Wa ./b043/_x009.o |grep memset
14: 0000000000000000 0 FUNC GLOBAL DEFAULT UND memset@GLIBC_2.17 (2)
157: 0000000000000000 0 FUNC GLOBAL DEFAULT UND memset@GLIBC_2.17
... and then `main.go` will compile+link successfully.
Why doesn't Go linker take memset from glibc? An educated guess: Go
determines whether to link with glibc from what the program asks (I
presume `.dynsym`). Since `memset` is no longer attributed to glibc, Go
skips linking to glibc altogether.
Bonus question: curious why `-O0` is necessary? Because when
optimizations are enabled (the default), the C compiler replaces
`memset` function call with plain `stp` instructions (on aarch64).