Mostly on macOS, since Loris showed me a not-great stack trace, and I
spent 8 hours trying to make it better. The dyld shared cache is
designed in a way which makes this really hard to do right, and
documentation is non-existent, but this *seems* to work pretty well.
I'll leave the ruling on whether I did a good job to CI and our users.
Our usage of `ucontext_t` in the standard library was kind of
problematic. We unnecessarily mimiced libc-specific structures, and our
`getcontext` implementation was overkill for our use case of stack
tracing.
This commit introduces a new namespace, `std.debug.cpu_context`, which
contains "context" types for various architectures (currently x86,
x86_64, ARM, and AARCH64) containing the general-purpose CPU registers;
the ones needed in practice for stack unwinding. Each implementation has
a function `current` which populates the structure using inline
assembly. The structure is user-overrideable, though that should only be
necessary if the standard library does not have an implementation for
the *architecture*: that is to say, none of this is OS-dependent.
Of course, in POSIX signal handlers, we get a `ucontext_t` from the
kernel. The function `std.debug.cpu_context.fromPosixSignalContext`
converts this to a `std.debug.cpu_context.Native` with a big ol' target
switch.
This functionality is not exposed from `std.c` or `std.posix`, and
neither are `ucontext_t`, `mcontext_t`, or `getcontext`. The rationale
is that these types and functions do not conform to a specific ABI, and
in fact tend to get updated over time based on CPU features and
extensions; in addition, different libcs use different structures which
are "partially compatible" with the kernel structure. Overall, it's a
mess, but all we need is the kernel context, so we can just define a
kernel-compatible structure as long as we don't claim C compatibility by
putting it in `std.c` or `std.posix`.
This change resulted in a few nice `std.debug` simplifications, but
nothing too noteworthy. However, the main benefit of this change is that
DWARF unwinding---sometimes necessary for collecting stack traces
reliably---now requires far less target-specific integration.
Also fix a bug I noticed in `PageAllocator` (I found this due to a bug
in my distro's QEMU distribution; thanks, broken QEMU patch!) and I
think a couple of minor bugs in `std.debug`.
Resolves: #23801Resolves: #23802
This abstraction isn't really tied to DWARF at all! Really, we're just
loading some information from an ELF file which is useful for debugging.
That *includes* DWARF, but it also includes other information. For
instance, the other change here:
Now, if DWARF information is missing, `debug.SelfInfo.ElfModule` will
name symbols by finding a matching symtab entry. We actually already do
this on Mach-O, so it makes obvious sense to do the same on ELF! This
change is what motivated the restructuring to begin with.
The symtab work is derived from #22077.
Co-authored-by: geemili <opensource@geemili.xyz>
The downside of this commit is that more precise errors are no longer
propagated up. However, these errors were pretty useless in isolation
due to them having no context; and regardless, we intentionally swallow
most of them in `std.debug` anyway. Therefore, this is better in
practice, because it allows `std.debug` to give slightly more useful
warnings when handling errors. This commit does that for unwind errors,
for instance, which differentiate between the unwind info being corrupt
vs missing vs inaccessible vs unsupported.
A better solution would be to also include more detailed information via
the diagnostics pattern, but this commit is an incremental improvement.