11610 Commits

Author SHA1 Message Date
mlugg
3a9c680ad7
std: allow disabling stack tracing
This option disables both capturing and printing stack traces. The
default is to disable if debug info is stripped.
2025-09-30 13:44:55 +01:00
mlugg
abb2b1e2da
std.debug: update support checks 2025-09-30 13:44:55 +01:00
mlugg
dd8d59686a
std.debug: miscellaneous fixes
Mostly on macOS, since Loris showed me a not-great stack trace, and I
spent 8 hours trying to make it better. The dyld shared cache is
designed in a way which makes this really hard to do right, and
documentation is non-existent, but this *seems* to work pretty well.
I'll leave the ruling on whether I did a good job to CI and our users.
2025-09-30 13:44:54 +01:00
mlugg
a18fd41064
std: rework/remove ucontext_t
Our usage of `ucontext_t` in the standard library was kind of
problematic. We unnecessarily mimiced libc-specific structures, and our
`getcontext` implementation was overkill for our use case of stack
tracing.

This commit introduces a new namespace, `std.debug.cpu_context`, which
contains "context" types for various architectures (currently x86,
x86_64, ARM, and AARCH64) containing the general-purpose CPU registers;
the ones needed in practice for stack unwinding. Each implementation has
a function `current` which populates the structure using inline
assembly. The structure is user-overrideable, though that should only be
necessary if the standard library does not have an implementation for
the *architecture*: that is to say, none of this is OS-dependent.

Of course, in POSIX signal handlers, we get a `ucontext_t` from the
kernel. The function `std.debug.cpu_context.fromPosixSignalContext`
converts this to a `std.debug.cpu_context.Native` with a big ol' target
switch.

This functionality is not exposed from `std.c` or `std.posix`, and
neither are `ucontext_t`, `mcontext_t`, or `getcontext`. The rationale
is that these types and functions do not conform to a specific ABI, and
in fact tend to get updated over time based on CPU features and
extensions; in addition, different libcs use different structures which
are "partially compatible" with the kernel structure. Overall, it's a
mess, but all we need is the kernel context, so we can just define a
kernel-compatible structure as long as we don't claim C compatibility by
putting it in `std.c` or `std.posix`.

This change resulted in a few nice `std.debug` simplifications, but
nothing too noteworthy. However, the main benefit of this change is that
DWARF unwinding---sometimes necessary for collecting stack traces
reliably---now requires far less target-specific integration.

Also fix a bug I noticed in `PageAllocator` (I found this due to a bug
in my distro's QEMU distribution; thanks, broken QEMU patch!) and I
think a couple of minor bugs in `std.debug`.

Resolves: #23801
Resolves: #23802
2025-09-30 13:44:54 +01:00
mlugg
604fb3001d
std.start: also don't print error trace targeting .other
This only matters if `callMain` is called by a user, since `std.start`
will never itself call `callMain` when `target.os.tag == .other`.
However, it *is* a valid use case for a user to call
`std.start.callMain` in their own startup logic, so this makes sense.
2025-09-30 13:44:54 +01:00
mlugg
d289667856
std.debug.Pdb: fix leak 2025-09-30 13:44:54 +01:00
mlugg
51d08f4b9b
fix compile errors and minor bugs 2025-09-30 13:44:54 +01:00
mlugg
344ab62b3f
std.debug: don't attempt SelfInfo unwinding when unsupported 2025-09-30 13:44:53 +01:00
mlugg
e6eccc3c8f
SelfInfo: remove x86-windows unwinding path
Turns out that RtlCaptureStackBackTrace is actually just doing FP (ebp)
unwinding under the hood, making this logic completely redundant with
our own FP-walking implementation; see added comment for details.
2025-09-30 13:44:53 +01:00
mlugg
1a8a8c610d
tests: split up and enhance stack trace tests
Previously, the `test-stack-traces` step was essentially just testing
error traces, and even there we didn't have much coverage. This commit
solves that by splitting the "stack trace" tests into two separate
harnesses: the "stack trace" tests are for actual stack traces (i.e.
involving stack unwinding), while the "error trace" tests are
specifically for error return traces.

The "stack trace" tests will test different configurations of:

* `-lc`
* `-fPIE`
* `-fomit-frame-pointer`
* `-fllvm`
* unwind tables (currently disabled)
* strip debug info (currently disabled)

The main goal there is to test *stack unwinding* under different
conditions. Meanwhile, the "error trace" tests will test different
configurations of `-O` and `-fllvm`; the main goal here, aside from
checking that error traces themselves do not miscompile, is to check
whether debug info is still working even in optimized builds. Of course,
aggressive optimizations *can* thwart debug info no matter what, so as
before, there is a way to disable cases for specific targets / optimize
modes.

The program which converts stack traces into a more validatable format
by removing things like addresses (previously `check-stack-trace.zig`,
now `convert-stack-trace.zig`) has been rewritten and simplified. Also,
thanks to various fixes in this branch, several workarounds have become
unnecessary: for instance, we don't need to ignore the function name
printed in stack traces in release modes, because `std.debug.Dwarf` now
uses the correct DIE for inlined functions!

Neither `test-stack-traces` nor `test-error-traces` does general foreign
architecture testing, because it seems that (at least for now) external
executors often aren't particularly good at handling stack tracing
correctly (looking at you, Wine). Generally, they just test the native
target (this matches the old behavior of `test-stack-traces`). However,
there is one exception: when on an x86_64 or aarch64 host, we will also
test the 32-bit version (x86 or arm) if the OS supports it, because such
executables can be trivially tested without an external executor.

Oh, also, I wrote a bunch of stack trace tests. Previously there was,
erm, *one* test in `test-stack-traces` which wasn't for error traces.
Now there are a good few!
2025-09-30 13:44:53 +01:00
mlugg
cedd9de64f
std.debug.Dwarf: fix names of inlined functions 2025-09-30 13:44:53 +01:00
mlugg
a12ce28224
std: fix os.linux.x86.syscall6
It was possible for `arg6` to be passed as an operand relative to esp.
In that case, the `push` at the top clobbered esp and hence made the
reference to arg6 invalid. This was manifesting in this branch as broken
stack traces on x86-linux due to an `mmap2` syscall accidentally passing
the page offset as non-zero!

This commit fixes a bug introduced in cb0e6d8aa.
2025-09-30 13:44:53 +01:00
mlugg
9901b9389e
std: fix 32-bit build and some unsafe casts 2025-09-30 13:44:53 +01:00
mlugg
7601b397ef
fix bad merge
The API of `std.debug.Pdb` changed.
2025-09-30 13:44:53 +01:00
mlugg
02a0ade138
std.debug: never attempt FP unwind under fomit-frame-pointer 2025-09-30 13:44:53 +01:00
mlugg
4e45362529
link.Elf: fix static PIE
We mustn't emit the DT_PLTGOT entry in `.dynamic` in a statically-linked
PIE, because there's no dl to relocate it (and `std.pie.relocate`, or
the PIE relocator in libc, won't touch it). In that case, there cannot
be any PLT entries, so there's no point emitting the `.got.plt` section
at all. If we just don't create that section, `link.Elf` already knows
not to add the DT_PLTGOT entry to `.dynamic`.

Co-authored-by: Jacob Young <jacobly0@users.noreply.github.com>
2025-09-30 13:44:53 +01:00
mlugg
1123741fd5
Dwarf: use 'gpa' terminology 2025-09-30 13:44:52 +01:00
mlugg
bfbbda7751
compiler: fix new panic handler in release builds 2025-09-30 13:44:52 +01:00
mlugg
c1a30bd0d8
std: replace debug.Dwarf.ElfModule with debug.ElfFile
This abstraction isn't really tied to DWARF at all! Really, we're just
loading some information from an ELF file which is useful for debugging.
That *includes* DWARF, but it also includes other information. For
instance, the other change here:

Now, if DWARF information is missing, `debug.SelfInfo.ElfModule` will
name symbols by finding a matching symtab entry. We actually already do
this on Mach-O, so it makes obvious sense to do the same on ELF! This
change is what motivated the restructuring to begin with.

The symtab work is derived from #22077.

Co-authored-by: geemili <opensource@geemili.xyz>
2025-09-30 13:44:52 +01:00
mlugg
f798048739
std.debug: don't include dumpCurrentStackTrace frame
If it's not given, we should set `first_address` to the return address
of `dumpCurrentStackTrace` to avoid the call to `writeCurrentStackTrace`
appearing in the trace. However, we must only do that if no `context` is
given; if there's a context then we're starting the stack unwind
elsewhere.
2025-09-30 13:44:52 +01:00
mlugg
e6adddf80c
small reasonable change 2025-09-30 13:44:52 +01:00
mlugg
2743fdb7ce
std.debug: try removing a probably-redundant condition 2025-09-30 13:44:52 +01:00
mlugg
229f0a01b8
std.debug: handle ThreadContext slightly better
It's now user-overrideable, and uses `noreturn` types to neatly stop
analysis.
2025-09-30 13:44:52 +01:00
mlugg
1392a7af17
std.debug: unwinding on Windows
...using `RtlVirtualUnwind` on x86_64 and aarch64, and
`RtaCaptureStackBackTrace` on x86.
2025-09-30 13:44:52 +01:00
mlugg
ac4d633ed6
std: fix debug.Info and debug.Coverage 2025-09-30 13:44:52 +01:00
mlugg
3a561da38d
std: doc comments and tweaks 2025-09-30 13:44:51 +01:00
mlugg
0c7b2a7bd5
fix compiler ftbfs from std.macho and std.dwarf changes 2025-09-30 13:44:51 +01:00
mlugg
202aeacc05
std: fixes 2025-09-30 13:44:51 +01:00
mlugg
f1215adeda
SelfInfo.DarwinModule: rename field 2025-09-30 13:44:51 +01:00
mlugg
253fdfce70
SelfInfo: be honest about how general unwinding is
...in that it isn't: it's currently very specialized to DWARF unwinding.

Also, make a type unmanaged.
2025-09-30 13:44:51 +01:00
mlugg
9859440d83
add freestanding support IN THEORY
untested because this branch has errors rn
2025-09-30 13:44:51 +01:00
mlugg
c2ada49354
replace usages of old std.debug APIs
src/crash_handler.zig is still TODO though, i am planning bigger changes there
2025-09-30 13:44:51 +01:00
mlugg
5709369d05
std.debug: improve the APIs and stuff 2025-09-30 13:44:51 +01:00
mlugg
d4f710791f
tweaks 2025-09-30 13:44:51 +01:00
mlugg
67fa5664b7
std.posix: mark getcontext as unsupported by default 2025-09-30 13:44:51 +01:00
mlugg
ba5d9d5a41
remove redundant test
turns out this actually has coverage in std.debug
2025-09-30 13:44:50 +01:00
mlugg
405075f745
SelfInfo: load eh_frame/debug_frame from ELF file if eh_frame_hdr omitted 2025-09-30 13:44:50 +01:00
mlugg
c895aa7a35
std.debug.SelfInfo: concrete error sets
The downside of this commit is that more precise errors are no longer
propagated up. However, these errors were pretty useless in isolation
due to them having no context; and regardless, we intentionally swallow
most of them in `std.debug` anyway. Therefore, this is better in
practice, because it allows `std.debug` to give slightly more useful
warnings when handling errors. This commit does that for unwind errors,
for instance, which differentiate between the unwind info being corrupt
vs missing vs inaccessible vs unsupported.

A better solution would be to also include more detailed information via
the diagnostics pattern, but this commit is an incremental improvement.
2025-09-30 13:44:50 +01:00
mlugg
dd9cb1beea
doc comments 2025-09-30 13:44:50 +01:00
mlugg
5e6a1919c7
fix aarch64-macos DWARF unwinding
turns out this isn't technically specific to that target at all; other
targets just don't emit mid-function 'ret' instructions as much so
certain CFI instruction patterns were only seen on aarch64.

thanks to jacob for finding the bug <3
2025-09-30 13:44:50 +01:00
mlugg
4b47a37717
stash? more like no 2025-09-30 13:44:50 +01:00
mlugg
665f13b0cd
SelfInfo deinit magic 2025-09-30 13:44:50 +01:00
mlugg
ba3f38959a
split SelfInfo into a file per impl 2025-09-30 13:44:50 +01:00
mlugg
1397b95143
std.debug.Dwarf: eliminate host pointer size dependency 2025-09-30 13:44:50 +01:00
mlugg
b762cd30fd
remove TODOs which are done or which i'm not actually gonna do lol 2025-09-30 13:44:50 +01:00
mlugg
e4dbfc109b
dont dupe state you silly billy 2025-09-30 13:44:50 +01:00
mlugg
8fdcdb8c69
the world if Dwarf.ElfModule was like REALLY good: 2025-09-30 13:44:49 +01:00
mlugg
84b65860cf
the world if ElfModule didn't suck: 2025-09-30 13:44:49 +01:00
mlugg
55a7affea4
me when i did a thing 2025-09-30 13:44:49 +01:00
mlugg
25e02bed4c
less hacky :D 2025-09-30 13:44:49 +01:00