Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/trace: flush trace data on non-throw crashes #65319

Closed
mknyszek opened this issue Jan 26, 2024 · 3 comments
Closed

runtime/trace: flush trace data on non-throw crashes #65319

mknyszek opened this issue Jan 26, 2024 · 3 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@mknyszek
Copy link
Contributor

If a Go program has tracing enabled and crashes, chances are that the most recent data (the most useful data) won't be properly flushed, and the trace will be broken. We can discard this broken part of the trace in the tooling (#65316), but it doesn't change the fact that we might loose a lot of information.

The thing is, many crashes that only impact user program state (such as nil dereferences and uncaught-but-recoverable panics) can absolutely still go through with a global buffer flush (runtime.traceAdvance) since the runtime state is still OK.

I'd like to suggest explicitly flushing all trace data on an uncaught panic or a crash due to some "easier" case, like nil dereferences, so that as much of the data comes out in-tact as possible.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jan 26, 2024
@mknyszek
Copy link
Contributor Author

Note: this is related to #63185 (flight recording) as well, since this could make recovering trace data from a crash while flight recording was enabled much more successful in the future. We could consider also adding the ability to install an optional handler to the flight recorder for writing out trace data in these cases, though that should probably go in the flight recording proposal.

@mknyszek mknyszek added this to the Backlog milestone Jan 26, 2024
@mknyszek
Copy link
Contributor Author

This also goes hand-in-hand with #65316, since it's still likely the tail end of the trace data will be broken, since the crash still has to happen.

@cherrymui cherrymui added the NeedsFix The path to resolution is known, but the work has not been done. label Jan 26, 2024
@mknyszek mknyszek self-assigned this Jan 31, 2024
@gopherbot
Copy link

Change https://go.dev/cl/562616 mentions this issue: runtime: call traceAdvance before exiting

ezz-no pushed a commit to ezz-no/go-ezzno that referenced this issue Feb 18, 2024
This ensures the trace buffers are as up-to-date as possible right
before crashing. It increases the chance of finding the culprit for the
crash when looking at core dumps, e.g. if slowness is the cause for the
crash (monitor kills process).

Fixes golang#65319.

Change-Id: Iaf5551911b3b3b01ba65cb8749cf62a411e02d9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/562616
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

3 participants