Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/pprof: syscall exit trace processing is buggy #11262

Closed
rsc opened this issue Jun 17, 2015 · 3 comments
Closed

runtime/pprof: syscall exit trace processing is buggy #11262

rsc opened this issue Jun 17, 2015 · 3 comments
Milestone

Comments

@rsc
Copy link
Contributor

rsc commented Jun 17, 2015

When tracing is enabled, the trace begins with a sequence of GoCreate/GoWaiting/GoInSyscall events describing the state of all the extant goroutines. This is done with the world stopped.

On system call exit, we wait until the world has started again to write the GoSysExit event.

The trace reader requires that the GoSysExit event be preceded by an event putting the goroutine into the system call. During ordinary execution that event is the GoSysCall event; at the beginning of a trace that is the GoInSyscall event.

As written, the trace has this property.

However, the tracer reader overwrites the time stamp on the GoSysExit event with the time stamp of the actual return from the system call, recorded in the trace file as ev.Args[1]. The actual system call exit can happen concurrently with the generation of the initial trace events, and in real time before the writing of the corresponding goroutine's GoInSyscall event. The entries are ordered correctly by recorded time stamp, but then when the reader smashes the GoSysExit event's recorded time stamp with this earlier time stamp and sorts the entries, it ends up with the GoSysExit before the GoInSyscall.

This makes the log reader unhappy, which makes many of the trace tests flaky.

CL https://go-review.googlesource.com/11180 works around this by making the error message about bad syscall exits contain the text 'SKIP TEST' and then if the test gets an error with that text, it calls t.Skip instead of t.Fatal.

But this needs a proper fix for the release: presumably the same problem makes the traces unusable in real life should this occur.

@rsc rsc added this to the Go1.5 milestone Jun 17, 2015
@rsc
Copy link
Contributor Author

rsc commented Jun 17, 2015

I should add: to reproduce easily, take out the SKIP TEST message in internal/trace/parser.go and then run

cd runtime/pprof
GOARCH=386 go test -c
cat >bad <<EOF
#!/bin/bash
GOGC=2 GOTRACEBACK=crash ./pprof.test -test.short -test.v -test.run=Trace
EOF
chmod +x bad
stress ./bad

@gopherbot
Copy link

CL https://golang.org/cl/11180 mentions this issue.

@dvyukov
Copy link
Member

dvyukov commented Jun 17, 2015

cl/9132 is meant to fix it

@golang golang locked and limited conversation to collaborators Jun 25, 2016
@rsc rsc unassigned dvyukov Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants