Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/trace: large traces fail to open in the trace-viewer #15482

Closed
mkobetic opened this issue Apr 28, 2016 · 11 comments
Closed

cmd/trace: large traces fail to open in the trace-viewer #15482

mkobetic opened this issue Apr 28, 2016 · 11 comments

Comments

@mkobetic
Copy link

mkobetic commented Apr 28, 2016

  1. What version of Go are you using (go version)?

Recent master (https://go.googlesource.com/go/+/0436a89a2c5afad41356dc1dff7c745cd30636a7)

Martins-MacBook-Pro:src martin$ git rev-parse master...
681c388e65af000772ed9e5484942610de917846
0436a89a2c5afad41356dc1dff7c745cd30636a7
^0436a89a2c5afad41356dc1dff7c745cd30636a7
Martins-MacBook-Pro:src martin$ $GOROOT/bin/go version
go version devel +1ad70de Wed Apr 27 14:58:01 2016 -0400 darwin/amd64
  1. What operating system and processor architecture are you using (go env)?
Martins-MacBook-Pro:src martin$ go env
GOARCH="amd64"
GOBIN="/Users/martin/bin"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/martin/shopify/go:/Users/martin/go"
GORACE=""
GOROOT="/Users/martin/go/go"
GOTOOLDIR="/Users/martin/go/go/pkg/tool/darwin_amd64"
GO15VENDOREXPERIMENT="1"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fno-common"
CXX="clang++"
CGO_ENABLED="1"
  1. What did you do?

I'm trying to use the tracing facilities to diagnose a production issue in our system. 30 second trace generates ~200MB trace containing ~45M events. Running the trace tool on it takes a while to parse it but seems to succeed (I instrumented the trace cmd binary to see what's going on inside). Attempting to open the full trace of this size crashes the JS side ("Aw snap!" screen). Going into the goroutine list and trying to filter it down to a single goroutine (~1M events) fails on the JS side with Uncaught TypeError: tr.importer.Import is not a constructor which is something I hoped https://go-review.googlesource.com/#/c/22013 has already fixed.

I can't really upload the production trace here, so I was trying to reproduce with a different trace. I couldn't reproduce with a shorter trace ~1M = ~200K events, so in the end I instrumented the trace command itself to run a trace while processing the production trace. That gave me a trace of ~45M = ~13M events and that one seems to reproduce the issue.

Here's how I compiled the trace command (given the above environment)

Martins-MacBook-Pro:src martin$ $GOROOT/bin/go build -x -o ~/bin/gotrace cmd/trace
WORK=/var/folders/d6/sqkwj26x2mx7f2tf_wk_9vym0000gn/T/go-build090429502
mkdir -p $WORK/cmd/trace/_obj/
mkdir -p $WORK/cmd/trace/_obj/exe/
cd /Users/martin/go/go/src/cmd/trace
/Users/martin/go/go/pkg/tool/darwin_amd64/compile -o $WORK/cmd/trace.a -trimpath $WORK -p main -complete -buildid 1ec24b4565ae96218700656adedb607074140dfa -D _/Users/martin/go/go/src/cmd/trace -I $WORK -pack ./goroutines.go ./main.go ./pprof.go ./trace.go
cd .
/Users/martin/go/go/pkg/tool/darwin_amd64/link -o $WORK/cmd/trace/_obj/exe/a.out -L $WORK -extld=clang -buildmode=exe -buildid=1ec24b4565ae96218700656adedb607074140dfa $WORK/cmd/trace.a
mkdir -p /Users/martin/bin/
mv $WORK/cmd/trace/_obj/exe/a.out /Users/martin/bin/gotrace

And this is how I ran the trace (the logging instrumentation is mine, and also trimming the trace at 1M events because anything more than that seems to crash the JS viewer)

Martins-MacBook-Pro:pprof2 martin$ gotrace ~/bin/gotrace trace
2016/04/28 14:35:50 parsing trace
2016/04/28 14:36:09 parsed 12894141 events
2016/04/28 14:36:13 tracing 12894141 events
2016/04/28 14:36:18 traced 1000000 events, 1814 frames
2016/04/28 14:36:21 sent serialized trace

I'll upload the trace file and binary as a gist or something and link it below.

  1. What did you expect to see?

Well, I was hoping to take a look at the production trace, no luck so far.

  1. What did you see instead?

:crash: & :burn: :-). Seriously though, I realize these are big traces, but that's what they seem to be from production systems. Either we can figure out how to deal with them in full size or we provide some ways to filter the traces to something the viewer can manage. Otherwise the tool won't be very usable in production scenarios.

@mkobetic
Copy link
Author

This is a tarball with the gotrace binary and the trace file itself https://drive.google.com/file/d/0B0JvI45O5NdxemtUN0pVRzVQbnc/view?usp=sharing

@ianlancetaylor ianlancetaylor changed the title cmd/trace large traces fail to open in the trace-viewer cmd/trace: large traces fail to open in the trace-viewer Apr 28, 2016
@ianlancetaylor ianlancetaylor added this to the Go1.7Maybe milestone Apr 28, 2016
@dvyukov
Copy link
Member

dvyukov commented May 2, 2016

@hyangah @Sajmani

I can think of only the following short-term stop-the-gap fix:
Split trace into <256MB parts so that in trace command page you see:

View trace (0-X sec)
View trace (X-Y sec)
View trace (Y-Z sec)

Does it sound good to you?

@mkobetic
Copy link
Author

mkobetic commented May 2, 2016

The best I was able to get was trimming the trace down to firtst ~1M events. IIRC that covered only about 100ms of time, maybe even less. I think it may be more useful to slice the traces into layers by event type or even more specific filtering criteria. I haven't run the stats on the traces to see which types of events are prevailing, but I'm hoping that suppressing the ones that cause trouble may still leave enough to give you some sense of what's going on.

@mkobetic
Copy link
Author

mkobetic commented May 2, 2016

BTW, all the traces I've had were <256MB, the viewer seems to be able to handle ~1M events at best which seems to be about ~3MB of the binary trace on average.

@dvyukov
Copy link
Member

dvyukov commented May 3, 2016

I think it may be more useful to slice the traces into layers by event type or even more specific filtering criteria.

What would be that criteria?
Note also that the tree is closed for new development now, so anything complex and non-obvious won't go in.

@mkobetic
Copy link
Author

mkobetic commented May 3, 2016

I was thinking something along the lines of the -focus option in the pprof tool, but I certainly haven't thought that through, I really need to take closer look at the contents of those profiles to come up with something more specific.

But that's definitely beyond the scope of simple and obvious. In the near future I was hoping that fixing the Uncaught TypeError: tr.importer.Import is not a constructor error would be doable so that at least the goroutine filtering had better chance to work in at least some cases.

@dvyukov
Copy link
Member

dvyukov commented May 3, 2016

@mkobetic Are you sure you have GOROOT set properly? trace command picks up trace viewer html from GOROOT.
With pending https://go-review.googlesource.com/#/c/22731/ I can open your trace (both a goroutine trace and individual parts of the full trace).

@gopherbot
Copy link

CL https://golang.org/cl/22731 mentions this issue.

@mkobetic
Copy link
Author

mkobetic commented May 3, 2016

Hm, I'm not sure if I had it set in the terminal where I ran the gotrace binary, I'll double check again.

@mkobetic
Copy link
Author

mkobetic commented May 4, 2016

I think you're right, I most likely forgot to set GOROOT when I ran the binary (I ran it in different directory in different terminal, so it was easy to miss). With GOROOT set properly I can't reproduce the JS error anymore. My apologies.

@dvyukov
Copy link
Member

dvyukov commented May 4, 2016

@mkobetic No problem. Thanks for confirming that it works for you now.

@golang golang locked and limited conversation to collaborators May 12, 2017
@rsc rsc unassigned dvyukov Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants