Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/trace: support for perfetto #57315

Open
felixge opened this issue Dec 14, 2022 · 21 comments
Open

cmd/trace: support for perfetto #57315

felixge opened this issue Dec 14, 2022 · 21 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Proposal Proposal-Accepted
Milestone

Comments

@felixge
Copy link
Contributor

felixge commented Dec 14, 2022

The current catapult trace viewer is bit rotting and struggles with large traces. Perfetto UI could offer better performance and UX.

This was already proposed in #57175 but no dedicated issue existed yet.

Also see #57159 which could be a follow-up enhancement.

@gopherbot gopherbot added this to the Proposal milestone Dec 14, 2022
@gopherbot
Copy link

Change https://go.dev/cl/457716 mentions this issue: cmd/trace: experimental support for perfetto

@felixge
Copy link
Contributor Author

felixge commented Dec 14, 2022

The CL linked above implements a prototype for this. In practice it looks like this:

Notes on the big trace:

  • The catapult UI frequently freezes up completely while trying to view it. Perfetto seems to handle it without issues. I'm not showing it in the video because I don't want to share too much details from the big trace.
  • The perfetto integration currently requires two clicks. This perfetto-dev thread explains the underlaying browser popup blocking issue.

@ianlancetaylor
Copy link
Contributor

CC @golang/runtime

@prattmic prattmic added the compiler/runtime Issues related to the Go compiler and/or runtime. label Dec 15, 2022
@prattmic prattmic changed the title proposal: runtime/trace: support for perfetto proposal: cmd/trace: support for perfetto Dec 15, 2022
@prattmic
Copy link
Member

Thanks for filing this! We certainly need to replace Catapult and Perfetto is the best alternative that I have seen, so overall I am very supportive of this.

Regarding #57315 (comment), until recently I hadn't really considered adding Perfetto support without implementing the trace proto format (#57159), as previously Perfetto didn't handle the JSON very well. But I tried again a few days ago, and it seems to work surprisingly well.

The Perfetto format looks daunting due to a huge number of custom event types for Chrome/Android [1], but the core ones (TracePacket, TrackEvent) seem straightforward, so I don't think that implementing proto support would be too difficult.

Perhaps we could add "experimental" links like in https://go.dev/cl/457716, and also work on the proto format. If both end up in the same release, great. If not, JSON is an OK fallback.

[1] These make me wonder if they would be open to adding Go-specific types for Ps and Gs rather than hacking them into existing constructs.

@prattmic
Copy link
Member

prattmic commented Dec 15, 2022

One issue I did see was that some (but not all) flow events displayed incorrectly.

For example, these events somehow became a loop in Perfetto?

Catapult: image

Perfertto: image (1)

@felixge
Copy link
Contributor Author

felixge commented Dec 15, 2022

Perhaps we could add "experimental" links like in https://go.dev/cl/457716, and also work on the proto format. If both end up in the same release, great. If not, JSON is an OK fallback.

Sounds good.

My colleague @nsrip-dd is already looking into the proto format. It looks daunting, but as you said, perhaps we just need to worry about a small subset of it.

One issue I did see was that some (but not all) flow events displayed incorrectly.

I could imagine other issues as well. I'm hesitant to spend a lot of time debugging this before we have a prototype for the protobuf format is ready. So maybe let's give that a few more weeks before deciding on the JSON option.

Meanwhile it would be great to know if the deep link integration itself seems okay. IMO the only thing that would have to change is the URLs if we switch to protobuf.

@prattmic
Copy link
Member

prattmic commented Dec 15, 2022

Meanwhile it would be great to know if the deep link integration itself seems okay. IMO the only thing that would have to change is the URLs if we switch to protobuf.

Of course I'd prefer a direct link, but from https://perfetto.dev/docs/visualization/deep-linking-to-perfetto-ui it seems like this is our only option, so it is what it is.

On the scalability front, it would be even better if Perfetto didn't need to load the entire trace up front and instead loaded only what it needed on demand, which would eliminate the need for us to cut the trace into chunks for the browser.

Perfetto has a tool [1] for this, where a local process parses the trace and the UI makes RPC queries against it for the current view. In an ideal world, go tool trace could implement that RPC interface to get a really streamlined browser experience.

Unfortunately, that RPC interface is fairly large, and IIUC the core of it is a SQL query interface against the trace data. I think implementing this full API in Go and keeping it up to date as Perfetto evolves is probably too high of a maintenance burden.

I don't think we could ship the trace_processor binary either [2], as we don't have the tooling to build a C++ application for all ports during the release process. We could perhaps check PATH, and use that binary as an optimization if users have installed it themselves?

[1] My understanding is that this same processor is compiled to WASM and run in the browser for the typical full-file mode.
[2] Way out idea is to build trace_processor to WASM and run that from Go!

@felixge
Copy link
Contributor Author

felixge commented Dec 15, 2022

On the scalability front, it would be even better if Perfetto didn't need to load the entire trace up front and instead loaded only what it needed on demand, which would eliminate the need for us to cut the trace into chunks for the browser.

Yeah, that'd be awesome. But I think CL 457716 already gets us a little bit closer. I'm able to bump the split size from 100 to 500MB like shown below, and the Perfetto UI still works really well on the large trace I have.

-	s, c := splittingTraceConsumer(100 << 20) // 100M
+	s, c := splittingTraceConsumer(500 << 20) // 500M

But if I go to 1000MB it fails:

CleanShot 2022-12-15 at 21 12 18@2x

Unfortunately, that RPC interface is fairly large, and IIUC the core of it is a SQL query interface against the trace data. I think implementing this full API in Go and keeping it up to date as Perfetto evolves is probably too high of a maintenance burden.

Agreed.

[1] My understanding is that this same processor is compiled to WASM and run in the browser for the typical full-file mode. > [2] Way out idea is to build trace_processor to WASM and run that from Go!

That would be a fun thing to try out at some point :). But this would still require the Go runtime to ship a WASM runtime such as wazero (Apache 2). Is that realistic?

@prattmic
Copy link
Member

Yeah, that'd be awesome. But I think CL 457716 already gets us a little bit closer. I'm able to bump the split size from 100 to 500MB like shown below, and the Perfetto UI still works really well on the large trace I have.

Nice. I imagine that the proto format will be even better here, as it should be more information dense.

That would be a fun thing to try out at some point :). But this would still require the Go runtime to ship a WASM runtime such as wazero (Apache 2). Is that realistic?

I don't know, it was just a fun thought, not particularly serious. :)

@felixge
Copy link
Contributor Author

felixge commented Dec 15, 2022

Yeah, that'd be awesome. But I think CL 457716 already gets us a little bit closer. I'm able to bump the split size from 100 to 500MB like shown below, and the Perfetto UI still works really well on the large trace I have.

Nice. I imagine that the proto format will be even better here, as it should be more information dense.

It's possible, but the Visualising large traces page hints at a 0.5-1GB file size limit even for protobuf inputs (2GB divided by 2-4x), so I wouldn't bet on it.

I don't know, it was just a fun thought, not particularly serious. :)

Okay :). Alternatively we could just document how to use trace_processor for people working with very large traces.

@mknyszek
Copy link
Contributor

Asking this as someone who knows very little about perfetto: does perfetto have any kind of protocol for like, streaming the trace? In other words, is there something like a perfetto "server" protocol wherein the client tells the server what time slice it wants to look at, and the server provides data for that time slice (potentially aggregating/approximating to limit the size of the data sent)?

@prattmic
Copy link
Member

Asking this as someone who knows very little about perfetto: does perfetto have any kind of protocol for like, streaming the trace? In other words, is there something like a perfetto "server" protocol wherein the client tells the server what time slice it wants to look at, and the server provides data for that time slice (potentially aggregating/approximating to limit the size of the data sent)?

The trace processor mentioned in #57315 (comment) has an RPC interface, but the interface is SQL queries about the trace data. i.e., much higher level than just chunks of raw trace data.

I think what you are describing/what we'd want is an interface below the trace processor, which sends chunks of raw trace proto. AFAIK that doesn't exist. In fact, I think the trace processor currently depends on parsing the entire trace proto before it can serve SQL queries.

@felixge
Copy link
Contributor Author

felixge commented Dec 20, 2022

I just used the trace_processor on the 280MB trace I have laying around. I first converted it to JSON by hitting the /tracejson endpoint. This resulted in 8.8GB of JSON. Running ./trace_processor --httpd big.json took 7.5min before being ready. After that the experience was fairly good. But as @prattmic says, there seems to be no streaming. Everything is loaded into an in-memory column store on startup. The UI then queries this store using various SQL queries over a websocket connection.

So at this point I don't think Perfetto will allow us to "stream" very large traces to the UI. And given the high startup time of trace_processor, I probably prefer the current trace splitting done by go tool trace in most cases.

That being said, we should probably raise this in the perfetto-dev mailing list at some point.

@rsc
Copy link
Contributor

rsc commented Dec 21, 2022

The current thing we use is unmaintained, and we have to move to something. Perfetto seems like the best game in town and it sounds like people here are generally positive about moving. Do I have that right? Are there any objections to agreeing to move to Perfetto? (I realize there is work to be done...)

@mknyszek
Copy link
Contributor

I'm not sure there are any specific objections.

Finding the right point of integration to get all the features (e.g. trace streaming) looks like it's going to be somewhat hard. We may want to reach out to the Perfetto maintainers for guidance.

In the meantime, we can always start with the simple thing of just using the legacy JSON interface, which is I think already better than what we currently have.

@rsc
Copy link
Contributor

rsc commented Dec 21, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@chrisguiney
Copy link
Contributor

For what it's worth, the json endpoints used by the current trace implementation are importable into perfetto. It's not perfect, because you can't import the entire trace. Going off of a rough memory of my attempts, the server would only output information about a given task.

I'd went as far as looking into forking the trace command to be able to translate the entire trace output to the json format. I ultimately decided not to pursue, just due to the time investment combined with not knowing what the future of the trace tool would be.

@felixge
Copy link
Contributor Author

felixge commented Jan 2, 2023

For what it's worth, the json endpoints used by the current trace implementation are importable into perfetto. It's not perfect, because you can't import the entire trace. Going off of a rough memory of my attempts, the server would only output information about a given task.

Not sure what issue you hit with importing an entire trace. It works fine for me (unless the trace is too big, but that's handled by splitting).

I'd went as far as looking into forking the trace command to be able to translate the entire trace output to the json format. I ultimately decided not to pursue, just due to the time investment combined with not knowing what the future of the trace tool would be.

You might want to take a look at my patch here and this video showing it in action. Feedback welcome!

@rsc
Copy link
Contributor

rsc commented Jan 4, 2023

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Jan 11, 2023

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: cmd/trace: support for perfetto cmd/trace: support for perfetto Jan 11, 2023
@rsc rsc modified the milestones: Proposal, Backlog Jan 11, 2023
@felixge
Copy link
Contributor Author

felixge commented Jan 11, 2023

Awesome. Next steps:

  1. I could use some feedback on my patch for adding perfetto as an option to the current HTML pages.
  2. My colleague @nsrip-dd is looking into emitting perfetto's protocol buffer format instead of JSON. Hopefully this will solve problems with event linking not working correctly right now.
  3. More work needs to go into comparing the output produced by both viewers to make sure there is nothing else broken or problematic in Perfetto.
  4. We need to decide how long we want to keep the old viewer around. But I'd suggest to keep it at least for the first release that includes Perfetto.

mauri870 pushed a commit to mauri870/go that referenced this issue Dec 18, 2023
The current catapult trace viewer is bit rotting and struggles with
large traces. Perfetto UI [1] could offer better performance and UX.

This patch gives users the ability to view their traces in both viewers.
If perfetto works well, catapult can be removed in the future.

For golang#57315

[1] https://ui.perfetto.dev/

Change-Id: I3fe248b8b7f1820f0847af5a4a234314fef3b36f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Proposal Proposal-Accepted
Projects
Status: In Progress
Status: Accepted
Development

No branches or pull requests

7 participants