Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/trace: implement user events #16619

Open
FiloSottile opened this issue Aug 5, 2016 · 15 comments
Open

runtime/trace: implement user events #16619

FiloSottile opened this issue Aug 5, 2016 · 15 comments
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. help wanted NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@FiloSottile
Copy link
Contributor

https://golang.org/s/go15trace specifies user events:

Runtime/pprof package also exposes a set of functions for emission of user events:

package pprof
func TraceEvent(id string)
func TraceScopeStart(id string)
func TraceScopeEnd(id string)

But there is no implementation at this time.

Those would be invaluable to properly debug latency problems.

(According to @dvyukov there is no one actively working on it, so I might give it a try.)

@dvyukov
Copy link
Member

dvyukov commented Aug 5, 2016

@aclements

@quentinmit quentinmit added this to the Go1.8 milestone Aug 8, 2016
@aclements
Copy link
Member

For reference, there's been a lot of discussion about the user events API since the original proposal here: https://go-review.googlesource.com/c/8552/3/src/runtime/trace/trace.go

Also, user events came up in today's meeting with @hyangah.

@hyangah
Copy link
Contributor

hyangah commented Aug 10, 2016

Yes, I plan to work on it for go1.8 but it's possible we end up with slightly different API.
Will update this issue soon with the plan we are thinking about. @matloob

@quentinmit quentinmit added the NeedsFix The path to resolution is known, but the work has not been done. label Oct 10, 2016
@quentinmit quentinmit modified the milestones: Go1.9Early, Go1.8 Oct 13, 2016
@quentinmit
Copy link
Contributor

This is too late for 1.8.

@hyangah hyangah self-assigned this Feb 8, 2017
@bradfitz bradfitz modified the milestones: Go1.10Early, Go1.9Early May 3, 2017
@bradfitz bradfitz added early-in-cycle A change that should be done early in the 3 month dev cycle. and removed early-in-cycle A change that should be done early in the 3 month dev cycle. labels Jun 14, 2017
@bradfitz bradfitz modified the milestones: Go1.10Early, Go1.10 Jun 14, 2017
@rakyll
Copy link
Contributor

rakyll commented Jul 15, 2017

@hyangah, what's the status of this? This probably needs a proposal first, I already have background and opinions of this one and can create a proposal if no one is actively working on it.

@aclements
Copy link
Member

I have an old prototype of user events at https://go-review.googlesource.com/c/34031/. No effort was made to define a good API, but that could be a useful starting point for an implementation if we decide on an API.

Earlier today @heschik, @rhysh, a few others who's usernames I don't know, and I were talking about the possibility of simply using the existing profiler labels as the user spans, since they often contain exactly the request information you might want to include in a user span. (I'm not personally convinced this is a good idea, but writing it down for the record.)

@rakyll
Copy link
Contributor

rakyll commented Jul 15, 2017

I were talking about the possibility of simply using the existing profiler labels as the user spans

I don't think this is a good idea. There is an obvious possibility to annotate spans with profiler labels but we shouldn't assume they are representing the spans' themselves. In tracing, one of the fundamental features is to give to user the flexibility to be able to mark the sections they are interested in and find statistically meaningful.

@dvyukov
Copy link
Member

dvyukov commented Jul 15, 2017

FWIW is would be useful if we could understand span type, i.e. these spans are of type A (say, request processing), while these are of type B (say, database accesses). Then we could automatically infer cases of tail latency and give explanation as to why a particular span takes so long. For example, this span is the longest of its type, it has comparable with other spans of the same type cpu time, scheduler latency and network wait time, but was blocked on mutexes 10x longer.
That's what we do with goroutines today, using goroutine start function as type.

@heschi
Copy link
Contributor

heschi commented Jul 17, 2017

I think I get why having profiler labels create spans is undesirable, but I don't see any reason not to propagate the profiler labels to the execution trace. Concretely, just emit a trace event whenever a goroutine's labels change, and the trace viewer can attach them on all of the slices (rectangles in the UI) it emits to the JSON.

@hyangah
Copy link
Contributor

hyangah commented Aug 24, 2017

I compiled my thought in https://goo.gl/iqJfJ3
Comments are welcome.

@gopherbot
Copy link

Change https://golang.org/cl/59572 mentions this issue: runtime/trace/trace: user annotation API skeleton

@navytux
Copy link
Contributor

navytux commented Oct 25, 2017

Something related for the reference - some of us are doing user-level Linux-style traceevents/tracepoints in Go:

https://lab.nexedi.com/kirr/go123/blob/c0f14991/tracing/tracing.go
https://lab.nexedi.com/kirr/go123/blob/c0f14991/tracing/cmd/gotrace/gotrace.go

https://lab.nexedi.com/kirr/go123/commit/3cf17be3
https://lab.nexedi.com/kirr/go123/commit/d89b1be1
https://lab.nexedi.com/kirr/go123/commit/c0f14991

@gopherbot
Copy link

Change https://golang.org/cl/63274 mentions this issue: runtime/trace: user annotation API

gopherbot pushed a commit that referenced this issue Feb 15, 2018
This CL presents the proposed user annotation API skeleton.
This CL bumps up the trace version to 1.11.

Design doc https://goo.gl/iqJfJ3

Implementation CLs are followed.

The API introduces three basic building blocks. Log, Span, and Task.

Log is for basic logging. When called, the message will be recorded
to the trace along with timestamp, goroutine id, and stack info.

   trace.Log(ctx, messageType message)

Span can be thought as an extension of log to record interesting
time interval during a goroutine's execution. A span is local to a
goroutine by definition.

   trace.WithSpan(ctx, "doVeryExpensiveOp", func(ctx context) {
      /* do something very expensive */
   })

Task is higher-level concept that aids tracing of complex operations
that encompass multiple goroutines or are asynchronous.
For example, an RPC request, a HTTP request, a file write, or a
batch job can be traced with a Task.

Note we chose to design the API around context.Context so it allows
easier integration with other tracing tools, often designed around
context.Context as well. Log and WithSpan APIs recognize the task
information embedded in the context and record it in the trace as
well. That allows the Go execution tracer to associate and group
the spans and log messages based on the task information.

In order to create a Task,

   ctx, end := trace.NewContext(ctx, "myTask")
   defer end()

The Go execution tracer measures the time between the task created
and the task ended for the task latency.

More discussion history in golang.org/cl/59572.

Update #16619

R=go1.11

Change-Id: I59a937048294dafd23a75cf1723c6db461b193cd
Reviewed-on: https://go-review.googlesource.com/63274
Reviewed-by: Austin Clements <austin@google.com>
@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Unplanned Jun 28, 2018
@nhooyr
Copy link
Contributor

nhooyr commented Sep 11, 2018

Was this done in Go 1.11?

I see https://godoc.org/runtime/trace#hdr-User_annotation

@ammario
Copy link

ammario commented Sep 23, 2018

Why isn't there an average for each task?

@rsc rsc unassigned hyangah Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. help wanted NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests