Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/trace: emit events traceEv(String|Stack|Frequency) before dependent events #18744

Closed
cstockton opened this issue Jan 22, 2017 · 8 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Suggested Issues that may be good for new contributors looking for work to do.

Comments

@cstockton
Copy link

This is a follow up from a few weeks ago from the mailing list thread I wrote when I was taking a stab at implementing a Go trace Encoder/Decoder. While plenty of this feature still escapes me like the state machine that makes events consistent, it was a good learning exercise and I see the potential for a separate category of tooling. You could stream events to simple dashboards/gui's in real time for a high level view of whats going on in your application or use it during runtime within the same process and filter uninteresting outbound events.

Based on my LIMITED knowledge (disclaimer here) I believe we may be able to construct a complete (processor, goroutineID's, timing, stack, string id refs) stream of events from a shallow look-behind state with a couple changes. First is to emit traceEvString and traceEvStack when discovered so dependents are emitted with referential integrity. The runtime/trace is a bit over my head, but I was able to implement this change as a POC to ensure it worked and tests passed.

The next I am less sure about, that is timing via traceEvFrequency. If it was emitted after EvBatch then all future events could calculate time based on the previous events offset. If i understand correctly it's the cpu ticks, which should be consistent at any point in the trace. But I haven't implemented the timing yet in my library and am not entirely sure on this and my suggestion may not make sense if the timing is based on the elapsed trace duration in some way.

My apologies if these suggestions don't make sense, thanks for taking the time to review regardless.

@ALTree
Copy link
Member

ALTree commented Jan 23, 2017

Hi,

is this a proposal (to be evaluated) or a feature-request-issue that you plan to fix by sending those patches you mention?

@cstockton
Copy link
Author

If the changes are agreed upon by the Go team, I am comfortable cleaning up my patches and sending them for review.

@ALTree
Copy link
Member

ALTree commented Jan 23, 2017

No proposal label then, just cc @dvyukov for opinions.

@ALTree ALTree added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Suggested Issues that may be good for new contributors looking for work to do. labels Jan 23, 2017
@gopherbot
Copy link

CL https://golang.org/cl/36325 mentions this issue.

@hyangah
Copy link
Contributor

hyangah commented Feb 6, 2017

@heschik

@aclements
Copy link
Member

Taking a step back here, is the high-level goal here to make the trace format (more) streamable?

If so, how do you plan to deal with time being out of order in the trace? In the current format, it may be necessary to buffer the entire trace before it can be sorted; for example, if some P is producing events so slowly that they all get buffered until the end of the trace.

@cstockton
Copy link
Author

cstockton commented Feb 13, 2017

@aclements Yes, that would be one of the high level goals. Currently the entire trace HAS to be buffered for complete events regardless of batch ordering, my goal with this change is to lift that restriction to allow tackling the second problem you mention with slow P's. The built in Go tool which gives the very detailed report and holistic view of a trace would not benefit much, but it does enable a new tool category that focuses on aggregate information in sliding windows. For example a tool could keep 30 second buffer and run for many minutes, or hours if needed and you could create a simple threshold for specific metrics that you have trouble catching in production to send to a trace file for deeper analysis with the go trace tool later. Maybe this wouldn't be very useful to most but I think it might allow some cool stuff (that likely does't lose value in the event of a single slow P, as it wouldn't effect a large aggregate view in most cases).

The problem with slow P's I think can be managed now that I have a better understanding of tracing. The general idea being once a buffer has been active for a given duration to flush it. I think the problem of solving a single slow P that has many other active P's covers a lot of cases and is much less nuanced than a scenario of ALL slow P's. The latter would need the wake event expanded for the parking of the ReadTrace G, I likely see obstacles (or don't see) where you and Dmitry may not since you have much better understandings of these things.

@cstockton
Copy link
Author

@aclements Is this something the Go team would accept a patch for if implemented sanely?

@golang golang locked and limited conversation to collaborators Feb 28, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Suggested Issues that may be good for new contributors looking for work to do.
Projects
None yet
Development

No branches or pull requests

5 participants