-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: expose number of running/runnable goroutines #17089
Comments
/cc @aclements |
I'm going to tentatively mark this as a feature request for runtime, instead of a proposal, since it seems pretty uncontroversial to me. |
It doesn't seem like a very interesting number, it'll always be less than On Wed, 14 Sep 2016, 00:43 Quentin Smith notifications@github.com wrote:
|
@davecheney The suggestion counts runnable goroutines, so it can be larger than GOMAXPROCS. My concern is that I don't see that this adds anything very useful over |
I think more importantly he doesn't want to count goroutines which are blocked (which NumGoroutine does count). |
Oh, I see, but then the goroutines in state |
Indeed. Only things blocked in Go (select{}, ...) would not be counted if we used the raw goroutine states. |
That is correct.
Using
Based on the POC that I made, It seems that at least some system goroutines always appear active. I could be wrong here as they might be blocking on a syscall (like say the netpoller). The other reason I think system goroutines should be excluded is because
That is correct maybe the
How can this be detected? Here is the POC code I wrote: https://gist.github.com/fd/7136de67a56e174d8c06cb505f7278aa |
Goroutines blocked in the Go runtime have Gwaiting state. You probably don't want to count those, they contribute nothing to CPU load (but do consume some memory). It is not clear whether you should count goroutines in the Gsyscall state. Whether you want to count them depends on whether they are doing real work in the syscall (reading a large file, say) or waiting (read on an idle network socket). I don't think the runtime has the information needed to make that call, although we might be able to make some approximation. That's what makes this problem hard. |
So, how about this:
So unless you are heavily using something like Including Remember, it is not my goal to find an accurate estimation of the CPU utilisation. Instead it is my goal to find a good-enough estimation of the application utilisation. I included a excerpt from Site Reliability Engineering, How Google Runs Production Systems which seems to suggest that Google uses a similar metric/approach. Site Reliability Engineering, How Google Runs Production Systems - p. 366
|
As you say, you are looking for an approximation, and you care about load shedding. Unless you start a long running goroutine for each incoming request, the number of long running goroutines should be a tiny fraction of the total number of goroutines, and are therefore ignorable for approximation purposes. I agree that proxy servers are a problem. Since you have proof of concept code, do you have a way to see the difference between I would be less concerned about adding One possibility would be to return two numbers: the number of running/runnable goroutines and the number of goroutines waiting for a system call or C code. But that seems to me to be too tied to the current details of how system calls and cgo are implemented. I assume you are looking for some sort of general framework here, because for any specific program that wants to do load shedding I would say just count the number of active requests. |
The problem NumActiveGoroutines is trying to solve is when to shed load. Wouldn't monitoring the latency of an application request be a more direct and ultimately more correct way to do this. If latency increases shed load. If latency improves increase load. Is there a use case where this doesn't work but NumActiveGoroutines does? Discussing the nuances of what _Gidle, _Grunnable, _Grunning, _Gsyscall, _Gwaiting plus what _Gscanrunning _Gscanrunnable, _Gscansyscall, and _Gscanidle means in this context is a very implementation dependent discussion. |
Even NumGoroutines does not capture all the work C is doing; the C code may have spawned threads that are independently doing work as well. I think it's reasonable to say that goroutines in C are not active from the perspective of Go, regardless of what they're calling. |
This is not uncontroversial. |
CL https://golang.org/cl/38180 mentions this issue. |
DO NOT SUBMIT. This is an experimental API. This introduces a runtime.SchedStats API to mirror the existing runtime.MemStats API. Currently, SchedStats reports the number of goroutines in four major states: running, runnable, non-go (syscall/cgo), and blocked. The intent is that these can be used to determine the CPU load of a Go process and use this to perform load shedding. This is *not* a complete solution since the Go scheduler cannot account for threads in syscalls or cgo; however, a complete solution can be built by combining these statistics with kernel-provided statistics. The comments on SchedStats attempt to make this clear. ReadSchedStats collects these counts efficiently by scanning the P states and using a running count of the number of goroutines in syscalls that don't own a P (which avoids doing any additional accounting in the syscall fast path). This way, it can avoid scanning all of the goroutines, which could potentially be expensive. With this approach, at GOMAXPROCS=4, ReadSchedStats takes only 33 ns. Updates golang#15490, golang#17089. Change-Id: I202f33eea5d10c83dbf41cb45c8c619ff17fa4c4
DO NOT SUBMIT. This is an experimental API. This introduces a runtime.SchedStats API to mirror the existing runtime.MemStats API. Currently, SchedStats reports the number of goroutines in four major states: running, runnable, non-go (syscall/cgo), and blocked. The intent is that these can be used to determine the CPU load of a Go process and use this to perform load shedding. This is *not* a complete solution since the Go scheduler cannot account for threads in syscalls or cgo; however, a complete solution can be built by combining these statistics with kernel-provided statistics. The comments on SchedStats attempt to make this clear. ReadSchedStats collects these counts efficiently by scanning the P states and using a running count of the number of goroutines in syscalls that don't own a P (which avoids doing any additional accounting in the syscall fast path). This way, it can avoid scanning all of the goroutines, which could potentially be expensive. With this approach, at GOMAXPROCS=4, ReadSchedStats takes only 33 ns. Updates golang#15490, golang#17089. Change-Id: I202f33eea5d10c83dbf41cb45c8c619ff17fa4c4
Summary
I'd like to propose a way to expose the number of active (running + runnable) goroutines.
Background
My primary use case for this metric is to estimate application load (
num-active-goroutines / num-cpu
) in order to implement load shedding. Other metrics, like thetimes()
syscall, don't expose application overload and don't work well in the presence of noisy neighbours.Plan
Currently the runtime package includes
runtime.NumGoroutine() int
which returns the number of live, non-system goroutines.The runtime package could be extended to include
runtime.NumActiveGoroutine() int
.NumActiveGoroutine()
should count all goroutines whereisSystemGoroutine()
is false and where status is_Grunnable|_Grunning|_Gsyscall
.It seems that such a function would need to acquire
sched.lock
andallglock
. This could have some performance implications.The text was updated successfully, but these errors were encountered: