New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: unexpected goroutine starvation #7126
Labels
Milestone
Comments
As I understand it, the scheduler will bias towards the active goroutine (the one that just returned to the scheduler) if it looks like there is more work to do. So, in tight loops it is possible that this goroutine is observed to spin and starve others. Is this a contrived example or does it come from a larger program ? You could try stracing the program and seeing if those os.Stderr writes even make it to syscalls. Assigning to Dmitry for his comments, I have not assigned this to a release. Labels changed: added release-none, repo-main. Owner changed to @dvyukov. Status changed to Accepted. |
I ran into this while writing a benchmark for a real logging library, so I guess that makes it contrived in that I don't expect a production scenario wherein we'd be using this library in a tight loop. I straced, and a least one goroutine is writing to stderr at a syscall level. The problem is that time.Sleep never "returns". |
Here is the benchmark with the (unfortunately not mine to post) logging library omitted. I'm logging to 5 different files at maximum speed, and trying to cleanly shut the system down by Close()'ing the files out from under the goroutines. The time.Sleep() call isn't essential, but being able to shutdown within a reasonable amount of time is. http://play.golang.org/p/h8d_so7MLS |
IMO, the program itself is buggy. it has two goroutines loops endlessly, which prevents the 3rd (main) goroutine to execute when GOMAXPROCS=1. the reason is used to work is that the we originally treated all Write syscalls as blocking, however, the new scheduler will treat syscalls that finish in a very short amount of time as nonblocking (so execute Write like this won't cause the scheduler to schedule another goroutine, hence the starvation). This scheduler behavior change is good for throughput, but not that good for latency. Either fixing the multiple ready gorutines problem or increasing GOMAXPROCS is the correct way to go. |
long comment deleted. Thanks to those who helped fix my code to avoid the undesired behavior. Feel free to close this bug as invalid if go is working as desired. I'm concerned that optimizing for throughput to the extent that a timer is called 30x to 90x later than requested even when the tight loop has both I/O and function calls (2 things that I think of as opportunities to switch goroutines) will make writing working code much harder than it should be. Maybe this is actually a feature request "Wakeup from sleep within an order of magnitude of the time requested even if other goroutines are doing I/O". |
This is not an issue with preemption, the control periodically passes through scheduler due to periodic garbage collections. But scheduler fails to schedule all goroutines fairly. I've mail https://golang.org/cl/53080043/ with a fix. Shane, the program causes extremely frequent garbage collections (heap size is almost zero, and the loop generates garbage). You can increase portion of time spent in the logging library (rather than GC) by adding some garbage ballast, e.g. var ballast = make([]byte, 64<<20) This will reduce frequency of GCs tremendously. Labels changed: added release-go1.3, removed release-none. |
This issue was closed by revision 90eca36. Status changed to Fixed. |
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The text was updated successfully, but these errors were encountered: