New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: goroutines aren't scheduled in time #29394
Comments
Hi @zhao-kun - 1.8.3 is quite an old version. Please give a try with 1.12 beta1 and let us know. Also, without proper steps to reproduce this issue, it is very hard to understand what's going on by just looking at the stack trace. Is there a way you can give us the exact steps for us to reproduce this issue ? |
Hi @agnivade , It's hard to reproduce. We have no extra operation on our K8s cluster (or Kubelet program), we run it in the normal way. But we have known there are some issues in the 1.7.4 version of the K8s, especially in the Cadvisor implementation, we have the plan to upgrade to the K8s latest version in the future. So if current information is too few to help you diagnose the problem, can you give me some pieces of advice from the Golang aspect, we can do something to help diagnose it when the issue occurs next time |
You could try with the 1.12beta version and see. Overall, it is hard to say whether the problem is with Go or K8s. I would advise filing an issue on the K8s repo and investigating that. And only when there is some concrete evidence that this is a Go issue, then file a new issue with proper repro steps. |
Timed out in state WaitingForInfo. Closing. (I am just a bot, though. Please speak up if this is a mistake or you have the requested information.) |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Unkonwn
What operating system and processor architecture are you using (
go env
)?The program run in Centos7.4
What did you do?
I have a kubernetes cluster which version is 1.7.4, build with go 1.8.3. Yesterday at noon, one of the nodes of my cluster didn't work, which stopped reporting its status to master. The node didn't work due to a program named the kubelet hung. I checked the logs, found the program stop printing log at 12:09, and after 20 minutes I killed
ABRT
the kubelet process. After dumping all goroutinesstacktrace
it exited.I grep
goroutine
, the result is:I notice many gorutines were blocked about
20
minitues which is almost to kubelet's hunging timeI check the goroutine 614744 stacktrace which is
I found the goroutine hung at
time.Sleep(0x2ba79f8ef)
, the parameter value is0x2ba79f8ef
and nearly equal 11 seconds which is expected as code logic.My question is the goroutine should sleep nearly 11 seconds, but why did it block almost 22 minutes? I think that the golang's runtime didn't schedule it in time, What's happened at the 12:09 which made scheduler didn't work
PS: there are about 83 goroutines blocked at sleep.
The attachment is whole goroutine stacktrace, I desensitized machine name.
abort.log
The text was updated successfully, but these errors were encountered: