Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: gc pauseNs larger than 4s #36080

Closed
xvsfekcn opened this issue Dec 11, 2019 · 13 comments
Closed

runtime: gc pauseNs larger than 4s #36080

xvsfekcn opened this issue Dec 11, 2019 · 13 comments
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@xvsfekcn
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.9 linux/amd64

Does this issue reproduce with the latest release?

go version go1.9

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GORACE=""
GOROOT="/usr/local/golang"
GOTOOLDIR="/usr/local/golang/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build725674040=/tmp/go-build"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config

What did you do?

What did you expect to see?

What did you see instead?

@xvsfekcn
Copy link
Author

xvsfekcn commented Dec 11, 2019

I started my program and everything works fine over time, STW started to rise in time, after about 2.3 days later. My pauseNs:

curl -s http://localhost:6060/debug/pprof/heap?debug=1 | grep -A 100 runtime.MemStats
# runtime.MemStats
# Alloc = 9152929216
# TotalAlloc = 24497596006112
# Sys = 14643025808
# Lookups = 24640021
# Mallocs = 314983702799
# Frees = 314890606713
# HeapAlloc = 9152929216
# HeapSys = 13834256384
# HeapIdle = 3843006464
# HeapInuse = 9991249920
# HeapReleased = 784695296
# HeapObjects = 93096086
# Stack = 14942208 / 14942208
# MSpan = 143016040 / 172310528
# MCache = 27776 / 131072
# BuckHashSys = 9887243
# GCSys = 577380352
# OtherSys = 34118021
# NextGC = 12637023712
# LastGC = 1576051983906690682
# PauseNs = [203828819 833122 44497010 159325516 319338315 375776346 208292466 133034718 141559087 247169403 224303382 512235442 165825001 89714485 129990939 148599231 490780744 133338423 154782262 812745 60532623 436341938 238022496 319964271 454474982 518773328 332954413 368718104 858266 457445676 464195037 834154 505901643 936110 3611034 495067537 289896022 524796682 140914877 848957 168438529 318007899 404385567 431675930 288005687 177508673 978077 91788246 340347753 540147043 421783343 264586748 90422906 6624034 417814921 53855855 453283135 40211925 118634926 96312914 368066690 235327760 100536648 1045337 257594391 1074007 66337537 142560983 480373240 392492774 418955865 32178967 26348115 293009917 394083984 219093549 803052 182271365 10284976 534806307 346830504 500074245 1021712 322887198 162602787 457591186 958799 283020383 28184437 345535081 828362 2446129720 889084 441424548 50855043 205107794 733906 415573895 1017907 455905395 879362 380071322 66247272 476012181 877939 521751696 1032588 30851995 485583989 2458300127 115004157 437713696 230794838 810304 246137432 939081 312121432 903018 951949 93242582 464245951 798639 288326403 235891135 121719955 64269077 815271 328020282 520650549 223382584 1004947 199729943 889266 278668733 432818802 98651203 311842240 499257411 260718907 269543337 819032 419187197 73016408 844057 396913394 117294816 96768203 277352403 235586928 349492503 880960 244180721 138064221 257537378 336688385 158606093 2762971 931411 2478892018 417227872 220396927 2446837344 314316950 54964367 67772899 153282247 1047737 208754885 919618 82454641 69999744 341117367 486141465 430776285 895392 135104892 954775 303320998 293714345 174196878 340615474 261347792 345574367 501635317 181690032 85475951 496206341 493427235 451805458 445636806 258290995 83218325 343698814 2502440845 81088232 459480913 368038349 42508786 508837597 326986278 461034688 72136756 957324 411684055 494013055 927757 95024833 208899784 326070156 315869167 129176389 1243124 483039370 453715852 815319 444773746 222427607 273437516 9229028 978643 356583344 359076061 499989934 438760605 791907 29874462 821385 260563387 318759644 302369941 213534383 43918041 101796913 104227525 324251770 38787048 159455648 409513370 3015769897 87574452 281620809 995836 142908763 382185848 9009796 334681354 142779759 100080809 312912939 190165085 319839487 127021161 1058189 469217422 929459 93705930]

If I restart the program,everything is returned to the normal.
but when the problem has happened, found the number of inuse_objects at the time of the problem is the same as when it was started, and the same is true for inuse_space.
Using top command, the memory occupied by the program has not increased

@ianlancetaylor
Copy link
Contributor

Go 1.9 is quite old and is no longer supported. The most recent release is 1.13.5. There have been many improvements to the garbage collector since 1.9. Is it possible for you to try a newer release?

@xvsfekcn
Copy link
Author

xvsfekcn commented Dec 11, 2019

the cpu profile flat flat% sum% cum cum%
1.03mins 27.30% 27.30% 1.04mins 27.43% runtime.gentraceback /root/go/src/runtime/traceback.go
0.31mins 8.22% 35.52% 0.31mins 8.22% runtime.usleep /root/go/src/runtime/sys_linux_amd64.s
0.17mins 4.49% 40.01% 0.17mins 4.49% runtime.heapBitsForObject /root/go/src/runtime/mbitmap.go
0.17mins 4.48% 44.49% 0.30mins 7.97% runtime.cgocall /root/go/src/runtime/cgocall.go
0.14mins 3.67% 51.98% 0.46mins 12.22% runtime.scanobject /root/go/src/runtime/mgcmark.go
0.09mins 2.49% 54.47% 0.09mins 2.49% runtime.greyobject /root/go/src/runtime/mbitmap.go
0.08mins 2.23% 56.69% 0.08mins 2.23% runtime.futex /root/go/src/runtime/sys_linux_amd64.s
0.07mins 1.98% 58.68% 1.40mins 37.02% runtime.mallocgc /root/go/src/runtime/malloc.go

@xvsfekcn
Copy link
Author

Go 1.9 is quite old and is no longer supported. The most recent release is 1.13.5. There have been many improvements to the garbage collector since 1.9. Is it possible for you to try a newer release?

iI wonder if it's a problem with our code itself

@ianlancetaylor
Copy link
Contributor

The most likely cause of this problem is an unpreemptible loop: a loop that runs, perhaps checking some package variable, without making any function calls. That will break the Go scheduler; see #10958 (this problem is fixed in the upcoming 1.14 release). The fix is to add calls to runtime.Gosched in the loop.

@ALTree ALTree changed the title runtime: my gc pauseNs is larger than 4s. runtime: gc pauseNs larger than 4s Dec 11, 2019
@ALTree ALTree added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Dec 12, 2019
@xvsfekcn
Copy link
Author

I turned off the request to send,the GC PauseNs return to normal. But I only need to send 1 request, the system GC PauseNs larger than hundreds of mililseconds. How can I know what GC is doing at this time?
Why do I have the highest percentage of gentraceback?thx and very much.

@xvsfekcn
Copy link
Author

The most likely cause of this problem is an unpreemptible loop: a loop that runs, perhaps checking some package variable, without making any function calls. That will break the Go scheduler; see #10958 (this problem is fixed in the upcoming 1.14 release). The fix is to add calls to runtime.Gosched in the loop.

Any better suggestions? we use the cpu profile and dichotomy also can not find the tight or busy loop.. so sad.

@josharian
Copy link
Contributor

Any better suggestions?

Have you tried go 1.13?

And if it is a non-preemptible loop, try go 1.14. A beta should be coming very soon.

@xvsfekcn
Copy link
Author

the reason for this is the grpc goroutine lead to the gentraceback with time growth

@josharian
Copy link
Contributor

Are you saying that grpc is leaking goroutines?

@xvsfekcn
Copy link
Author

Are you saying that grpc is leaking goroutines?

Yes, the error disappeared when we replaced grpc with http. I'm not sure if it's because our usage is wrong.

@agnivade
Copy link
Contributor

It sounds like a bug with grpc then. Please file an issue on their issue tracker.

@josharian
Copy link
Contributor

pprof's goroutine profiling may be useful for tracking down where the leaked goroutines are being created. See e.g.

https://stackoverflow.com/questions/38401778/how-to-profile-number-of-goroutines#38414527

I'm going to close this issue, but if the issue does turn out to originate in grpc, please do file an issue with them. Thanks!

@golang golang locked and limited conversation to collaborators Dec 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

6 participants