cmd/pprof: unable to profile CPU intensive code (even with available threads) #32652

jaffee · 2019-06-17T18:00:39Z

What version of Go are you using (`go version`)?

$ go version
go version go1.12.6 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (`go env`)?

go env Output

$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/ubuntu/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/ubuntu/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build117733995=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Ran a simple program which launches a configurable number of goroutines all which do CPU intensive work in a tight loop. It also imports net/http/pprof and starts a server to expose the debugging endpoint.

go get github.com/jaffee/pegger/cmd/pegger
pegger -concurrency 3
# in another terminal:
go tool pprof localhost:6060/debug/pprof/profile?seconds=3

Try setting the concurrency argument lower than what runtime.NumCPU() reports on your system. By default it is equal to runtime.NumCPU().

What did you expect to see?

A profile after 3ish seconds.

What did you see instead?

A few different things:

$ go tool pprof localhost:6060/debug/pprof/profile
Fetching profile over HTTP from http://localhost:6060/debug/pprof/profile
localhost:6060/debug/pprof/profile: Get http://localhost:6060/debug/pprof/profile: net/http: timeout awaiting response headers
failed to fetch any source profiles

$ go tool pprof localhost:6060/debug/pprof/profile
Fetching profile over HTTP from http://localhost:6060/debug/pprof/profile
localhost:6060/debug/pprof/profile: Get http://localhost:6060/debug/pprof/profile: dial tcp 127.0.0.1:6060: connect: connection refused
failed to fetch any source profiles

On OSX, I see the connection refused message if -concurrency is equal to the number of hardware threads available, and otherwise it works.

On a 32 core Linux VM in Azure, I've seen both the timeout and connection refused messages at concurrencies ranging from 2 to 32. At concurrency 1, things work as expected. At concurrency 2, I've seen it work, but it usually doesn't. It has failed consistently at all higher concurrencies in my testing.

I can understand why, if I have 8 cores, and I'm running 8 goroutines all in tight loops, that nothing else might get to run, and so I wouldn't be able to connect to the debug endpoint. But why if I'm only running 3 or 4 goroutines on a 32 core machine?

The text was updated successfully, but these errors were encountered:

jaffee · 2019-06-18T13:00:09Z

Here is a condensed version of the pegger code in the playground: https://play.golang.org/p/BVSZLO6Kc3B

(You know, in case you don't want to go get and blindly run arbitrary code from a stranger)

katiehockman · 2019-06-18T22:05:46Z

/cc @hyangah

hyangah · 2019-06-19T00:08:38Z

Is it reproducible other than on VM?

yuce · 2019-06-19T13:55:54Z

@hyangah I can reproduce this with -concurrency 6 and up on Ubuntu 18.04 on an 8 thread CPU (i7-6700HQ). /cc @jaffee .

go version:

go version go1.12.5 linux/amd64

uname -a:

Linux ichi 4.15.0-51-generic #55-Ubuntu SMP Wed May 15 14:27:21 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

go env:

GOARCH="amd64"
GOBIN=""
GOCACHE="/home/yuce/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/yuce/go_path"
GOPROXY=""
GORACE=""
GOROOT="/home/yuce/go"
GOTMPDIR=""
GOTOOLDIR="/home/yuce/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build113306053=/tmp/go-build -gno-record-gcc-switches"

EDIT: Added the output of go version.

hyangah · 2019-06-19T15:09:12Z

I think this is an inherent limitation of profiling/monitoring cpu or memory from runtime, especially, when the runtime doesn't treat profiling and monitoring work differently from any other work the runtime is doing. How can we rely on the process to reliably capture the correct profile or performance signals when the process itself is overloaded or about to oom?

Maybe runtime team may have a better idea to mitigate the issue from runtime (@aclements)

Otherwise, for processes with extremely high cpu usage, I think we may have better luck by collecting profiles outside runtime (e.g. linux perf, ...)

jaffee · 2019-06-19T15:32:48Z

Keep in mind, that we're talking about running with 4 CPU-heavy goroutines on a 32 core host. There are 28 other threads available to the runtime for execution, memory usage is minimal.

I attached to the pegger process with strace, and interestingly, when the pprof request occurs, you see SIGPROF events:

--- SIGPROF {si_signo=SIGPROF, si_code=SI_KERNEL} ---
strace: [ Process PID=13912 runs in 64 bit mode. ]
rt_sigreturn({mask=[]})                 = 5542608
--- SIGPROF {si_signo=SIGPROF, si_code=SI_KERNEL} ---
rt_sigreturn({mask=[]})                 = 7136556669451735562
--- SIGPROF {si_signo=SIGPROF, si_code=SI_KERNEL} ---
rt_sigreturn({mask=[]})                 = 824635219968
--- SIGPROF {si_signo=SIGPROF, si_code=SI_KERNEL} ---

then they abruptly stop, and shortly thereafter the pprof request times out.

If you try to hit the pprof endpoint again, you see NO new system calls through strace.

hyangah · 2019-06-19T15:58:08Z

@jaffee Yeah, that's unarguably strange that pprof couldn't find cpu to run on when there are 28 other cores are idle in theory. But I am not familiar with Azure's VM environment, so I was curious whether such dramatic differences (32 cores >> 4 working cores) were observed in non-VM environment.

jaffee · 2019-06-19T16:28:01Z

Thanks @hyangah. I have a couple boxes I can test on when I get home (nothing with 32 cores though!). Can also pretty easily test on AWS/GCP/OCI if that helps.

jaffee · 2019-07-10T13:08:10Z

After more investigation (and opening another ticket #33015), this is likely also a dup of #10958 and will likely be fixed by #24543.

I think the missing piece of my understanding was that the runtime is waiting for all running goroutines to reach a pre-emption point so that it can do a STW phase of GC (or something), which is why more things won't run even though there are available threads.

katiehockman changed the title ~~unable to profile CPU intensive code (even with available threads)~~ cmd/pprof: unable to profile CPU intensive code (even with available threads) Jun 18, 2019

katiehockman added the NeedsInvestigation label Jun 18, 2019

jaffee closed this as completed Jul 10, 2019

golang locked and limited conversation to collaborators Jul 9, 2020

gopherbot added the FrozenDueToAge label Jul 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/pprof: unable to profile CPU intensive code (even with available threads) #32652

cmd/pprof: unable to profile CPU intensive code (even with available threads) #32652

jaffee commented Jun 17, 2019

jaffee commented Jun 18, 2019

katiehockman commented Jun 18, 2019

hyangah commented Jun 19, 2019

yuce commented Jun 19, 2019 •

edited

Loading

hyangah commented Jun 19, 2019

jaffee commented Jun 19, 2019

hyangah commented Jun 19, 2019

jaffee commented Jun 19, 2019

jaffee commented Jul 10, 2019

cmd/pprof: unable to profile CPU intensive code (even with available threads) #32652

cmd/pprof: unable to profile CPU intensive code (even with available threads) #32652

Comments

jaffee commented Jun 17, 2019

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

jaffee commented Jun 18, 2019

katiehockman commented Jun 18, 2019

hyangah commented Jun 19, 2019

yuce commented Jun 19, 2019 • edited Loading

hyangah commented Jun 19, 2019

jaffee commented Jun 19, 2019

hyangah commented Jun 19, 2019

jaffee commented Jun 19, 2019

jaffee commented Jul 10, 2019

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?

yuce commented Jun 19, 2019 •

edited

Loading