Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: timerproc does not get to run under load #15706

Closed
aclements opened this issue May 16, 2016 · 5 comments
Closed

runtime: timerproc does not get to run under load #15706

aclements opened this issue May 16, 2016 · 5 comments

Comments

@aclements
Copy link
Member

  1. What version of Go are you using (go version)?

go version devel +64770f6 Mon May 16 18:28:38 2016 +0000 linux/amd64

  1. What operating system and processor architecture are you using (go env)?
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/austin/r/go"
GORACE=""
GOROOT="/home/austin/go.dev"
GOTOOLDIR="/home/austin/go.dev/pkg/tool/linux_amd64"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build809401652=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"

Also reproduced on the linux-arm64-buildlet builder (while tracking down an unrelated problem).

  1. What did you do?
go test -c runtime/pprof
GOMAXPROCS=1 GOGC=1 GODEBUG=gcstackbarrierall=1 GOTRACEBACK=crash ./pprof.test -test.v -test.short -test.run=TestStackBarrierProfiling -test.timeout=1m
  1. What did you expect to see?

The test should finish in under a second.

  1. What did you see instead?

The test runs until you kill it.

The test is actively running and spinning around the main test loop, but the time.After that's supposed to fire after 0.2 seconds and stop the test never fires. Digging into the runtime state, it looks like timeproc was started, but never actually gets scheduled (timers.created is true, but timers.gp is 0), so the timer never fires. Most likely it's being starved by something. Possibly the test is tricking the scheduler into switching the one available P back and forth between the test goroutine and the GC mark worker and never getting to the rest of the run queue. The scheduler state is below:

(gdb) print-sched 
G state:
       1 waiting (chan receive)
         started at runtime.main
         created by runtime.rt0_go
       2 waiting (force gc (idle))
         started at runtime.forcegchelper
         created by runtime.init.4
       3 runnable
         started at runtime.bgsweep
         created by runtime.gcenable
       4 running, m 0, p 0, fractional mark worker
         started at runtime.gcBgMarkWorker
         created by runtime.gcBgMarkStartWorkers
       5 runnable
         started at runtime.runfinq
         created by runtime.createfing
       6 runnable
         started at runtime.timerproc
         created by runtime.addtimerLocked
       7 scanwaiting (GC assist wait)
         started at testing.tRunner
         created by testing.(*T).Run
       8 runnable
         started at runtime/pprof.profileWriter
         created by runtime/pprof.StartCPUProfile

M state:
       0 PID 12395, g 4, p 0, 2 locks
       1 PID 12399
       2 PID 12400, blocked

P state:
       0 running, g 4, m 0, mark worker g 4
         runq: 5 6 8

global runq: 3

/cc @RLH @dvyukov

@aclements aclements added this to the Go1.7Maybe milestone May 16, 2016
@mikioh
Copy link
Contributor

mikioh commented May 17, 2016

If it's hard to fix in go1.7, I'll change the existing test cases in net package not to depend on timers much, well, if possible.

@dvyukov
Copy link
Member

dvyukov commented May 17, 2016

We need a test for this (frequent GCs effect of fairness). We've already been here: #7126.

Execution alternates between a single user gorotuine and GC goroutine (and even a goroutine from globrunq runs episodically), but other goroutines are started.

FWIW, the following fixes this particular test:

        if gp == nil && gcBlackenEnabled != 0 {
                gp = gcController.findRunnableGCWorker(_g_.m.p.ptr())
+               inheritTime = true
        }

@rsc rsc changed the title runtime: timer doesn't fire runtime: timerproc does not get to run under load May 17, 2016
@aclements
Copy link
Member Author

My concern with adding inheritTime = true there is that I think it could lead to a scheduler loop: suppose a user goroutine has very nearly run out the quantum when it blocks; the scheduler picks a mark worker to run; the mark worker inherits the quantum, quickly runs out what's left of it, and returns to the scheduler; since the worker didn't run for very long, the scheduler picks it again, and it again inherits the quantum; since its quantum is up it gets preempted again after just 20us; and the scheduler goes into a loop picking and preempting the mark worker.

You're right, of course, that the high-priority scheduling of the GC worker is the crux of the problem. In effect, by triggering GC and blocking in an assist, user code can control which goroutine runs next and which goroutine runs after that, without inheriting the quantum across these executions.

Another possible solution is for the GC to always add goroutines to the end of the run queue (either a local run queue or the global run queue). Then user code can't exploit it to skip the run queue without consuming quantum.

@gopherbot
Copy link

CL https://golang.org/cl/23171 mentions this issue.

@gopherbot
Copy link

CL https://golang.org/cl/23172 mentions this issue.

gopherbot pushed a commit that referenced this issue May 19, 2016
Currently ready always puts the readied goroutine in runnext. We're
going to have to change this for some uses, so add a flag for whether
or not to use runnext.

For now we always pass true so this is a no-op change.

For #15706.

Change-Id: Iaa66d8355ccfe4bbe347570cc1b1878c70fa25df
Reviewed-on: https://go-review.googlesource.com/23171
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
@golang golang locked and limited conversation to collaborators May 19, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants