runtime: goroutines linger in GC assist queue #22654
Labels
compiler/runtime
Issues related to the Go compiler and/or runtime.
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Performance
Milestone
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
This report is about the behavior on tip. I've also seen this recently with Go 1.9 and Go 1.8.
What operating system and processor architecture are you using (
go env
)?linux/amd64
What did you do?
I have a server that makes HTTPS requests for large JSON blobs consisting mainly of base64 encoded protobuf messages. It decodes and processes the messages, retaining them in memory for a short period. I collected and viewed an execution trace.
What I found in the execution trace didn't match the way I think the GC is expected to work.
This server isn't very latency-sensitive, but since the GC is preventing mutators from doing work the program's throughput is stifled.
What did you expect to see?
I expected to see the background GC worker running on 1/4th of the threads (9 of 36 here), regularly releasing assist credit to the goroutines that have become blocked on assist. I expected to see any goroutines that blocked on assist credit to be satisfied in significantly less than 1ms.
What did you see instead?
I see goroutines sitting in the GCWaiting state for several milliseconds, and see less than a quarter of Ps running the background mark worker. User goroutines that allocate memory during those several milliseconds are transitioned to the GCWaiting state.
This doesn't happen during every GC cycle for this program—maybe during one out of five cycles, and to varying degrees. This is the "best" example I've seen out of maybe 40 GC cycles (three screens of execution trace UI).
I'd expect the background workers to flush their credit at least dozens of times each millisecond, since
gcCreditSlack = 2000
seems pretty small. I'd expect there to be enough work for many background mark workers, since the GC splits large jobs are split into oblets of 128 kB. I'd expect for the GC-aware parts of the scheduler to allow and encourage 1/4th of the Ps to run background mark workers. If background mark workers intentionally go idle when they aren't able to find work, I'd expect them to be reactivated whengcFlushBgCredit
finds that there's unsatisfied demand for scan credit.Here's an overview of my program's execution trace. We'll zoom in on the GC cycle at 215–229 ms in a moment:
A GC workers start at 216ms, and 9/36 Ps run the gcBgMarkWorker for 6ms. All Ps are busy for another 4ms (hard to see if 1/4th of them are running the background worker).
Around 225ms, the number of goroutines in the GCWaiting state climbs to the 80s, where it remains until 228ms. Only Ps 9 and 12 run the background mark worker. About a dozen Ps run user goroutines. A couple of those goroutines attempt to allocate and so also enter the GCWaiting state.
When the background mark worker on P 9 stops, its assist credit allows a bunch of the blocked goroutines to run. Those goroutines soon return to the GCWaiting state.
@aclements do you understand what's going on here?
The text was updated successfully, but these errors were encountered: