New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: gcResetMarkState contributes significantly to sweep termination time #14420
Comments
Sure enough. The goroutines don't even have to be doing anything. The cost is something like 25ns/goroutine on my laptop. Amusingly, I can cut that roughly in half just by putting gcAssistBytes on the same cache line as gcscandone and gcscanvalid. :) I think we can just do this concurrently in gcStart before stopping the world. The downside of that is that it will still block allocation on Gs that go outside of their span cache, but that's better than blocking everything. We could also make it just plain faster by pulling those fields out into their own contiguous arrays we can bulk zero. We're paying the same per-goroutine cost in markroot during STW mark termination when we look for stacks to scan. That's trickier to avoid. We might be able to fix that by lifting the gcscanvalid check up in to markroot if we also move gcscanvalid in to a contiguous array. @RLH, any thoughts? Here's a simplified reproducer (I love the term "ballast", BTW):
|
Looks like I caused this performance degradation in b0d5e5c. |
CL https://golang.org/cl/20147 mentions this issue. |
CL https://golang.org/cl/22043 mentions this issue. |
Currently we reset the mark state during STW sweep termination. This involves looping over all of the goroutines. Each iteration of this loop takes ~25ns, so at around 400k goroutines, we'll exceed our 10ms pause goal. However, it's safe to do this before we stop the world for sweep termination because nothing is consuming this state yet. Hence, move the reset to just before STW. This isn't perfect: a long reset can still delay allocating goroutines that block on GC starting. But it's certainly better to block some things eventually than to block everything immediately. For 1.6.x. Fixes #14420. name \ 95%ile-time/sweepTerm old new delta 500kIdleGs-12 11312µs ± 6% 18.9µs ± 6% -99.83% (p=0.000 n=16+20) Change-Id: I9815c4d8d9b0d3c3e94dfdab78049cefe0dcc93c Reviewed-on: https://go-review.googlesource.com/20147 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-on: https://go-review.googlesource.com/22043 Reviewed-by: Austin Clements <austin@google.com>
When profiling the reproducer for #14406 with "go version go1.6 linux/amd64", stack shrinking off, and numactl disabling memory migration, I see calls to
runtime.gcResetMarkState
as called byruntime.gcStart
in nearly all sampled stacks that includeruntime.gcStart
.The lines reported by GODEBUG=gctrace=1 show sweep termination durations of around 30ms when running with 1e6 goroutines. It looks like the sequential processing of
runtime.allgs
is responsible for these long pauses.Full gctrace output is in #14406
@aclements
The text was updated successfully, but these errors were encountered: