runtime: lock cycle between mheap_.lock and gcSweepBuf.spineLock #34156
Labels
compiler/runtime
Issues related to the Go compiler and/or runtime.
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone
As the title says, there's a runtime lock cycle between
mheap_.lock
andgcSweepBuf.spineLock
. The cycle is effectively of the following form:On one thread:
(*mheap).alloc_m
(which runs on the system stack) allocates a span and calls in(*gcSweepBuf).push
.(*gcSweepBuf).push
acquires the spine lock.Meanwhile on another thread:
deductSweepCredit
(on a g's stack) calls intosweepone
, then(*mspan).sweep
.(*mspan).sweep
calls into(*gcSweepBuf).push
.(*gcSweepBuf).push
acquires the spine lock.(*gcSweepBuf).push
calls into eitherpersistentalloc
orunlock
.mheap_.lock
.Note that
(*gcSweepBuf).push
would have the potential for self-deadlock in thealloc_m
case, but because it runs on the system stack, stack growths won't happen.This must be an extremely rare deadlock because
git
history indicates that it's been around since 2016 and we've never received a single bug report (AFAICT). With that being said, if we want any sort of automated lock cycle detection, we need to fix this.It's unclear to me what the right thing to do here is. The "easy" thing would be make
(*gcSweepBuf).push
run on the system stack, that way it'll never trigger a stack growth, but this seems wrong. It feels better to instead only acquirespineLock
aftermheap_.lock
, but this may not be possible. My concern is that the allocated span'ssweepgen
could end up skewed with respect to thegcSweepBuf
it's in, but I haven't looked closely at the concurrency requirements of the relevant pieces.CC @aclements
The text was updated successfully, but these errors were encountered: