Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: lock cycle between mheap_.lock and gcSweepBuf.spineLock #34156

Open
mknyszek opened this issue Sep 6, 2019 · 0 comments
Open

runtime: lock cycle between mheap_.lock and gcSweepBuf.spineLock #34156

mknyszek opened this issue Sep 6, 2019 · 0 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@mknyszek
Copy link
Contributor

mknyszek commented Sep 6, 2019

As the title says, there's a runtime lock cycle between mheap_.lock and gcSweepBuf.spineLock. The cycle is effectively of the following form:

On one thread:

  1. (*mheap).alloc_m (which runs on the system stack) allocates a span and calls in (*gcSweepBuf).push.
  2. (*gcSweepBuf).push acquires the spine lock.

Meanwhile on another thread:

  1. deductSweepCredit (on a g's stack) calls into sweepone, then (*mspan).sweep.
  2. (*mspan).sweep calls into (*gcSweepBuf).push.
  3. (*gcSweepBuf).push acquires the spine lock.
  4. (*gcSweepBuf).push calls into either persistentalloc or unlock.
  5. In the prologue of either of these function, a stack growth is triggered which acquires mheap_.lock.

Note that (*gcSweepBuf).push would have the potential for self-deadlock in the alloc_m case, but because it runs on the system stack, stack growths won't happen.

This must be an extremely rare deadlock because git history indicates that it's been around since 2016 and we've never received a single bug report (AFAICT). With that being said, if we want any sort of automated lock cycle detection, we need to fix this.

It's unclear to me what the right thing to do here is. The "easy" thing would be make (*gcSweepBuf).push run on the system stack, that way it'll never trigger a stack growth, but this seems wrong. It feels better to instead only acquire spineLock after mheap_.lock, but this may not be possible. My concern is that the allocated span's sweepgen could end up skewed with respect to the gcSweepBuf it's in, but I haven't looked closely at the concurrency requirements of the relevant pieces.

CC @aclements

@mknyszek mknyszek modified the milestones: Go1.14, Go1.15 Sep 6, 2019
@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 6, 2019
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: Triage Backlog
Development

No branches or pull requests

3 participants