Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: gcWriteBarrier requires g.m.p != nil even if g.m.dying > 0 #26575

Closed
ianlancetaylor opened this issue Jul 24, 2018 · 3 comments
Closed
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Milestone

Comments

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Jul 24, 2018

A signal can occur on a thread with no p, that is, where g != nil && g.m != nil && g.m.p == nil. Thus, the signal hander must not have any write barriers. A signal can cause a panic, which the signal handler implements by calling startpanic_m. The code in startpanic_m is permitted to have write barriers, because write barriers are permitted, even if g.m.p == nil, if g.m.dying > 0. This check is made in wbBufFlush.

However, write barriers are currently implemented by calling gcWriteBarrier. And that function, written in assembler, assumes that g.m.p != nil. So when startpanic_m has a write barrier, which it does when setting _g_.writebuf = nil, we can get a segmentation violation while handling a signal.

I believe this can happen starting in the 1.10 release.

cc @aclements

@ianlancetaylor ianlancetaylor added NeedsFix The path to resolution is known, but the work has not been done. release-blocker labels Jul 24, 2018
@ianlancetaylor ianlancetaylor added this to the Go1.12 milestone Jul 24, 2018
@aclements
Copy link
Member

Yuck. I can think of a few solutions:

  1. Check if g.m.dying > 0 in gcWriteBarrier. This is the obvious one, but adds more instructions and another memory load to the write barrier hot path.

  2. Check if g.m.p is nil in gcWriteBarrier and jump to a slow path that checks g.m.dying. This adds two instructions to the hot path, but no more loads. The slow path could just be the flush path and we could do the detailed check in the Go code to keep the assembly simple.

  3. The _g_.writebuf = nil write is the only write barrier in startpanic_m. We could change how that works and then disallow write barriers in startpanic_m.

    1. We could just do that write without a write barrier, since the GC can't possibly transition any more once we've reached this point.
    2. We could ignore g.writebuf if g.m.dying > 0.
    3. If we cleaned up tracebacking (which I started to do but got stuck on write barriers, coincidentally), we might be able to get rid of g.writebuf. The sole user is runtime.Stack. That would also shrink the g.

@ianlancetaylor
Copy link
Contributor Author

See #29305 for an actual occurrence of this problem.

@gopherbot
Copy link

Change https://golang.org/cl/154837 mentions this issue: runtime: avoid write barrier in startpanic_m

@golang golang locked and limited conversation to collaborators Dec 19, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Projects
None yet
Development

No branches or pull requests

4 participants