Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: Windows profilem may profile Ms after unminit #50557

Open
prattmic opened this issue Jan 11, 2022 · 5 comments
Open

runtime: Windows profilem may profile Ms after unminit #50557

prattmic opened this issue Jan 11, 2022 · 5 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@prattmic
Copy link
Member

profileLoop partially synchronizes with unminit via mp.threadLock, but it drops threadLock prior to _SuspendThread and profilem.

Thus profiling may race with dropm, resulting in profilem running on an "extra" M which is no longer in use by Go and whose previous thread is running C code.

In theory, this should result in gFromSP returning nil (as logically there should not be a G at all since this isn't a Go thread), which may make sigprof crash, as it assumes gp != nil. However, in practice the SP likely still falls in the mp.g0 stack range and things end up not crashing, though with a non-meaningful profile sample.

I've been able to easily detect this condition in practice with a C thread repeatedly calling a Go function and profilem checking if mp.thread is still set.

cc @cherrymui

@prattmic prattmic added the NeedsFix The path to resolution is known, but the work has not been done. label Jan 11, 2022
@prattmic prattmic added this to the Backlog milestone Jan 11, 2022
@prattmic
Copy link
Member Author

It looks like preemptM has a similar issue. It synchronizes with "external code" using mp.preemptExtLock, which is set by cgocall, but is not set by dropm/unminit. Like profileLoop, preemptM does not fully synchronize with unminit via mp.threadLock. As a result, I believe that preemptM could attempt to preempt a C thread after unminit, which may call ExitProcess.

cc @aclements

@prattmic
Copy link
Member Author

prattmic commented Jan 13, 2022

I see a few different options here:

  1. Hold mp.threadLock until _SuspendThread completes (i.e., after _GetThreadContext), thus unminit can't run until we have stopped the thread.
    • I think this is the simplest approach, but does introduce possibility of contention on mp.threadLock. A preemptM call may have to wait a long time for profileLoop to unlock before it can continue.
    • On the other hand, maybe that is OK. In the standard case, preemptM is going to suspend the thread anyways, so in theory waiting for profileLoop too only makes preemptM take 2x as long.
  2. Same as (1), but add an atomic flag that profile suspend is in progress. preemptM immediately fails preemption if this flag is set.
  3. Clear mp.g0.stack on unminit and skip sigprof/preempt if gFromSP can't match up with a G.
    • I'm not a big fan of this because it still allows racing with needm picking up the M again and setting mp.g0.stack, which may not even be assigning the M to the same thread that we are profiling.
  4. For preemptM, we can set mp.preemptExtLock in unminit, which will make it skip uninitialized Ms as "external code".
    • For profilem, do nothing? We get nonsense profiling samples, and could crash sigprof with a nil G, but we could skip profiling entirely on a nil G.

@cherrymui
Copy link
Member

I think 4 is a good idea, for preemption. Not sure about profiling, maybe we could do 3. Maybe we put setting/unsetting stack bounds in the same lock. Or we could re-check mp.thread after the thread is suspended? (is there any ABA problem?)

Not sure about the implication of suspending thread with the lock held. I haven't thought it carefully, but it sounds a bit dangerous (for deadlock). Maybe it is fine.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
@mknyszek
Copy link
Contributor

mknyszek commented Sep 6, 2022

2022-09-05T08:08:24-af7f417/windows-amd64-longtest

Is this exactly this issue? It looks really similar. A nil pointer dereference in profilem.

@prattmic
Copy link
Member Author

prattmic commented Sep 6, 2022

I believe that is #54885.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Projects
Development

No branches or pull requests

4 participants