Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: Process.Signal on the current PID should signal the current thread #19326

Open
bcmills opened this issue Feb 28, 2017 · 12 comments
Open

os: Process.Signal on the current PID should signal the current thread #19326

bcmills opened this issue Feb 28, 2017 · 12 comments
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Feb 28, 2017

The Go standard library currently lacks an equivalent to the standard C raise function.

The closest portable analogue is something like:

func raise(sig os.Signal) error {
  p, err := os.FindProcess(os.Getpid())
  if err != nil {
    return err
  }
  return p.Signal(sig)
}

Unfortunately, this pattern is somewhat error-prone: on many platforms (e.g. Linux), it can deliver the signal to any thread in the process group. If the purpose of raising the signal is to produce a stack trace or trigger the invocation of a C fatal-signal handler (e.g. by raising SIGQUIT or SIGABRT), that can result in the handler executing on a thread unrelated to the failure.

It's tempting to request a new function call, os.Raise, for this purpose, but it seems like we could address this use-case automatically without an API change. We could have (*os.Process).Signal check whether the Process in question is the Go program itself and raise the signal synchronously to the current thread (e.g. by using the tgkill syscall on Linux, or calling pthread_kill or raise in cgo-enabled builds).

@ianlancetaylor
Copy link
Contributor

This might be a good idea; I'm not sure. I just want to note that it is possible to write a program today that will work today but would fail with this suggestion. For example, if you block SIGHUP before starting a program, and the program then uses signal.Notify to look for SIGHUP signals, the program will be in a state in which most threads have SIGHUP blocked but one thread (the ensureSigM thread) will have SIGHUP unblocked. If you send a SIGHUP to the process, it will be delivered to that thread. If you send a SIGHUP to the currently running thread using tgkill, it will be put on the pending list until the thread unblocks SIGHUP, which in practice will never happen.

@bcmills
Copy link
Contributor Author

bcmills commented Feb 28, 2017

That's a good point. Perhaps we could kill the current thread if the signal is unblocked, and the overall process otherwise?

Do you have any opinion as to whether that would be better or worse than adding an explicit os.Raise function or similar?

@ianlancetaylor
Copy link
Contributor

Go programs normally interact with signals via the os/signal package, and of course we don't have to actually raise a signal to pretend that we received one. So the only possible purposes I can see for explicitly raising a signal are

  1. when linked with C, invoke a previously installed C signal handler if there is one;
  2. trigger a core dump;
  3. exit with the exit status indicating failure due to a signal.

You can get a core dump by calling debug.SetTraceback("crash") and then panicking.

To invoke a C signal handler or to set the exit status, I don't see why it matters which thread receives the signal.

So to me the argument for this does not yet seem compelling. You can always use syscall.Tgkill(-1, syscall.Gettid(), signo).

@bcmills
Copy link
Contributor Author

bcmills commented Feb 28, 2017

The use-case I have in mind is for invoking C handlers, particularly for fatal signals (such as SIGABRT).

It's nice to have the signal to arrive at the specific thread because the C handler may itself trigger a core dump (or otherwise capture a traceback of the signaled thread), and keeping the signal on the current thread ensures that that traceback is for the goroutine that actually encountered the problem.

You can always use syscall.Tgkill(-1, syscall.Gettid(), signo).

Indeed, although in practice it seems that I have to pass syscall.Getpid() instead of -1, and I believe I also need to call runtime.LockOSThread() before syscall.Gettid().

(There are certainly workarounds, but they aren't as portable.)

@ianlancetaylor
Copy link
Contributor

If you know for sure there is a C handler, then it would not be unreasonable to use cgo/SWIG to call the C function raise.

@nightlyone
Copy link
Contributor

I wonder how the requested behavior matches to the Go runtime assumption of "there are no threads visible to the user"?

If this is a feature to improve cooperation with C, it could be implemented in the C part of the hybrid program.

Given the above consideration, is this feature about the runtime being aware of the fact that some signals are not intended for the whole process, but individual threads? And then again how does this interact with the runtime assumption of "there are no user visible threads".?

So again the receiving part could be implemented in the C part of a CGO enabled program with a Go callback.

Now the question (which I didn't research yet) is: Does the Go runtime handle per thread signals well enough, on all platforms that support it, to allow both C helpers to be written?

Or is this whole feature intended to only support a dedicated thread reserved via runtime.LockOSThread?

@bcmills
Copy link
Contributor Author

bcmills commented Mar 1, 2017

If you know for sure there is a C handler, then it would not be unreasonable to use cgo/SWIG to call the C function raise.

Agreed. The use-case I have in mind is for reporting a major invariant violation in a Go library that may be used in pure Go programs, mostly-C programs (via cgo export), or anything in between. I do not want the library to add a libc dependency for Go programs that would not otherwise have one.

I cannot use panic for that purpose because it may be inadvertently recovered by the caller (e.g. if the invariant violation occurs within a function transitively called by a fmt.Stringer being passed to fmt.Sprint).

@bcmills
Copy link
Contributor Author

bcmills commented Mar 1, 2017

I wonder how the requested behavior matches to the Go runtime assumption of "there are no threads visible to the user"?

In a Go program that includes C code, threads are visible to the C portion of the program. The runtime can only assume that threads are not visible to the Go portion of the program.

Since a non-blocked signal sent to the process could be delivered to the current thread anyway, making it so that the signal is always sent to the current thread only changes the distribution of observed behaviors — it does not add a new behavior per se.

If this is a feature to improve cooperation with C, it could be implemented in the C part of the hybrid program.

The Go libraries in use by a program do not, in general, know whether "the C part of the hybrid program" even exists; see my previous comment.

Now the question (which I didn't research yet) is: Does the Go runtime handle per thread signals well enough, on all platforms that support it, to allow both C helpers to be written?

The Go runtime does handle per-thread signals reasonably, or can be made to do so. I do not see any viable way to implement the requested behavior using only C helpers, regardless of the runtime's ability to handle the signals.

Or is this whole feature intended to only support a dedicated thread reserved via runtime.LockOSThread?

An explicit call to runtime.LockOSThread should be unnecessary. With the proposed change, the os package would be responsible for doing any necessary locking / unlocking for the duration of the Signal call.

Current workarounds do require an explicit call to runtime.LockOSThread, but only to avoid the goroutine switching threads in between the calls to syscall.Gettid() and syscall.Tgkill.

@bradfitz bradfitz added this to the Go1.9Maybe milestone Mar 21, 2017
@bradfitz bradfitz modified the milestones: Go1.9Maybe, Go1.10 Jul 20, 2017
@bradfitz bradfitz modified the milestones: Go1.10, Go1.11 Nov 30, 2017
@bradfitz bradfitz added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Nov 30, 2017
@gopherbot gopherbot modified the milestones: Go1.11, Unplanned May 23, 2018
@rsc
Copy link
Contributor

rsc commented Jun 11, 2018

As @ianlancetaylor pointed out, C programs can call C.raise.
In general, while the C signal call does not guarantee which thread will get a signal, I believe nearly all operating systems implement a strong preference for the current thread. Do you know of important, common situations where the current thread doesn't get the signal with the current Process.Signal?

@rsc
Copy link
Contributor

rsc commented Jun 11, 2018

If not, seems like we should close this.

@bcmills
Copy link
Contributor Author

bcmills commented Jun 11, 2018

Do you know of important, common situations where the current thread doesn't get the signal with the current Process.Signal?

Yes: for programs that (a) have crash-reporting signal handlers that capture stack traces or core files (such as those used for mobile apps) and (b) exit by raising SIGABRT or a similar signal, the signal typically arrives either too late on the signaling thread or on a different thread entirely (I'm not sure which).

In the Google-internal bug report I received, by the time the signal arrived, the handler saw runtime.usleep and runtime.sysmon running on the signaled thread instead of the goroutine that raised the signal. (I can get you a reference to the internal bugreport if you'd like more detail.)

@bcmills
Copy link
Contributor Author

bcmills commented Jun 11, 2018

To be clear: I'm not sure whether the problem is due to asynchronous signal delivery or thread identity. tgkill seems to address both: it delivers the signal synchronously to the requested thread.

moroten added a commit to moroten/bb-storage that referenced this issue Aug 1, 2022
remote-apis-testing shows the following log in
https://gitlab.com/remote-apis-testing/remote-apis-testing/-/jobs/2773967861

2022/07/26 20:20:14 Received terminated signal. Initiating graceful shutdown.
2022/07/26 20:20:14 Graceful shutdown complete
panic: Raising the original signal didn't cause us to shut down

Apparently, the panic line was reached before the signal had been
handled and the process killed, probably due to a different thread is
receiving the signal. For more details, see
golang/go#19326
EdSchouten pushed a commit to buildbarn/bb-storage that referenced this issue Aug 2, 2022
remote-apis-testing shows the following log in
https://gitlab.com/remote-apis-testing/remote-apis-testing/-/jobs/2773967861

2022/07/26 20:20:14 Received terminated signal. Initiating graceful shutdown.
2022/07/26 20:20:14 Graceful shutdown complete
panic: Raising the original signal didn't cause us to shut down

Apparently, the panic line was reached before the signal had been
handled and the process killed, probably due to a different thread is
receiving the signal. For more details, see
golang/go#19326
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

6 participants