Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: FindProcess should document how to test if a process is alive since on UNIX it always returns a process #34396

Closed
srowles opened this issue Sep 19, 2019 · 13 comments · Fixed by ferrmin/go#645 · May be fixed by #60461
Closed
Labels
Documentation NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@srowles
Copy link

srowles commented Sep 19, 2019

What version of Go are you using (go version)?

$go version go1.12.6 linux/amd64

Does this issue reproduce with the latest release?

Yes, Documentation issue, see live doc at:

https://golang.org/pkg/os/#FindProcess

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
$ go env
GOARCH="amd64"
GOHOSTARCH="amd64"
GOHOSTOS="linux"

What did you do?

Tried to find out if a process exists using os.FindProcess()

Read the documentation to find that the FindProcess call always returns a nil error on unix systems so it doesn't really "find" the process:

On Unix systems, FindProcess always succeeds and returns a Process for the given pid, regardless of whether the process exists.

Found closed documentation ticket saying you can send a signal to find if the process really exists or not:

#14146 (comment)

What did you expect to see?

More helpful information in the documentation about how someone might actually test for a process existing, for example as a starter on Linux:

	// error is always nil on Unix systems, just creates a process object
	p, _ := os.FindProcess(pid)
	// send SIGCONT (on Linux) which is a safe signal to the process to test if it exists
	err = p.Signal(syscall.SIGCONT)

What did you see instead?

No help without how the user might find out that a process really exists. Just a comment that it doesn't actually tell you if the process exists:

On Unix systems, FindProcess always succeeds and returns a Process for the given pid, regardless of whether the process exists.
@toothrot
Copy link
Contributor

Hey @srowles, thanks for taking the time to file this.

I am not totally confident if the suggested example is within scope for os.FindProcess. /cc @bradfitz who chimed in last time this issue came up. #14146 (comment)

@toothrot toothrot added this to the Go1.14 milestone Sep 20, 2019
@toothrot toothrot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 20, 2019
@bradfitz
Copy link
Contributor

I'm not sure SIGCONT is necessarily safe to send. It might be stopped on purpose, and then you just resumed it.

Maybe we need some sort of portable os.(*Process).State method? But then it'd be tempting to use ProcessState as its return type, which might not be a great fit, despite the name. Notably, its ExitCode method says:

ExitCode returns the exit code of the exited process, or -1 if the process hasn't exited or was terminated by a signal.

That doesn't really let you know if it's still running or not. Perhaps we could add a new Running() bool method.

Then what you could do:

    p, _ := os.FindProcess(123)
    if p != nil && p.State().Running() {
              ....
    }
}

/cc @ianlancetaylor

@srowles
Copy link
Author

srowles commented Sep 20, 2019

I'd be happy with a new method, that would be preferable to the workaround of sending signals to try and determine the process state.

@odeke-em odeke-em changed the title os: documentation for os.FindProcess could be more helpful os: FindProcess should document how to test if a process is alive since on UNIX it always returns a process Sep 25, 2019
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@odeke-em
Copy link
Member

Given the number of times this has come up (here and on forums such as Stackoverflow), the suggestions for sending signal 0 which is in the user's way and requires them to go Googling, this issue is haunting me a little and I can't sleep thinking about it. I think that perhaps we could do more so os.FindProcess should do some more work on the UNIXes instead of blindly returning an unchecked process.

I noticed that the now defunct CL https://codereview.appspot.com/7392048/ by Akshat Kumar which timed out had some of the changes that I was just thinking.

My suggestions here would be that:
a) On Linux, FreeBSD and Plan 9, we stat the /proc/target_pid file as per Linux /proc FreeBSD /proc Plan9 /proc
b) On Darwin/NetBSD, we can do a sysctl(CTL_KERN, KERN_PROC, KERN_PROC_PID, target_pid)
c) For other UNIXes that we haven't covered, we can seek contributions

and with that, perhaps we can turn this issue instead into a change/proposal for Go1.15.

Kindly pinging @ianlancetaylor @mdempsky @randall77 as well to ask if this might seem like something viable.

@ianlancetaylor
Copy link
Contributor

To me it seems like kind of an impossible problem. The program could test that the process exists, and then the process could exit, and then the program could try to do something with the process, but it wouldn't work. Or it would affect an entirely different process that was reusing the process ID. In general doing arbitrary things with a PID isn't safe on Unix systems.

@bradfitz
Copy link
Contributor

Not portable but Linux has pidfd: https://lwn.net/Articles/794707/

@odeke-em
Copy link
Member

To me it seems like kind of an impossible problem. The program could test that the process exists, and then the process could exit, and then the program could try to do something with the process, but it wouldn't work. Or it would affect an entirely different process that was reusing the process ID. In general doing arbitrary things with a PID isn't safe on Unix systems.

Indeed Ian, this is tricky if we are sending a signal to the process, but regardless of whether this might be racy or unsafe on UNIX systems, I think the same thing can happen on Windows but we strive hard to provide a FindProcess implementation for the user :) What we can do perhaps is document this unsafe behavior.

Not portable but Linux has pidfd: https://lwn.net/Articles/794707/

Cool, thanks for the reference, Brad!

@gopherbot
Copy link

Change https://golang.org/cl/211801 mentions this issue: os: implement FindProcess on Darwin instead of noop

rissson added a commit to goauthentik/authentik that referenced this issue Apr 28, 2023
This is the first step to handle configuration reloading. With those
changes, it is already possible to do so, by sending a SIGUSR2 signal to
the Go server process. The next step would be to watch for changes to
configuration files and call the Restart function of the GoUnicorn
instance.

SIGHUP is catched by the go server and forwarded as-is to gunicorn,
which causes it to restart its workers. However, that does not trigger
a reload of the Django settings, probably because they are already
loaded in the master, before creating any of the worker instances.

SIGUSR2, however, can be used to spawn a new gunicorn master process,
but handling it is a bit trickier. Please refer to Gunicorn's
documentation[0] for details, especially the "Upgrading to a new binary
on the fly" section.

As we are now effectively killing the gunicorn processed launched by the
server, we need to handle some sort of check to make sure it is still
running. That's done by using the already existing healthchecks, making
them useful not only for the application start, but also for its
lifetime. If a check is failed too many times in a given time period,
the gunicorn processed is killed (if necessary) and then restarted.

[0] https://docs.gunicorn.org/en/20.1.0/signals.html

Other relevant links and documentation:

Python library handling the processing swaping upon a SIGUSR2:
https://github.com/flupke/rainbow-saddle/

Golang cannot easily check if a process exists on Unix systems:
golang/go#34396

Signed-off-by: Marc 'risson' Schmitt <marc.schmitt@risson.space>
mpldr added a commit to mpldr-pulls/go that referenced this issue May 15, 2023
The current implementation of always returning a process can lead to
unexpected behavior and is generally not very helpful.

This commit adds a simple check whether the process is still alive or
not, as described in the POSIX standard, by attempting to send a null
signal to the process.

Fixes golang#34396

Link: https://pubs.opengroup.org/onlinepubs/9699919799/
mpldr added a commit to mpldr-pulls/go that referenced this issue May 15, 2023
The current implementation of always returning a process can lead to
unexpected behavior and is generally not very helpful.

This commit adds a simple check whether the process is still alive or
not, as described in the POSIX standard, by attempting to send a null
signal to the process.

Fixes golang#34396

Link: https://pubs.opengroup.org/onlinepubs/9699919799/
@gopherbot
Copy link

Change https://go.dev/cl/494895 mentions this issue: os: add a FindProcess implementation for UNIX

mpldr added a commit to mpldr-pulls/go that referenced this issue May 15, 2023
The current implementation of always returning a process can lead to
unexpected behavior and is generally not very helpful.

This commit adds a simple check whether the process is still alive or
not, as described in the POSIX standard, by attempting to send a null
signal to the process.

Fixes golang#34396

Link: https://pubs.opengroup.org/onlinepubs/9699919799/
mpldr added a commit to mpldr-pulls/go that referenced this issue May 15, 2023
The current implementation of always returning a process can lead to
unexpected behavior and is generally not very helpful.

This commit adds a simple check whether the process is still alive or
not, as described in the POSIX standard, by attempting to send a null
signal to the process.

Fixes golang#34396

Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/kill.html
@mitar
Copy link
Contributor

mitar commented May 20, 2023

I would also mention that blockUntilWaitable already does the necessary check, it does waitid with WNOWAIT flag. Isn't this enough to determine if process exist if you combine it with WNOHANG as well? So I think FindProcess should do this before returning.

mitar added a commit to mitar/go that referenced this issue May 26, 2023
Using pidfd allows us to have a handle on the process and
poll the handle to non-blocking wait for the process to
exit.

Fixes golang#34396
Fixes golang#60321
Fixes golang#60320
@mitar
Copy link
Contributor

mitar commented May 26, 2023

I made #60461 with implementation which uses pidfd.

mitar added a commit to mitar/go that referenced this issue May 26, 2023
Using pidfd allows us to have a handle on the process and
poll the handle to non-blocking wait for the process to
exit.

Fixes golang#34396
Fixes golang#60321
Fixes golang#60320
@gopherbot
Copy link

Change https://go.dev/cl/498615 mentions this issue: os/exec: use pidfd for waiting and signaling of processes

mitar added a commit to mitar/go that referenced this issue May 26, 2023
Using pidfd allows us to have a handle on the process and
poll the handle to non-blocking wait for the process to
exit.

Fixes golang#34396
Fixes golang#60321
Fixes golang#60320
@gopherbot
Copy link

Change https://go.dev/cl/502815 mentions this issue: os/exec: document a method to check if a process is alive

mpldr added a commit to mpldr-pulls/go that referenced this issue Jun 16, 2023
mpldr added a commit to mpldr-pulls/go that referenced this issue Jun 16, 2023
@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jun 17, 2023
@dmitshur dmitshur modified the milestones: Backlog, Go1.21 Jun 17, 2023
rissson added a commit to goauthentik/authentik that referenced this issue Aug 25, 2023
This is the first step to handle configuration reloading. With those
changes, it is already possible to do so, by sending a SIGUSR2 signal to
the Go server process. The next step would be to watch for changes to
configuration files and call the Restart function of the GoUnicorn
instance.

SIGHUP is catched by the go server and forwarded as-is to gunicorn,
which causes it to restart its workers. However, that does not trigger
a reload of the Django settings, probably because they are already
loaded in the master, before creating any of the worker instances.

SIGUSR2, however, can be used to spawn a new gunicorn master process,
but handling it is a bit trickier. Please refer to Gunicorn's
documentation[0] for details, especially the "Upgrading to a new binary
on the fly" section.

As we are now effectively killing the gunicorn processed launched by the
server, we need to handle some sort of check to make sure it is still
running. That's done by using the already existing healthchecks, making
them useful not only for the application start, but also for its
lifetime. If a check is failed too many times in a given time period,
the gunicorn processed is killed (if necessary) and then restarted.

[0] https://docs.gunicorn.org/en/20.1.0/signals.html

Other relevant links and documentation:

Python library handling the processing swaping upon a SIGUSR2:
https://github.com/flupke/rainbow-saddle/

Golang cannot easily check if a process exists on Unix systems:
golang/go#34396

Signed-off-by: Marc 'risson' Schmitt <marc.schmitt@risson.space>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment