Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/exec: CommandContext does not forward the context's error on timeout #21880

Open
aruiz14 opened this issue Sep 14, 2017 · 12 comments
Open

os/exec: CommandContext does not forward the context's error on timeout #21880

aruiz14 opened this issue Sep 14, 2017 · 12 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@aruiz14
Copy link

aruiz14 commented Sep 14, 2017

Go version: Latest release - 1.9

Using CommandContext in combination with Context.WithTimeout shows very few information about the reason of the error when the execution times out.
Using a small variation of the CommandContext example
proves this:

package main

import (
        "context"
        "fmt"
        "os/exec"
        "time"
)

func main() {
        ctx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond)
        defer cancel()

        if err := exec.CommandContext(ctx, "sleep", "5").Run(); err != nil {
                fmt.Println("cmd: ", err)
                fmt.Println("ctx: ", ctx.Err())
        }
}

Output:

cmd:  signal: killed
ctx:  context deadline exceeded

Also, if the context is already done then the cmd.Run gets called, the error matches the context's error:

package main

import (
        "context"
        "fmt"
        "os/exec"
        "time"
)

func main() {
        ctx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond)
        defer cancel()
        time.Sleep(150 * time.Millisecond)

        if err := exec.CommandContext(ctx, "sleep", "5").Run(); err != nil {
                fmt.Println("cmd: ", err.Error())
                fmt.Println("ctx: ", ctx.Err().Error())
        }
}

Output:

cmd:  context deadline exceeded
ctx:  context deadline exceeded
@ianlancetaylor
Copy link
Contributor

What do you suggest should happen?

@ianlancetaylor ianlancetaylor added this to the Go1.10 milestone Sep 15, 2017
@aruiz14
Copy link
Author

aruiz14 commented Oct 20, 2017

I'm very sorry for the long delay. I missed the notification.
I would expect the error from Run() to match ctx.Err(), given that it's the actual cause of the command being aborted.
In case it helps, I'm using the following workaround:

	// return errors.Trace(cmd.Run())
	if err := cmd.Run(); err != nil {
		if ctx.Err() != nil {
			return errors.Annotate(err, ctx.Err().Error())
		}
		return errors.Trace(err)
	}

BTW thanks for adding this to the 1.10 milestone! :)

@mkmik
Copy link

mkmik commented Oct 20, 2017

I guess it makes sense that cmd.Run returns an exec.ExitError when a process has been actually created; otherwise one couldn't access e.g. os.ProcessState which might make sense even if the process has been killed because the context deadlined.

What if we the ExitError's Error() method was able to access the context error and render e.g.:

context deadline exceeded, signal: killed

@bcmills
Copy link
Contributor

bcmills commented Oct 23, 2017

This gets at a more general problem of chaining errors: ideally, I agree with @mmikulicic that Run should return an ExitError that shows both the immediate cause (the signal) and the secondary one (the cancellation).

It would be nice if we had some standard means to express that.
(CC: @dsnet, @rogpeppe, @davecheney, @crawshaw)

@bcmills
Copy link
Contributor

bcmills commented Oct 23, 2017

Also note that if you cancel a context.Context in general, you have no guarantee that the function you passed it to actually returns ctx.Err() even when you are only making calls within the same (Go) process. The convention of returning ctx.Err() is nice to follow, but not universally enforced.

In general, if you cancel a Context you should ignore whatever error is returned by the call you passed it to. (As a corollary, if you return an error you should not also log it: it is up to the caller to determine whether it is still relevant.)

@mkmik
Copy link

mkmik commented Oct 23, 2017

Does this mean it's up to the caller to do something with it, e.g.:

func Foo() error {
    if err := exec.CommandContext(ctx, "sleep", "5").Run(); err != nil {
        if err := ctx.Err(); err != nil {
            return err
        }
        return err
    }
}

or:

import "github.com/juju/errors"

func Foo() error {
    if err := exec.CommandContext(ctx, "sleep", "5").Run(); err != nil {
        if ctx.Err() != nil {
            err = errors.Wrap(err, ctx.Err())
        }
        return err
    }
}

or some custom (or yet to be defined) helper:

func Foo() error {
    if err := exec.CommandContext(ctx, "sleep", "5").Run(); err != nil {
        return WrapContextError(ctx, err) // passes err through when ctx is not cancelled
    }
}

?

@bcmills
Copy link
Contributor

bcmills commented Oct 23, 2017

Does this mean it's up to the caller to do something with it, e.g.:

None of the callers you're describing there have a Context parameter. I would expect to see something more like:

func Foo(ctx context.Context) (Bar, error) {
	out, err := exec.CommandContext(ctx, …).Output()
	if err != nil {
		return Bar{}, err
	}
	return parseBar(out)
}

…

func SomeOtherFunction(ctx context.Context) error {
	ctx, cancel := context.WithTimeout(ctx, veryLongWatchdogTimeout)
	defer cancel()
	bar, err := Foo(ctx)
	if err != nil {
		if ctx.Err() == context.DeadlineExceeded {
			return watchdogError(err)
		}
		return internalError(err)
	}
	return doSomethingWith(bar)
}

Honestly, though, I don't really understand most uses of context.WithTimeout to begin with. With explicit cancellation, it's much easier to know whether you intentionally canceled the operation (and thus whether you should ignore the details of the error).

@mkmik
Copy link

mkmik commented Oct 23, 2017

yeah I forgot declaring the signature in the Foo function

Honestly, though, I don't really understand most uses of context.WithTimeout to begin with. [...]

are you suggesting that it's better to explicitly set up a timer that calls cancel instead of having this functionality built in the context library?

@bcmills
Copy link
Contributor

bcmills commented Oct 23, 2017

are you suggesting that it's better to explicitly set up a timer that calls cancel instead of having this functionality built in the context library?

No, I'm suggesting that most callers should propagate cancellation (e.g. from an incoming http.Request, or from a signal as described in #21521) instead of applying arbitrary timeouts.

@mkmik
Copy link

mkmik commented Oct 23, 2017

ah, well, the parent timeout is preserved, I guess this is just about whether or not it makes sense to allow a call to further narrow down the timeout; we're getting off topic :-)

@odeke-em
Copy link
Member

How's it going here @aruiz14 @mmikulicic @bcmills? Just a gentle ping.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.10, Unplanned Jan 3, 2018
@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jan 3, 2018
@epels
Copy link

epels commented Feb 10, 2020

I agree it makes sense to return an error that indicates the process was terminated when a process is spawned but killed, rather than returning a context.DeadlineExceeded (even if that was the reason for termination).
However, I do think it is good to reflect this behaviour in the docs (or at least in the example for CommandContext). Possibly hint that callers may inspect context.Err() themselves to find whether the deadline has exceeded. What do you all think?

A vaguely related incosistency I'd like to point out: Cmd.Wait() returns an errorString, but Cmd.Run() returns ExitError. I feel it makes sense to return an ExitError from both methods.

npepinpe added a commit to camunda/zeebe that referenced this issue Feb 21, 2022
As shown in [this issue](golang/go#21880),
running a command with a deadline or timeout causes the process to be
killed properly, but the error returned is that it was killed, without
any more information. To know that it timed out, one has to use the
context itself via `context.Err()`. We now do this and also set the
timeout flag of the command to add a second, to ensure we can properly
detect time outs, as otherwise it's difficult to know why the command
exited since it always uses the same exit code on error.
npepinpe added a commit to camunda/zeebe that referenced this issue Feb 21, 2022
As shown in [this issue](golang/go#21880),
running a command with a deadline or timeout causes the process to be
killed properly, but the error returned is that it was killed, without
any more information. To know that it timed out, one has to use the
context itself via `context.Err()`. We now do this and also set the
timeout flag of the command to add a second, to ensure we can properly
detect time outs, as otherwise it's difficult to know why the command
exited since it always uses the same exit code on error.
npepinpe added a commit to camunda/zeebe that referenced this issue Feb 21, 2022
As shown in [this issue](golang/go#21880),
running a command with a deadline or timeout causes the process to be
killed properly, but the error returned is that it was killed, without
any more information. To know that it timed out, one has to use the
context itself via `context.Err()`. We now do this and also set the
timeout flag of the command to add a second, to ensure we can properly
detect time outs, as otherwise it's difficult to know why the command
exited since it always uses the same exit code on error.
npepinpe added a commit to camunda/zeebe that referenced this issue Feb 23, 2022
As shown in [this issue](golang/go#21880),
running a command with a deadline or timeout causes the process to be
killed properly, but the error returned is that it was killed, without
any more information. To know that it timed out, one has to use the
context itself via `context.Err()`. We now do this and also set the
timeout flag of the command to add a second, to ensure we can properly
detect time outs, as otherwise it's difficult to know why the command
exited since it always uses the same exit code on error.
LawnGnome added a commit to sourcegraph/src-cli that referenced this issue Jun 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants