Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/exec: Wait behavior changed from Go 1.9 to Go 1.10 on Windows #25835

Closed
rsc opened this issue Jun 11, 2018 · 26 comments
Closed

os/exec: Wait behavior changed from Go 1.9 to Go 1.10 on Windows #25835

rsc opened this issue Jun 11, 2018 · 26 comments
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Milestone

Comments

@rsc
Copy link
Contributor

rsc commented Jun 11, 2018

Copying from #23019 (comment)

Here is the reproduction as a main function with correct markdown :-)
THX @ysmolsky!

Result does not change: works with go 1.9, breaks with go 1.10.

package main

import (
	"bufio"
	"context"
	"fmt"
	"os"
	"os/exec"
	"strings"
	"syscall"
	"time"
)

func fatal(format string, args ...interface{}) {
	fmt.Println(fmt.Sprintf(format, args...))
	os.Exit(1)
}

func main() {
	ctx, cancel := context.WithTimeout(context.Background(), time.Second)
	defer cancel()

	cmd := exec.CommandContext(ctx, "sh", "-c", "echo hello world && sleep 5")
	stdoutPipe, err := cmd.StdoutPipe()
	if err != nil {
		fatal("no pipe: %v", err)
	}
	start := time.Now()

	if err = cmd.Start(); err != nil {
		fatal("start failed: %v", err)
	}

	go func() {
		var stdout string
		buf := bufio.NewReader(stdoutPipe)
		for {
			line, err := buf.ReadString('\n')
			if len(line) > 0 {
				stdout = stdout + line + "\n"
			}
			if err != nil {
				if !strings.HasPrefix(stdout, "hello world") {
					fatal("wrong output: %q", stdout)
				}
				return
			}
		}
	}()

	err = cmd.Wait()
	d := time.Since(start)

	if err != nil {
		exiterr := err.(*exec.ExitError)
		status := exiterr.Sys().(syscall.WaitStatus)
		if status.ExitStatus() == 0 {
			fatal("wrong exit status: %v", status.ExitStatus())
		}
	}

	if d.Seconds() >= 3 {
		fatal("Cancelation took too long: %v", d)
	}
	fmt.Println("Success!")
}
@rsc rsc added this to the Go1.11 milestone Jun 11, 2018
@rsc rsc added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 11, 2018
@rsc
Copy link
Contributor Author

rsc commented Jun 11, 2018

/cc @ianlancetaylor

@dsymonds
Copy link
Contributor

Ah, I have also tripped over this recently! I have been observing the output of subprocesses writing to stderr (by the parent's os.Stderr being assigned to cmd.Stderr) happening after Wait has returned. For me it is logging, so not a big deal, but it was certainly confusing and I hadn't had time to track it down.

@dsymonds
Copy link
Contributor

Sorry, my mistake. My situation also has cmd.Stderr connected to a pipe like the reproduction above.

@ianlancetaylor
Copy link
Contributor

@kgadams I can't recreate the problem. On my GNU/Linux system, the example program shown above behaves the same with Go 1.9.6, 1.10.3, and tip. In all cases it prints Success!.

What system are you running on? What exact version of Go are you using? Thanks.

@ianlancetaylor ianlancetaylor added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jun 13, 2018
@dsymonds
Copy link
Contributor

I've got what I believe is a simpler reproduction:

package main

import (
        "context"
        "fmt"
        "log"
        "os/exec"
        "time"
)

func main() {
        ctx, cancel := context.WithTimeout(context.Background(), time.Second)
        defer cancel()

        cmd := exec.CommandContext(ctx, "sh", "-c", "echo hello world && sleep 5")

        start := time.Now()
        output, err := cmd.CombinedOutput()
        d := time.Since(start)
        fmt.Printf("output: %q\n", output)
        fmt.Printf("err: %v\n", err)
        fmt.Printf("d: %v\n", d)

        if err == nil {
                log.Fatalf("command didn't fail after %v", d)
        }

        if d.Seconds() >= 3 {
                log.Fatalf("Cancelation took too long: %v", d)
        }
        fmt.Println("Success!")
}

This fails (with the "Cancelation took too long" message) on go version devel +e5f0c1f6c9 Thu Jun 7 07:00:46 2018 +0000 darwin/amd64. However, it also fails on go version go1.7.4 linux/amd64, so maybe it's been buggy all that time.

@dsymonds
Copy link
Contributor

The possible connection to the case reported up the top is this: if I change from cmd.CombinedOutput() to cmd.Run(), I get a signal: killed error after 1s (the context cancelation time). So it seems to be an issue with capturing the output.

@ianlancetaylor
Copy link
Contributor

I think we are seeing shell dependent behavior. What is sh on your PATH?

@dsymonds
Copy link
Contributor

sh is /bin/sh, which is GNU bash, version 3.2.57(1)-release. But does it matter? cmd.Wait() or cmd.CombinedOutput() shouldn't last much longer than the context provided to exec.CommandContext. In my real world case, I am executing a Mach-O binary, not sh.

@kgadams
Copy link

kgadams commented Jun 13, 2018

I tested all go/os combinations I could get my hands on, and I found that the version of @dsymonds fails consistently on all of them.
The original version of the test program however succeeds on

  • go version go1.9.7 windows/amd64
  • go version go1.9.3 windows/amd64
  • go version go1.10.3 linux/amd64
  • go version go1.9.7 linux/amd64
  • go version go1.10.3 linux/ppc64le
  • go version go1.9.7 linux/ppc64le

It fails on:

  • go version go1.10.3 windows/amd64
  • go version go1.10.1 windows/amd64

So it looks like a windows only problem to me. And somehow there is a difference between using CombinedOutput() and StdoutPipe()

FWIW: I run it with $GOROOT/bin/go run cmd.go under a git environment, so the sh used on Windows is
GNU bash, version 4.4.12(2)-release (x86_64-pc-msys)

@ianlancetaylor
Copy link
Contributor

There is definitely a significant difference between CombinedOutput and StdoutPipe. The former creates a goroutine to pull all the data from the pipe, and waits for that goroutine to complete, which means that it waits for the descriptor to close. That is #23019. StdoutPipe does not create a goroutine.

@ianlancetaylor
Copy link
Contributor

@kgadams Thanks for explaining that your failure is only on Windows. That makes a huge difference.

@ianlancetaylor ianlancetaylor added OS-Windows and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Jun 13, 2018
@ianlancetaylor
Copy link
Contributor

The reason @dsymonds 's program works for me is indeed a shell feature. With the shell I am using (bash 4.4.12(1)-release on GNU/Linux) the sleep 5 is executed directly using execve, rather than calling fork first. If I change the program to run echo hello world && sleep 5 && echo slept then I see the delay. This is simply #23019, and is presumably unrelated to the original program's failure on WIndows.

@ianlancetaylor
Copy link
Contributor

There is more about exec on Windows with child processes in #17245. I don't know why it would have changed between 1.9 and 1.10, though.

@ianlancetaylor ianlancetaylor changed the title os/exec: Wait behavior changed from Go 1.9 to Go 1.10 os/exec: Wait behavior changed from Go 1.9 to Go 1.10 on Windows Jun 13, 2018
@dsymonds
Copy link
Contributor

@ianlancetaylor: Thanks, but #23019 doesn't seem to have much to say about the provided context not being respected. In my real case, I'm running a real binary with a 5s context timeout, and it runs for several minutes. That's running on Linux, not Windows. Is there a bug open that would be relevant to that? I couldn't find one with a bit of searching.

@ianlancetaylor
Copy link
Contributor

@dsymonds The problem you are seeing--CombinedOutput does not return even though the context has been canceled--is exactly what #23019 is about. It's because the Wait does not complete until all the child processes have exited.

@crvv
Copy link
Contributor

crvv commented Aug 22, 2018

The behavior changing comes from 382d492

cmd.Wait() is waiting for closing the pipe reader at

c.closeDescriptors(c.closeAfterWait)

In Go1.10, (*os.File).Close is blocked by (*os.File).Read. This won't happen in Go1.9
This only happens on Windows because of #19098

(*os.File).Read is waiting for grandchildren, which is similar to #23019

https://golang.org/pkg/os/exec/#Cmd.StdoutPipe
The document says "it is incorrect to call Wait before all reads from the pipe have completed"
So maybe this is not a bug of Go stdlib.

@kgadams
Copy link

kgadams commented Aug 22, 2018

Does anybody know of a way to work around this problem on Windows?
I very much would like to have the same behavior on Windows with Go 1.10 then we have on Linux with Go 1.9 and Go 1.10 and on Windows with Go 1.9.
In my specific scenario I have no control about the grandchildren but I want to be able to cancel the shell reliably and quickly, even if some output from grandchildren is lost.

@ianlancetaylor
Copy link
Contributor

@kgadams This problem only arises when pointing the Stdout or Stderr field to something that isn't a *os.File, so you can always work around it by setting them to the writer returned by os.Pipe.

@kgadams
Copy link

kgadams commented Aug 22, 2018

@ianlancetaylor: First I thought it works, but in the real code it does not.
Also: Cmd.StdoutPipe() does just what you suggested as a workaround: it creates an os.Pipe() and assigns it to cmd.Stdout. Why would it make a difference if I do it manually?

@ianlancetaylor
Copy link
Contributor

Sure, use StdoutPipe. That is even better if it works for you.

If this doesn't solve the problem then I don't understand what the problem is.

@kgadams
Copy link

kgadams commented Aug 22, 2018

See the opening post in this thread. It uses StdoutPipe().
cmd.Wait() does not return after canceling the Context until the grandchild is dead on Windows/go1.10.
It returns immediately on Unix and with go1.9 on Windows.

@ianlancetaylor
Copy link
Contributor

Ah, OK, StdoutPipe does wait to close the reading end of the pipe before returning from Wait. So I was wrong: using StdoutPipe is not equivalent to using your own os.Pipe. But I still don't see why os.Pipe doesn't work, as long as you coordinate closing the pipe with reading from the pipe.

@crvv
Copy link
Contributor

crvv commented Aug 23, 2018

The problem is that Go doesn't use the runtime poller to operate pipe on Windows.
Pipes are treated as regular files.
For a pipe, (*os.File).Read() can block indefinitely, and it won't be interrupted by (*os.File).Close()

So the problem itself is how to

coordinate closing the pipe with reading from the pipe

How about make os/exec put the pipe into runtime poller, so that the Read() will be interrupted by Close()?

@ianlancetaylor
Copy link
Contributor

Can we just fix the os package to add pipes to the poller when you call os.Pipe? Is there any reason that wouldn't work?

@crvv
Copy link
Contributor

crvv commented Aug 23, 2018

That is #19098
(*os.File).Fd() won't work as expected.

StdoutPipe() also returns an *os.File. Changing os/exec just make that file's Fd() not work.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.12, Unplanned Dec 20, 2018
crvv added a commit to crvv/go that referenced this issue Mar 4, 2019
When closing a pipe, use CancelIoEx to cancel pending I/O.
This makes concurrent Read and Write calls to return os.ErrClosed.

This change also enables pipe_test on Windows.

Fixes golang#28477, golang#25835

Change-Id: If52bb7d80895763488a61632e4682a78336e8420
crvv added a commit to crvv/go that referenced this issue Mar 4, 2019
When closing a pipe, use CancelIoEx to cancel pending I/O.
This makes concurrent Read and Write calls to return os.ErrClosed.

This change also enables some pipe tests on Windows.

Fixes golang#28477, golang#25835

Change-Id: If52bb7d80895763488a61632e4682a78336e8420
@gopherbot
Copy link

Change https://golang.org/cl/164721 mentions this issue: internal/poll, os: cancel pending I/O when closing pipes on Windows

@golang golang locked and limited conversation to collaborators Mar 18, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Projects
None yet
Development

No branches or pull requests

6 participants