Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive/tar: io.copy hangs when using io.pipe and tarReader #53110

Closed
jquick opened this issue May 27, 2022 · 4 comments
Closed

archive/tar: io.copy hangs when using io.pipe and tarReader #53110

jquick opened this issue May 27, 2022 · 4 comments

Comments

@jquick
Copy link

jquick commented May 27, 2022

What version of Go are you using (go version)?

go version go1.18.2 darwin/arm64
go version go1.17.6 darwin/arm64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
❯ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN="/Users/jared.quick/go/bin"
GOCACHE="/Users/jared.quick/Library/Caches/go-build"
GOENV="/Users/jared.quick/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/jared.quick/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/jared.quick/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/opt/homebrew/Cellar/go/1.18.2/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/opt/homebrew/Cellar/go/1.18.2/libexec/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.18.2"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/jared.quick/dev/test_pipes/go.mod"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/yq/hv1ztnz90dg3kw91rn3m_gym0000gq/T/go-build3766060246=/tmp/go-build -gno-record-gcc-switches -fno-common"
(was also tested on a amd64 machine)

What did you do?

In this little example I show 3 cases:

  1. Reading a raw tar file using a io.pipe - works
  2. Reading via tarReader a tar file using bytes.buffer - works
  3. Reading via tarReader a tar file using io.pipe - does not work

https://go.dev/play/p/cDqxssuAaT2

What did you expect to see?

The io.pipe to close after fully reading the file to EOF when using tarReader and io.pipe.

What did you see instead?

The io.copy hangs indefinitely after the EOF is reached and the pipe is empty.

@Jorropo
Copy link
Member

Jorropo commented May 28, 2022

This sounds good to me.
Your code seems wrong.

// reading tarReader bytes using a io.pipe from file
func doesNotWork() error {
 	pr, pw := io.Pipe()
 	errC := make(chan error, 1)
 	go func() {
-		defer func() { _ = pw.Close() }()
 		errC <- readTar(DummyTar, pw)
 	}()

 	tarR := tar.NewReader(pr)
 	bArray := make([]byte, 1)
 	for {
 		_, err := tarR.Read(bArray)
 		if err == io.EOF {
 			fmt.Println("EOF reached!")
 			break
 		} else if err != nil {
 			return err
 		}
 	}
+	pw.Close()
 	<-errC
 	fmt.Println("doesNotWork finished successfully")
 	return nil
 }

This works.

As far I can tell, reading the file from tarR doesn't fully exhaust the underlying reader which is happening because (*tar.Reader).Read send io.EOF based on the tar metadata length, not the underlying reader sending EOF.
First copying the data into a buffer would have the same effect.
Calling tarR.Next() also fix it by exhausting the underlying reader.

Could also be that some padding data is left in the underlying reader and that tarR.Next() flush them out, but my memory of the tar format is unclear and I don't really remember trailing padding bytes.

@jquick
Copy link
Author

jquick commented May 28, 2022

@Jorropo Thanks for your help with this! You're right, I found 512 bytes in the pipe reader that needed to be drained after the tar io.EOF was reached. You can use pr.Read() or tarR.Next() after the tar io.EOF to fully flush the pipe and then the io.copy will close as expected.

It is still is a little odd to me that tarR.Read() will not flush the underline reader anymore after EOF but tarR.Next() will. I can add another tarR.Next() in my code after fully reading the tar to fix this but it seems a little confusing. I wonder if there is any merit to auto draining the underline reader when tar.Reader hits the io.EOF header.

@seankhliao
Copy link
Member

This is an incorrect use of archive/tar.Reader
Next() has to be called to read any part, including the first. It's expected to be called until it reaches io.EOF
This is less obvious in your example because it uses 0 sized files as input.

Closing as not a bug

@jquick
Copy link
Author

jquick commented May 28, 2022

For anyone that may come across this issue. I still see this issue rarely with some tar files (will try to upload an example) while explicitly using tar.Next() until it reaches io.EOF. The only work around I have found is to drain the pipe reader manually after. tar.Next() will just keep returning io.EOF even if there are bytes still in the pipe.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants