Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive/tar: invalid header (file worked in Go 1.9, broke in Go 1.10+) #28843

Closed
mitchellh opened this issue Nov 17, 2018 · 9 comments
Closed
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@mitchellh
Copy link

mitchellh commented Nov 17, 2018

What version of Go are you using (go version)?

$ go version
go version go1.11.2 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/mitchellh/Library/Caches/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/mitchellh/code/go"
GOPROXY=""
GORACE=""
GOROOT="/Users/mitchellh/code/3rdparty/go"
GOTMPDIR=""
GOTOOLDIR="/Users/mitchellh/code/3rdparty/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/25/vnk5bx1d5j1_bvn0p3_mgbt80000gn/T/go-build271954328=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Input file is this: https://gist.github.com/mitchellh/e89242770d34660a2bc627557b6449c5
(base64 encoded tar file)

The file in question has extended headers. It is identical to the file added in this commit:
hashicorp/go-getter@668c725#diff-0d8d3e8dc13e64092481d7dd63191a0f

tarR := tar.NewReader(input)
_, err := tarR.Next()
panic(err)

What did you expect to see?

No error.

What did you see instead?

Error: archive/tar: invalid tar header

This was working with Go 1.9, broke in Go 1.10, and is still broken in Go 1.11.

I'm unsure if this was intended or not. I tried looking through various archive/tar issues and couldn't find anything definitive and I'm not super familiar with the variety of tar header formats.

And note: this works fine with BSD tar.

mitchellh added a commit to hashicorp/go-getter that referenced this issue Nov 17, 2018
Go 1.10 and 1.11 can't even parse the tar header anymore (errors with
"invalid header"). I'm not sure if this is a bug so I opened an issue
here: golang/go#28843

/cc @jbardin
@agnivade
Copy link
Contributor

Hi @mitchellh, thank you for filing the issue ! Could you kindly paste a complete code sample so that it is easy to debug this issue ?

/cc @dsnet @mvdan for further comments.

@mitchellh
Copy link
Author

mitchellh commented Nov 17, 2018

Sure! Expects "extended_header.tar" in the cwd. Made no attempt at cleanup for obvious reasons:

package main

import (
	"archive/tar"
	"os"
)

func main() {
	input, err := os.Open("extended_header.tar")
	if err != nil {
		panic(err)
	}

	tarR := tar.NewReader(input)
	_, err = tarR.Next()
	panic(err) // should be nil, but is "invalid header"
}

@dsnet
Copy link
Member

dsnet commented Nov 17, 2018

Hi @mitchellh, thanks for the reproduction, made it easy to debug.

Here's a hexdump of the first 1024 bytes:

00000000  70 61 78 5f 67 6c 6f 62  61 6c 5f 68 65 61 64 65  |pax_global_heade|
00000010  72 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |r...............|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000060  00 00 00 00 30 30 30 30  36 36 36 00 30 30 30 30  |....0000666.0000|
00000070  30 30 30 00 30 30 30 30  30 30 30 00 30 30 30 30  |000.0000000.0000|
00000080  30 30 30 30 30 36 33 00  30 30 30 30 30 30 30 30  |0000063.00000000|
00000090  30 30 30 00 30 31 32 36  34 32 00 20 67 00 00 00  |000.012642. g...|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000100  00 75 73 74 61 72 00 30  30 00 00 00 00 00 00 00  |.ustar.00.......|
00000110  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000140  00 00 00 00 00 00 00 00  00 30 30 30 30 30 30 30  |.........0000000|
00000150  00 30 30 30 30 30 30 30  00 00 00 00 00 00 00 00  |.0000000........|
00000160  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200  35 32 20 63 6f 6d 6d 65  6e 74 3d 37 63 39 32 37  |52 comment=7c927|
00000210  65 62 30 63 33 65 32 36  31 61 31 34 65 38 62 66  |eb0c3e261a14e8bf|
00000220  62 64 33 36 35 64 31 34  65 38 39 66 38 65 37 38  |bd365d14e89f8e78|
00000230  63 62 31 00 00 00 00 00  00 00 00 00 00 00 00 00  |cb1.............|
00000240  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*

At offsets 0x7c..0x88, you see the string "00000000063", which indicates that the file size is 51 bytes (encoded in octal). If you skip down to offsets 0xc8..0xe9, you'll notice that it's a string of "52 comment=7c927...". This claims to say that there is a PAX record of 52 bytes in length (PAX is weird where the size field counts the size field itself). However, this length is bogus since it is larger than the entire length of the data section (reported as 51 bytes earlier).

How was this file generated? If it was by archive/tar, then that is a bug we should fix. If not, then we'll probably close this as working-as-intended. As an additional data point, GNU tar also rejects this file.

@mitchellh
Copy link
Author

Thanks for the detailed write up!

@jbardin do you recall how you made this file? (He made it, not me) I'll ping him on Slack next week too if he doesn't respond here. 😄

@agnivade agnivade added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Nov 17, 2018
@jbardin
Copy link
Contributor

jbardin commented Nov 17, 2018

I'm fairly certain I used git to create this. The git-archive command by default stores the commit ID in a global extended pax header, which was being seen as a file causing the original bug. FWIW bsd and gnu tar handle this example file correctly as well.

@jbardin
Copy link
Contributor

jbardin commented Nov 17, 2018

Oh wait, @dsnet just mentioned that gnu tar does not accept this file. I'll see if I can find out any more info here.

@dsnet
Copy link
Member

dsnet commented Nov 18, 2018

What version of git are you using? At least on v1.9.1, it doesn't have this issue:

$ git version
git version 1.9.1
$ git archive --format=tar HEAD | head -c 1024 | hexdump -C
00000000  70 61 78 5f 67 6c 6f 62  61 6c 5f 68 65 61 64 65  |pax_global_heade|
00000010  72 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |r...............|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000060  00 00 00 00 30 30 30 30  36 36 36 00 30 30 30 30  |....0000666.0000|
00000070  30 30 30 00 30 30 30 30  30 30 30 00 30 30 30 30  |000.0000000.0000|
00000080  30 30 30 30 30 36 34 00  31 33 33 37 33 31 35 32  |0000064.13373152|
00000090  37 32 37 00 30 30 31 34  35 32 34 00 67 00 00 00  |727.0014524.g...|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000100  00 75 73 74 61 72 00 30  30 72 6f 6f 74 00 00 00  |.ustar.00root...|
00000110  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000120  00 00 00 00 00 00 00 00  00 72 6f 6f 74 00 00 00  |.........root...|
00000130  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000140  00 00 00 00 00 00 00 00  00 30 30 30 30 30 30 30  |.........0000000|
00000150  00 30 30 30 30 30 30 30  00 00 00 00 00 00 00 00  |.0000000........|
00000160  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200  35 32 20 63 6f 6d 6d 65  6e 74 3d 61 33 31 36 34  |52 comment=a3164|
00000210  39 64 33 33 31 32 38 65  63 61 33 65 30 35 61 39  |9d33128eca3e05a9|
00000220  35 38 66 62 30 34 36 31  31 62 38 30 34 65 34 33  |58fb04611b804e43|
00000230  37 65 62 0a 00 00 00 00  00 00 00 00 00 00 00 00  |7eb.............|
00000240  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*

Here you see that it correctly reports the filesize as 064, which is 52 bytes in octal, matching the PAX record in the body.

@jbardin
Copy link
Contributor

jbardin commented Nov 19, 2018

Hi @dsnet, thanks for verifying that.
I'm not sure what version of git was used originally, but if current versions are working correctly, I think we can just update our test file accordingly.

I've re-created the test file with git 2.18.0, and our tests now pass.
Thanks again!

@dsnet
Copy link
Member

dsnet commented Nov 19, 2018

Copy. Closing as "working as intended" then.

@dsnet dsnet closed this as completed Nov 19, 2018
@golang golang locked and limited conversation to collaborators Nov 19, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

5 participants