Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive/tar: Does not handle pax archives with members larger than 8GB #15573

Closed
dkolbly opened this issue May 6, 2016 · 2 comments
Closed

Comments

@dkolbly
Copy link

dkolbly commented May 6, 2016

  • I am using Go 1.6.2: go version go1.6.2 linux/amd64
  • Here's my full environment (go env)
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH=""
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GO15VENDOREXPERIMENT="1"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="g++"
CGO_ENABLED="1"
  • I created a 9GB file of NULs and tarred it using tar -H pax on linux:
dd if=/dev/zero bs=1000000 count=9000 of=big.dat
tar -H pax -cf big.tar big.dat

then I scanned the tar file using the archive/tar library

Here's the utility program I used to do so, also as a gist

package main

import (
    "archive/tar"
    "crypto/sha1"
    "encoding/hex"
    "fmt"
    "io"
    "os"
)

func main() {
    var src io.Reader

    if os.Args[1] == "-" {
        src = os.Stdin
    } else {
        s, err := os.Open(os.Args[1])
        if err != nil {
            panic(err)
        }
        src = s
    }
    r := tar.NewReader(src)
    for {
        hdr, err := r.Next()
        if err == io.EOF {
            fmt.Printf("done\n")
            break
        }
        if err != nil {
            panic(err)
        }
        fmt.Printf("%12d '%c' %s\n", hdr.Size, hdr.Typeflag, hdr.Name)
        hash := sha1.New()
        n, err := io.Copy(hash, r)
        if err != nil {
            panic(err)
        }
        digest := hex.EncodeToString(hash.Sum(nil))
        fmt.Printf("   ... copied %d bytes: %s\n", n, digest)
    }
}
  • I expected to see it extract 9GB from the tarfile and compute a sha1 of 6e55... (i.e., the SHA1 of 9GBs of NULs)

Something like this:

  9000000000 '0' big.dat
   ... copied 9000000000 bytes: 6e55e4710d345e664f11d2306b8400da36648971
done
  • What I actually saw was a report of a 9GB file member (correct), but no bytes read and the SHA1 of an empty string (wrong):
  9000000000 '0' big.dat
   ... copied 0 bytes: da39a3ee5e6b4b0d3255bfef95601890afd80709
done
  • Best guess about what is wrong...

I'm pretty sure the issue is that by the time we call mergePAX(), the numBytesReader has already been configured with the wrong size, so even though we fix up the hdr.Size in mergePAX, it's too late for curr, the numBytesReader which is already a *regFileReader with nb=0.

@bradfitz bradfitz added this to the Go1.8Maybe milestone May 6, 2016
@dsnet
Copy link
Member

dsnet commented May 6, 2016

This is caused by #15564, where the general problem is that the PAX header (which records a size of 9GB) does not take precedence over the USTAR header (which records size of 0).

You are right that we need to apply regFileReader again after mergePAX, similar to whats done here. For that change I had previously made an abstraction to consolidate the common logic in dealing with regular files.

@gopherbot
Copy link

CL https://golang.org/cl/27454 mentions this issue.

gopherbot pushed a commit that referenced this issue Aug 25, 2016
Factor out the regular file handling logic into handleRegularFile
from nextHeader. We will need to reuse this logic when fixing #15573
in a future CL.

Factor out the sparse file handling logic into handleSparseFile.
Currently this logic is split between nextHeader (for GNU sparse
files) and Next (for PAX sparse files). Instead, we move this
related code into a single method.

There is no overall logic change. Thus, no unit tests.

Updates #15573 #15564

Change-Id: I3b8270d8b4e080e77d6c0df6a123d677c82cc466
Reviewed-on: https://go-review.googlesource.com/27454
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@golang golang locked and limited conversation to collaborators Sep 2, 2017
@rsc rsc unassigned dsnet Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants