Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive/zip: Reader.Read returns ErrChecksum on certain files #21876

Closed
ericpromislow opened this issue Sep 14, 2017 · 3 comments
Closed

archive/zip: Reader.Read returns ErrChecksum on certain files #21876

ericpromislow opened this issue Sep 14, 2017 · 3 comments
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@ericpromislow
Copy link

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.7.1 darwin/amd64

Does this issue reproduce with the latest release?

Don't know

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/pivotal/go"
GORACE=""
GOROOT="/usr/local/Cellar/go/1.7.1/libexec"
GOTOOLDIR="/usr/local/Cellar/go/1.7.1/libexec/pkg/tool/darwin_amd64"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/60/czc9bh9j6w17gl13v_g9vxs80000gn/T/go-build433256889=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"

What did you do?

This is fun.

  1. Create a zip file on Windows, or any other platform where the
    directory entries don't have the 'd' bit set. Here's the one we worked on: https://github.com/cloudfoundry/cf-acceptance-tests/blob/master/assets/java-spring/java-spring.jar

  2. Copy bad-win.zip to orig-bad-win.zip, so you can run step 3 repeatedly:

  3. Run the zip file through some ruby code like this:

    require 'zip'
    def get_dirs_from_zip(zip_path)
    Zip::File.open(zip_path) do |in_zip|
    in_zip.select(&:directory?)
    end
    end

    dir_list = get_dirs_from_zip(zip_path)
    Zip::File.open(zip_file) do |in_zip|
    dir_list.each do |dir|
    in_zip.remove(dir)
    end
    end

  4. The go code is attached in a file called transform.go

  5. I ran this command (note that the code wants ok-win.tar to exist in advance):

touch ok-win.tar ; go run transform.go bad-win.zip ok-win.tar

What did you expect to see?

I expected to see a tar file called ok-win.tar

What did you see instead?

"zip: checksum error"

Notes

There are three likely sources for this error (4 but we're allowing winzip to not set the 'd' bit because zip is weird about directories anyway):

  1. Our ruby code
  2. The ruby zip library
  3. The go archive/zip library

Our ruby code seems ok, because it's not a problem with the zip files where the 'd' bit is set.

The error might be in the ruby zip library, but I can't really report it until I have a better idea
of the cause of the checksum error. I left some printf's in the code, so you can see the error occurs in the call io.Copy(tarWriter, zipFileReader), and not earlier calls that read the zip file.

@ericpromislow
Copy link
Author

transform.go.txt

@dsnet dsnet changed the title zip/reader.Reader.Read => ErrChecksum error on a particular kind of zip file archive/zip: Reader.Read returns ErrChecksum on certain files Sep 14, 2017
@dsnet dsnet self-assigned this Sep 14, 2017
@dsnet dsnet added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 14, 2017
@dsnet dsnet added this to the Go1.10 milestone Sep 14, 2017
@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017
@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Unplanned Jun 13, 2018
@marcellanz
Copy link

marcellanz commented Feb 9, 2019

I've investigated this issue and extracted both the go and ruby code together with a test to reproduce the issue and a possible fix for the issuers ruby code here:
https://github.com/marcellanz/go-21876

what happens

IMO, this is a non-issue from Go's point of view, here is why:

I was able to reproduce the issue on "go version go1.11.5 darwin/amd64" with the *.jar file and ruby code provided to this issue.

Also, I was able to create a similar zip file using ZIP(1L) on macOS that behaved similarly with the ruby script provided and then processed with the go code of this issue.

What seems to be happening is the following:

  1. The given *.jar file contains data descriptor entries [4] for each entry in the zip file.

  2. The produced zip file by the ruby code [2] left bit 3 set on 1 of the general purpose bit flag that is part of the central directory header and that indicates the presence of a corresponding data descriptor entry for a file within the zip file…

  3. … but Ruby's zip implementation Rubyzip [5] does not write data descriptor entries at all [6] (at least for all non-encryption zip files IMO / see ::Zip::NullDecrypter) and ignores the general purpose bit flag on bit 3 that was copied over initially [7].

  4. Go's zip reader implementation reads the general purpose flag on bit 3 when reading the central directory header entry of every file in the zip file [8] and then expects to get to find a data descriptor entry after the corresponding file data section within the zip file.

  5. Ultimately Go's zip reader reads a wrong CRC [9] since no data descriptor entry is present and then emits ErrChecksum

ZIP file structure (simplified)

      [local file header 1]
      [file data         1]
      [data descriptor   1] <- these are missing (incl. crc) after Rubyzip modified the zip file
      ...
      [local file header n]
      [file data         n]
      [data descriptor   n]
      ...
      [central directory header 1] <- bit 3 on the general purpose flags is still set
      ...
      [central directory header n]

fix

To fix this issue on the issuers side, general purpose flags can be cleared.

e.gp_flags &= ~0x8

Also (I'm not a ruby expert) Rubyzip's zip output stream might have to be fixed for entries that where copied over from an existing zip file. There is an issue [10] on Rubyzip that is Open and indicates to be similar to Issue21876.

Similar to Go's zip reader, macOS's Archive Utility.app cannot open the ZIP file produced by Rubyzip.

issue reproduced

Here we produce a zip file with data descriptors included for zip file entries using "-fd" flag with ZIP(1L) on macOS. Then, as described in the issue, the ruby script is used to remove directories out of the zip file and the zip file is transformed to a tar file using Go.

Run the following script (found here: https://github.com/marcellanz/go-21876) to reproduce the issue:

cd src/org.golang.go.issues/21876
# see it fail
./run.sh
# see it not fail with cleared gp_flags (see below)
./run.sh -fixed

@dsnet
Copy link
Member

dsnet commented Feb 9, 2019

Hi @marcellanz, thank you for your wonderful analysis. It's thorough and seems to conclusively show that the issue is with buggy file generation in Ruby. Given that there is nothing actionable on Go's end, I'm closing the issue.

@dsnet dsnet closed this as completed Feb 9, 2019
@golang golang locked and limited conversation to collaborators Feb 9, 2020
@rsc rsc unassigned dsnet Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants