archive/zip: cannot parse file header with compressed size or local file header offset of 0xffffffff #31692

AxbB36 · 2019-04-26T08:16:59Z

archive/zip misinterprets (I believe) APPNOTE.TXT 4.5.3, such that it wrongly requires a Zip64 Extended Information extra field to be present whenever the compressed size or local file header offset of a central directory header is exactly 0xffffffff. #14185 fixed the problem for the uncompressed size as a special case, but really there is nothing special about the uncompressed size and all three fields should be treated equally.

APPNOTE.TXT 4.5.3 says:

The order of the fields in the zip64 extended information record is fixed, but the fields MUST only appear if the corresponding Local or Central directory record field is set to 0xFFFF or 0xFFFFFFFF.

archive/zip interprets the statement as:

if a field is 0xffffffff:
    require zip64 extended information to be present

But that logic is backwards—it's an "only if", not an "if". I think the interpretation should rather be

if zip64 extended information is present:
    replace only those fields that are 0xffffffff

In other words, 0xffffffff, by itself, is not a magic value that indicates special handling is required. It is the presence of a Zip64 Extended Information extra field that indicates special handling, and only then does the value 0xffffffff become significant. 0xffffffff is a perfectly valid field value to have in a non-Zip64 file.

I'm attaching a zip file that demonstrates the problem, ffffffff.zip.gz.gz. (It is gzipped twice to reduce the size of the attachment, but the gzip layers have nothing to do with the issue and you should remove them before testing.) The zip file was produced by Info-ZIP Zip 3.0 and contains 2 files, with a maximum compressed/uncompressed size of 0xffffffde and a maximum local file header offset of 0xffffffff. Zip has decided to write a non-Zip64 zip file, as none of the values exceeds 0xffffffff. Info-ZIP UnZip 6.00 can parse the file, but archive/zip cannot. The sample file was created as follows:

# 216186 * 19867 = 0xffffffff - len("pad") - 30
dd if=/dev/zero bs=216186 count=19867 of=pad
echo test > test.txt
rm -f ffffffff.zip
zip -0 -X ffffffff.zip pad test.txt
gzip -9 < ffffffff.zip | gzip -9 > ffffffff.zip.gz.gz

archive/zip doesn't have a problem if the local file header appears one byte earlier or later—the easiest way to test that is to use a 2- or 4-byte filename instead of "pad" in the recipe above. In the former case it's because the value is 0xfffffffe and in the latter case it's because the value is 0xffffffff but Zip64 information is present.

For corroboration, see the function getZip64Data in process.c of UnZip 6.00. It puts the Zip64 check outside the field value checks:

        if (eb_id == EF_PKSZ64) {
          if (G.crec.ucsize == 0xffffffff || G.lrec.ucsize == 0xffffffff){

Fixing this issue will allow removing the special case introduced in #14185 because it will be handled by the general case: a value of 0xffffffff means what it says, in the absence of a Zip64 extra field.

This issue is only a problem when reading a zip file, not when writing. archive/zip currently writes Zip64 information whenever a field is exactly 0xffffffff—that's probably a good idea for interoperability, even if it's not required. Compare Zip 3.0's strict inequality (function putend in zipfile.c):

  if( n > ZIP_UWORD16_MAX || s > ZIP_UWORD32_MAX || c > ZIP_UWORD32_MAX ||

with archive/zip's non-strict inequality:

	if records >= uint16max || size >= uint32max || offset >= uint32max {

What version of Go are you using (`go version`)?

$ go version
go version go1.11.5 linux/amd64

Does this issue reproduce with the latest release?

Yes, I tried go version devel +a62887aade Fri Apr 26 05:16:33 2019 +0000 linux/amd64.

What operating system and processor architecture are you using (`go env`)?

go env Output

$ go env
GOARCH="amd64"
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"

What did you do?

Put in ziplist.go:

package main

import (
	"archive/zip"
	"fmt"
	"os"
)

func main() {
	z, err := zip.OpenReader(os.Args[1])
	if err != nil {
		panic(err)
	}
	defer z.Close()
	for _, f := range z.File {
		fmt.Printf("0x%09x 0x%09x %+q\n", f.CompressedSize64, f.UncompressedSize64, f.Name)
	}
}

Now run:

$ gzip -dc ffffffff.zip.gz.gz | gzip -dc > ffffffff.zip
$ go run ziplist.go ffffffff.zip

What did you expect to see?

0x0ffffffde 0x0ffffffde "pad"
0x000000005 0x000000005 "test.txt"

What did you see instead?

panic: zip: not a valid zip file

goroutine 1 [running]:
main.main()
        ziplist.go:12 +0x202
exit status 2

The text was updated successfully, but these errors were encountered:

ianlancetaylor · 2019-04-26T13:25:13Z

CC @dsnet

rsc · 2019-05-21T22:50:35Z

The full comment on the relevant code says:

	// Assume that uncompressed size 2³²-1 could plausibly happen in
	// an old zip32 file that was sharding inputs into the largest chunks
	// possible (or is just malicious; search the web for 42.zip).
	// If needUSize is true still, it means we didn't see a zip64 extension.
	// As long as the compressed size is not also 2³²-1 (implausible)
	// and the header is not also 2³²-1 (equally implausible),
	// accept the uncompressed size 2³²-1 as valid.
	// If nothing else, this keeps archive/zip working with 42.zip.
	_ = needUSize

	if needCSize || needHeaderOffset {
		return ErrFormat
	}

I think this code is probably still best as written: no real zip encoder is going to write out 2³²-1 compressed bytes that uncompress to 2³²-1 or fewer bytes (and if it uncompressed to more it would need a zip64 header). The far more likely possibility is that the input is somehow corrupted or malformed (ErrFormat). That justifies the needCSize check.

The needHeaderOffset check is maybe slightly more debatable, but even so it still seems incredibly implausible and far more likely to be an invalid (or malicious) file than an innocently-created actual zip file. I think we should probably leave the code as is.

AxbB36 · 2019-06-20T02:30:41Z

no real zip encoder is going to write out 2³²-1 compressed bytes that uncompress to 2³²-1 or fewer bytes (and if it uncompressed to more it would need a zip64 header).

I think you're right about that. Compressing random data could result in a compressed size greater than the uncompressed size, but most encoders will switch to method 0 (Store) whenever that happens, so it's unlikely to be the case that uncompressed_size < compressed_size.

The needHeaderOffset check is maybe slightly more debatable, but even so it still seems incredibly implausible and far more likely to be an invalid (or malicious) file than an innocently-created actual zip file. I think we should probably leave the code as is.

I don't think this one is so implausible. Any time the input consists of multiple files that total more than 4 GB, Info-ZIP Zip will store all the local file header offsets that are ≤ 0xffffffff without Zip64, and those that are > 0xffffffff with Zip64. It will be up to chance whether a local file header happens to land exactly on 0xffffffff and result in a zip file that archive/zip cannot parse. If the input consists of small files of around 50 bytes, then there's around a 1% chance of that happening.

I don't mean to overstate the importance, and I won't be upset if the issue gets closed with no changes. It's only a very small minority of zip files that will ever have an offset of exactly 0xffffffff. (Then again, you could say the same for 0xfffffffe.)

ianlancetaylor added the NeedsInvestigation label Apr 26, 2019

ianlancetaylor added this to the Go1.13 milestone Apr 26, 2019

AxbB36 mentioned this issue May 13, 2019

Cannot parse file header with uncompressed size, compressed size, or local file header offset of 0xffffffff, if not in Zip64 format thejoshwolfe/yauzl#109

Closed

andybons modified the milestones: Go1.13, Go1.14 Jul 8, 2019

rsc modified the milestones: Go1.14, Backlog Oct 9, 2019

gabyhelp mentioned this issue Sep 12, 2024

archive/zip: improve Zip64 compatibility with 7z #69415

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

archive/zip: cannot parse file header with compressed size or local file header offset of 0xffffffff #31692

archive/zip: cannot parse file header with compressed size or local file header offset of 0xffffffff #31692

AxbB36 commented Apr 26, 2019

ianlancetaylor commented Apr 26, 2019

rsc commented May 21, 2019

AxbB36 commented Jun 20, 2019

archive/zip: cannot parse file header with compressed size or local file header offset of 0xffffffff #31692

archive/zip: cannot parse file header with compressed size or local file header offset of 0xffffffff #31692

Comments

AxbB36 commented Apr 26, 2019

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

ianlancetaylor commented Apr 26, 2019

rsc commented May 21, 2019

AxbB36 commented Jun 20, 2019

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?