Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: timestamp of files in module archive unset #34953

Closed
klingtnet opened this issue Oct 17, 2019 · 14 comments
Closed

cmd/go: timestamp of files in module archive unset #34953

klingtnet opened this issue Oct 17, 2019 · 14 comments
Labels
FrozenDueToAge GoCommand cmd/go NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@klingtnet
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.13.1 linux/amd64

Does this issue reproduce with the latest release?

Probably.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/alinz/.cache/go-build"
GOENV="/home/alinz/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/alinz/.go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go"
GOSUMDB="off"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/alinz/code/gogitlabproxy/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build916870041=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I am writing a custom module proxy at the moment and while testing equality of the zipped module with the output of proxy.golang.org I noticed that the official proxy does not set the timestamp for the files in the archive, i.e. they are all dated Nov 30 1979.
Is this the expected behavior?

What did you expect to see?

The actual file timestamps, i.e. the files in the archive should have the same timestamp as their source in the git repository.

What did you see instead?

$ curl -s https://proxy.golang.org/github.com/pkg/errors/@v/v0.8.1.zip | bsdtar -ztvf-
-rw-rw-r--  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/.gitignore
-rw-rw-r--  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/.travis.yml
-rw-rw-r--  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/LICENSE
-rw-rw-r--  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/README.md
-rw-rw-r--  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/appveyor.yml
-rw-rw-r--  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/bench_test.go
-rw-rw-r--  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/errors.go
-rw-rw-r--  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/errors_test.go
-rw-rw-r--  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/example_test.go
-rw-rw-r--  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/format_test.go
-rw-rw-r--  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/stack.go
-rw-rw-r--  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/stack_test.go
@dmitshur
Copy link
Contributor

Is this the expected behavior?

This is the behavior of go mod download:

$ cd $(mktemp -d)
$ export GOPATH=$(pwd)
$ export GO111MODULE=on
$ export GOPROXY=direct
$ go mod download github.com/pkg/errors@v0.8.1
go: finding github.com/pkg/errors v0.8.1
$ cat $GOPATH/pkg/mod/cache/download/github.com/pkg/errors/\@v/v0.8.1.zip | bsdtar -ztvf-
-rwxrwxrwx  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/.gitignore
-rwxrwxrwx  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/.travis.yml
-rwxrwxrwx  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/LICENSE
-rwxrwxrwx  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/README.md
-rwxrwxrwx  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/appveyor.yml
-rwxrwxrwx  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/bench_test.go
-rwxrwxrwx  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/errors.go
-rwxrwxrwx  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/errors_test.go
-rwxrwxrwx  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/example_test.go
-rwxrwxrwx  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/format_test.go
-rwxrwxrwx  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/stack.go
-rwxrwxrwx  0 0      0           0 Nov 30  1979 github.com/pkg/errors@v0.8.1/stack_test.go

@dmitshur dmitshur changed the title proxy.golang.org: timestamp of files in module archive unset cmd/go: timestamp of files in module archive unset Oct 17, 2019
@dmitshur dmitshur added GoCommand cmd/go NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Oct 17, 2019
@klingtnet
Copy link
Author

I took a quick look in /archive/zip and it looks like 1979-11-30 is the Zero MSDOS date:

https://play.golang.org/p/HOhsuCVXEuP

https://github.com/golang/go/blob/master/src/archive/zip/struct.go#L128

@klingtnet
Copy link
Author

Another small thing I found was that all files in the module from proxy.golang.org have 0664 permissions.

@heschi
Copy link
Contributor

heschi commented Oct 18, 2019

The go mod download command is the only supported mechanism for producing module zips. There are a lot of subtleties involved, and I'm not sure anyone could give you a full list without some research. All of the things you've listed so far are deliberate as far as I know.

I believe @jayconrod has an open bug somewhere to consider making that code available as a library. Other than that, I'm not sure what there is to do here.

@jayconrod
Copy link
Contributor

CL 202042 is a draft CL that moves module zip handling code to x/mod. It documents the restrictions on zip files that the Go command accepts and provides a way to build and extract zip files without go mod download.

We don't currently save any metadata other than file names, and we don't set file modes or dates when we extract files into the module cache. I'm not sure there's any concrete reason for this. Storing and setting timestamps and executability would not change module hashes, which only cover file names and content.

It would change the SHA-256 hash of generated zip files. We haven't promised those are stable, but changing them would probably surprise some people. I would like golang.org/x/mod/zip to produce files with the same hash as cmd/go/internal/modfetch as long as there are two implementations.

@dmitshur
Copy link
Contributor

We haven't promised those are stable, but changing them would probably surprise some people.

I think we should actively document that they're not guaranteed to be stable, so that people don't begin to expect them to be stable. Otherwise, we'd be locking in the exact zip file format encoding, compression level, etc.

I can't find it written down right now unfortunately, but my understanding of the decision to include only the file names and file contents in the hashes have been to prevent this problem. The only reference I'm able to find now is the documentation of dirhash.HashZip, which says:

HashZip returns the hash of the file content in the named zip file. Only the file names and their contents are included in the hash: the exact zip file format encoding, compression method, per-file modification times, and other metadata are ignored.

@klingtnet
Copy link
Author

Storing and setting timestamps and executability would not change module hashes, which only cover file names and content.

@jayconrod This is exactly what it wanted to ask, thanks!

@klingtnet
Copy link
Author

In my module proxy I create the zip file on the fly from a tar.gz stream that I receive from the Gitlab API in order to prevent downloading the full repository snapshot first. I think this use case would not be applicable with the API of https://go-review.googlesource.com/c/mod/+/202042/2/zip/zip.go . It would be really nice if the zip could be built by incrementally adding files, one after another.

@bcmills
Copy link
Contributor

bcmills commented Oct 21, 2019

we don't set file modes or dates when we extract files into the module cache.

#34965 is closely related. (We should probably preserve the execute bit, at least.)

@bcmills
Copy link
Contributor

bcmills commented Oct 21, 2019

It would change the SHA-256 hash of generated zip files. We haven't promised those are stable, but changing them would probably surprise some people.

There is a reason the hash in the go.sum file is on the file contents and not the zipfile itself. 😉

@jayconrod
Copy link
Contributor

@klingtnet

In my module proxy I create the zip file on the fly from a tar.gz stream that I receive from the Gitlab API in order to prevent downloading the full repository snapshot first. I think this use case would not be applicable with the API of https://go-review.googlesource.com/c/mod/+/202042/2/zip/zip.go . It would be really nice if the zip could be built by incrementally adding files, one after another.

In this CL, zip.Create excludes files that are part of a submodule (files that are in a directory that contains a go.mod file, other than the root). If we provide an API that accepts a stream of files, it would be difficult to exclude those files since all the go.mod files might be at the end of the stream. We could either 1) read the whole stream internally before adding files to the zip or 2) remove this logic and require the caller to exclude those files (adding complexity and implicitly requiring the caller to read the whole stream).

Either way, the whole stream needs to be read before we can start adding files to the zip. Given that, I think it's better to make the cost clear to the caller and accept a list of files, rather than a stream.

@klingtnet
Copy link
Author

In this CL, zip.Create excludes files that are part of a submodule (files that are in a directory that contains a go.mod file, other than the root)

Ok, that's a valid point and I haven't thought about filtering submodules, yet.
Is there documentation about what files/folders are excluded from a module, e.g. /vendor should be ignored as well?

@rsc
Copy link
Contributor

rsc commented Oct 24, 2019

Repeating my comment from #34965 (comment):

We defined the Go module representation to be absolutely least common denominator across systems for portability. The Go module zip file is for dependencies that need to be downloaded and built by the go command. It is not meant to support anything more. By design, there are no permission bits, nor special files like symlinks, nor modification times. I still believe this is the right decision. We can go arbitrarily far in "preserving" file metadata, and the result would still miss some kinds of metadata and fail to recreate other metadata on some subset of systems (symlinks, odd file modification times, etc).

Tests that need special files (anything beyond files containing bytes) can write them to a temporary directory and configure them as they wish.

Install scripts can be invoked as bash ./script (or cat script | bash) instead of ./script.

I don't believe this decision was wrong. But even if it were, it is a fundamental design decision made years ago that would be very hard to change now - it would invalidate all the checksums (if this extra information is important enough to preserve it would have to be checksummed too). It would have to be very wrong to be worth the pain of correcting.

@jayconrod
Copy link
Contributor

Closing as working as intended. I'll include this in the module reference documentation and the documentation for golang.org/x/mod/zip when it's ready.

@golang golang locked and limited conversation to collaborators Oct 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge GoCommand cmd/go NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

7 participants