Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: compress/gzip, archive/tar: randomize output some of the time #26378

Closed
neild opened this issue Jul 13, 2018 · 4 comments
Closed

proposal: compress/gzip, archive/tar: randomize output some of the time #26378

neild opened this issue Jul 13, 2018 · 4 comments

Comments

@neild
Copy link
Contributor

neild commented Jul 13, 2018

Every release cycle, we run all of our tests with the upcoming release. And every release cycle, it seems, we discover several new places that expect the output of compress/gzip and archive/tar to be stable for all time.

Perhaps there's some way to introduce deliberate randomness into this output, along the lines of map iteration order randomization and https://go-review.googlesource.com/c/go/+/64451.

This is tricky, however, because it's probably reasonable to depend on the output of these packages to be consistent within any given binary. Perhaps a build stamp could be an input to the randomizer, or randomization could be only in tests, or both.

@gopherbot gopherbot added this to the Proposal milestone Jul 13, 2018
@dsnet
Copy link
Member

dsnet commented Jul 13, 2018

I would expand this to all compress, archive, and most encoding packages.

@bradfitz
Copy link
Contributor

@dsnet, we definitely don't want archive/zip and archive/tar to start generating random output. We're trying in another bug to have reproducible builds for Go. And what would it mean for encoding? Randomized orders? Another bug is proposing we start sorting more in fmt, for instance.

I think this would cause more pain than it'd solve.

@rsc
Copy link
Contributor

rsc commented Jul 23, 2018

We used to do this, by putting time stamps in the output, and we took them out precisely to get repeatable output. I don't see why we would make it non-repeatable again. Like you say, it's entirely reasonable to depend on the output of these packages to be consistent within any given binary.

We do agree it can change from release to release. I don't see a nice way to rub that in everyone's faces though. Note also that we decided against #13884 for pretty much the same reasons.

@rsc rsc closed this as completed Jul 23, 2018
@cyphar
Copy link

cyphar commented Jan 20, 2019

And it should be noted that most container image formats are based on Go's archive/tar. I'm trying to move everyone away from it, because it's an awful format for that usecase, but to randomise it would be to intentionally break layer caching and reproducibility for most container image formats.

@golang golang locked and limited conversation to collaborators Jan 20, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants