Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: x/mod/zip: provide a way to ignore files when creating module zip #50225

Closed
dolmen opened this issue Dec 16, 2021 · 5 comments
Closed

Comments

@dolmen
Copy link
Contributor

dolmen commented Dec 16, 2021

Proposal

As a Go module author I would like a way to tell golang.org/x/mod/zip to ignore some files from the VCS repository where my module is published.

I propose to add a file that lists patterns of files to ignore in a similar way to the well known .gitignore. That list would be used to filter the list of files to exclude/include when creating a module zip.

Rationale

A VCS repository that contains a Go module may also contain many files irrelevant to the use of that module.
Currently most of the content of the repository are embedded in Go module zip files.
And most of the big files stored in Go modules are ones irrelevants to the Go build as a end user (like fuzz data, videos, images, HTML documentation...).

The current list of restrictions for files in a module zip is quite short: https://go.dev/ref/mod#zip-path-size-constraints

You can check the list of files on your own machine with this command:

find $(go env GOMODCACHE)/*.* -type f ! -name '*.go' ! -name 'go.mod' ! -name 'go.sum' ! -name 'list.lock' ! -name 'v*.mod' ! -name 'v*.info' ! -name 'v*.zip' ! -name 'v*.ziphash' ! -name 'v*.lock' ! -name 'LICENSE*' ! -name 'README*' -print

Check the size (requires GNU find):

find $(go env GOMODCACHE)/*.* -type f ! -name '*.go' ! -name 'go.mod' ! -name 'go.sum' ! -name 'list.lock' ! -name 'v*.mod' ! -name 'v*.info' ! -name 'v*.zip' ! -name 'v*.ziphash' ! -name 'v*.lock' ! -name 'LICENSE*' ! -name 'README*'  -printf "%s\n" | awk '{sum+=$1} END{print sum+0}'

On my machine:

  • 13GB used by module content in module cache (du -hc $(go env GOMODCACHE)/*.*) | tail -n 1)
  • 208,430 mostly useless files
  • 3,834,497,841 bytes

This is a waste of resources (network, storage on proxies, storage on build machines which are often end-user machine).

Implementation ideas

These are just general ideas that would have to be expanded/specified in a design document.

The patterns file would be stored at the root of the module (with go.mod). If we follow the .gitignore model, ignore files would also be allowed in sub directories.

Ideas for the naming the file:

  • .goignore (like .gitignore)
  • go.ignore (to go with go.mod, go.sum)
    Parsing of the file would be handled by new APIs exposed in package golang.org/x/mod/zip
@gopherbot gopherbot added this to the Proposal milestone Dec 16, 2021
@zx2c4
Copy link
Contributor

zx2c4 commented Dec 16, 2021

Alternatively, could x/mod/zip figure out the minimal set of files required, by working out the dependency graph from externally reachable entry points, and then adding a few pre-set project files like LICENSE and README?

@seankhliao
Copy link
Member

Do you want this to create your own zips which your would serve/host or do you want to change the default set of files selected by go ?
If it's the first, then it should only be via API.
If it's the second, it's a dup of #30058 or #42965

@dolmen
Copy link
Contributor Author

dolmen commented Dec 16, 2021

@zx2c4: this is not about filtering Go files. This is about filtering non-Go files.

@dolmen
Copy link
Contributor Author

dolmen commented Dec 16, 2021

@seankhliao Yes, we can consider this is a duplicate of #42965 or #30058: we need a common solution to those problem.

I hope that my new approach to the problem statement will raise the priority of that issue because as I show that the problem affects every developer's machine and grows as the community grows and time passes (each developer gets more and more modules and more and more versions of each module).

@seankhliao
Copy link
Member

In that case I think the conversation can be kept in #42965

@golang golang locked and limited conversation to collaborators Dec 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants