Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: embed: Allow remote resources to be embedded #56401

Closed
malt3 opened this issue Oct 24, 2022 · 8 comments
Closed

proposal: embed: Allow remote resources to be embedded #56401

malt3 opened this issue Oct 24, 2022 · 8 comments

Comments

@malt3
Copy link

malt3 commented Oct 24, 2022

Currently, go embed only allows for local files to be embedded in the final binary. While I understand that this beneficial for security, reproducibility and running go build in airgapped environments, it also has a major drawback:
Every file embedded into the go binary has to either be checked in with your VCS or be downloaded dynamically before using go build (and other go commands).

Checking every file into your VCS is not feasible for binary blobs of certain sizes as file sizes are often limited and binary blobs are prone to creating huge diffs.
Adding extra required steps before running go build is problematic as well. I like the fact that I don't need special Makefiles and build tools to use go.

My proposal is as follows:

  • Allow go embed with a url and a hash
  • Can either point to a single file or tar
  • Embedded files can be cached and downloaded in advance (very similar to the handling of go modules)

This has the following advantages:

  • Fully deterministic by specifying expected hash
  • Lightweight (only url and hash are part of the embed directive)
  • Cachable using hash
  • Optional early download step allows compilation in air-gapped systems
@malt3 malt3 added the Proposal label Oct 24, 2022
@gopherbot gopherbot added this to the Proposal milestone Oct 24, 2022
@seankhliao
Copy link
Member

This opens up a massive can of worms on remote resource access: protocol, proxies, authentication, caching, as well as all the issues of tar.

@ianlancetaylor
Copy link
Contributor

I think that people understand that commands like go get access the network. I think people would be very surprised if commands like go build access the network.

@rittneje
Copy link

You can use go generate instead to download these artifacts (via curl or similar). Then your build process is essentially go generate && go build.

@malt3
Copy link
Author

malt3 commented Oct 25, 2022

This opens up a massive can of worms on remote resource access: protocol, proxies, authentication, caching, as well as all the issues of tar.

This is a valid concern and I would understand if a proposal is rejected due to the overall complexity of the implementation.

I think that people understand that commands like go get access the network. I think people would be very surprised if commands like go build access the network

I believe (depending on your proxy configuration, vendoring and previous go build and go mod download invocations), go build will in fact access the network already to download go modules.

You can use go generate instead to download these artifacts (via curl or similar). Then your build process is essentially go generate && go build.

This is a possible workaround if the download should occur in the module I am building directly. This will not work if a module I depend on wants to download + embed a file.
This approach also prevents CI environments and IDEs from working out of the box. Many tools have the assumption of:
git clone the source (or have it as a dependency in go.mod + go.sum), then perform any action directly like: go build, go vet ./..., go test ./... and more. This is why it is recommended to add go generated files into VCS so the assumption holds.
Not having to add every embedded file to the VCS is the entire goal of my proposal.

@mvdan
Copy link
Member

mvdan commented Oct 25, 2022

Checking every file into your VCS is not feasible for binary blobs of certain sizes as file sizes are often limited and binary blobs are prone to creating huge diffs.

Sounds like you could look into git-lfs then: #47308

Also note that you don't even need to use a VCS to publish a Go module. You can publish versions as zip files on an HTTP server: https://go.dev/ref/mod#serving-from-proxy

Or, more simply, if we're talking about a Go module that you will just build locally without a need to go get it, users could download a zip archive and run go build . directly. Ultimately, the source files for building your Go module belong in one place.

Finally, note that there is currently a limit of about 500 MiB on module sizes, both when compressed as a zip and when uncompressed. I would argue that embedding more than a few tens of megabytes of files is already a bad idea in general, but you'd also run into this limit pretty easily with large files.

@rsc
Copy link
Contributor

rsc commented Nov 16, 2022

This contradicts most of the design goals for modules. We really really want reproducible, self-contained builds that don't change from day to day. The hash helps, but the build can still fail today when it worked yesterday.

@rsc
Copy link
Contributor

rsc commented Nov 16, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Nov 16, 2022

This proposal has been declined as infeasible.
— rsc for the proposal review group

@rsc rsc closed this as completed Nov 16, 2022
@golang golang locked and limited conversation to collaborators Nov 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants