Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: do not download “modules” that contain no go.mod or *.go #31866

Open
bcmills opened this issue May 6, 2019 · 19 comments
Open

cmd/go: do not download “modules” that contain no go.mod or *.go #31866

bcmills opened this issue May 6, 2019 · 19 comments
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. modules Proposal Proposal-Accepted
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented May 6, 2019

At the moment, go mod download will happily try to extract and download any arbitrary repository as long as it can be resolved by some means (through a hard-coded hosting service such as github.com, or using a distinguished extension like .git), even if it does not contain anything even marginally related to building Go code.

I am not aware of any reasonable use-case for such a repository:

  • It's not useful for storing test data, because we currently provide no mechanism for the tests to actually locate that data. (Modules are not guaranteed to be loaded from the module cache — for example, they might be subject to a replace directive — and since the test itself is run within the directory containing its source code, it has no way to locate the data or run go list within the module that invoked it.)

  • It's not useful for C headers (for use with cgo), for the same reason.

  • It might be useful for fetching non-Go inputs to go generate: in theory, the generator could run go mod download $MODULE to locate the sources at the required version. But the output of go generate is intended to be checked in anyway, which makes the use of modules somewhat spurious: if an explicit version of the non-Go inputs appears in the module's requirements, then everyone using the generated package will have an extra module to fetch that is guaranteed to have no effect on the build, and in most cases the go generate program can just as easily git clone (or similar) the input data at a specific revision.

Furthermore, if someone did find a way to make modules without Go source code useful (for the above use-cases or others), it's trivial to add a go.mod file to indicate that the repository really is somehow intended for use with Go source code. (We need to support go.mod-only modules anyway, since they can arise naturally when splitting a large root module into smaller nested modules.)


On the other hand, module proxies tend to rely on the go command to decide what is or is not a valid module, and accepting arbitrary non-Go repositories potentially exposes such proxies to a significant amount of additional load.


Therefore, I propose that we change the go command to explicitly reject any “module” that both contains no .go source files and lacks a go.mod file.

CC @rsc @jayconrod @heschik @hyangah @katiehockman @thepudds @marwan-at-work @ianthehat

@bcmills bcmills added modules NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Proposal labels May 6, 2019
@bcmills bcmills added this to the Go1.14 milestone May 6, 2019
@marwan-at-work
Copy link
Contributor

Sounds good to me as long as the Go command only checks for .go files and does not validate the import paths when a proxy runs go mod download $MODULE

For more on why: #31458

@bcmills
Copy link
Contributor Author

bcmills commented May 6, 2019

@marwan-at-work

as long as the Go command only checks for .go files and does not validate the import paths

For that do you mean something like #31662?

@marwan-at-work
Copy link
Contributor

marwan-at-work commented May 6, 2019

@bcmills yes that issue would guarantee a proxy to go mod download a module via its VCS path and not its Vanity import path. I believe this already works as expected for my use case

Thanks!

@vearutop
Copy link
Contributor

vearutop commented May 7, 2019

Another example use case is to deliver supporting scripts and assets with module. Not saying it is the greatest idea, but I included base Makefile from go/pkg/mod/path@ver/Makefile.

Such approach was working with vendor before.

I agree for such case it should be not a problem to at least add go.mod in the repo root.

@beoran
Copy link

beoran commented May 7, 2019

Actually in the proxy we am writing, we download the repository using the VCS, not using go get or go download. So, I don't know whether the repository is a useful go module or not, I let the proxy user's go decide on that. So actually, a go module without go files doesn't bother me at all. I'd rather not have to decide in the proxy whether a repository contains go files or not, so I am not in favor of this proposal.

@bcmills bcmills added the early-in-cycle A change that should be done early in the 3 month dev cycle. label May 30, 2019
@Xe
Copy link

Xe commented May 30, 2019

3.1 GB of data in go/pkg/mod

Got bit by this today while trying to find out why my disk space was gone. The biggest offender seems to be the linux kernel tree. Like the actual upstream linux kernel somehow.

@seebs
Copy link
Contributor

seebs commented Feb 3, 2021

This is probably a stupid question, but:

If you don't download it, how do you know what it contains?

@seankhliao
Copy link
Member

I think I remember people getting around the no reliable path problem by invoking go mod vendor so dependencies can be located in vendor/

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Feb 17, 2021
@rsc rsc changed the title cmd/go: do not download “modules” that contain no Go files proposal: cmd/go: do not download “modules” that contain no Go files Jun 22, 2022
@gopherbot gopherbot removed the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jun 22, 2022
@hyangah
Copy link
Contributor

hyangah commented Nov 29, 2022

On the other hand, module proxies tend to rely on the go command to decide what is or is not a valid module, and accepting arbitrary non-Go repositories potentially exposes such proxies to a significant amount of additional load.

As @seebs pointed out, unless there is a magic git or vcs command to do this cheaply without cloning, module proxy already has downloaded the repository when the go mod download concluded there was no go code. (still there is a question on whether it's acceptable for the module proxy to stop serving such modules, but that is a separate issue).

If this proposal is accepted and implemented, I think module proxies need a way to distinguish this mode of failure from other go command failures (network issue, etc) so they know that they don't need to retry and download the requested versions again in the future.

@bcmills
Copy link
Contributor Author

bcmills commented Nov 29, 2022

I think the proxy should be able to use the Origin and Reuse fields (added to go mod download -json for #53644) for this case.

@rsc
Copy link
Contributor

rsc commented Nov 30, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Dec 7, 2022

I agree that Origin and Reuse should provide enough information.

@rsc
Copy link
Contributor

rsc commented Dec 7, 2022

Does anyone object to doing this? (It's hard to imagine why anyone would object given that there is no Go code in these "modules".)

@bcmills
Copy link
Contributor Author

bcmills commented Dec 7, 2022

I thought of one possible complication.

If the module at the root of a repo is carved up into nested modules, and the carved-out modules do not leave any packages in the original root module, it could be left with no Go source files, but still needed in order to check for import-path collisions. (That is: we need to module's source code only in order to verify that it does not contain the packages loaded from nested modules.)

The root module must continue to exist, because modules that require versions of it from before the split must be able to upgrade to remove the root-module packages.

Finally, if the root module was at a +incompatible version, its maintainers cannot add a go.mod file to make it a valid module, because that would make the +incompatible versions invalid.

I think it would still be possible to add a .go file with //go:build ignore at the repo root to keep the module valid (the import path at the repo root necessarily cannot conflict with any nested module), but that is kind of a funky workaround and would potentially invalidate empty-repo-root modules that were previously valid.

On a practical note, https://github.com/Azure/azure-sdk-for-go may be very close to that exact situation. 😅

@rsc
Copy link
Contributor

rsc commented Dec 14, 2022

Having to create a .go file seems like a useful signal in this case.

@rsc rsc changed the title proposal: cmd/go: do not download “modules” that contain no Go files proposal: cmd/go: do not download “modules” that contain no go.mod or *.go Dec 14, 2022
@rsc
Copy link
Contributor

rsc commented Dec 21, 2022

Does anyone object to accepting this proposal?

@rsc
Copy link
Contributor

rsc commented Jan 4, 2023

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@thepudds
Copy link
Contributor

A couple of people from the Gentoo Foundation were asking some seemingly related questions in #51284, including #51284 (comment).

It might be worthwhile for someone from the core Go team to make a brief comment there, including in the context of this proposal here.

@rsc
Copy link
Contributor

rsc commented Jan 11, 2023

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: cmd/go: do not download “modules” that contain no go.mod or *.go cmd/go: do not download “modules” that contain no go.mod or *.go Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. modules Proposal Proposal-Accepted
Projects
Status: Accepted
Development

No branches or pull requests