Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/go: subcommands to add and remove modules from the module cache #28835

Open
bcmills opened this issue Nov 16, 2018 · 19 comments
Open
Labels
modules NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal Proposal-Hold
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Nov 16, 2018

For a number of use-cases, it would be helpful to be able to upload modules to the module cache from source code (not just zip files!) in a local directory or repository.

Some examples:

To support those use-cases, I propose the following subcommands:

  • go mod pack [MODULE[@VERSION]] DIR: construct a module in the module cache from the module source code rooted at DIR (at version VERSION). If the MODULE is omitted, it is inferred from DIR/go.mod. If @VERSION is provided, it must be a valid semantic version, and go mod pack fails if that version already exists with different contents. If @VERSION is omitted, DIR must be within a supported version control repository, and go mod pack will attempt to infer the version from the repo state (commits and tags).

  • go mod unpack MODULE[@VERSION] DIR: download the contents of MODULE to DIR. If @VERSION is omitted, use the active version from the main module (if any), or latest if no version is active. (In contrast to go mod vendor, go mod unpack would unpack the entire contents of the module — not just the packages in the import graph of the main module.)

  • go clean -m MODULE[@VERSION]: remove MODULE@VERSION from the module cache. If run within a module, also remove the corresponding entry from its go.sum file. If @VERSION is omitted, remove all versions of MODULE from the module cache (and go.sum file).

CC @hyangah @jadekler @rsc @myitcv @thepudds @rasky @rogpeppe @FiloSottile

@bcmills bcmills added Proposal NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. modules labels Nov 16, 2018
@bcmills bcmills added this to the Go1.13 milestone Nov 16, 2018
@bcmills
Copy link
Contributor Author

bcmills commented Nov 16, 2018

A way to explicitly populate the module cache from source might also help in cases where the original source path is blocked or unavailable but the code is available from a trusted mirror (as in #28652).

@rsc
Copy link
Contributor

rsc commented Nov 27, 2018

We have go mod download to add to the module cache.
We have go mod clean -modcache to clear it.
Do we really need more fine-grained control?
I fear that will make people manage it more.

@rsc
Copy link
Contributor

rsc commented Nov 27, 2018

Never mind I didn't understand the problem being solved.

@gopherbot
Copy link

Change https://golang.org/cl/153819 mentions this issue: cmd/go/internal/modfetch: skip symlinks in (*coderepo).Zip

@rsc
Copy link
Contributor

rsc commented Dec 12, 2018

Ping @bcmills to summarize our discussion from 2 weeks ago about alternatives to meet the need you are trying to address here.

gopherbot pushed a commit that referenced this issue Dec 12, 2018
Tested manually.

Before:

	$ go mod init golang.org/issue/scratch
	go: creating new go.mod: module golang.org/issue/scratch
	$ go1.11.2 mod download github.com/rogpeppe/test2@latest
	go: finding github.com/rogpeppe/test2 v0.0.11
	$ find $GOPATH -name goodbye
	/tmp/tmp.Y8a8UzX3zD/_gopath/pkg/mod/github.com/rogpeppe/test2@v0.0.11/tests/goodbye
	$ cat $(find $GOPATH -name goodbye)
	hello

After:

	$ go mod init golang.org/issue/scratch
	go: creating new go.mod: module golang.org/issue/scratch
	$ go mod download github.com/rogpeppe/test2@latest
	go: finding github.com/rogpeppe/test2 v0.0.11
	$ find $GOPATH -name goodbye
	$ find $GOPATH -name hello
	/tmp/tmp.Zo0jhfLaRs/_gopath/pkg/mod/github.com/rogpeppe/test2@v0.0.11/tests/hello

A proper regression test would require one of:
• a new entry in the vcs-test server (feasible but tedious, and not easily updated by open-source contributors), or
• a way to set up an HTTPS proxy in a script_test, or
• a way to explicitly populate the module cache from the contents of a local repository (#28835).

Fixes #27093
Updates #28835

Change-Id: I72702a7e791f8815965f0f87c82a30df4d6f0151
Reviewed-on: https://go-review.googlesource.com/c/153819
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Jay Conrod <jayconrod@google.com>
@gopherbot
Copy link

Change https://golang.org/cl/153822 mentions this issue: cmd/go/internal/modfetch: skip symlinks in (*coderepo).Zip

gopherbot pushed a commit that referenced this issue Dec 14, 2018
…coderepo).Zip

Tested manually.

Before:

	$ go mod init golang.org/issue/scratch
	go: creating new go.mod: module golang.org/issue/scratch
	$ go1.11.2 mod download github.com/rogpeppe/test2@latest
	go: finding github.com/rogpeppe/test2 v0.0.11
	$ find $GOPATH -name goodbye
	/tmp/tmp.Y8a8UzX3zD/_gopath/pkg/mod/github.com/rogpeppe/test2@v0.0.11/tests/goodbye
	$ cat $(find $GOPATH -name goodbye)
	hello

After:

	$ go mod init golang.org/issue/scratch
	go: creating new go.mod: module golang.org/issue/scratch
	$ go mod download github.com/rogpeppe/test2@latest
	go: finding github.com/rogpeppe/test2 v0.0.11
	$ find $GOPATH -name goodbye
	$ find $GOPATH -name hello
	/tmp/tmp.Zo0jhfLaRs/_gopath/pkg/mod/github.com/rogpeppe/test2@v0.0.11/tests/hello

A proper regression test would require one of:
• a new entry in the vcs-test server (feasible but tedious, and not easily updated by open-source contributors), or
• a way to set up an HTTPS proxy in a script_test, or
• a way to explicitly populate the module cache from the contents of a local repository (#28835).

Fixes #29191
Updates #28835

Change-Id: I72702a7e791f8815965f0f87c82a30df4d6f0151
Reviewed-on: https://go-review.googlesource.com/c/153819
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Jay Conrod <jayconrod@google.com>
(cherry picked from commit 561923f)
Reviewed-on: https://go-review.googlesource.com/c/153822
@nim-nim
Copy link

nim-nim commented Feb 25, 2019

On the Linux distribution side, you need almost the same thing. The whole process is:

  • prepare/fix/patch/clean up pristine project sources in a project directory (typically nuking vendor dir so the component uses clean audited system modules and not private hacked forks)
  • transform the directory content in something that can be deployed in a specific place on the filesystem (usually faking a deployment in a pristine staging directory)
  • give the result to the system package manager
  • (eventually, once the system package manager has deployed the result in the actual final target dir) reindex the target directory

So you’d need almost the same, with a little tweak: deployment and indexing need to be separated

  • one command to deploy the files composing the module in a staging directory:
    go mod pack --version version [PROJECT_DIR] STAGING_DIR (of course it sucks that you need a separate version argument and it is not already present in the go.mod file). PROJECT_DIR defaulting to . and containing a go.mod file.
  • one command to reindex the final target directory one the system package manager has copied the files produced in the first step in the destination system proxy dir (or removed the files belonging to another module)
    go mod reindex PROXYDIR (though since PROXYDIR will usually be standardised it should be read in a system conf file or an env var, not specified every time a reindexing needs to be place)

There is no concept of cleaning up the module cache, since all files are supposed to be associated with a single system component, so the system manager knows how to clean up them without help. I suspect this part won't go well with the proxy protocol as defined today since some files are shared between different versions of the same module (but .so file symlinks are pretty much the same mess so that should be manageable with a few hacks)

Lots of Linux subsystems, from python to fontconfig, behave this way today, that's a proofed deployment design pattern that is easy to integrate system-side

@bcmills
Copy link
Contributor Author

bcmills commented Feb 25, 2019

@nim-nim, there is no “indexing” step in the module cache. Either the requested version is there, or it isn't.

@nim-nim
Copy link

nim-nim commented Feb 26, 2019

@bcmills Then how is $GOPROXY/<module>/@v/list supposed to be generated?

You can go mod pack mymodule version x.y.z in system component golang-mymodule-x.y.z, that will contain

$GOPROXY/mymodule/@v/x.y.z.mod
$GOPROXY/mymodule/@v/x.y.z.info
$GOPROXY/mymodule/@v/x.y.z.zip

and then you can go mod pack version a.b.c in another system component golang-mymodule-a.b.c, that will contain

$GOPROXY/mymodule/@v/a.b.c.mod
$GOPROXY/mymodule/@v/a.b.c.info
$GOPROXY/mymodule/@v/a.b.c.zip

So far so good every file is nicely accounted for and the system component on-disk representation does not clash (even though having to manage a separate info file just because the module file does not contain the version is annoying).

But depending on whether the user installs only golang-mymodule-x.y.z, only golang-mymodule-a.b.c, or both $GOPROXY/mymodule/@v/list is not supposed to have the same content isn't it? So you need to reindex $GOPROXY/mymodule/@v/list on installation/uninstallation of anything in $GOPROXY/mymodule/@v/

In rpm tech that would mean adding a %transfiletriggerin and a %transfiletriggerpostun on the $GOPROXY directory that calls a go subsytem command to reindex all the stuff inside $GOPROXY every time the system component manager adds or removes things in it (rpm documentation)

@rsc
Copy link
Contributor

rsc commented Feb 28, 2019

The module cache is a cache. I really do not want the module download cache to have manual maintenance. That was the big problem with $GOPATH/pkg and go install: go install was manual maintenance of $GOPATH/pkg. The new build cache has no maintenance, which simplifies everything and eliminates a lot of awful failure modes. We'd really like the same for the module cache.

The operation being created above is really "pretend this module version has been published, so I can build and test other modules that depend on it". It's not clear to me that that should be scoped to a whole machine (a whole $GOPATH). At the very least it seems like we need two commands:

  1. Fake-publish this module.
  2. Build this other module using the fake-published stuff.

A build should never default to using the fake-published stuff. Then you can't do two logically separate things in a single GOPATH and we're back to manual cache maintenance a la go install. That is, if I'm in the middle of testing one fake-published module 1 against another module 2 and I get an interrupt and context switch to something completely different module 3 that happens to also depend on module 1, I don't want to have no way to get back to the real world where there isn't a fake module 1 floating around. That should be the default world I'm in. Otherwise the mental load of managing this automatically-used staging area is much like $GOPATH/pkg and go install.

I can't remember exactly what @bcmills and I discussed in late Nov 2018 but I think it was some other mechanism that wasn't "the module cache" for fake-publishing. You could imagine saying "fake publish to configuration foo" and then "build with configuration foo" and even "list configuration foo". Or maybe there's just one fake-published-world per $GOPATH.

@nim-nim
Copy link

nim-nim commented Mar 1, 2019

@rsc It's not fake-publish, it's using your own code, only without forcing people to use github or artifactory in the middle. In the actual "real world" you have lots of situations where roundtripping to the github just to use your own code is not acceptable. So please make this use case work cleanly without artificial fake publish degradation, or people will just reverse engineer how go mod works and write their own tools you won't be happy with (already starting to, because modules are pushed before the tooling is finished and ready).

When you don't own your cloud like Google, when you don't have fat network pipes, when you have restricted networks because of $expensive and $dangerous factories plugged here, you don't roundtrip to the Internet all the time just because it's cool at home to look youtube videos.

As written in the module FAQ

Rather, the go tooling in 1.11 has added optional proxy support via GOPROXY to enable more enterprise use cases (such as greater control)

Greater control means greater control, and people doing the stuff they want with their code without opaque cloud intermediaries.

Besides making access to some remote VCS mandatory just to make use of some code, would make Go instantaneously incompatible with every single free software license out there.

@rsc
Copy link
Contributor

rsc commented Mar 7, 2019

@nim-nim I don't understand your response. I completely sympathize with the use case here and I spelled out a path forward that avoids the network. My use of "fake-publish" was not derogatory. I am referring to the operation of making it look locally like the module has been published even though it has not, hence "fake publish".

@akamensky
Copy link

akamensky commented Aug 15, 2019

I am not sure about other 2 commands in this proposal, but I think go mod pack is something that is going to be really needed by many developers. I know these comments are really frowned upon here, but in many already long established tools/ecosystems this functionality is deemed as must have. First comes to mind is Maven where you can publish artifact to local cache from local code.

Consider a project A that depends on library B. Often times developers want to develop and publish v1.2 of both A and B at the same time. How can I import module B v1.2 that I am working on locally to my project A that I am also working on locally? As of now (1.13b1) there does not seem to be any mechanism to achieve this without manually hacking into go.mod with replace and subsequently removing it from go.mod (again manually I presume) before publishing both.

@perillo
Copy link
Contributor

perillo commented Jan 28, 2020

The concept of pre-fill the module cache with a local (non published) module, or with a new revision not yet published, can be implemented with an external command.

Here is an implementation: https://github.com/perillo/gomod-pack.
it calls go mod download -json with a custom environment, where git is configured with URL rewriting and go is configured with direct access and disabled checksum database.

gomod-pack can only be called inside a module, and the user can only specify the version to pack.
It prints to stdout the versioned module path, that the user can use in a go.mod require directive.

The only drawback is that it only works with git.

@marystern
Copy link

That was the big problem with $GOPATH/pkg and go install: go install was manual maintenance of $GOPATH/pkg. The new build cache has no maintenance, which simplifies everything and eliminates a lot of awful failure modes. We'd really like the same for the module cache.

Hi, I'm coming from Issue #37554 and have just read this. I had no idea that "go install" was going to become deprecated!...Maybe this needs clarification in the community?

In my issue, I suggested that "go install" do the same as "go mod pack" in this proposal (and I prefer that way of expressing the command as it's the same as previous go-versions). I agree with @nim-nim as we both seem to want a fairly simple use case (local code using modules, not hitting the network), but the current implementation of modules makes this tricky to say the least.

@ronakg
Copy link

ronakg commented May 19, 2020

I just finished reading this whole thread because I hit into this same issue while developing a new app for an enterprise product. I'm still very new to Go, but no provision to import a separate module that I'm developing in parallel seems like a huge oversight.

Let me try to summarize my use-case:

  • I'm developing myapp module, which is reliant on mylib module.
  • mylib is also still under development and not published anywhere.
  • They're both separate modules under same repo because there might be new futureapp that would want to use mylib. All 3 are delivered as part of the same ISO for enterprise customers, so having single repo makes version management much simpler.

As of now, there's no way for myapp to import mylib without adding the replace directive in the go.mod for myapp, which feels very hacky. I have to publish mylib separately without myapp and then update myapp go.mod file to remove the replace directive.

Another use-case is - when I'm developing a library that's used by multiple modules, I need to run integration tests for the dependent modules to make sure I'm not introducing any regressions. So now I need to change all the dependent modules' go.mod file and add a replace directive to point to the local module.

@marystern 's idea about go install installing the unpublished module locally in the cache sounds like a really good idea. That's how many build management systems work as well. Maven lets you build and install a jar/war file to the local maven repo for other Maven projects to import.

@Helcaraxan
Copy link
Contributor

Helcaraxan commented May 19, 2020

@ronakg, this is not really on the topic of this exact issue but the fact that you are using multiple modules in the same repository for your use-case seems to be an anti-pattern. In general multi-module repositories are not a recommended workflow.

In your specific case (based on the information you have provided) there should not be any reason for having multiple modules. Simply put your library and your app in the same module which should be rooted at the root of your repo. And if a new app using your library needs to be created it can live in the same module & repo as well.

Modules are a dependency-management & versioning abstraction, not a feature-level abstraction. Hence if everything (the library and the binaries) are part of the same product and will be shipped and versioned in a common fashion then they can all be part of the same module without any negative side-effects. Using multiple modules would actually make achieving your goals much harder and your day-to-day development workflows much more complex.

@rsc
Copy link
Contributor

rsc commented Oct 13, 2020

Based on discussion with @bcmills, @jayconrod, @matloob, putting this on hold because we need to think about the higher-level issue of publishing modules at all first. This issue was primarily intended to address publishing a collection of modules that depend on each other, perhaps in a cycle or perhaps not. That's the problem to solve; reusing the module cache is probably not the right solution.

Placing on hold to come back with a different solution.

@folays
Copy link

folays commented Mar 23, 2023

May I add that [#44989] and [#32976] have been marked as duplicate on this current one,
but for CGo the impact is heavy to not yet have a command to "clean only one module" from the cache.

Indeed when modifying a C/C++ source file in CGo outside of compiled package,
there isn't an easy way to force rebuild of the compiled package beside cleaning ALL module cache,
which has an heavy recompile time cost, especially when it's not the only CGo module in the whole project...

If you would want to argue that keeping those C/C++ source files outside of the specific compiled package directory is not a good practice, please keep in mind that keeping those C sources files separate in their own directory allows them to be in a git-subtree dedicated folder, and permits to follow C++ upstreams and emit diff-to-upstream patches easily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
modules NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal Proposal-Hold
Projects
Status: Hold
Development

No branches or pull requests