Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: go mod tidy ignores go.work file #50750

Open
bozaro opened this issue Jan 21, 2022 · 76 comments
Open

cmd/go: go mod tidy ignores go.work file #50750

bozaro opened this issue Jan 21, 2022 · 76 comments
Labels
modules NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Thinking
Milestone

Comments

@bozaro
Copy link

bozaro commented Jan 21, 2022

What version of Go are you using (go version)?

$ go version
go version go1.18beta1 linux/amd64

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/bozaro/.cache/go-build"
GOENV="/home/bozaro/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/bozaro/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/bozaro/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go-1.18"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go-1.18/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.18beta1"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/bozaro/github/go-work-play/tools/go.mod"
GOWORK="/home/bozaro/github/go-work-play/go.work"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build4292218461=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Minimal reproducing repository: https://github.com/bozaro/go-work-play/tree/go-mod-tidy

Full script:

$ git clone https://github.com/bozaro/go-work-play.git -b go-mod-tidy
$ cd go-work-play/tools
$ go mod tidy
go: finding module for package github.com/bozaro/go/go-work-play/shared/foo
github.com/bozaro/go/go-work-play/tools/hello imports
	github.com/bozaro/go/go-work-play/shared/foo: module github.com/bozaro/go@latest found (v0.0.0-20200925035954-2333c6299f34), but does not contain package github.com/bozaro/go/go-work-play/shared/foo

What did you expect to see?

I expect go.mod and go.sum updated with current working copy state.

What did you see instead?

I see that go mod tidy try to get modules for shared go modules from repository ignoring go.work content.

@bcmills
Copy link
Contributor

bcmills commented Jan 21, 2022

Tidiness is a property of an individual module, not a workspace: if a module is tidy, then a downstream consumer of the module knows which versions to use for every dependency of every package in that module.

If you don't particularly care about downstream consumers having a package that is provided by the workspace, you can use go mod tidy -e to ignore the error from the missing package.

Otherwise, you either need to publish the workspace dependencies before running go mod tidy (so that they have well-defined upstream versions), or tell the go command what those versions will be (using a replace and require directive in the individual module).

@bcmills
Copy link
Contributor

bcmills commented Jan 21, 2022

@matloob, for Go 1.19 I wonder if we should augment the go.work file to be able to declare explicit intended versions for the modules in the workspace. Then go mod tidy could use those versions instead of looking upstream. 🤔

@bcmills bcmills changed the title go mod tidy ignores go.work` file cmd/go: go mod tidy ignores go.work` file Jan 21, 2022
@bcmills bcmills changed the title cmd/go: go mod tidy ignores go.work` file cmd/go: go mod tidy ignores go.work file Jan 21, 2022
@bcmills bcmills added this to the Backlog milestone Jan 21, 2022
@heschi heschi added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jan 21, 2022
@bozaro
Copy link
Author

bozaro commented Jan 22, 2022

Currently with reproducing repository (https://github.com/bozaro/go-work-play/tree/go-mod-tidy).

I can run:

cd tools/hello
go run .

And go run works like workspace go modules declared as replace in current tools/go.mod.

But for go mod tidy I need add replace manually.

This behaviour looks like inconsistent: I expects that go run and go mod tidy would use same dependencies.

@liadmord
Copy link

liadmord commented Jan 24, 2022

What valid use cases are there for go.work if we are still required to fully implement the go.mod in order to work on multi module repo?
@bcmills

@AlmogBaku
Copy link

AlmogBaku commented Apr 8, 2022

This is also happening for go get ./...

I agree with @liadmord, the only benefit I see get from the go.work is a better "understanding" of my IDE
Or - maybe I'm doing something wrong here?

@Jictyvoo
Copy link

I agree with @bozaro @liadmord and @AlmogBaku

I have a monorepo private project where I have multiple modules, and on each one I need to insert a lot of replace statements for each direct and indirect dependencies. With go workspaces I really want to be possible to run go mod download or go mod tidy without the needed to write a replace statement in every go.mod file in my workspace.

@springeye

This comment was marked as duplicate.

@jincheng9

This comment was marked as duplicate.

@maxsxu
Copy link

maxsxu commented May 18, 2022

Encountering the same issue with go1.18.2.

Furthermore, some other go mod commands also failed. See sample repo: https://github.com/maxsxu/go-workspace

  • go mod vendor
➜  server   go mod vendor 
atmax.io/go-workspace/server imports
        atmax.io/go-workspace/commons/pkg: no required module provides package atmax.io/go-workspace/commons/pkg; to add it:
        go get atmax.io/go-workspace/commons/pkg
  • go mod verify
➜  server   go mod verify 
atmax.io/go-workspace/server : missing ziphash: open hash: no such file or directory

Expected behaviour: The go mod commands should not try to find modules which have been declared in go.work. which I think will make go workspace more meaningful.

@Gbps
Copy link

Gbps commented May 31, 2022

My expected functionality is that a go mod tidy on a go.work enabled workspace should feel like a bunch of phantom replace statements inside the go.mod without editing the file. If the module is at github.com/Gbps/mymodule and I tell my go.work that the module can be found in /path/to/mymodule, then go mod tidy should pull the version from my local repo instead of going to github at all.

If I don't commit my go.work, then anyone who uses my go.mod afterwards will have to go to GitHub, which is correct! Now I can edit my local copy and test changes, but when I'm ready to push, I can tidy the correct version into the go.mod and push both the mymodule local dev repo and my current project. This makes sense!

Once I must make an edit to go.mod to include a local path string for any feature of the go mod command line to work, all the purpose of go.work is lost on me. I should never have to taint go.mod with local paths, go mod tidy can find what commit I'm working with locally through go.work and update it accordingly. It should be no different than if I pushed a commit to github and ran go mod tidy to pull the newer version into my go.mod.

In other words, if I add this line to go.mod:

replace github.com/Gbps/mymodule => /path/to/mymodule/

Then go mod tidy works as expected. If I have use /path/to/mymodule/ in my go.work, it should accomplish the same thing. The benefit being I don't have to taint go.mod with my local paths, as that is going to need to be committed to remote whereas go.work isn't.

@opengs
Copy link

opengs commented Jun 17, 2022

This is a very big issue for me in the case of creating Docker containers :( I cant use go mod tidy and copy the dependency graph to the Dockerfile. So every time I build a container I have to download downstream dependencies instead of doing go mod download.

Golang is considered the language for microservices and In the current state of the go work it's very hard to work with them in monorepo. Creating multiple-container environment is incredibly hard :(. Basically, every time I hit docker-build it has to download downstream dependencies. Currently, I have two ways to fix this:

  1. Push packages to the repo. But if I have ~30 microservices and most of them are 1 file program - it's a very poor experience.
  2. replace and require create too many relative dependencies and it's very hard to maintain.

I expect 4 things from the golang workspaces:

  1. go mod to know about other modules and don't try to download the local package from the internet. Maybe traverse up in the folder hierarchy and check if go.work exists?
  2. you can create a clear dependency graph so it can be used for caching in Docker containers
  3. functions like go work tidy and go work download, because now I have to enter every package and type go mod commands. I just want to tidy and download everything from one place, and because we already have a path to the modules in go.work file - it should be trivial.
  4. functionality to reduce the number of similar dependencies (download one specific version and force to use it in every module in workspace). This was done by for example turborepo, lerna, and other monorepo management tools.

So now, the only thing why I'm using go work is to have a better codding experience in VsCode. Most of the negative comments that I hear from colleagues about golang are about poor modules/workspaces management. I think, go work is a very important feature to have and it's a very important direction to go.

rustatian added a commit to khepin/roadrunner that referenced this issue Jul 15, 2022
Signed-off-by: Valery Piashchynski <piashchynski.valery@gmail.com>
@HeCorr
Copy link

HeCorr commented Jul 25, 2022

I've faced this issue yesterday but my case is a bit different.. My replace statements are in the go.work file, which seemed to solve go run errors, but when attempting go mod tidy on any of the modules, it is indeed ignored.

The solution is simply to move the replace statements to the go.mod files as explained by Gbps, but keeping track of what package uses what internal package is a bit of a hassle. Is this considered a bug also?

@bcmills
Copy link
Contributor

bcmills commented Jul 25, 2022

go mod tidy is intended to update the module's self-contained dependencies. It doesn't use the workspace because in general one may work on multiple independently-maintained modules in the same workspace, and if you're preparing an upstream commit you definitely don't want that commit to rely on unpublished local modifications.

go work sync is intended to update the modules within a workspace to reflect the dependencies selected in that workspace. That is probably what you want if you are working on a set of modules that are all maintained as a single unit.

Folks who are commenting here (and please bear in mind https://go.dev/wiki/NoPlusOne!) — have you considered go work sync for your use-case? If so, what does go work sync do (or not do) that makes it unsuitable for your use-case?

@bcmills bcmills added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jul 25, 2022
@AlmogBaku
Copy link

@bcmills I'm not sure what the intended workflow is, but I think the feedback here is that the DX is really not intuitive.

@burdiyan
Copy link

Folks who are commenting here (and please bear in mind https://go.dev/wiki/NoPlusOne!) — have you considered go work sync for your use-case? If so, what does go work sync do (or not do) that makes it unsuitable for your use-case?

I wish there were go work tidy or something similar that would do the same as go mod tidy for each module in the workspace. Or maybe go work sync could have a flag for that, or even do it by default. This way one could easily manage a single repository with many modules, keeping everything in sync and tidy :) without having to switch to every single module directory to keep it tidy.

@lopezator

This comment was marked as off-topic.

@bcmills

This comment was marked as off-topic.

@mlaventure
Copy link

go work sync does not work for me in that scenario either unfortunately. I have a go.work for a monorepo containing yet to be published packages (the domain also does not exist yet). e.g.:

use (
    ./foo
    ./bar
    ./fubar
)

replace not-yet-created.com/foo/foo v0.0.0-00010101000000-000000000000 => ./foo

However that replace directive seems to be ignored by go work sync

$ go work sync
go: not-yet-created.com/foo/foo@v0.0.0-00010101000000-000000000000: unrecognized import path "not-yet-created.com/foo/foo": https fetch: Get "https://not-yet-created/foo/foo?go-get=1": dial tcp: lookup not-yet-created.com on 127.0.0.53:53: server misbehaving

@dkrieger
Copy link

dkrieger commented Feb 9, 2024

@mdepot What's the use case for the client and server being in different modules from each other?

@matloob can we not XY-problem this and take as given that it is a valid use case to have a client and a server be in different modules, especially considering it is commonplace for them to be in entirely different repos?

Otherwise this turns into a tangent of "why are monorepos, especially ones where packages within can have dependencies on other packages' source, beneficial", which is deeper topic than needs to be explored here, and it can just as easily be taken a given that a non-trivial number of organizations (and therefore go toolchain users) have decided this has some good answers.

For further clarification, I used the term "package" above in the general sense, not the golang-specific sense; pretend I said "module" if you prefer.

@BourgeoisBear
Copy link

BourgeoisBear commented Feb 9, 2024

@matloob: i'm in the same situation as @mdepot. In many cases, the client is free / open-source while the server is proprietary / closed-source. There are also cases where one side has far more dependencies than the other (i.e. why force people to pull a fat Electron client when they only want a server side that is strictly files & sockets?).

@Perusae
Copy link

Perusae commented Feb 13, 2024

Whatever the answer ends up being, it would be good to explain the best way to manage this type of scenario somewhere in the workspace docs as well.

We also use a similar structure for the above mentioned reason. When running Go tidy from the root, it took us by surprise that it ignored the go.work.

A quick fix was running:

for f in $(find . -name go.mod)
      do (cd $(dirname $f); go mod tidy)
 done

But that is just a temporary solution, for a missing feature.

@bcmills
Copy link
Contributor

bcmills commented Feb 13, 2024

@dkrieger, generally we expect “dependencies on other packages' source” to mean something that can be fetched as a module, from a (public or private) GOPROXY or version-control host. It isn't obvious why one would have dependencies that can't at least be redirected for go get using an insteadOf directive in the user's .gitconfig.

@bcmills
Copy link
Contributor

bcmills commented Feb 13, 2024

@BourgeoisBear

There are also cases where one side has far more dependencies than the other (i.e. why force people to pull a fat Electron client when they only want a server side that is strictly files & sockets?).

That was the motivation for adding module graph pruning in Go 1.17. Are there cases of this sort where graph pruning is ineffective? If so, can you give more detail about the concrete problems (ideally as a separate issue)?

@dkrieger
Copy link

dkrieger commented Feb 13, 2024

@dkrieger, generally we expect “dependencies on other packages' source” to mean something that can be fetched as a module, from a (public or private) GOPROXY or version-control host. It isn't obvious why one would have dependencies that can't at least be redirected for go get using an insteadOf directive in the user's .gitconfig.

@bcmills The obvious case is when you have a monorepo with multiple go modules, orchestrated with that other Google tool bazel/blaze. Being able to publish updates to two or more modules in your monorepo in the same commit is high on the list of benefits of a monorepo.

@bcmills
Copy link
Contributor

bcmills commented Feb 14, 2024

@dkrieger, can you give some more detail about why the monorepo is structured that way? (We're trying to better understand the underlying use-cases so that we can address them holistically.)

We're familiar with https://github.com/googleapis/google-cloud-go, which IIUC uses multiple modules so that they can tag its various APIs at different levels of stability (for example, have some APIs stabilized at v1 while others are still at v0). But the discussion on this issue has been mostly focused on dependencies that don't have any versions published at all, which would imply that that's not the same underlying motivation.

@BourgeoisBear
Copy link

BourgeoisBear commented Feb 14, 2024

@bcmills, "pull" as in git/hg of a repo with large assets (like a GUI with many rasters/videos) when I don't need it. No complaints on compile time or binary sizes.

@bcmills
Copy link
Contributor

bcmills commented Feb 14, 2024

Hmm, here's a thought. For #50603, we're going to need the ability to inspect the local repo containing a given repository to determine its version information.

In theory, go work sync could do the same thing for every main module in the workspace, so that if some module in the workspace depends on it that dependency could be updated to refer to the version actually found in the workspace.

But there is a bit of a chicken-and-egg problem: the version of a checked-out repo in the workspace depends on its commit hash, which depends on the contents of its files. If there is a cycle in the requirement graph (which is normally allowed), then go work sync would become unstable: each time it writes out a go.mod file in one module, the versions it records in the other go.mod files would be updated to the new hash, causing an additional go.mod write. The only ways to reach a fixed point are:

  1. Decouple the actual versions written to the go.mod files from the versions found in the workspace, such as by resolving upstream commits.
  2. Have go work sync intentionally remove references to the workspace modules from the go.mod files, presumably to be filled in later.
  3. Have the user explicitly specify versions to assign to the workspace modules, instead of trying to infer them from the repo.

@dkrieger
Copy link

@bcmills let me start with a language agnostic description of why we would structure a monorepo in this way, then I'm happy to field any go-specific follow-up questions you might have.

At our company (and I'd suggest most companies), most of our code is not public. Our deployable artifacts are mostly containers that we deploy to kubernetes.

For some given deployable slice of the codebase, we may deploy it as one or more services, and a given chunk of code may appear in one or more containers -- in a common scenario where it appears in multiple containers, we may have a fatter container that aggregates multiple domains (let's say each domain is represented by one grpc service, and we pack those into single networked service, i.e. many:1 domain:service) and a smaller container that is a microservice (i.e. 1:1 domain:service). This enables sophisticated traffic shaping when both are deployed, but in more simpler scenarios it means we can periodically move predictable-traffic and/or high-reliability domains into the aggregate service, and move bursty and/or low-reliability domains into their own microservices -- basically, we preserve domain boundaries and the flexibility of deploying as a microservice or in an aggregate/monolith as we see fit, balancing performance, complexity, efficiency, and reliability over time.

At lower levels, we may have various utility libraries that get used by many deployable slices of the codebase.

Now, for all that private code, we don't really care about publishing these for consumption from other repos, whether inside or outside the organization. As such, we don't have to worry about versioning in the traditional sense -- any commit on the trunk branch (or, by convention, any merge commit) can be built and deployed, and the commit hash is the "version" for everything. Because everything we do is in the monorepo, we not only know every consumer for X service, but they share the same build and IDE context. We can make breaking changes to a service and update every consumer in a single commit if we want to (in practice we'd make a backwards compatible change followed by cleaning up the deprecated API when talking about services, but in the case of libraries we can do it all in one shot).

Still, we want strong boundaries between different parts of our codebase, and the universally strongest boundary is to structure it as the unit a given language uses for external dependencies -- in node, this is a package; in go, this is a module. We don't actually want to put package registries in the middle of this though.

When I look at my string util package ("module" in go), I want to be able to look at the go.mod file and see just the dependencies for that, not polluted with a single line that has to do with some consumer of the string util go module, or some other completely unrelated go module. If it doesn't depend on any other go modules in the monorepo, it could be built with just the standard go toolchain.

For higher level go modules in the monorepo (e.g. a networked service), if I don't want them published for consumption outside the monorepo, they will have dependencies on external libs and some other internal modules in the monorepo. I'd like to be able to go mod tidy --workspace and not have it barf, behaving how it currently does for external libs, but for other modules in the workspace (as defined in the go.work file), it will point to those local sources. This allows for making logical changes to module apis in a single commit, updating consumers that would otherwise break, without going through the toil of publishing each update in the dependency graph first. With bazel, every part of thus workflow works except for the go mod tidy portion, because go mod tidy doesn't understand how to handle this.

@matloob
Copy link
Contributor

matloob commented Feb 15, 2024

@dkrieger Could you speak about how you use Bazel together with modules? It was my understanding that Bazel didn't support modules. Is that incorrect? Why do you need modules if you're using Bazel to do your builds?

Do you have a Bazel WORKSPACE for each of your modules? My understanding is that the workspace in Bazel is what maps closest to a Go module.

@dkrieger
Copy link

dkrieger commented Feb 16, 2024

@matloob Bazel with rules_go and gazelle lets you configure gazelle to source your workspace-level go dependencies from the union of every go.mod in your bazel workspace. A bazel workspace maps to a go.work, if you are using go modules and workspace. The only pain point is the missing ability to have go mod tidy manage a given go.mod for you that uses your unpublished/unversioned sources for other modules in the same go/bazel workspace

@matloob
Copy link
Contributor

matloob commented Feb 16, 2024

@dkrieger I didn't realize that Bazel supported that.

I want to try to step back a bit. The reason I'm so hesitant about supporting this is that we don't want the tooling to encourage folks to use multiple modules when they can work with a single one. The motivation that I understand for having a multi module monorepo is that the modules are distributed separately from each other. That is: the modules appear in a monorepo, but they are also dependencies of other external modules. For those cases go mod tidy should work fine because each module is able to stand on its own as a dependency of an external project.

So I want to understand your use case better. I see that you mentioned the string util libraries. Are those depended on by other modules outside of your monorepo? Are they built separately from the rest of your monorepo?

@dkrieger
Copy link

dkrieger commented Feb 16, 2024

@matloob

The reason I'm so hesitant about supporting this is that we don't want the tooling to encourage folks to use multiple modules when they can work with a single one.

I have a couple responses to this:

  1. an optional cli flag does not encourage its usage, and the docs can express (non-)recommended use cases. In a pnpm monorepo, only when you add the --workspace flag when adding a dependency will it prefer the local workspace sources (vs some pinned version available on the registry) when adding an intra-workspace dep.
  2. I think borrowing the python mantra of "we're all adults" is appropriate here. Ergonomics and defaults are the appropriate ways to signal recommended patterns, not capabilities. In python, prefixing with an underscore signals "this isn't meant for external use", but you can still do it if you feel you have a good reason to; this works out really well in practice

The motivation that I understand for having a multi module monorepo is that the modules are distributed separately from each other.

For all intents and purposes, I'd say that the use case I'm describing does involve distributing modules separately from one another; whether they're distributed in the conventional go way or not, whether they're published to a public registry (including public github repo) or not -- these are downstream decisions. In all cases, they're possibly (and in practice, often) developed in lock step, and contracts can be changed in backwards incompatible ways in a single atomic commit. Again, the main effect is eliminating the toil of traversing your dependency graph and the logical change you're wanting to make becoming distributed over the dimension of git-commit-time

The result is much cleaner than having a huge module with packages that have no logical relationship with one another. Every module I create tends to consist of several packages, and those packages all pertain to the logic of that module. The 2nd level of ordering that modules provide is incredibly useful, as it (1) makes it easier to understand the transitive closure of dependencies of any piece of code without noise and (2) makes it easier to understand the interrelated packages that exist in that module. This is just as true whether or not I publish my modules in a way that they can be retrieved via go get in some external repo/organization.

In practice, I'd only invest in versioning/publishing for consumption in external repos if I want legacy pre-monorepo code to use them. I don't want to extract a network of packages from a monolithic module when that determination is made, I want to make small changes to my distribution logic (and introduce versioning)

@nemith
Copy link
Contributor

nemith commented Feb 22, 2024

That was the motivation for adding module graph pruning in Go 1.17. Are there cases of this sort where graph pruning is ineffective? If so, can you give more detail about the concrete problems (ideally as a separate issue)?

When downloading a module the entire repo is checked out. So if you have a single go.mod for a monorepo you easily run into an issue where you can hit the limits of size of repo (I can't remember what the limits are but it's hardcoded into the go command). Using submodules does inform git to ignore directories.

I am specifically running into this problem with trying to add new modules to an existing repo. New module is created, needs to be referenced by other modules in the monorepo but go get and go mod tidy don't work as the module doesn't exist upstream yet. This is breaking a normal workflow for coding and means we have to commit a dummy mod first (and get code review on it) before hacking further.

Replace works here, but sorta defeats the purpose of go.work.

@spekary
Copy link

spekary commented Feb 22, 2024

means we have to commit a dummy mod first (and get code review on it) before hacking further

Yep, that is how I had to work around it. I don't think it has anything to do with limits on the size of a repo though.

@nemith
Copy link
Contributor

nemith commented Feb 22, 2024

means we have to commit a dummy mod first (and get code review on it) before hacking further

Yep, that is how I had to work around it. I don't think it has anything to do with limits on the size of a repo though.

What i mean is that I am using (or rather want to us) multiple modules to get around the size of my monorepo vs just having a single go.mod for the entire repo.

There are two problems with large monorepos and a single go.mod when it comes to size. One is just download size, it takes a long time for consumers of one project in the the monorepo to download and cache the entire thing. The other is hard limits.

For all download type it looks like it defaults to 10<<20 (or around 524MB) which is a lot for a repo, but we use git-lfs which is resolved here and includes any binaries you may want to stuff into your monorepo:

https://github.com/golang/go/blob/master/src/cmd/go/internal/modfetch/codehost/codehost.go#L35
https://github.com/golang/go/blob/master/src/cmd/go/internal/modfetch/proxy.go#L435
https://github.com/golang/go/blob/master/src/cmd/go/internal/modfetch/codehost/vcs.go#L465
https://github.com/golang/go/blob/master/src/cmd/go/internal/modfetch/coderepo.go#L1087

Submodules is a great way to solve this, but because of resolving versions and the chicken-and-egg problem they are hard to work with and hard for my coworkers to stomach.

@dkrieger
Copy link

An important underlying theme here is that monorepos are not monolithic codebases, nor are they distributed eventually consistent codebases where you necessarily version and publish each part to some registry. They are no less modular than multi-repo/polyrepo -- to the contrary, they promote mindful modularity by reducing the incidental disincentives to modular design. That they don't encourage/require long-lived support branches in the form of versioning in many circumstances is frankly unrelated to modularity.

If we can agree that this is at least a defensible position, let's try to shift the focus to how we can decouple go module tooling behavior from opinions for/against monorepo adoption without breaking userland.

@rsc
Copy link
Contributor

rsc commented Feb 28, 2024

"go mod" commands are scoped to modules when making changes; there is no workspace in go mod sub-commands. This is why "go work vendor" is different from "go mod vendor". So "go mod tidy" should not know about go.work any more than the other commands. (Some of the read-only commands like "go mod graph" are workspace-aware, but that's less problematic than read-write commands.)

It seems like "go work sync" is what we should be focusing on. What does it do or not do that is inappropriate?

@brianbraunstein
Copy link

@rsc Hi Ross, thanks for looking into this.

What does it do or not do that is inappropriate?

Scenario

Imagine this scenario, you're planning on creating a repo with the following:

root
├── lib
│   ├── some_code.go  // package example.com/lib
├── prog
│   └── main.go   // package example.com/prog
  • prog/main.go uses a function from lib/some_code.go.
  • prog/main.go also uses a function from thirdparty.com/foolib
  • You want to make lib and prog as independent modules for various reasons (I think these are also mentioned by others above):
    • lib will be useful in other places than just prog, and in those other places, pulling in prog isn't appropriate.
    • Limiting scope to keep things organize and limit cognitive load when focusing on a single module.

Problem

So now you learn about "go workspaces" that seem like they could help with this and you want to quickly try it out. You make the above file tree, then do:

cd root
go work init
go work use -r .

cd root/lib
go mod init
go mod tidy

cd root/prog
go mod init

# so far so good

go mod tidy
# ERROR: ...cannot find module providing package example.com/lib...

To workaround you can do:

cd root/prog

go mod edit -replace=example.com/lib ../lib
go mod tidy
go mod edit -dropreplace=example.com/lib

go build

But that's obviously silly because you might as well not use workspaces at that point.

Possible Solution 1: go work modtidy

Make something like go work modtidy that does the job of go mod tidy, but like go build, is aware of the go.work file.

Possible Solution 2: Clarify the purpose of workspaces

One solution might be to just add clarification in the documentation about the intended use cases for workspaces, and more importantly, what use cases are NOT appropriate for workspaces.

https://go.dev/doc/tutorial/workspaces says:

With multi-module workspaces, you can tell the Go command that you’re writing code
in multiple modules at the same time and easily build and run code in those modules"

I think this is leading people astray. In their mind, they read this and think of the scenario I've given here. One could argue that in this scenario you should not use workspace and instead either:

  • just check root into github/etc so that all dependency modules, even the local ones, are always accessible for go mod tidy to actually work.
  • just use replace in this case.

Tutorial use case maybe shouldn't use workspaces too...

Adding some clarification would be helpful for me because so far I'm failing to see the compelling use case for workspaces. In the example given in https://go.dev/doc/tutorial/workspaces, it seems to me that it would have been better done with replace. At the very end of the tutorial it says:

...Now, to properly release these modules we’d need to make a release of the golang.org/x/example/hello module...Once the release is done, we can increase the requirement on the golang.org/x/example/hello module in hello/go.mod

With replace, if you accidentally forget to submit the change to the dependency first and send the code review out, your reviewer will say "hey, what's with this replace business!? are you running your code against a locally tweaked version of that library!? please don't check this in, it will break for everyone else". Also, if the dependent does accidentally get submitted first and breaks everyone else, the error message will make it will be pretty clear what happened (replace pointing to some weird ../../foo location). Using workspaces like the tutorial, the reviewer is much less likely to notice, there's no trace of what happened in the code review, and when the breakage happens it will be more confusing to root cause, especially if it's a behavior change to an existing function that wouldn't be caught at compile time.

So maybe it would be helpful to have a different use case in the tutorial that really shows where workspaces shine.

@dkrieger
Copy link

dkrieger commented Mar 2, 2024

@brianbraunstein

Possible Solution 2: Clarify the purpose of workspaces

Respectfully, this is not a solution, it's a temporary mitigation at best. Not supporting atomic updates spanning multiple workspace modules defies the purpose of a workspace in the general sense of the term, as it is observed across languages and monorepo build tools. Whether we get go mod tidy --workspace, go work modtidy, or some other cli for this use case has no bearing on whether it is a sensible use case to support or not. If we can agree it satisfies a standard of "plausibly sensible", that should be enough to move forward, at which point the finer details of the appropriate cli API can be worked out.

Updating go.mod files appears to be within the purview of go mod. Workspace management is sometimes within the purview of go work, but not exclusively, and in this case, its not really workspace management, but rather module management that could be extended to be workspace-aware. Unless there is another issue with more discussion on this topic (I have not found one), advancing the discussion here whether or not it ends up being a go mod subcommand in the end seems appropriate. If there's buy in for support, then, despite my own misgivings that go work sync is the appropriate domain, that should stay on the table, but gating further consideration by asking this discussion to be reframed as a go work sync deficiency is not constructive IMO @rsc .

@atdiar
Copy link

atdiar commented Mar 2, 2024

I don't know whether it is related but one issue that I have encountered is, while building something similar to gonew, when downloading a module in a workspace, the go.work file needed to be updated manually.

Perhaps that go work sync could also work bottom up (via a flag?) to add this module/directory to the module list in the go.work file?

If that makes sense.

@dpifke
Copy link
Contributor

dpifke commented Mar 4, 2024

If a different command should be substituted for go mod tidy when using workspaces, I think we should update the various places where running other commands instruct the user to run it:

https://cs.opensource.google/go/go/+/refs/tags/go1.22.0:src/cmd/go/internal/modget/get.go;l=1781
https://cs.opensource.google/go/go/+/refs/tags/go1.22.0:src/cmd/go/internal/modload/help.go;l=53
https://cs.opensource.google/go/go/+/refs/tags/go1.22.0:src/cmd/go/internal/modload/init.go;l=648
https://cs.opensource.google/go/go/+/refs/tags/go1.22.0:src/cmd/go/internal/modload/init.go;l=651
https://cs.opensource.google/go/go/+/refs/tags/go1.22.0:src/cmd/go/internal/modload/load.go;l=2023
https://cs.opensource.google/go/go/+/refs/tags/go1.22.0:src/cmd/go/internal/modload/load.go;l=2029
https://cs.opensource.google/go/go/+/refs/tags/go1.22.0:src/cmd/go/internal/modload/load.go;l=2032

This might also be a good way to audit go mod tidy use cases and make sure they are covered by whatever is the equivalent workspace command.

Being told by the tooling to run go mod tidy, and having that fail, is what brought me to this issue originally. I've since given up on using workspaces and instead use replace directives in go.mod, along with a hacky Git pre-commit hook to filter those out when committing. (I'm probably holding it wrong.)

@ohir
Copy link

ohir commented Mar 17, 2024

@matloob

So I want to understand your use case better.

Many use cases were described by participants here. Mine is that I do not want to publish dummy stub modules before I can start to work. I can buy rsc explanation why go mod tidy should not know about go.work. But certainly go work should be able to recursively tidy all use pieces declared in go.work workspace, akin to go work sync. Thus please consider:

go work tidy

that does not try to reach to unpublished modules. Ie. replaced to the => ./local/path

@matloob
Copy link
Contributor

matloob commented Mar 18, 2024

@dpifke Thanks, that's a useful perspective. We don't want our tools confusing users with the instructions they give. I'm interested if solving the error messages is the primary reason most people want 'tidy' for these use cases.

@brianbraunstein I don't think 'tidyness' is the right concept here. I don't think we should call a go.mod file tidy if it can't be used (alone) to build a module. If go work sync doesn't work we'll need to think of some other clean concept we can use to express that though the modules don't work standalone they can be used in a workspace concept.

@ohir I don't know if tidy is the right concept here. Why wouldn't go work sync work for your use case.

@dkrieger I don't think we should frame this discussion around what the concept of a "Workspace" is in other language ecosystems. We want to build what make the most sense for the Go module system.

@ohir
Copy link

ohir commented Mar 21, 2024

@matloob

Why wouldn't go work sync work for your use case

It still reaches to the net even if it has everything replaced or internal. It should not.

Repro tree attached (usage: patch -p0 < goworksync_repro.patch)

goworksync_repro.patch

@matloob
Copy link
Contributor

matloob commented Mar 26, 2024

@ohir

When I disable the network your repro case doesn't produce an error when the two replaces in the go.work are there. But if I remove them I do get an error. Is that what you're seeing too? If not, what version of Go are you running?

@ohir
Copy link

ohir commented Mar 27, 2024

@matloob

There is no problem with sync if workstation network (local) is down. Problem arises when the workstation network is on but module repo is not exposed – either due to the firewall rules, or because it hasn't been really set up.

Since sync tries to contact fairbe.org domain looking for the module repo it hangs for a longer while (till Dial timeouts).
With real tree having dozens of modules it takes more noticeable time.

This is a wider problem: go get also goes astray on a repo with unpublished/unreachable module:
It errs with [...] Get "https://fairbe.org/moda?go-get=1": net/http: TLS handshake timeout.

The only working solution for now is to replenish all go.mod files with respective replaces, too. What nullifies all, or almost all, benefits of the workspace mode.

go version go1.22.1 darwin/amd64
GOPRIVATE=fairbe.org,example.com
GONOPROXY=fairbe.org,example.com
GONOSUMDB=fairbe.org,example.com

Of course fairbe.org proxy is not reachable from the outside. But even if that were the case, I don't see why the tool reaches for the repo if it has been told that the replacement is on the local drive.


Summary:
Workspace can not be reliably set up for a project that is even partially isolated, ie. it contains an unpublished/unreachable module(s). For the go work sync and go get to work seamlessly everything in the project, including isolated parts, MUST be reachable over the network - or be replaced directly in impoter's go.mod file.

@a-pav
Copy link

a-pav commented Mar 28, 2024

Summary:
Workspace can not be reliably set up for a project that is even partially isolated, ie. it contains an unpublished/unreachable module(s). For the go work sync and go get to work seamlessly everything in the project, including isolated parts, MUST be reachable over the network - or be replaced directly in impoter's go.mod file.

I don't understand why this discussion has continued for this long, honestly, and yet there's no progress to be seen. Your summary seems to sum it up perfectly and I think it's the reason I subscribed to this issue a long time ago: My expectation was to use go.work for the scenario that I want to use an unpublished/locally modified module, but I wasn't able to that. Seems simple enough!

@ohir
Copy link

ohir commented Apr 5, 2024

@matloob
FYI: neither you can easily vendor an unpublished module in the workspace.

After adding vendor/btea module to the workspace (unpublished fairbe.org/btea) then trying to import/use it resulted in a few minutes of delay in the IDE (awaiting network timeouts) then gopls throwed at me a full bucket of errors. Interestingly — import errors in mostly unrelated modules. Not even stdlib pieces could be found.

Filtered excerpts follows:

[cmd/smth imports fairbe.org/btea module for btea.FromBELE]
: "undefined: FromBELE" <= this should come from vendor/btea, it did not. Until replaced explicit in the cmd/go.mod

[other module - imported by cmd/smth but NOT importing fairbe.org/btea at all

: "could not import golang.org/x/crypto/chacha20 (missing metadata for import of \"golang.org/x/crypto/chacha20\")",
: "fairbe.org/btea@v0.0.1: unrecognized import path \"fairbe.org/btea\": https fetch: Get \"https://fairbe.org/btea?go-get=1\": net/http: TLS handshake timeout",
: "could not import fmt (missing metadata for import of \"fmt\")",
: "could not import crypto/ecdh (missing metadata for import of \"crypto/ecdh\")",
: "could not import crypto (missing metadata for import of \"crypto\")",
: "could not import golang.org/x/crypto/chacha20 (no required module provides package \"golang.org/x/crypto/chacha20\")",

To sort this mess out an explicit replace had to be added to the (cmd's) go.mod replace fairbe.org/btea v0.0.1 => ../../../vendor/btea.

I hear "its a gopls issue". No, it is not just gopls – it is a problem of workspaces still being a patchwork on the side of other tools mostly being unaware of the go.work replace directives (@rsc).

I would like my "Summary" warning to be included in the docs of the workspaces for the time being - to save headaches for others trying to add some isolated module to their workspace. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
modules NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Thinking
Projects
None yet
Development

No branches or pull requests