Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proxy.golang.org: stop serving /@v/branchname.info when the branch is deleted #49146

Open
mvdan opened this issue Oct 25, 2021 · 8 comments
Open
Labels
modules NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. proxy.golang.org
Milestone

Comments

@mvdan
Copy link
Member

mvdan commented Oct 25, 2021

I just had a very confusing experience using modules for a solid hour, which was all caused because one of the tools I use renamed their HEAD branch from master to main and I hadn't noticed.

I was happily running go install github.com/kyleconroy/sqlc/cmd/sqlc@master, and it was working. I ran into some weird behaviors which should have been fixed in the latest development version, which was odd. I later found out via go version -m $(which sqlc) that my version was half a year old, even though HEAD had a commit just days ago.

Ah! The HEAD branch is now main. But why is @master still happily giving me an old version, when upstream deleted their branch? Surely, upstream deleting their old HEAD branch is as much as they should do to make things right.

Turns out that the problem is that proxy.golang.org is remembering the last master version it saw, presumably forever:

$ go list -m github.com/kyleconroy/sqlc@master
github.com/kyleconroy/sqlc v1.8.1-0.20210503000056-ec35f225dfed
$ go list -m github.com/kyleconroy/sqlc@main
github.com/kyleconroy/sqlc v1.10.1-0.20211019050806-6ee39cb62526
$ GOPROXY=direct go list -m github.com/kyleconroy/sqlc@master
go: github.com/kyleconroy/sqlc@master: invalid version: unknown revision master
$ GOPROXY=direct go list -m github.com/kyleconroy/sqlc@main
github.com/kyleconroy/sqlc v1.10.1-0.20211019050806-6ee39cb62526

You can see it frozen in time at https://proxy.golang.org/github.com/kyleconroy/sqlc/@v/master.info.

I strongly believe the proxy server should not do this. I realise that tagged versions are effectively treated as read-only once pushed by the proxy and index servers, but I understand that branches aren't meant to work that way. If a branch vanishes from upstream, I think its .info file should reflect that, just like /@v/missing.info gives invalid version: unknown revision missing.

This issue is slightly similar to #39007, but also different - in my case, the module is healthy, but just not at that branch anymore.

@mvdan mvdan added the modules label Oct 25, 2021
@mvdan
Copy link
Member Author

mvdan commented Oct 25, 2021

cc @katiehockman @heschi @hyangah

@aarzilli
Copy link
Contributor

Isn't the whole point of the proxy to keep serving things that have been deleted?

@mvdan
Copy link
Member Author

mvdan commented Oct 25, 2021

For tags and semver versions, I definitely agree. For branch names, I'm not that convinced. A branch is a moving target by definition.

An alternative would be to build special knowledge that, if master has vanished but main still exists, then /@v/master.info should redirect to /@v/main.info. I believe sites like proxy.golang.org and pkg.go.dev already treat master/main branches in a special way, so perhaps this extra rule would be reasonable. Intuitively, I think I'd prefer a fix that doesn't depend on branch names.

@hyangah
Copy link
Contributor

hyangah commented Oct 25, 2021

My personal preference (not go proxy or go command teams' opinion) is not to proxy or cache any version query string (e.g. tags, commit hash, branch name), and let proxy.golang.org just return 404/410 with a long TTL for such queries. I'd also like the go command to just automatically switch to direct mode (if GOPROXY includes it) for tags/branch names - my reasoning is that they are VCS-specific and it's better to let VCS handle. But tags such as `vX.Y.Z' which look like a version string that looks like go's recognized semver, but may be resolved to something else.

@aarzilli
Copy link
Contributor

For tags and semver versions, I definitely agree. For branch names, I'm not that convinced. A branch is a moving target by definition.

This argument could be made if the repository is deleted as well. If the entire repository was deleted should sqlc@master return anything? I think @hyangah argument is more consistent.

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 25, 2021
@dmitshur dmitshur added this to the Backlog milestone Oct 25, 2021
@bcmills
Copy link
Contributor

bcmills commented Oct 26, 2021

I'd also like the go command to just automatically switch to direct mode (if GOPROXY includes it) for tags/branch names

I don't think the go command should automatically switch to direct mode for branch names or non-semver tags — the default proxy behavior is designed so that the proxy can explicitly reject, say, module paths that are known not to comply with the proxy owner's third-party-dependency policies.

(But I do agree that it would be fine for proxy.golang.org in particular to serve 404s or 410s for branches and non-semver tags, since it has no such policy to enforce.)

@bcmills
Copy link
Contributor

bcmills commented Oct 26, 2021

It may also be possible to change cmd/go to more clearly indicate the situation when a module's repository exists but the requested version (or nested module) is not present in that repository. The proxy could use that as an explicit signal to evict the cache entry.

@mvdan
Copy link
Member Author

mvdan commented May 12, 2023

This argument could be made if the repository is deleted as well.

Coming to this over two years later, but: the proxy should continue serving semver versions practically forever, unless there's something serious like a takedown notice or licensing issue. Otherwise, one pillar of reproducible builds falls down. I still think that branch names do not fall under that category, because they're not used by MVS nor commands like go build on a tidied module, as those don't need to resolve branches. It's only direct calls by humans or scripts like go install pkg@master which are affected.

I personally don't mind whether this detection of "branch deleted" happens in the proxy or in cmd/go. Intuitively, I would assume that the proxy can do it more efficiently, as it already needs to keep track of new versions published by the module, as well as cache some branch names. On the other hand, an implementation in cmd/go means that this would be solved for any proxy and not just proxy.golang.org.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
modules NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. proxy.golang.org
Projects
None yet
Development

No branches or pull requests

6 participants