Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: mod meta tag causes infinite loop in GOPROXY #31458

Open
marwan-at-work opened this issue Apr 13, 2019 · 7 comments
Open

cmd/go: mod meta tag causes infinite loop in GOPROXY #31458

marwan-at-work opened this issue Apr 13, 2019 · 7 comments
Labels
modules NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@marwan-at-work
Copy link
Contributor

marwan-at-work commented Apr 13, 2019

I'm stumped about how the <meta content="<path> mod <url>" tag would ever work.

Let's say a user does a go get marwan.io/moddoc where marwan.io is a vanity import server (which it is right now).

Let's say, marwan.io wants to expose its modules via a GOPROXY and not via git

Therefore, the meta tag that it returns will now be

<meta name="go-import" content="marwan.io/moddoc mod https://someproxy.com">

instead of

<meta name="go-import" content="marwan.io/moddoc git https://github.com/marwan-at-work/moddoc">

The go command will now start talking to https://someproxy.com through the Download Protocol. And the first thing it will do is:

GET https://someproxy.com/marwan.io/moddoc/@v/list

The proxy, will now use go list -m -versions -json marwan.io/moddoc@latest to get the versions list (or even latest)

The go command inside the GOPROXY will now actually ping marwan.io/moddoc?go-get=1 one more time, and it will parse the meta tag again and then it will ping the same https://someproxy.com that it is currently running on causing an infinite loop.

Even if we hard code the /list and @latest responses to the GOPROXY server to avoid the infinite loop, when the original go command now requests a version say: GET https://someproxy.com/marwan.io/moddoc/@v/v1.2.3.zip, the GOPROXY must use go mod download to get the module contents which will also cause infinite loop.

The only solution so far is to use go mod download marwan.io/moddoc before we deploy the change to the meta tag, then manually upload it to the GOPROXY server storage. But once we deploy the change, we'll never be able to upload the correct zip file of any new versions because those will cause infinite loops. We'll have to revert the meta tag response to be git again, do go mod download locally, and redeploy the meta-tag change along with the new zip file (and friends) in storage.

Even if the purpose of the "mod meta tag" is not for vanity imports to expose them, I fail to realize where they ever make sense. The vgo proposal mentions here that you can use this feature for directly injecting the storage url: https://research.swtch.com/vgo-module#publishing

But I still fail to see how this is maintainable for a couple of reasons:

  1. @latest is not compatible with the storage.googleapis.com
  2. Authors have to manually do go mod download against the underlying git repository and upload it to storage on every new version or every new commit so that go will never hit @latest

cc: @bcmills @rsc

@bcmills
Copy link
Contributor

bcmills commented Apr 19, 2019

This seems like an implementation detail of the module server. The protocol for a server and a proxy is the same, but a server that is not a proxy fundamentally needs to know which paths it's serving.

(Note that one option is for the server to run go mod download, but on path that differs from the one requested by the user.)

For the @latest path in particular, I suspect that @hyangah, @heschik, and/or @katiehockman can give you more useful advice.

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Apr 19, 2019
@bcmills bcmills added this to the Unplanned milestone Apr 19, 2019
@hyangah
Copy link
Contributor

hyangah commented Apr 19, 2019

This is indeed an implementation detail of the module server.
This issue brings up an interesting attack scenario against certain public proxies. That's also proxy implementation details, and public proxies should be prepared to handle this type of unexpected use or misuse cases.

The /@latest endpoint is not crucial for the go command to operate as long as /@v/list returns a non-empty list. If the storage server is implemented as described in the original vgo proposal, not having /@latest endpoint will not be an issue.

@marwan-at-work
Copy link
Contributor Author

marwan-at-work commented Apr 19, 2019

@hyangah In this case it should be 100% safe to say that the only way a proxy server knows how to go mod download a vanity import path with a mod meta tag, is by knowing its counter-part VCS import path ahead of time.

For example, if a user did:

go get marwan.io/moddoc

And then my marwan.io server returned something like ~<meta mod https://someproxy.com>

Then my myproxy.com must have an ahead of time knowledge that marwan.io maps to github.com/marwan-at-work so that it will do the correct go mod download without getting stuck in an infinite loop. Assuming go mod download github.com/marwan-at-work/moddoc should never fail because the source code's import paths start with marwan.io/moddoc

The only other way around my solution above, is that the proxy server has a way for that vanity import path to be populated outside of go mod download, i.e. by manually uploading its own .zip/.info/.mod without using go mod download which would run the risk of checksum mismatches and that proxy has to be the only place to ever get that module.

If all of my assumptions are correct, I'm happy closing the issue :)

Thanks!

@hyangah
Copy link
Contributor

hyangah commented Apr 19, 2019

@marwan-at-work a different approach to protect the public proxy from the infinite loop is to design the myproxy.com to avoid duplicate work. The subsequent request in the chain will then be classified as duplicate of pending operation, and not add additional work to the proxy but wait for the pending duplicate work to complete. Eventually some of the request in the chain will timeout (as most production service would do) and the chain will terminate.

@marwan-at-work
Copy link
Contributor Author

marwan-at-work commented Apr 19, 2019

@hyangah I'm more focused on how it should work as opposed to the security vulnerability side of it.

From a vulnerability stand point, your suggestion makes sense. Another one is that the proxy server can also issue its own GET marwan.io/moddoc?go-get=1 and check that its mod <meta> tag is not pointing to the same host as the proxy itself.

However, from a feature perspective, I wasn't clear on how go mod download and the mod <meta> tag should work with each other besides your suggestion of go mod download VCS-PATH.

But I'd love a confirmation that the mod <meta> tag feature is indeed intended for my two suggestions in my previous comment (go mod download vcs-path (not vanity path) and direct upload) by design.

Thanks again :)

@dmitshur
Copy link
Contributor

dmitshur commented Apr 20, 2019

@marwan-at-work One of the use cases where the mod go-import meta tag feature seems to work well is for publishing modules that do not have a counter-part public VCS.

For example, the module dmitri.shuralyov.com/test/a has a vanity import path and it uses the mod type, but there is no public accompanying VCS. New versions of that module are published directly to the module proxy that is providing it. The module proxy is just a static filesystem being served over HTTP, using http.FileServer. As a result, this problem does not come up in that setup.

I agree that it becomes problematic if the mod type is meant to be used to augment a module with a vanity import path that has an underlying public VCS and new versions of the module are created by pushing to the VCS. I'm not currently sure how that should be handled.

@dominikh
Copy link
Member

Note that go mod download vcs-path isn't viable, either. The resultant ZIP archive will contain file paths prefixed by the VCS path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
modules NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants