Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/go: go list: report go-source tag information #59566

Closed
jeffwidman opened this issue Apr 12, 2023 · 13 comments
Closed

proposal: cmd/go: go list: report go-source tag information #59566

jeffwidman opened this issue Apr 12, 2023 · 13 comments
Milestone

Comments

@jeffwidman
Copy link
Contributor

jeffwidman commented Apr 12, 2023

👋 over in :dependabot: we need to map a go module import path back to it's release-related metadata, such as the release notes, changelog, commit log, etc. We use this metadata when assembling the Dependabot PR.

Currently, we use a third-party library which is an older fork of some internal go code, but we'd much prefer to migrate to official go tooling. More context.

So I was delighted to notice that go list recently added support for mapping a module import path to the root repository URL.

However, when I tried to migrate to that, I ran into problems because go list go list -m -f '{{.Origin.URL}}' module@version only lists the go-import meta tag content, not the go-source meta tag content as demonstrated in dependabot/dependabot-core#7045 (comment):

...Take this for example:

$ curl https://golang.org/x/text\?go-get\=1
<!DOCTYPE html>
<html lang="en">
<title>The Go Programming Language</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<meta name="go-import" content="golang.org/x/text git https://go.googlesource.com/text">
<meta name="go-source" content="golang.org/x/text https://github.com/golang/text/ https://github.com/golang/text/tree/master{/dir} https://github.com/golang/text/blob/master{/dir}/{file}#L{line}">
<meta http-equiv="refresh" content="0; url=https://pkg.go.dev/golang.org/x/text">
</head>
<body>
<a href="https://pkg.go.dev/golang.org/x/text">Redirecting to documentation...</a>
</body>
</html>

The Origin only lists the go-import meta tag content:

$ go list -m -f '{{.Origin.URL}}' golang.org/x/text@latest
https://go.googlesource.com/text

[Dependabot's] current solution returns the go-source meta tag content, so we'd lose the PR content for these common dependencies and any other set up in this way.

After some digging, I'm unclear on the current go core team level of official support for the go-source metadata tag. The primary docs for the format are on an archived repo, which doesn't inspire confidence: https://github.com/golang/gddo/wiki/Source-Code-Links

However, the proposal for an updated tag format was recently rejected: #39559

If the go-source tag remains a supported tag going forward, can go list be enhanced to expose this metadata URL as well?

I'm agnostic on the details of the solution--you folks know the go list API much better than I do, as long as there's a way to extract it I'd be happy.


More notes:

  • One potential solution is to add an Origin.SourceURL field.
  • To avoid confusion, the existing Origin.URL could be renamed to Origin.ImportURL for consistency with the existing meta tag names... unless this would be a backwards breaking change? Or is that API not considered fully stable yet?
  • Alternatively, the go-source meta tag could be officially deprecated? However, it still seems like there's a valid use case for "download URL" vs "URL where the source code is developed, so users can see the history of PR's, commits, etc"...
  • Another edge case: What is the migration path when a library wishes to change to a different go-source URL for newer versions of the library? From a :dependabot: perspective pragmatically we're fine with only showing whatever the latest value of the URL is, since at the time of the PR it'll generally be the latest package version available. But from a language design standpoint this corner case may need to be considered. 🤷‍♂️
@ianlancetaylor
Copy link
Contributor

CC @bcmills @matloob

@jeffwidman
Copy link
Contributor Author

Anything I can do to move this forward?

I assume it needs a decision from the go core team... if accepted, I'm happy to provide a PR iff that'd be helpful.

@rsc
Copy link
Contributor

rsc commented May 3, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented May 10, 2023

Talked to @bcmills and @matloob. We believe this is beyond the scope of the go command. The go-source line was defined by godoc.org before modules existed, and it provided a template to link to a specific line of a specific file in a go-get-able package. Later, the Go project adopted godoc.org and its code base but then we archived it once its replacement, pkg.go.dev, was ready. Nothing we have initiated or maintain today uses the go-source tags. In particular, neither the Go toolchain nor pkg.go.dev uses them. It seems like a mistake to add new support now, for at least two reasons:

  1. The go command is not concerned with web pages about packages, which is what go-source points to. (The go command is absolutely concerned with where to find actual source files, which is now listed in the go mod download -json output, as noted earlier in the discussion.)

  2. The go-source lines in widespread use provide no way to get to a web page showing the content of a file in a specific version of a module. It's ironic that they show how to get to a specific line, but that line number is almost meaningless if you don't know which version you are going to get.

Maybe it would make sense to define a replacement for go-source, but it is not a high priority, since not all module hosts may even have a web page view of every file in a module, and there is honestly limited value in being able to get to a web page about a given file.

For the use case in the top comment, I don't understand how having a web page view for a given file in a module is going to help you derive "release-related metadata, such as the release notes, changelog, commit log, etc." The only way I can see that helping is if you recognize the go-source URL as being one of a set of known URLs, such as "this is GitHub" and then change the URL to find adjacent material. But if so, it seems like using the VCS Origin URL would work just as well.

@hyangah
Copy link
Contributor

hyangah commented May 11, 2023

I agree that go-source is out of scope for cmd/go (and tricky to be correct in the module world). Just minor correction - pkgsite uses this as one of many hacks to guess the source locations. I think the code for querying/parsing the metatag is straightforward enough to copy and use.

@rsc
Copy link
Contributor

rsc commented May 17, 2023

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@jeffwidman
Copy link
Contributor Author

Thanks for the thoughtful response. I haven't had time to look at this in-depth, but planning to do so in the next few days, do you mind leaving the issue open through the end of this week?

@rsc
Copy link
Contributor

rsc commented May 20, 2023

In the absence of new information the issue will be marked Declined on Wed 5/24.

@rsc
Copy link
Contributor

rsc commented May 24, 2023

No change in consensus, so declined.
— rsc for the proposal review group

@rsc rsc closed this as completed May 24, 2023
@rsc rsc changed the title proposal: cmd/go: allow go list to return the content of the go-source meta tag proposal: cmd/go: go list: report go-source tag information May 24, 2023
@jeffwidman
Copy link
Contributor Author

jeffwidman commented Apr 12, 2024

Following up on this since it's still an issue we're facing in :dependabot::

For the use case in the top comment, I don't understand how having a web page view for a given file in a module is going to help you derive "release-related metadata, such as the release notes, changelog, commit log, etc." The only way I can see that helping is if you recognize the go-source URL as being one of a set of known URLs, such as "this is GitHub" and then change the URL to find adjacent material.

Yes, there's only a handful of popular source code hosts, so once we know a repo URL we have some generic code that extracts that metadata and adds it to the :dependabot: PR.

But if so, it seems like using the VCS Origin URL would work just as well.

Unfortunately not. It's relatively common for the VCS Origin URL to be a vanity domain that's useful for go get but not where development actually takes place. See the curl results from this example.

Today, if :dependabot: opens a PR to bump golang.org/x/text we can include the release notes, commit log, etc because the source tag points us at https://github.com/golang/text/ which is a well-known source code host. I realize it's just a mirror, but regardless adding that info to :dependabot: PR's provides value for code reviewers evaluating whether to upgrade golang.org/x/text.

If we only use the VCS Origin URL, we get pointed at https://go.googlesource.com/text, so we can't grab the release metadata unless we treat the custom domain go.googlesource.com as a well-known source host.

We don't intend to dictate to library authors where to host their code--it doesn't have to be GitHub. But even in this example, that's a custom domain for the import... https://go.googlesource.com/text url isn't the actual source repo. According to https://pkg.go.dev/golang.org/x/text the source repo is actually https://cs.opensource.google/go/x/text, but there's no way for the library author to communicate that solely using the VCS Origin URL.

As long as users have the ability to set custom/vanity domains for go get / go.mod imports, then it's also useful to offer a side-channel to say "if you're interested in the release notes and commit log, this other URL is the actual development repo."

Your rationale for why the go-source tag may be the wrong solution makes sense to me. For this use case, the side channel doesn't need to point at specific lines or specific files. I suppose technically it'd be nice to tie it to specific versions if someone changed the development URL, but pragmatically that's very rare so I'm not sure I'd worry about it.

Again, all we really need here is a way for library authors who use a vanity import URL to say "here's the URL of the actual development repo".

@jeffwidman
Copy link
Contributor Author

Sorry this took so long to circle back on... I moved off the :dependabot: team for a bit and only recently rejoined them, plus now on paternity leave so have time to circle back on some things I personally care about.

Given that this has already been closed, and that the discussion is moving toward a generic side channel solution which may not be the go-source tag, would it be more helpful if I opened a new proposal instead?

@seankhliao
Copy link
Member

We don't intend to dictate to library authors where to host their code--it doesn't have to be GitHub. But even in this example, that's a custom domain for the import... https://go.googlesource.com/text url isn't the actual source repo. According to https://pkg.go.dev/golang.org/x/text the source repo is actually https://cs.opensource.google/go/x/text, but there's no way for the library author to communicate that solely using the VCS Origin URL.

https://go.googlesource.com/text is the upstream repo.
https://cs.opensource.google/go/x/text is a different mirror that has better search functionality.

@ianlancetaylor
Copy link
Contributor

@jeffwidman Yes, if you want to move this forward, a new proposal with a new approach would be the way to go. I think it's clear that we aren't going to change cmd/go for this. It might be more appropriate on pkg.go.dev.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Declined
Development

No branches or pull requests

6 participants