Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: x/pkgsite: new source meta tag #39559

Closed
jba opened this issue Jun 12, 2020 · 12 comments
Closed

proposal: x/pkgsite: new source meta tag #39559

jba opened this issue Jun 12, 2020 · 12 comments

Comments

@jba
Copy link
Contributor

jba commented Jun 12, 2020

(Edited: now a proposal.)

This issue describes a new HTML meta tag for referring to Go source files in online documentation. It is an official Go proposal, though it doesn't affect the Go language or tools.


For many years, the go-source meta tag has allowed godoc.org and other source-browsing tools to provide links to Go source for import paths that use the go-import meta tag. With the advent of modules, the go-source meta tag in its current form cannot be used, because it does not support versions. While we could just extend go-source to add a parameter to the templates, we could also take this opportunity to improve it in other ways.

We propose a new tag, go-source-v2, with the following properties:

  • Module versions are supported.
  • Source files are described relative to modules rather than packages.
  • Additional information can be provided, so that module-browsing tools like pkg.go.dev can display repo information and render README and similar files.
  • The anomalies listed at the end of the current spec are resolved.

Structure

For certain common code-hosting sites, like GitHub and Bitbucket, no go-source-v2 tag is necessary. See "Implicit Source Information" below for details.

For a module with path M, the tag should appear in the <head> of the page served by
GETting https://M?go-get=1. The tag should look like

<meta name="go-source-v2" content="home directory file line-suffix raw">

where:

  • home is the URL of the repo root. If _, then the repo root (third component) of the go-import tag on the same page is used; if there is no go-import tag or the tag’s second component is mod, then no repo is specified. This does not preclude serving source files but it does prevent tools from linking to the repo or providing repo-based signals, like number of stars and forks.
  • directory is a URL template for linking to a directory of files. It supports two parameters:
    • {revision} is replaced by an identifier for the (approximate) VCS revision. See "Revision Parameters" below for more.
    • {dir} is replaced by the directory relative to the module (not repo) root.
  • file is a URL template for linking to an entire file. In addition to {revision} and {dir}, it also supports {file}, the basename of the file.
  • line-suffix will be appended to file to obtain a URL for a file at a particular line. It supports only the parameter {line}, the 1-based integer line number.
  • raw is a URL template for linking to the raw contents of a file. It supports the {revision}, {dir} and {file} parameters as defined above. While file should display a file for people (with line numbers and syntax highlighting, perhaps), raw should serve the raw bytes of the file. It can be used to rewrite links in README files and the like.

After a tool replaces a template’s parameters, it should remove doubled and trailing slashes. This should make go-source’s {/dir} parameter unnecessary. In theory, a site could serve a path differently depending on whether it had a trailing slash, but we are unaware of any code-hosting site that makes this distinction.

Any component of the tag’s contents can be omitted by using an underscore.

Here’s an example of the go-import and go-source-v2 meta tags for the gopkg.in/yaml.v2 module:

<meta name="go-import" content="gopkg.in/yaml.v2 git https://gopkg.in/yaml.v2">

<meta name="go-source-v2" content="
    github.com/go-yaml/yaml
    https://github.com/go-yaml/yaml/tree/{revision}/{dir}
    https://github.com/go-yaml/yaml/blob/{revision}/{dir}/{file}
    #L{line}
    https://github.com/go-yaml/yaml/raw/{revision}/{dir}/{file}
">

Revision Parameters

Tools should derive {revision} from the module version as follows:

  • For pseudo-versions, use the commit hash (the part after the final hyphen).
  • For semantic versions, use the version after removing any +incompatible suffix.
    Use other version specifiers (like master) as is.

For a nested module, {revision} is not actually the tag name. A nested module N at version v1.2.3 has tag N/v1.2.3, but {revision} will be v1.2.3. The templates must account for this. For instance, if example.com served directories using GitHub-style URLs, and example.com/mod/nest were a nested module under example.com/mod, then its directory template might be https://example.com/mod/tree/nest/{revision}/nest/{dir}. The first occurrence of nest is part of the tag that identifies the version of example.com/mod/nest.

Implicit Source Information

If the https://M?go-get=1 page for module M has a go-import meta tag that refers to a repo whose domain matches one of the following glob patterns, then no go-source-v2 tag is needed:

  • github.com
  • bitbucket.org
  • *.googlesource.com
  • gitlab.com
  • gitlab.* (if the site behaves like gitlab.com)

The templates for these sites are well-known, and are provided below.

There is one problem: for a major version greater than 1, the templates for “major branch” and “major subdirectory” conventions differ (See https://research.swtch.com/vgo-module for a discussion of these conventions.) To determine the right template, make a HEAD request for the go.mod file using each template, and select the one that succeeds. For example, for module github.com/a/b/v2 at version v2.3.4, probe both github.com/a/b/blob/v2.3.4/go.mod (the location of the go.mod file using the “major branch” convention) and github.com/a/b/blob/v2.3.4/v2/go.mod (its location using “major subdirectory”).

Standard Patterns

In these patterns, REPO is the repo URL and MS is the suffix of the module path without the repo prefix. These can be determined from the go-import tag and the path of the go-get=1 URL.

github.com:

  • directory: REPO/tree/{revision}/MS/{dir}
  • file: REPO/blob/{revision}/MS/{dir}/{file}
  • line suffix: #L{line}
  • raw: REPO/raw/{revision}/MS/{dir}/{file}

gitlab.com, gitlab.*:

  • directory: REPO/tree/{revision}/MS/{dir}
  • file: REPO/blob/{revision}/MS/{dir}/{file}
  • line suffix: #L{line}
  • raw: REPO/raw/{revision}/MS/{dir}/{file}

bitbucket.org:

  • directory: REPO/src/{revision}/MS/{dir}
  • file: REPO/src/{revision}/{dir}/MS/{file}
  • line suffix: #lines-{line}
  • raw: REPO/raw/{revision}/MS/{dir}/{file}

*.googlesource.com:

  • directory: REPO/+/{revision}/{dir}
  • file: REPO/+/{revision}/{dir}/{file}
  • line suffix: #{line}
  • raw: not supported

Sites that won’t work

Code-hosting sites running Gitea cannot be accommodated by the source linking scheme described above, or indeed by any scheme that has only the information available from the module zip. Gitea source URLs are different for branches, tags and commit hashes, and for the last only the full hash will work. Since revisions should always be tags, the templates for a Gitea site can use the tag form of the source URL. But there is no template that will work with the abbreviated hash at the end of a pseudo-version.

While a source-browsing tool could clone the repo and resolve the abbreviated hash locally, that work should be outside the scope of the tool. Instead, we suggest that a gitea.com contributor add URL routes that can work with partial hashes.

The same problem exists for code.dumpstack.io (which appears to be a rebranded gitea).

Whatever software is used for https://blitiri.com.ar/ has the same issues, and one additional one: there doesn’t seem to be any URL for tags.

@gopherbot gopherbot added this to the Unreleased milestone Jun 12, 2020
@golang golang deleted a comment Jun 14, 2020
@julieqiu julieqiu changed the title go.dev: new source meta tag x/pkgsite: new source meta tag Jun 15, 2020
@mvdan
Copy link
Member

mvdan commented Jun 16, 2020

@jba @julieqiu I wonder if we should file this as a proposal under cmd/go, since this would primarily affect go get, and we want input from people who work on modules.

@jba
Copy link
Contributor Author

jba commented Jun 16, 2020

How would it affect go get?

@mvdan
Copy link
Member

mvdan commented Jun 17, 2020

You're right that it wouldn't need to affect go get's code per se, since it primarily uses the go-import meta tag to locate the VCS, not go-source.

The base documentation for remote import paths, Go meta tags, and how to obtain them via ?go-get=1 is also documented under cmd/go: https://golang.org/cmd/go/#hdr-Remote_import_paths

So I think my original wording wasn't right, but I still think this is very relevant to the folks who work on modules such as @bcmills or @jayconrod. What package to file this under would depend on where this would all be documented, I assume.

I also do think we should make this a proposal, since it seems like a pretty big decision to make without the process :)

@jayconrod
Copy link
Contributor

Agreed this won't affect the go command.

The revision parameters tricky, but this already addresses the problems that come to mind (nested modules and pseudo-versions), and it seems like this will work for known major providers.

@jba jba added the Proposal label Jun 17, 2020
@jba
Copy link
Contributor Author

jba commented Jun 17, 2020

OK @mvdan, as per https://github.com/golang/proposal#readme I added the Proposal label and edited the initial comment to match.

@jba
Copy link
Contributor Author

jba commented Jun 17, 2020

What package to file this under would depend on where this would all be documented, I assume.

The current convention is documented on the gddo repo's wiki. I don't know if the doc needs to live anywhere more prominent or central than that (with s/gddo/pkgsite/ of course).

@jba
Copy link
Contributor Author

jba commented Jul 29, 2020

Closing because we're going a different way. See #40477.

@jba jba closed this as completed Jul 29, 2020
@jba jba reopened this Jul 29, 2020
@jba
Copy link
Contributor Author

jba commented Aug 3, 2020

Copy of #40477 (comment):

We identified a few important problems with the go-source-v2 idea, most notably:

  • The only guarantee for a module is a zip file. There may not even be a publicly browseable source code repository available.
  • The extra meta tag is only accessible if you go back to the original redirect page. It's not proxyable like all the other module metadata, nor is it preserved if the origin goes away. (Part of why it's not proxyable is that it's only a godoc.org add-on, not something the Go toolchain has ever defined.)
  • The rules around {revision} are very specific to Git repos and bake in details about pseudo-versions that may change in the future.

This makes us pretty confident that go-source-v2 as proposed in the other issue is not the right long-term solution. It's a little bit more module-friendly in that it knows what a version is, but it's not module-friendly enough. More thought is clearly needed.

And even if we added that go-source-v2 support today, we'd need every go get redirector to be updated before any links would start working. That's a lot to ask for a design that we're not even sure is right, especially when a better design might not require any changes at all. The right answer might be to display the code directly from the zip files, or it might be to put some info in the zip file that helps find a source display, or it might be something else entirely. We don't know.

For now, instead of defining a new tag that will require widespread adoption but still not be completely right, it seems best to get the most common sites working by making changes to pkg.go.dev directly, and then revisit the topic when we've had more time to think about the right path forward.

@lafriks
Copy link

lafriks commented Nov 25, 2020

Gitea now has support for partial commit hashes in version 1.14.0 when it is released.

@gopherbot
Copy link

Change https://golang.org/cl/274956 mentions this issue: internal/source: update gitea comment

gopherbot pushed a commit to golang/pkgsite that referenced this issue Dec 4, 2020
For golang/go#39559

Change-Id: Id9c2ae0bcac9299565695d79a66f4bf591e60364
Reviewed-on: https://go-review.googlesource.com/c/pkgsite/+/274956
Trust: Jonathan Amsterdam <jba@google.com>
Run-TryBot: Jonathan Amsterdam <jba@google.com>
TryBot-Result: kokoro <noreply+kokoro@google.com>
Reviewed-by: Julie Qiu <julie@golang.org>
@jba
Copy link
Contributor Author

jba commented Jan 25, 2021

We changed the "known sites" approach to recognize specific URL schemas in the old version-free go-source tags and automatically adjust them to the versioned equivalents. That approach requires only O(number of hosting softwares) instead of O(number of hosting domains) cases and should scale better.

That change, along with our willingness to add custom patterns to the pkg.go.dev source as necessary to match new sites (see #40477), makes this proposal unnecessary. I retract it.

@rsc
Copy link
Contributor

rsc commented Mar 29, 2023

This proposal has been declined as retracted.
— rsc for the proposal review group

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests

7 participants