Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/vgo: consider alternative syntax for major version #24119

Closed
rogpeppe opened this issue Feb 25, 2018 · 24 comments
Closed

x/vgo: consider alternative syntax for major version #24119

rogpeppe opened this issue Feb 25, 2018 · 24 comments
Milestone

Comments

@rogpeppe
Copy link
Contributor

The current syntax for major versions is arguably awkward because extracting the version number depends on the package domain, and there's potential for clashes with actual non-version names ("v8" for example).

Perhaps a syntax where the major version is obviously distinct might work better. The "@" character is used in multiple places in vgo to associate a version with a path, so perhaps it could be used in import paths too.

For example:

"github.com/foo/bar@v3"

instead of:

"github.com/foo/v3/bar"

This has an additional advantage that imports of different major versions of the same package will sort next to one another in the import list.

@gopherbot gopherbot added this to the vgo milestone Feb 25, 2018
@jimmyfrasche
Copy link
Member

This would make the presentation different for synthetic versions from tags and when actually using literal vN directories but that's arguably a good thing.

@fhs
Copy link
Contributor

fhs commented Feb 25, 2018

It might be worth keeping import paths backwards compatible with current go tools. Something like this:

"github.com/foo/bar"  // vgo: v3

A lot of go get commands will start to fail for people using the old go command, once people start migrating to vgo. This can at least be a temporary or additional syntax until the old go command is deprecated (after few years).

@bigwhite
Copy link

vgo design seems still under go1 compatible spec.

the import syntax:

"github.com/foo/v3/bar"

is go1 compatible and it also indicates the package path even if the search path is under $GOPATH/v/cache, not under the $GOPATH/github.com/foo.

@fhs
Copy link
Contributor

fhs commented Feb 26, 2018

The go tools are not covered by the go1compat promise, but that doesn't mean we shouldn't try to avoid breaking stuff if we can. To clarify, here is an example package that works with vgo but not with go tool:

$ go get github.com/fhs/go-get-issue-24119/hello
package github.com/fhs/go-get-issue-24119/v2/msg: cannot find package "github.com/fhs/go-get-issue-24119/v2/msg" in any of:
	/home/big/go/src/github.com/fhs/go-get-issue-24119/v2/msg (from $GOROOT)
	/home/fhs/mygo/src/github.com/fhs/go-get-issue-24119/v2/msg (from $GOPATH)

$ vgo build
vgo: resolving import "github.com/fhs/go-get-issue-24119/v2/hello"
vgo: finding github.com/fhs/go-get-issue-24119/v2 (latest)
vgo: adding github.com/fhs/go-get-issue-24119/v2 v2.0.0

@zeebo
Copy link
Contributor

zeebo commented Feb 26, 2018

The comment syntax makes importing the different major versions of the same import problematic. I don’t see how a non-vgo aware go toolchain would be able to successfully build or analyze it, for example.

@4ad
Copy link
Member

4ad commented Feb 27, 2018

Throwing options around:

import "example.com/foo/v2/bar/baz"
import "example.com/foo@v2/bar/baz"
import "example.com/foo/bar/baz/v2"
import "example.com/foo/bar/baz@v2"
import "example.com/foo/bar/baz @v2"
import "example.com/foo/bar/baz v2"
import "/v2/example.com/foo/bar/baz"
import "v2/example.com/foo/bar/baz"
import "v2:example.com/foo/bar/baz"
import "v2 example.com/foo/bar/baz"

@cznic
Copy link
Contributor

cznic commented Feb 27, 2018

I'd prefer keeping the format of the import path backwards compatible with existing tools, at least wrt lexical grammar conventions. That would reduce the list to

import "example.com/foo/v2/bar/baz"
import "example.com/foo/bar/baz/v2"

Yes, it's basically what vgo came with.

@4ad
Copy link
Member

4ad commented Feb 27, 2018

Relevant as ever: The Hideous Name by @robpike and @pjweinb. The section about Cedar seems directly applicable.

@nicpottier
Copy link

nicpottier commented Feb 27, 2018

While originally liking the idea of:

import "github.com/user/package@v2"

I think it becomes super messy and weird once you are referring to subpackages as most seem to want to: (using version as strictly a suffix to the import)

import "github.com/user/package/subpackage@v2"

This also applies to having a directory suffix, ie github.com/user/package/subpackage/v2, very weird.

I don't mind this form as much:

import "github.com/user/package@v2/subpackage

But it does fail in the way that it isn't backwards compatible.

So ya, I think Russ's original proposal is the right one, it satisfies allowing repos to provide backwards compatible directory structures and puts the versioning in the right part of the path to make subpackage imports intuitive and unambiguous.

@ianlancetaylor
Copy link
Contributor

A minor note, but import paths are permitted to include @ characters and they can be expected to map to file names as usual. If this proposal is updated we should use a character that implementations are permitted to forbid in import paths, as listed at https://golang.org/ref/spec#Import_declarations . I would suggest #.

That said, the arguments against doing this seem compelling.

@nicpottier
Copy link

nicpottier commented Feb 27, 2018

@ianlancetaylor you bring up a good point though I actually find it more compelling that @ is allowed in import paths currently, as that actually puts this format back on the table for me:

import "github.com/user/package@v2/subpackage

In that a repo could have an @v2 subdirectory to allow both go and vgo to work against it. (same as is proposed with /v2

I think the argument to use @v2 is simply that it is somewhat weird to have an import of:

import "github.com/user/package/v2"

And refer to things as package.Foo() since until now that would have been v2.Foo() which obviously doesn't work.

import "github.com/user/package@v2"

Feels slightly cleaner in that respect (and does have the benefit of more closely matching both internal structure of vgo's library caching and go.mod)

[edit: realize I'm off about @v2 being able to be backwards compatible since ya of course @v2 won't be interpreted as a directory as /v2 does.. so ya, that's the big downside with any approach that isn't / in that it becomes harder to support both forms without using branching]

@hosewiejacke
Copy link

Why not use tags for imports? E.g.

import "github.com/foo/bar" `v:"2.3.4"`

@FiloSottile
Copy link
Contributor

I agree not using something that looks like a filesystem path would reduce confusion, immediately suggesting there's a new concept to account for to an unfamiliar developer.

I strongly agree with @ianlancetaylor that using a previously forbidden character would be much better. Naive tools will fail instead of being vulnerable to takeovers by repositories actually called "foo@v2". I do like #.

@jimmyfrasche
Copy link
Member

I like import "path#vn".

It keeps separate information separate and makes it easier to see at a glance what the version is, even when using packages under the root.

Tools that assume it's a path will break but will be much easier to fix than if it's a fake directory somewhere in the middle.

It would also be easier for syntax highlighters to color the path and version separately, if desired.

@nicpottier
Copy link

nicpottier commented Feb 27, 2018

Is adding @ as a reserved character in import paths off the table? How many (are there any) repos use @ in their path? Sure would be nice to use @ consistently in all the places if we decide to sacrifice backwards compatibility by not using /

@jimmyfrasche
Copy link
Member

I was looking through cmd/go/go_test.go and found in TestBadCommandLines that the go command already disallows @ (and -) as the prefix to a directory or file name as part of the fix for #23672

@rsc
Copy link
Contributor

rsc commented Mar 30, 2018

Re import paths, none of the ones on godoc.org use @. This is in large part due to the fact that all the regexps matching well-known import paths (hosting services) disallow @, so it could only possibly be used on a custom domain. So @ is practically available even though it wasn't 100% reserved before. That said, I think we should stick with / for now. Will write a longer comment about that.

@rsc
Copy link
Contributor

rsc commented Apr 2, 2018

One of the main reasons for using /v2 to denote a major version was to create an overlap with “go get” to enable a smooth transition. It’s clear we need to do more than that, so I proposed this morning to issue a patch release for old “go get” to make a minimal change to process vgo-updated code, by stripping the v2 path element. See #24301 (comment).

If we’re going to modify old “go get” to strip the path element, then we do have more flexibility in what it should be, and as Roger points out, my/thing@v2 is the obvious alternative. I spent about a day or two convinced we should do it, but when I tried to write up a justification, I changed my mind.

The first argument for moving to @ is that it, with changes to go.mod, it could unify the import, module, and command-line syntaxes. (I’m going to assume here that “ “ in go.mod are gone, because I think we should do that, separate from this discussion.)

// go.mod
module my/client@v2
require my/thing@v2.3.4

// command line
go get my/thing@v2.3.4/cmd/foo

// x.go
import “my/thing@v2/sub/pkg”

These all use @ but they end up being not particularly uniform: the module path and import must say only @v2 (not @v2.3 or @v2.3.4), while the command line and require typically give a full @v2.3.4; while they can say @v2 or @v2.3 those are equivalent to (and in require will be rewritten to) @v2.0.0 and @v2.3.0.

There are two distinct concepts: (1) identifying the space of versions associated with a given major version major version, and (2) identifying one specific version. Before, (1) used the / syntax and (2) used a space in go.mod or an @ on the command line. The unification blurs that distinction, in what I think is a confusing way.

When you add in v1’s of a module, things get even weirder. Consider:

// go.mod
require (
	my/thing@v1.2.3
	my/thing@v2.3.4
)

The require pair really bothers me: it looks like you are requiring the same thing at two different versions of my/thing, not looking at two different modules my/thing and my/thing@v2. When you see that you’re likely to think it’s also OK to add another my/thing@v1.4.5. Nothing suggests that there’s only one per major version or that the major version is somehow special. This sounds like a small point, but it’s the critical contribution of semantic import versioning. It seems to me a significant mistake to switch to a syntax that blurs that distinction. Compare to:

require (
	my/thing v1.2.3
	my/thing/v2 v2.3.4
)

That stutters a little but makes very clear that my/thing and my/thing/v2 are being treated as distinct, and it doesn’t imply that you could add “my/thing v1.4.5” as a third requirement.

The command line does not fare much better. Today you say:

vgo install my/thing/cmd/bar
vgo get my/thing/cmd/bar@v1.2.3

vgo install my/thing/v2/cmd/foo
vgo get my/thing/v2/cmd/foo@v2.3.4

If we switch to @ for module versions then these would probably become:

vgo install my/thing/cmd/bar
vgo get my/thing@v1.2.3/cmd/bar

vgo install my/thing@v2/cmd/foo
vgo get my/thing@v2.3.4/cmd/foo

It bothers me that the get path is not a substring of the install path. Moving the @v2 to the end of every path would resolve that problem, but with version-at-the-end there’s not a clear distinction between the packages of v1 and the packages of v2 when you do a basic sort.

So the syntax unification, on closer examination, seems not to be a win.

The second argument for moving to @ is that it avoids ambiguity with existing path elements named v2. For example today github.com/gorilla/rpc/v2 indicates the v2 subdirectory of v1.x.x but it looks like instead it might indicate the root (or v2 subdirectory) of v2.x.x. If the syntax for major version two were github.com/gorilla/rpc@v2, then there would be no mistaking a module version for a plain directory, and the boundary in the import path where the module prefix ends would be easy to find, unambiguous.

But the module prefix boundary is still ambiguous for v1 or earlier, and that ambiguity may actually be helpful. Every good abstraction creates ambiguity, by omitting unimportant detail. In this case, the ambiguity about module prefix boundary allows evolving module organization. For example, the golang.org/x/text repo holds a collection of related packages for Unicode-aware text processing. Keeping them in a single repo is the right unit for development, code review, and so on, but the package APIs are at different maturity levels. Some are ready to commit to a a 1.0 API while others are more experimental. We have been discussing marking the overall repo as a single module at 1.0 but simultaneously carving out nested modules that can stay at 0.x while their APIs mature. For example maybe x/text would be at 1.0.0 but x/text/currency would be at 0.1.0. When eventually the currency API is stable enough, we’d want to promote it into the main x/text repo, say at 1.2.0. The ambiguity of not seeing the module boundary lets us move x/text/currency from its own module into a larger module transparently. It also allows splitting a subtree out of a module. Being able to move packages between modules easily seems important for long-term maintenance of a large project, just as being able to move types between packages is important.

So @ does not eliminate ambiguity about module boundaries, and eliminating that ambiguity might not be a win anyway.

The same concerns also suggest that we should, if possible, allow ambiguity about v2 module versions. For example we might have my/thing, my/thing/v2, and my/thing/v3 and then decide that we want to group those import paths together as one module “my” v1.0.0. Being able to have a plain directory named vN would allow that kind of refactoring. It would also avoid invalidating existing paths like “github.com/gorilla/rpc/v2” in older commits.

Obviously the ambiguous vN names are a little more difficult to reason about and implement, but the same can be said about “x + y” being an integer or float or complex or string addition, and yet we’re all comfortable with that abstraction/ambiguity.

I think we should probably continue to use / as the version separator while still allowing non-module-version /vN/ in paths too, and see how far we get. If we hit an impossible obstacle we can always convert over to @'s.

@jimmyfrasche
Copy link
Member

A uniform syntax for both allows switching between synthetic and real directories without interruption as an implementation detail.

If they had different syntax and you have to use real directories at v(N+1) but weren't at vN but you needed to use real directories so that v(N+1) and vN interoperate it's game over.

With @ or whatever syntax for both, then when you're using real directories the import path doesn't coincide with the file path, so that gets lost either way.

The @ syntax still feels nicer—if nothing else it makes it easier to spot at a glance. Overloading / is probably fine, though.

I'm not sure why it has to separate the part of the import path specifying the module from the part specifying the package. I'd be fine with import "some/module/apackage@v2". The @v2 is a modifier of some/module but it's a footnote—some/module/apackage seems more important to me. If I'm concerned about the module root I can look it up. If I care about the version it's easy to find because it's always at the end not buried somewhere in the middle.

@nicpottier
Copy link

All fair points Russ, as another perspective from someone less deeply ingrained in vgo, having just played with converting a few projects to it, I (in contrast to you) find this syntax:

require (
	my/thing v1.2.3
	my/thing/v2 v2.3.4
)

To be less intuitive than:

require (
	my/thing@v1.2.3
	my/thing@v2.3.4
)

The first requires knowing the rule that you leave off v1 for v1/v0 packages. The second only requires you to know that semantic versioning means you can have multiple major versions in a vgo project, which feels like a much more fundamental concept ingrained in vgo that you'll learn pretty quickly and never forget. The specific syntax will never be forgotten because it is always the same. Then again nobody is proposing using @v1 in import paths so I guess you have to learn the "leave off v1/v0" rule anyways.

Perhaps I'm just not smart enough but man, the overloading of paths in modules with v# for using submodules etc sounds crazy confusing and likely to cause head-scratching. If I'm understanding that correctly you are saying you might have a "main" module at v1, that decides to pull in what used to be a submodule at v3 into itself but no longer version that submodule differently? So your only dependency would be the main submodule at say v1.3.4 but it would have a v3 directory in it from what was previously a submodule?

@FiloSottile
Copy link
Contributor

@rsc, since we are going to treat vgo paths differently in 1.9 and 1.10 already per #25069, I think it would significantly reduce confusion to use a character that suggests something is different, and that developers already associate to versioning.

The only precedence in breaking the mapping to filesystem paths was vendor/, which was self-explanatory. / might be cleaner, but @ will take a lot less explaining. I think it's important to cause as little surprise as possible.

Also, not all modules are three paths elements deep, and @ clearly signals where the module cut is.

@as
Copy link
Contributor

as commented Apr 29, 2018

@4ad

Relevant as ever: The Hideous Name by @robpike and @pjweinb. The section about Cedar seems directly applicable.

I was about to reference that particular paper but it looks like you beat me to it. Why are vendor and internal real directories with synthetic treatment by the importer, but major versions have to be a special artifact worthy of its own delimiter?

The only precedence in breaking the mapping to filesystem paths was vendor/

internal

@FiloSottile
Copy link
Contributor

internal changes visibility, not the expected location on disk.

And my argument is that vendor is self-explanatory, while v\d+ is not. Also, major versions would appear in code while vendor doesn't.

@rsc
Copy link
Contributor

rsc commented May 25, 2018

Per #24119 (comment), leaving things alone. Closing.

@rsc rsc closed this as completed May 25, 2018
@golang golang locked and limited conversation to collaborators May 25, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests