Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: mod download modpath@HEAD erroneously resolves to a pseudo-version when HEAD is subsequently tagged #37336

Open
perillo opened this issue Feb 20, 2020 · 14 comments
Labels
help wanted modules NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@perillo
Copy link
Contributor

perillo commented Feb 20, 2020

What version of Go are you using (go version)?

$ go version
go version go1.14rc1 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN="/home/manlio/.local/bin"
GOCACHE="/home/manlio/.cache/go-build"
GOENV="/home/manlio/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/manlio/.local/lib/go:/home/manlio/src/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/manlio/sdk/go1.14rc1"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/manlio/sdk/go1.14rc1/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/manlio/src/go/src/mperillo.test/issue/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build788093506=/tmp/go-build -gno-record-gcc-switches"
GOROOT/bin/go version: go version go1.14rc1 linux/amd64
GOROOT/bin/go tool compile -V: compile version go1.14rc1
uname -sr: Linux 5.5.4-arch1-1
/usr/lib/libc.so.6: GNU C Library (GNU libc) stable release version 2.31.
gdb --version: GNU gdb (GDB) 9.1

What did you do?

First, I configured git with:

[url "file:///home/manlio/src/go/src/mperillo.test/"]
	insteadOf = https://mperillo.test/

in order to test with local module paths.

In a new git repository with only 1 commit:

$ go1.14rc1 clean -modcache
$ GOPRIVATE="*" go1.14rc1 mod download -json mperillo.test/issue.git@HEAD
{
	"Path": "mperillo.test/issue.git",
	"Version": "v0.0.0-20200220172602-e96c074fea4f",
}

Now I release a version, without a new commit:

$ git tag -a v0.1.0
$ GOPRIVATE="*" go1.14rc1 mod download -json mperillo.test/issue.git@HEAD
{
	"Path": "mperillo.test/issue.git",
	"Version": "v0.1.1-0.20200220172602-e96c074fea4f",
	"Error": "mperillo.test/issue.git@v0.1.1-0.20200220172602-e96c074fea4f: invalid pseudo-version: tag (v0.1.0) found on revision e96c074fea4f is already canonical, so should not be replaced with a pseudo-version derived from that tag",
}

What did you expect to see?

Not sure.

But it is not clear why it fails, and the error message seems a bit confusing. Why is go trying to replace the canonical v0.1.0 if this is the version that it is processing now?

P.S.

The actual module path, as defined in go.mod, is mperillo.test/issue and not mperillo.test/issue.git. Can this be considered a bug?

The .git extension was added to force go to don't try to get https://mperillo.test/issue?go-get=1.

@bcmills bcmills changed the title cmd/go: mod download modpath@HEAD with git has problems when HEAD is tagged later cmd/go: mod download modpath@HEAD erroneously resolves to a pseudo-version when HEAD is a tagged version Feb 20, 2020
@bcmills bcmills added modules NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Feb 20, 2020
@bcmills bcmills added this to the Backlog milestone Feb 20, 2020
@bcmills bcmills changed the title cmd/go: mod download modpath@HEAD erroneously resolves to a pseudo-version when HEAD is a tagged version cmd/go: mod download modpath@HEAD erroneously resolves to a pseudo-version when HEAD is subsequently tagged Feb 20, 2020
@bcmills
Copy link
Contributor

bcmills commented Feb 20, 2020

The originally-derived pseudo-version v0.0.0-20200220172602-e96c074fea4f should remain correct.

The newly-derived pseudo-version v0.1.1-0.20200220172602-e96c074fea4f really is an incorrect name for that commit. So we need to figure out why go mod download is resolving it to a pseudo-version instead of the tagged version.

@perillo
Copy link
Contributor Author

perillo commented Feb 20, 2020

I tested again with a clean module cache and git repository.

I noted that the pseudo-version v0.1.1-0.20200220182651-2e7323fa45eb is in the module cache, even when go mod download failed.

When I invoke GOPRIVATE="*" go1.14rc1 mod download -json mperillo.test/issue.git@HEAD again, it works fine, and all the versions are in the module cache.

When I use the version latest instead of HEAD it works fine.

@bcmills
Copy link
Contributor

bcmills commented Feb 20, 2020

My best guess (and, to be clear, it's just a guess) is that we may have another bug in either the git client binary itself or the go command's interaction with it.

...speaking of which, what git client version are you using?

@perillo
Copy link
Contributor Author

perillo commented Feb 20, 2020

git version 2.25.1

@perillo
Copy link
Contributor Author

perillo commented Feb 20, 2020

In src/cmd/go/internal/modfetch/codehost/git.go, I added some tracing statements:
https://gist.github.com/perillo/789078ef6af7611a34879c9eb301b891

When I invoke go mod download for the first time, the trace is

r.local: false
r.local: false
r.local: false

After releasing version v0.1.0:

r.local: false
r.local: false
r.local: false
found tag tag: v0.1.0,
added tags [v0.1.0]

And after calling go mod download again, without any changes:

r.Local: false
found tag tag: v0.1.0,
added tags [v0.1.0]
r.Local: false
found tag tag: v0.1.0,
added tags [v0.1.0]

@bcmills
Copy link
Contributor

bcmills commented Feb 20, 2020

Ok, I think I understand what's going on.

During the first fetch, the tag doesn't exist, so the statLocal call (correctly) returns an empty tag set.

During the second fetch, statLocal thinks that it has enough information about the commit to resolve the query from the cache. And that's (maybe) arguably correct, but then we have a problem. When we don't find an exact tag on the commit, we call RecentTag to find a pseudo-version base, and RecentTag fetches more history (including the actual tag).

After it fetches that history, RecentTag (correctly) identifies the tag on the commit itself as the most recent one and returns that as the pseudo-version base. Then we go to verify that pseudo-version and notice that it's based on a tag on the commit itself, which is not allowed.

@bcmills
Copy link
Contributor

bcmills commented Feb 20, 2020

I'm not sure what the right fix here is. We certainly need to fetch tags more aggressively, but we might not want to do that for every repo that we Stat, especially if the version we're requesting is already canonical.

@bcmills
Copy link
Contributor

bcmills commented Feb 20, 2020

#27171 is somewhat related, in that we should fetch more data about tags even if we can theoretically produce a valid answer using only local information.

@bcmills
Copy link
Contributor

bcmills commented Feb 20, 2020

CC @jayconrod @matloob

@perillo
Copy link
Contributor Author

perillo commented Feb 20, 2020

Is there a reason why go does not fetch all the tags and heads as soon as possible?
It should, unless the version we're requesting is already canonical, as you wrote.

@bcmills
Copy link
Contributor

bcmills commented Feb 21, 2020

In the past we have tried to make the go command fetch as little of the upstream repositories as possible, so that we don't consume extra disk space on the user's machine without some corresponding benefit.

However, that has exposed a lot of git client bugs, and a lot of subtleties in git client usage that have led to bugs in the go command itself. Especially given the default GOPROXY in Go 1.13, we can probably afford to be a little less parsimonious now in order to have a simpler (and, hopefully, more correct) implementation.

@perillo
Copy link
Contributor Author

perillo commented Feb 21, 2020

In gopath mode, the go command clones the entire repository. In module mode it only fetches the heads and the tags, unless the user explicitly require a commit ID. I don't think the extra disk space is an issue.

@perillo
Copy link
Contributor Author

perillo commented Feb 21, 2020

However I remember a message from Russ where he mentioned that, if possible, he wanted to avoid receiving code from a remote repository due to possible security bugs with the git client.

@bcmills
Copy link
Contributor

bcmills commented Feb 25, 2020

At one point we did only fetch the heads and tags. However, the current version of the code fetches the refs and tags along with all ancestors of those commits:

// fetchRefsLocked fetches all heads and tags from the origin, along with the
// ancestors of those commits.
//
// We only fetch heads and tags, not arbitrary other commits: we don't want to
// pull in off-branch commits (such as rejected GitHub pull requests) that the
// server may be willing to provide. (See the comments within the stat method
// for more detail.)
//
// fetchRefsLocked requires that r.mu remain locked for the duration of the call.
func (r *gitRepo) fetchRefsLocked() error {
if r.fetchLevel < fetchAll {
// NOTE: To work around a bug affecting Git clients up to at least 2.23.0
// (2019-08-16), we must first expand the set of local refs, and only then
// unshallow the repository as a separate fetch operation. (See
// golang.org/issue/34266 and
// https://github.com/git/git/blob/4c86140027f4a0d2caaa3ab4bd8bfc5ce3c11c8a/transport.c#L1303-L1309.)
if _, err := Run(r.dir, "git", "fetch", "-f", r.remote, "refs/heads/*:refs/heads/*", "refs/tags/*:refs/tags/*"); err != nil {
return err
}
if _, err := os.Stat(filepath.Join(r.dir, "shallow")); err == nil {
if _, err := Run(r.dir, "git", "fetch", "--unshallow", "-f", r.remote); err != nil {
return err
}
}
r.fetchLevel = fetchAll
}
return nil
}

That's potentially a lot of data, but given the default GOPROXY usage it's probably fine for non-canonical versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted modules NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

2 participants