Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: regression: UTF-8 package import path fails on master, succeeds on go1.15.6 #43035

Closed
maruel opened this issue Dec 6, 2020 · 9 comments
Labels
FrozenDueToAge GoCommand cmd/go NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@maruel
Copy link
Contributor

maruel commented Dec 6, 2020

What version of Go are you using (go version)?

$ go version
go version devel +3a65abfbda Tue Oct 13 20:13:25 2020 +0000 linux/amd64

Does this issue reproduce with the latest release?

Not on go1.15.6.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOOS="linux"

What did you do?

Run unit tests for corner cases for import paths.

What did you expect to see?

Non-ascii import path works.

What did you see instead?

Non-ascii import path fails as if it didn't exist.

cmd/panic/main.go:40:2: malformed import path "github.com/maruel/panicparse/v2/cmd/panic/internal/ùtf8": invalid char 'ù'

Source: https://github.com/maruel/panicparse/blob/v2.0.1/cmd/panic/internal/%C3%B9tf8/%C3%B9tf8.go

A bisect points to https://go-review.googlesource.com/c/go/+/258298
Paging author @bcmills

Bisected with golang checked out at $HOME/src-oth/golang and github.com/maruel/panicparse checked out at $HOME/src/panicparse:

$HOME/src-oth/bisect.sh:

#!/bin/bash
set -eu
git bisect reset
git checkout $(git merge-base origin/master go1.15.6)
git bisect start
git bisect good
git checkout origin/master
git bisect bad
git bisect run $HOME/src-oth/test.sh

and $HOME/src-oth/test.sh:

#!/bin/sh
set -eu
cd $HOME/src-oth/golang/src
./make.bash
cd $HOME/src/panicparse
$HOME/src-oth/golang/bin/go test ./cmd/panic

I used panicparse at origin/master but I expect v2.0.1 to have the same outcome.

To be clear, it's a corner case. I'm not sure there's a lot of people that create non-ascii package path name but it'd be sad if it regressed. Should probably have a unit test for this.

@maruel
Copy link
Contributor Author

maruel commented Dec 6, 2020

This comes from golang/mod@5d307ac, where @rsc imported CheckImportPath() from cmd/go/internal/module.

git log -1 -- src/cmd/go/internal/module
git checkout 1fb7d5472e8f46faaa034fe6e16ca66a1e7c766f~1
git blame src/cmd/go/internal/module/module.go | grep 'func CheckImportPath'
203c16592bd (Russ Cox         2018-07-15 22:56:37 -0400 321) func CheckImportPath(path string) error {

which leads to https://go-review.googlesource.com/c/go/+/124378/

I'm fine if support for non-ascii is officially not supported, but I'd vote for documenting this as a regression in the release notes, since it used to work and does work on godoc as shown in issue #43036 .

@ianlancetaylor ianlancetaylor changed the title regression: UTF-8 package import path fails on master, succeeds on go1.15.6 cmd/go: regression: UTF-8 package import path fails on master, succeeds on go1.15.6 Dec 6, 2020
@ianlancetaylor ianlancetaylor added GoCommand cmd/go NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Dec 6, 2020
@ianlancetaylor ianlancetaylor added this to the Go1.16 milestone Dec 6, 2020
@ianlancetaylor
Copy link
Contributor

CC @bcmills @jayconrod

@bcmills
Copy link
Contributor

bcmills commented Dec 7, 2020

See previously #29101, particularly #29101 (comment).

@bcmills
Copy link
Contributor

bcmills commented Dec 7, 2020

FWIW, I also see an error here when using go1.15.6. It might not be reported by all commands, and might not be reported for packages in the main module, but this path definitely was not in a fully-working state as of 1.15.6.

example.com$ go1.15.6 get -d github.com/maruel/panicparse/v2/cmd/panic/internal/ùtf8
go: downloading github.com/maruel/panicparse/v2 v2.0.1
go: downloading github.com/maruel/panicparse v1.5.1
go get github.com/maruel/panicparse/v2/cmd/panic/internal/ùtf8: malformed module path "github.com/maruel/panicparse/v2/cmd/panic/internal/ùtf8": invalid char 'ù'

@bcmills
Copy link
Contributor

bcmills commented Dec 7, 2020

I agree that we should have a 1.16 release note for this, though. I've filed that as #43052.

@maruel
Copy link
Contributor Author

maruel commented Dec 7, 2020

Thanks, I'll remove the non-ascii path in this test and will make a release.
You can close this issue since issue #43052 supersedes this one since it's WAI.

Side note: I really wonder what I did wrong for panicparse v2.0.1 in go module mode to fetch v1.5.1 but that's a separate story. Will ask on the slack channel.

@bcmills
Copy link
Contributor

bcmills commented Dec 7, 2020

Side note: I really wonder what I did wrong for panicparse v2.0.1 in go module mode to fetch v1.5.1 but that's a separate story. Will ask on the slack channel.

That's just an idiosyncrasy of cmd/go itself. (It doesn't know whether /v2 is a subdirectory within the v1 module or a suffix for a v2 module, so it checks them both in parallel.)

@bcmills
Copy link
Contributor

bcmills commented Dec 7, 2020

Duplicate of #43052

@kostich
Copy link

kostich commented Mar 27, 2021

What are the users of packages which are using Cyrillic names (or other non-ASCII scripts) supposed to do now?

Using older compilers isn't a good long-term plan. Renaming all packages to use English names is also a lot of work. :(

@golang golang locked and limited conversation to collaborators Mar 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge GoCommand cmd/go NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants