Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/url: Parse does not permit percent escapes in host #30844

Closed
zombiezen opened this issue Mar 14, 2019 · 8 comments
Closed

net/url: Parse does not permit percent escapes in host #30844

zombiezen opened this issue Mar 14, 2019 · 8 comments
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done. Suggested Issues that may be good for new contributors looking for work to do.
Milestone

Comments

@zombiezen
Copy link
Contributor

RFC 3986 permits percent-encoded characters in the host part of a URL. *net/url.URL.String will percent-encode the net/URL.Host field, but confusingly, url.Parse does not accept URLs produced in this form.

Applications like PostgreSQL that use net/url.Parse to parse URLs that have this form produce an error, see lib/pq#796. The prior #16127 notes some mismatch between net/url.Parse and RFC 3986. This issue is around the specific issue of percent-encoded characters in the host, not any larger scope.

What version of Go are you using (go version)?

$ go version
go version go1.12 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE=[redacted]
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH=[redacted]
GOPROXY=""
GORACE=""
GOROOT=[redacted]
GOTMPDIR=""
GOTOOLDIR=[redacted]
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=[redacted]
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS=[redacted]

What did you do?

https://play.golang.org/p/l9oOmcNkYUd

What did you expect to see?

URL1: postgres://foo@%2Fvar%2Frun%2Fpostgres/mydb
Host: /var/run/postgres
URL2: postgres://foo@%2Fvar%2Frun%2Fpostgres/mydb
Host: /var/run/postgres

What did you see instead?

URL1: postgres://foo@%2Fvar%2Frun%2Fpostgres/mydb
Host: /var/run/postgres
Parse: parse postgres://foo@%2Fvar%2Frun%2Fpostgres/mydb: invalid URL escape "%2F"
@zombiezen zombiezen added the Suggested Issues that may be good for new contributors looking for work to do. label Mar 14, 2019
@alin04
Copy link

alin04 commented Apr 2, 2019

In addition, both "<" and ">" are allowed in host (specifically registered name) when they shouldn't be as per: https://tools.ietf.org/html/rfc3986#section-3.2.2

https://golang.org/src/net/url/url.go?s=2590:2650#L100

This should pct-encode the ">", but doesn't:
https://play.golang.org/p/aVV6I6q-mPV

@bcmills bcmills added help wanted NeedsFix The path to resolution is known, but the work has not been done. labels Apr 12, 2019
@bcmills bcmills added this to the Unplanned milestone Apr 12, 2019
@thoeni
Copy link
Contributor

thoeni commented Apr 13, 2019

@zombiezen I wrote a test with your expected result, and I got it passing by removing a check here:
https://github.com/golang/go/blob/master/src/net/url/url.go#L221-L223
Those lines appear to explicitly exclude ASCII (but for the %25) from escaping on parse

The note on the if branch mentions page 21 of that same RFC that you've linked: https://github.com/golang/go/blob/master/src/net/url/url.go#L215-L220

I think this is the relevant bit:

The reg-name syntax allows percent-encoded octets in order to
represent non-ASCII registered names in a uniform way that is
independent of the underlying name resolution technology. Non-ASCII
characters must first be encoded according to UTF-8 [STD63], and then
each octet of the corresponding UTF-8 sequence must be percent-
encoded to be represented as URI characters.

Also, I can see that the mention about RFC implementation by the package has been "relaxed" a while ago: https://go-review.googlesource.com/c/go/+/22859/
And here: #16127 (comment)

Not sure how to interpret the RFC, but happy to hear your thoughts.

Also worth noticing that some tests are actually checking the opposite behaviour:
https://github.com/golang/go/blob/master/src/net/url/url_test.go#L1432

@zombiezen
Copy link
Contributor Author

@bradfitz, any thoughts on this issue? I remember there being some subtleties about what sort of changes we want to permit around net/url.

@bradfitz
Copy link
Contributor

Given that we String-ify such Hosts, I'm not opposed to trying to Parse them too. We'll see what breaks. But please send a change soon, as the freeze is coming, and net/url changes are notorious for breaking things & getting reverted.

@thoeni
Copy link
Contributor

thoeni commented Apr 15, 2019

What's not entirely clear to me is:
The RFC says:

The reg-name syntax allows percent-encoded octets in order to
represent non-ASCII registered names in a uniform way that is
independent of the underlying name resolution technology.

The example of / is an "ASCII" character after all. 🤔

I've made a change, and I'll open a CL, but there were tests explicitly asserting the opposite of what this issue says, see:

Also the current comment on the if statement mentions:

// Per https://tools.ietf.org/html/rfc3986#page-21
// in the host component %-encoding can only be used
// for non-ASCII bytes.
// But https://tools.ietf.org/html/rfc6874#section-2
// introduces %25 being allowed to escape a percent sign
// in IPv6 scoped-address literals. Yay.

Have a look at the changes.

@gopherbot
Copy link

Change https://golang.org/cl/172157 mentions this issue: net/url: Parse allow ASCII percent-encoded chars in host

@r5d
Copy link
Contributor

r5d commented Sep 11, 2021

@bcmills @zombiezen RFC3986 says %-encoding is allowed in reg-name for only non-ASCII characters; / is an ASCII character.

If I'm interpreting this correctly, Parse should not permit %-encoded ASCII characters.

Should this issue be closed? If no, I would like to start working on a patch improving upon @thoeni's CL 172157 .

@markokr
Copy link

markokr commented Nov 22, 2021

In that case the encoding "/" as %2F is wrong? What code is responsible for that? (See testcase in opening msg)

@golang golang locked and limited conversation to collaborators Nov 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done. Suggested Issues that may be good for new contributors looking for work to do.
Projects
None yet
Development

No branches or pull requests

9 participants