New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/url: encoding inconsistency between 1.4 and 1.5 for unicode domain names #12719
Comments
I assume the conversion done to maximize the interoperability with the legacy URI resolvers, but the RFC 3987 particularly recommends replacing non-ascii with dashes primarily. From http://www.ietf.org/rfc/rfc3987.txt,
|
Well, there's already http://godoc.org/golang.org/x/net/idna#ToASCII for Punycode conversions. |
In any case, this is a bug. We should never over escape the host name because url.Parse doesn't allow parsing percent-encoded host names now.
will panic with "panic: parse http://www.%C5%BElu%C5%A5ou%C4%8Dk%C3%BD-k%C5%AF%C5%88.cz: percent-encoded characters in host". |
RFC 3986 is clear about the need for percent-encoding the host when creating the URL. The parser is a bit lax in accepting the UTF-8 to begin with, but it's probably a mistake to reject it at this point. I sent a CL to accept the %-encoded form so that we can round-trip the URL. I'm not sure this is a great idea, but we'll see I guess. The main argument is for non-HTTP uses of URLs. |
CL https://golang.org/cl/17385 mentions this issue. |
Given this code (Go Playground):
I get different results in go 1.4 and go 1.5:
Is this intended (but undocumented) behavior - or is it a bug?
The text was updated successfully, but these errors were encountered: