You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running idna.ToASCII, it should perform a normalization of unicode before encoding to punycode (https://en.wikipedia.org/wiki/Internationalized_domain_name, section "ToASCII and ToUnicode": "ToASCII will apply the Nameprep algorithm, which converts the label to lowercase and performs other normalization, and will then translate the result to ASCII using Punycode").
The golang.org/x/net/idna does not seem to perform that normalization step, while e.g. the userspace github.com/DanielOaks/go-idn package does.
So running idna.ToASCII on www.état.com and on www.e\u0301tat.com should (if I understand IDNA correctly) return the same punycode form: www.xn--tat-9la.com.
What did you see instead?
The userspace package correctly returns www.xn--tat-9la.com for both inputs, but x/net/idna returns "www.xn--tat-9la.com" and "www.xn--etat-vvc.com".
The text was updated successfully, but these errors were encountered:
My bad, it seems that rfc-5891 ("Internationalized Domain Names in Applications (IDNA): Protocol") obsoletes the "nameprep" rfc-3491 ("Nameprep: A Stringprep Profile for IDN") and states in "Appendix A. Summary of Major Changes from IDNA2003":
Remove the mapping and normalization steps from the protocol and have them, instead, done by the applications themselves, possibly in a local fashion, before invoking the protocol.
So I guess x/net/idna does the right thing and it is up to the caller to normalize or not. Though it means the caller should know whether a domain in non-normalized form is equivalent to one in normalized form, which I have no idea if it is (maybe it is incosistent in the wild, registration for www.\u00e9tat.com and www.e\u0301tat.com may or may not be separate domains?).
If anyone knows about that last part, I'd love to know (it would be very helpful for the purell normalization package that I maintain), but otherwise this is not an issue for the idna package, so I'll close it.
Please answer these questions before submitting your issue. Thanks!
go version
)?go env
)?If possible, provide a recipe for reproducing the error.
A complete runnable program is good.
A link on play.golang.org is best.
https://play.golang.org/p/zS-UR4WhIx
When running
idna.ToASCII
, it should perform a normalization of unicode before encoding to punycode (https://en.wikipedia.org/wiki/Internationalized_domain_name, section "ToASCII and ToUnicode": "ToASCII will apply the Nameprep algorithm, which converts the label to lowercase and performs other normalization, and will then translate the result to ASCII using Punycode").The golang.org/x/net/idna does not seem to perform that normalization step, while e.g. the userspace github.com/DanielOaks/go-idn package does.
So running idna.ToASCII on
www.état.com
and onwww.e\u0301tat.com
should (if I understand IDNA correctly) return the same punycode form:www.xn--tat-9la.com
.The userspace package correctly returns
www.xn--tat-9la.com
for both inputs, but x/net/idna returns "www.xn--tat-9la.com" and "www.xn--etat-vvc.com".The text was updated successfully, but these errors were encountered: