Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/publicsuffix: PublicSuffix() is case sensitive #25254

Open
OrBaruk opened this issue May 4, 2018 · 5 comments
Open

x/net/publicsuffix: PublicSuffix() is case sensitive #25254

OrBaruk opened this issue May 4, 2018 · 5 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@OrBaruk
Copy link

OrBaruk commented May 4, 2018

What version of Go are you using (go version)?

go version go1.10.1 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOOS="linux"

What did you do?

PublicSuffix() returns incorrect values when the domain contains letters with uppercase

According to RFC4343 I believe PublicSuffix() should be case insensitive.

package main

import (
	"fmt"
	"golang.org/x/net/publicsuffix"
)

func main() {
	suffix, icann := publicsuffix.PublicSuffix("abc.CoM.Br")
	fmt.Printf("suffix: %s\nicann: %v\n", suffix, icann)
}

What did you expect to see?

suffix: com.br
icann: true

What did you see instead?

suffix: Br
icann: false
@gopherbot gopherbot added this to the Unreleased milestone May 4, 2018
@OrBaruk OrBaruk changed the title x/net/publicsuffix: x/net/publicsuffix: PublicSuffix() is case sensitive May 4, 2018
@ALTree ALTree added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 4, 2018
@meirf
Copy link
Contributor

meirf commented May 5, 2018

Test cases from the official PSL repo:

// Mixed case.
checkPublicSuffix('COM', null);
checkPublicSuffix('example.COM', 'example.com');
checkPublicSuffix('WwW.example.COM', 'example.com');

This indicates that PublicSuffix should be case insensitive.

Note these test cases are missing from x/net/publicsuffix. I wonder if they were omitted intentionally. @vdobler @nigeltao

@vdobler
Copy link
Contributor

vdobler commented May 5, 2018

Yes this is deliberate because the tests would not pass.

EffectiveTLDPlusOne returns the eTLD+1 of its argument
in the original input capitalisation, it doesn't lowercase
the input.

EffectiveTLDPlus1("b.DOmaiN.BiZ") == "DoMaiN.BiZ"

@OrBaruk
Copy link
Author

OrBaruk commented May 5, 2018

I think the EffectiveTLDPlusOne should preserve the original capitalisation, but it should not give different results depending on how the input was capitalised for example:

Current:

publicsuffix.EffectiveTLDPlusOne("SuBdOmAiN.aBc.BlOgSpOt.cOm") == "BlOgSpOt.cOm"
publicsuffix.EffectiveTLDPlusOne("subdomain.abc.blogspot.com") == "abc.blogspot.com"

Expected:

publicsuffix.EffectiveTLDPlusOne("SuBdOmAiN.aBc.BlOgSpOt.cOm") == "aBc.BlOgSpOt.cOm"
publicsuffix.EffectiveTLDPlusOne("subdomain.abc.blogspot.com") == "abc.blogspot.com"

These different results was how I originally found the issue.

@vdobler
Copy link
Contributor

vdobler commented May 7, 2018

@OrBaruk

The unexpected result you see with mixed case inputs is addressed in
two TODOs

The current implementation of package publicsuffix requires arguments
to PublicSuffix and EffectiveTLDPlus1 to be: Lowercases, free from leading
and trailing dots and punycoded.
As the main use of package publicsuffix is in package cookiejar and cookiejar
calls PublicSuffix only with lowercased, dot-free, punycoded arguments and
ensuring this once more in package publicsuffix is wasteful.

But of course this is unintuitive and not documented.

You demonstrated un-intuitive behavior with uppercase domain names.
How should EffectiveTLDPlusOne behave on IDNs?
How on full qualified domain names (trailing dot)?
Should it strip leading dots (still common in the domain
attribute of a cookie)=

publicsuffix.EffectiveTLDPlusOne("世界.食狮.中国") = ??
publicsuffix.EffectiveTLDPlusOne("www.foo.com.") = ??
publicsuffix.EffectiveTLDPlusOne(".example.com") = ??

@nigeltao
I think a natural expectation would be that package publicsuffix

  • works well with uppercase/mixed case domains
  • works well with IDNs.
    With "works well" in the sense of produces the naturally expected value
    like suggested by OrBaruk and by handling IDNs transparently as this
    package seems to be used not only as an Option to cookiejar.Jar.
    (Leading/trailing dots must be removed by the caller.)

What do you think?

@nigeltao
Copy link
Contributor

I understand the "works well" expectation, but I think that there's a bigger question of how does Go generally handle IDNs (and upper/mixed case domains)? I think any answer needs to be consistent across net, net/http, net/http/cookiejar, net/url, x/net/publicsuffix, etc, and not just a tactical modification only to x/net/publicsuffix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants