Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/publicsuffix: effectiveTDLPlusOne for subdomain of amazonaws.com yields wrong value #51510

Closed
leandrosansilva opened this issue Mar 6, 2022 · 10 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@leandrosansilva
Copy link

When calling publicsuffix.EffectiveTLDPlusOne("inbound-smtp.us-east-1.amazonaws.com"), I'd expect to obtain amazonaws.com, but instead I obtain the unmodified value inbound-smtp.us-east-1.amazonaws.com.

Tested version: latest as on 2022-03-06.

Some code reproducing the issue (on playground: https://go.dev/play/p/VAYufCl2LDY):

package main

import (
	"fmt"

	"golang.org/x/net/publicsuffix"
)

func main() {
	domains := []string{
		"amazon.co.uk",
		"books.amazon.co.uk",
		"www.books.amazon.co.uk",
		"amazon.com",
		"inbound-smtp.us-east-1.amazonaws.com",
	}

	for _, domain := range domains {
		eTLDPlusOne, _ := publicsuffix.EffectiveTLDPlusOne(domain)

		fmt.Printf("%s -> %s\n", domain, eTLDPlusOne)
	}
}

The result is:

amazon.co.uk -> amazon.co.uk
books.amazon.co.uk -> amazon.co.uk
www.books.amazon.co.uk -> amazon.co.uk
amazon.com -> amazon.com
inbound-smtp.us-east-1.amazonaws.com -> inbound-smtp.us-east-1.amazonaws.com
@gopherbot gopherbot added this to the Unreleased milestone Mar 6, 2022
@mengzhuo
Copy link
Contributor

mengzhuo commented Mar 7, 2022

cc @neild

@mengzhuo mengzhuo added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 7, 2022
@rentiansheng
Copy link

rentiansheng commented Mar 7, 2022

I also found this issue .
reason:
psl-maintainers@amazon.com at https://github.com/publicsuffix/list/blob/master/public_suffix_list.dat#L10743
Submit "us-east-1.amazonaws.com" domain .

publicsuffix.EffectiveTLDPlusOne( "inbound-smtp.us-east-1.amazonaws.com")

result : "inbound-smtp.us-east-1.amazonaws.com". not expect "amazonaws.com"

@leandrosansilva
Copy link
Author

Hi @rentiansheng, it seems that the publicsuffix database is quite messy regarding those domains.

For instance, EffectiveTDLPlusOne("inbound-smtp.eu-west-1.amazonaws.com") properly yields amazonaws.com, and the only difference is that east becomes west.

An update version of the sample code is at: https://go.dev/play/p/McXmywFKGG0

Probably the solution there is either adding west to the database (and maybe all the other variations) or remove both from the database.

Should I close the current issue and open one at the publicsuffix repository?

For the context, my current use case is to obtain a company's main domain from some SMTP relay domain name. Please let me now if anyone thinks publicsuffix is not suitable for that.

@jub0bs
Copy link

jub0bs commented Nov 8, 2022

@leandrosansilva This is expected behaviour.

us-east-1.amazonaws.com is (at the time of writing) the most specific suffix in the PSL that matches domain inbound-smtp.us-east-1.amazonaws.com.

Therefore, the eTLD+1 is us-east-1.amazonaws.com plus the label immediately to the left, i.e. inbound-smtp.us-east-1.amazonaws.com.

@dpanic
Copy link

dpanic commented Aug 28, 2023

I have another situation with amazonaws.com.
domain, err := publicsuffix.EffectiveTLDPlusOne("ec2-1-1-1-1.compute-1.amazonaws.com")
publicsuffix: cannot derive eTLD+1 for domain "ec2-1-1-1-1.compute-1.amazonaws.com"

While others work:
test.test.amazonaws.com -> amazonaws.com
test.123.amazonaws.com -> amazonaws.com

@jub0bs
Copy link

jub0bs commented Aug 28, 2023

@dpanic

Working as expected. The public-suffix list contains the entry *.compute-1.amazonaws.com. Therefore, according to the spec, ec2-1-1-1-1.compute-1.amazonaws.com is an eTLD. Therefore, you cannot derive its eTLD+1; you'd need one more label on the left for that.

@dpanic
Copy link

dpanic commented Aug 28, 2023 via email

@jub0bs
Copy link

jub0bs commented Aug 28, 2023

@dpanic x/net/publicsuffix does, as expected, strictly adhere to the semantics of the public-suffix list. If you need a different behaviour, you can always implement it yourself, I guess.

@dpanic
Copy link

dpanic commented Aug 28, 2023

@jub0bs sure, thanks for heads up. Implemented workaround.

https://go.dev/play/p/1xKKMF0RzV-

// ExtractDomain extracts domain from suffix list
func ExtractDomain(hostname string) (domain string, err error) {
	tmp := strings.Split(hostname, ".")
	results := make([]string, 0)

	for i := 0; i < len(tmp); i++ {
		test := strings.Join(tmp[i:], ".")
		domain, _ = publicsuffix.EffectiveTLDPlusOne(test)
		if domain != "" {
			results = append(results, domain)
		}
	}

	sort.Strings(results)
	if len(results) > 0 {
		domain = results[len(results)-1]
	}

	if domain == "" {
		err = ErrDomainNotExtracted
	}
	return
}

@jub0bs
Copy link

jub0bs commented Aug 29, 2023

@leandrosansilva If you're satisfied with the answers you've got, could you please self-close this issue? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants