Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/mail: AddressList doesn't decode rfc2047 encoded words inside quotes #23140

Open
sfilargi opened this issue Dec 14, 2017 · 10 comments
Open
Labels
help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@sfilargi
Copy link

sfilargi commented Dec 14, 2017

net/mail AddressList doesn't decode rfc2047 encoded words if they are inside quotes.

The RFC mentions "An 'encoded-word' MUST NOT appear within a 'quoted-string'."

Now before you close this bug, saying it is working as intended, let me try to convince you otherwise.

A lot of clients break the rule above, and most services/libraries are programmed to work around it.

For example Gmail will happily decode the string even if it is inside quotes.

The way I see it, there are two paths we can follow:

  1. Stick to RFC
  2. Try to be compatible with the majority of the libraries/services out there.

We can be strict and choose 1, in which case the library will not be of much use, since users of the software will complain.

Or we can be pragmatic and choose 2. There is not much risk in decoding Q encoded words inside quotes.

I hope you choose (2).

Repro:

https://play.golang.org/p/etkJkTfs3Q

What version of Go are you using (go version)?

1.9

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

darwin/amd64

What did you do?

https://play.golang.org/p/etkJkTfs3Q

What did you expect to see?

Decoded name

What did you see instead?

Undecoded name

@bradfitz bradfitz changed the title net/mail AddressList doesn't decode rfc2047 encoded words inside quotes net/mail: AddressList doesn't decode rfc2047 encoded words inside quotes Dec 14, 2017
@bradfitz bradfitz added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 14, 2017
@bradfitz bradfitz added this to the Go1.11 milestone Dec 14, 2017
@minaevmike
Copy link
Contributor

Thunderbird does the same as gmail.

@gopherbot
Copy link

Change https://golang.org/cl/139177 mentions this issue: net/mail: Decode RFC 2047 encoded strings within quotes.

@RalphCorderoy
Copy link

Gmail deciding to consciously (hopefully) violate the RFC for input isn't as important as whether Gmail, being a major player in sending emails, produces corrupt output that violates the RFC. I'm assuming not?

What are the producers of the corrupt emails? If it were a major producer of emails, e.g. a 'MailChimp', or an open-source library, then it can probably be persuaded to fix things for future emails. (I know of several successes in this area for mail RFC violations.) Some 'spam assassins' use RFC violations as one measure; it catches PHP scripters but by letting a kosher mail producer off the hook, this signal is being weakened.

Continuing to stick with the RFC doesn't stop Go processing the email, e.g. receiving and sending, so I don't think Go violating the RFC, and encouraging others to do so, is a good idea. Postel might have been right then, but would be wrong in the modern era. https://en.wikipedia.org/wiki/Postel%27s_law#Criticism has references, including https://tools.ietf.org/html/draft-thomson-postel-was-wrong-02

@sfilargi
Copy link
Author

sfilargi commented Oct 3, 2018

Some 'spam assassins' use RFC violations as one measure; it catches PHP scripters but by letting a kosher mail producer off the hook, this signal is being weakened.

It doesn't matter. The suggestion is not to make Go send emails that break RFC, but to correctly parse them when it receives them, even if they break the RFC.

Continuing to stick with the RFC doesn't stop Go processing the email, e.g. receiving and sending, so I don't think Go violating the RFC, and encouraging others to do so, is a good idea.

No, it doesn't stop sending or receiving, but it looks terrible for the end user, so this library will be useless for those cases where there is end-user interaction. Not fixing it in Go is kind of conformism, since pretty much every other MTA out there breaks this rule.

@RalphCorderoy
Copy link

Some 'spam assassins' use RFC violations as one measure; it catches PHP scripters but by letting a kosher mail producer off the hook, this signal is being weakened.

It doesn't matter. The suggestion is not to make Go send emails that break RFC, but to correctly parse them when it receives them, even if they break the RFC.

Go would not be 'correctly parsing them'. It does that now. Go would be weakening the signal for spam detectors by no longer discouraging buggy email producers.

No, it doesn't stop sending or receiving, but it looks terrible for the end user, so this library will be useless for those cases where there is end-user interaction. Not fixing it in Go is kind of conformism, since pretty much every other MTA out there breaks this rule.

But this Go is not an MTA, it is a stdlib. Working around buggy emails is a policy decision, not a mechanism one, and for the MTA written in Go to make, and implement, not the stdlib to do for all callers. (And as a user of that MTA, I'd want it off by default so I get to see the true email that's been sent.)

@dmitshur
Copy link
Contributor

As a data point, I got a notification mail from Gerrit with the following From header:

Subject: [go] time: fix parse month error message
From: "=?UTF-8?Q?=E7=B4=98=E5=A3=AB_=E5=85=AB=E5=B7=BB_=28Gerrit=29?=" <noreply-gerritcodereview-abcdef123456@google.com>

I'm not sure if that's violating the RFC, but if so, I plan to report it to them.

Removing the quotes seems to make it parse with net/mail:

https://play.golang.org/p/0j4_QXh0EeK

@stavrospen
Copy link

stavrospen commented Feb 24, 2019

As a data point, I got a notification mail from Gerrit with the following From header:

Subject: [go] time: fix parse month error message
From: "=?UTF-8?Q?=E7=B4=98=E5=A3=AB_=E5=85=AB=E5=B7=BB_=28Gerrit=29?=" <noreply-gerritcodereview-abcdef123456@google.com>

I'm not sure if that's violating the RFC, but if so, I plan to report it to them.

Removing the quotes seems to make it parse with net/mail:

https://play.golang.org/p/0j4_QXh0EeK

They may be violating the RFC, but there are so many clients out there that do it, that you cannot just take a hard stance if you are building a product for end users.

At the end of the day our end-users will see that OUR product is not working, while other products have no problem.

Go is a pragmatic language and I was hoping the pragmatic approach would have been followed here.

But this is just academic for me at the moment, as I switched to another language because of this.

@dmitshur
Copy link
Contributor

dmitshur commented Feb 24, 2019

One approach may be to have two packages. The RFC can be followed strictly in the standard library package, but another package can implement more lax parsing outside of the standard library.

That way, users who are looking for strict RFC behavior can continue to use net/mail, but those interested in building an email product for end users can implement custom behavior for their needs.

@RalphCorderoy
Copy link

Hi @dmitshur, Yes, that header in the Gerrit email is faulty so please do report it to the creator, probably Gerrit or a library they use. It's https://tools.ietf.org/html/rfc2047#section-5 that says An 'encoded-word' MUST NOT appear within a 'quoted-string'., their shouting, not mine. :-)

@dmitshur
Copy link
Contributor

I've reported it to Gerrit at https://bugs.chromium.org/p/gerrit/issues/detail?id=10519.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

8 participants