Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mime: FormatMediaType fails to detect non-ASCII bytes #28849

Closed
andrius4669 opened this issue Nov 17, 2018 · 5 comments
Closed

mime: FormatMediaType fails to detect non-ASCII bytes #28849

andrius4669 opened this issue Nov 17, 2018 · 5 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@andrius4669
Copy link
Contributor

What version of Go are you using (go version)?

go version go1.11.2 linux/amd64

Does this issue reproduce with the latest release?

yeah.

What operating system and processor architecture are you using (go env)?

irrelevant.

What did you do?

https://play.golang.org/p/mkEHhue4cKm
Tried FormatMediaType with different unicode characters in parameter values.

What did you expect to see?

One of:

  1. FormatMediaType encoding parameters with any unicode characters as per RFC 2231 section 4.
  2. FormatMediaType consistently accepting all unicode characters. That's probably not really compliant with current standards, but works with few email clients I tried. I like RFC 6532, but it doesn't have any mention of Content-Type or Content-Disposition though, so this option can probably be ignored.
  3. FormatMediaType consistently rejecting all characters impossible to represent by US-ASCII.

What did you see instead?

Only characters covered by ISO-8859-1 are detected as unicode and cause encoding failure.
Characters whose rune values (actual unicode values, not UTF-8 encoded) aren't & 0x80 pass test and are included in output.

Additional info & reporter's opinion

I noticed this looking at source code of FormatMediaType and noticing that it contains if character&0x80 != 0 check, however it iterates thru string with range value (not range []byte(value) which would be correct).
I really don't like that it doesn't at least support RFC 2231 encoding, now I will either need to reimplement this function myself, or encode filename parameter using either mime.BEncoding or mime.QEncoding.
Adding proper check there and failing to encode could cause some issues if implementations relied on old broken behavior, I guess.

@bcmills bcmills added the NeedsFix The path to resolution is known, but the work has not been done. label Nov 19, 2018
@bcmills bcmills changed the title mime: FormatMediaType fails to detect unicode characters properly mime: FormatMediaType fails to detect non-ASCII bytes Nov 19, 2018
@bcmills
Copy link
Contributor

bcmills commented Nov 19, 2018

This issue seems like it is really two separate issues:

  1. The current validation checks runes instead of bytes.
  2. Ideally you'd like RFC 2231 encoding.

(1) is a bug-fix. (2) is a feature request; please file it separately.

@bcmills
Copy link
Contributor

bcmills commented Nov 19, 2018

CC @mpvl @bradfitz

@bcmills bcmills added this to the Go1.13 milestone Nov 19, 2018
@andrius4669
Copy link
Contributor Author

andrius4669 commented Nov 19, 2018 via email

@gopherbot
Copy link

Change https://golang.org/cl/150417 mentions this issue: mime: correctly detect non-ASCII characters in FormatMediaType

@gopherbot
Copy link

Change https://golang.org/cl/150498 mentions this issue: mime: remove allocation introduced in recent fix

gopherbot pushed a commit that referenced this issue Nov 20, 2018
CL 150417 was submitted before I could recommend this change to remove
an unnecessary allocation.

Updates #28849

Change-Id: I4cd655f62bb3d00eda6c997f074785385bceee0c
Reviewed-on: https://go-review.googlesource.com/c/150498
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
bradfitz pushed a commit that referenced this issue Nov 21, 2018
FormatMediaType used rune&0x80==0 to check if parameter values consisted
of valid ascii charaters. Comparing strings using their runes instead of
their bytes leads to some non-ascii strings to pass as valid.

E.g. the rune for 'Ą' is 0x104, 0x104 & 0x80 => 0. Its byte
representation is 0xc4 0x84, both of which result in non zero values
when masked with 0x80

Fixes #28849

Change-Id: Ib9fb4968bcbbec0197d81136f380d40a2a56c14b
Reviewed-on: https://go-review.googlesource.com/c/150417
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
bradfitz added a commit that referenced this issue Nov 21, 2018
CL 150417 was submitted before I could recommend this change to remove
an unnecessary allocation.

Updates #28849

Change-Id: I4cd655f62bb3d00eda6c997f074785385bceee0c
Reviewed-on: https://go-review.googlesource.com/c/150498
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@golang golang locked and limited conversation to collaborators Nov 20, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

3 participants