Skip to content

mime: use "charset=UTF-8" instead of "charset=utf-8" #19430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jamshid opened this issue Mar 7, 2017 · 5 comments
Closed

mime: use "charset=UTF-8" instead of "charset=utf-8" #19430

jamshid opened this issue Mar 7, 2017 · 5 comments
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@jamshid
Copy link

jamshid commented Mar 7, 2017

What version of Go are you using (go version)?

go version go1.7 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN="/go/bin"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/root/go"
GORACE=""
GOROOT="/go"
GOTOOLDIR="/go/pkg/tool/linux_amd64"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build903236581=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"

What did you do?

Not a big deal, the current code is not wrong, but please uppercase "UTF-8" in golang source.

Reference https://golang.org/src/mime/type.go, the map builtinTypesLower has some default Content-type's for file extensions like:

var builtinTypesLower = map[string]string{
...
		".xml":  "text/xml; charset=utf-8",
}

What did you expect to see?

                 ...
		".xml":  "text/xml; charset=UTF-8",

What did you see instead?

The charsets with "utf-8" should be spelled "UTF-8". The current lowercase version isn't necessarily wrong, but uppercase is more correct according to:
https://blog.codingoutloud.com/2009/04/08/is-utf-8-case-sensitive-in-xml-declaration/

This came up because the java server jetty uppercases the "utf-8" in the incoming request's Content-type charset before the servlet starts. Normally this isn't a problem, but it breaks S3-like signature verification, where client and server sign properties of a request (e.g. the Content-type header).

@bradfitz bradfitz changed the title Use "charset=UTF-8" instead of "charset=utf-8" in mime/types.go mime: use "charset=UTF-8" instead of "charset=utf-8" Mar 7, 2017
@bradfitz
Copy link
Contributor

bradfitz commented Mar 7, 2017

Normally this isn't a problem, but it breaks S3-like signature verification, where client and server sign properties of a request (e.g. the Content-type header).

Elaborate, please.

Changing this arbitrarily seems just as likely to break other people (and lots of tests!). We would need a very good reason to change this.

@bradfitz bradfitz added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Mar 7, 2017
@jamshid
Copy link
Author

jamshid commented Mar 7, 2017

The scenario is a golang client (e.g. https://github.com/ncw/rclone/) makes a request to a Jetty server. They both use AWS S3-like signatures to authenticate the request. I.e. both the client and server compute a hash using various request headers and sign the hash with a shared "secret" to produce a signature.

The client signature includes Content-type "text/xml; charset=utf-8" but Jetty (wrongly, IMO) normalizes the charset so the servlet sees "text/xml; charset=UTF-8". This causes the signatures to not match.

This can be worked around with some Jetty hacking, e.g. to preserve the original Content-type, but as the link above shows the official spellings of the charset is "UTF-8", not "utf-8", so seems preferable to use that.

But you could argue that http://www.iana.org/assignments/character-sets/character-sets.xhtml does say "However, no distinction is made between use of upper and lower case letters", so if you really don't want to change golang types.go I guess that's not unreasonable.

@bradfitz
Copy link
Contributor

bradfitz commented Mar 7, 2017

@ncw, can you elaborate? What's at fault, here? Is the "S3-like" signature algorithm just poorly designed? Is this a specified protocol, or something that rclone invented?

@jamshid
Copy link
Author

jamshid commented Mar 8, 2017

Just to be clear I'm not saying anything is at fault. Well, except maybe Jetty, it should not normalize the "charset" in the request header "Content-type".

The S3 authentication protocol
http://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-header-based-auth.html requires the client and server to use the exact same Content-type value when computing the signature.

That's not possible when a golang client sends ...;charset=utf-8 to an S3-compatible server based on Jetty, which quietly changes the header value to ...;charset=UTF-8.

Of course this can (and should) be fixed or worked around in the Jetty server. I'm not saying golang is wrong to use "utf-8", just that it might as well use the official charset spelling "UTF-8" instead. Of course, changing to that spelling now might not be worth the trouble.

@bradfitz
Copy link
Contributor

bradfitz commented Mar 8, 2017

Okay, then I'll close this. Changing it would break people's tests unnecessarily if the bug is in Jetty.

@bradfitz bradfitz closed this as completed Mar 8, 2017
@golang golang locked and limited conversation to collaborators Mar 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

3 participants