New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: sniff.go DetectContentType failing to detect text/html correctly #16275
Comments
How do you know it isn't an XML document? |
A valid HTML, XHTML, XML, or HTML4 document must not start with anything On Wed, Jul 6, 2016 at 11:25 PM, Michael Fraenkel notifications@github.com
|
ok you can take the spec / standard high road on this, or we could improve the code. right now, '<!--' is considered text/html (from sniff.go):
The issue is in this line of htmlSig:
My html files do not have a trailing ' ' against '<!--', they have a newline instead. |
We're going to follow the spec. If we don't draw the line somewhere then DetectContentType will be under constant churn and feature requests. Please file a bug at https://github.com/whatwg/mimesniff/issues if you disagree with the spec. Once fixed upstream, we'll fix Go. Until then, unless there's some place where we're in violation of the mimesniff spec, I'm going to close this bug. |
Thats fair, I'll chase with mimesniff! Thanks for the immediate response all. |
Please answer these questions before submitting your issue. Thanks!
go version
)?go version go1.6.2 darwin/amd64
go env
)?GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH=""
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GO15VENDOREXPERIMENT="1"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fno-common"
CXX="clang++"
CGO_ENABLED="1"
With test.html being:
output: Content-type text/plain; charset=utf-8
output: Content-type text/html; charset=utf-8
Basically DetectContentType should detect the '<!--' at the start of the .html file, and it does not. As soon as I remove the '<!--' tag, then the file is correctly detected as text/html
Content-type text/plain; charset=utf-8
The text was updated successfully, but these errors were encountered: