Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: sniff.go DetectContentType failing to detect text/html correctly #16275

Closed
turbodonkey opened this issue Jul 6, 2016 · 5 comments
Closed
Milestone

Comments

@turbodonkey
Copy link

turbodonkey commented Jul 6, 2016

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?
    go version go1.6.2 darwin/amd64
  2. What operating system and processor architecture are you using (go env)?
    GOARCH="amd64"
    GOBIN=""
    GOEXE=""
    GOHOSTARCH="amd64"
    GOHOSTOS="darwin"
    GOOS="darwin"
    GOPATH=""
    GORACE=""
    GOROOT="/usr/local/go"
    GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
    GO15VENDOREXPERIMENT="1"
    CC="clang"
    GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fno-common"
    CXX="clang++"
    CGO_ENABLED="1"
  3. What did you do?
package main

import (
    "fmt"
    "net/http"
    "io/ioutil"
)

func main() {
    fp, err := ioutil.ReadFile("test.html")
    if err != nil {
        fmt.Printf("error opening file %s\n", err)
        return
    }
    fmt.Printf("Content-type %s\n", http.DetectContentType(fp))
}

With test.html being:

<!--
this is a dummy comment
-->

<html>
        this is a simple html page
</html>

output: Content-type text/plain; charset=utf-8

  1. What did you expect to see?

output: Content-type text/html; charset=utf-8

Basically DetectContentType should detect the '<!--' at the start of the .html file, and it does not. As soon as I remove the '<!--' tag, then the file is correctly detected as text/html

  1. What did you see instead?

Content-type text/plain; charset=utf-8

@ianlancetaylor ianlancetaylor added this to the Go1.8 milestone Jul 6, 2016
@fraenkel
Copy link
Contributor

fraenkel commented Jul 6, 2016

How do you know it isn't an XML document?

@davecheney
Copy link
Contributor

A valid HTML, XHTML, XML, or HTML4 document must not start with anything
except an opening element, ie , , etc. <!-- is a comment, it
is not permitted at the top level.

On Wed, Jul 6, 2016 at 11:25 PM, Michael Fraenkel notifications@github.com
wrote:

How do you know it isn't an XML document?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#16275 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAAcA8ns3OtZngMLqVzvgi4SUEWEGY88ks5qS6y2gaJpZM4JF8xK
.

@turbodonkey
Copy link
Author

ok you can take the spec / standard high road on this, or we could improve the code. right now, '<!--' is considered text/html (from sniff.go):

// Data matching the table in section 6.
var sniffSignatures = []sniffSig{
    htmlSig("<!DOCTYPE HTML"),
    htmlSig("<HTML"),
    htmlSig("<HEAD"),
    htmlSig("<SCRIPT"),
    htmlSig("<IFRAME"),
    htmlSig("<H1"),
    htmlSig("<DIV"),
    htmlSig("<FONT"),
    htmlSig("<TABLE"),
    htmlSig("<A"),
    htmlSig("<STYLE"),
    htmlSig("<TITLE"),
    htmlSig("<B"),
    htmlSig("<BODY"),
    htmlSig("<BR"),
    htmlSig("<P"),
    htmlSig("<!--"),

The issue is in this line of htmlSig:

    // Next byte must be space or right angle bracket.
    if db := data[len(h)]; db != ' ' && db != '>' {
        return ""
    }

My html files do not have a trailing ' ' against '<!--', they have a newline instead.

@bradfitz
Copy link
Contributor

bradfitz commented Jul 6, 2016

ok you can take the spec / standard high road on this, or we could improve the code

We're going to follow the spec. If we don't draw the line somewhere then DetectContentType will be under constant churn and feature requests.

Please file a bug at https://github.com/whatwg/mimesniff/issues if you disagree with the spec. Once fixed upstream, we'll fix Go.

Until then, unless there's some place where we're in violation of the mimesniff spec, I'm going to close this bug.

@bradfitz bradfitz closed this as completed Jul 6, 2016
@turbodonkey
Copy link
Author

Thats fair, I'll chase with mimesniff! Thanks for the immediate response all.

@golang golang locked and limited conversation to collaborators Jul 7, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants