Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mime: add ".txt": "text/plain; charset=utf-8" to builtinTypesLower #46578

Closed
mbucc opened this issue Jun 4, 2021 · 5 comments
Closed

mime: add ".txt": "text/plain; charset=utf-8" to builtinTypesLower #46578

mbucc opened this issue Jun 4, 2021 · 5 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.

Comments

@mbucc
Copy link
Contributor

mbucc commented Jun 4, 2021

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

$ go version
go version go1.15.10 linux/386

What did you do?

Run the following program on Alpine Linux.

package main

import "path/filepath"
import "fmt"
import "mime"


func main() {
	name := "x.txt"
	ext := filepath.Ext(name);
	fmt.Printf("mime type for extension '%s' = %v\n", ext, mime.TypeByExtension(ext))
}

What did you expect to see?

mime type for extension '.txt' = text/plain; charset=utf-8

What did you see instead?

mime type for extension '.txt' =

I suggest adding a txt to the list of built-in mime types.

Before I built the small program above to isolate the problem, the behavior I saw was:

  1. Built a go web server that used http.ServeContent to serve static files.
  2. On my development box (OSX), the mime type was set correctly and the browser rendered the content as plain text
  3. On the production box, the mime type was blank and the browser rendered text as HTML.

At first, I first assumed the Nginx proxy was behaving badly. Eventually, I figured out that Alpine base does not provide any of the four mime files that mime/type_unix.go looks for and mime/type.go does not provide a built-in default for the .txt extension.

I opened a bug with Alpine, but they did not think this corner case was worth adding /etc/mime.types to the Apline base. I can see their point, as this is a corner case and the mime-type file used depends on the application more than the host; mail, Apache, and Nginx all have different ideas about where the mime-type file lives---the latter two even disagree on the file format. There does not seem to be a standard location.

Does this issue reproduce with the latest release (go1.16.5)?

Based on my read of these two sources, yes.

System details

GO111MODULE=""
GOARCH="386"
GOBIN=""
GOCACHE="/home/mark/.cache/go-build"
GOENV="/home/mark/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="386"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/mark/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/mark/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_386"
GCCGO="gccgo"
GO386="387"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m32 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build108867270=/tmp/go-build -gno-record-gcc-switches"
GOROOT/bin/go version: go version go1.15.10 linux/386
GOROOT/bin/go tool compile -V: compile version go1.15.10
uname -sr: Linux 4.4.68-0-grsec
@seankhliao seankhliao changed the title mime/types.go: Add ".txt": "text/plain; charset=utf-8" to builtinTypesLower mime: add ".txt": "text/plain; charset=utf-8" to builtinTypesLower Jun 4, 2021
@seankhliao seankhliao added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jun 4, 2021
@seankhliao
Copy link
Member

cc @neild

@neild
Copy link
Contributor

neild commented Jun 4, 2021

Go 1.17 adds support on Unix-like systems for reading the FreeDesktop Shared MIME-info Database, which is a common specification supported by many Linux variants. There is a Shared MIME-Info Alpine package.

While .txt is certainly a common extension, using the system database seems strictly better than adding to the built-in list.

@mbucc
Copy link
Contributor Author

mbucc commented Jun 6, 2021

I was curious how other languages handle this. All languages I checked (other than Erlang and Java) take the approach suggested in this bug report. Erlang http code raises an error if it can't find a mime.types file and go takes the same approach as Java. (How often can you say that? :) )

Note the change you mention does not address this issue. There were already four such mime type files loaded in mime/type_unix.go, none of which are provided in Alpine base. This is an undocumented, run-time, external package dependency that changes how the same go code behaves. Can you think of another instance where go library behaves this way?

The nuclear option is to have mime.TypeByExtension and http.ServeContent return an error if there is no mime type file found, and zap the list of defaults (the Erlang approach). But that could break existing code, and does not seem worth it for this corner case.

You could also simply update the docstrings in these two methods to document this dependency. But that "fix" depends on people reading the docs. :|

I think the safest way forward is to just expand the array of default values, perhaps following Haskell's lead and autogenerating from Nginx and Apache configs. Or just copy and paste the list from them.

Here are the links to how other languages handle this:

The Haskell mime-types package contains a long list of default mime-types, autogenerated from "from the Apache and nginx mime.types files": https://www.stackage.org/haddock/nightly-2021-06-01/mime-types-0.1.0.9/src/Network.Mime.html#MimeMap

The Rust mime crate also has a long list of defaults: https://docs.rs/mime/0.3.16/mime/

Erlang is a bit different, requiring you to drop a mime.types file in a conf directory: http://erlang.org/documentation/doc-4.8.2/lib/inets-2.3.1/doc/html/httpd_core.html#mime. It raises an error if it cannot find the mime-types files: https://github.com/erlang/otp/blob/master/lib/inets/src/http_server/httpd_conf.erl#L70-L76

Python has a long list of defaults: https://github.com/python/cpython/blob/8e2c0fd7ada79107f7e0d9c465e77fb36a9486e5/Lib/mimetypes.py#L414-L561

Ruby's mime library has a long list of defaults, but they add value by taking the simple mime file format and turning that into a set of 13 yaml files: https://github.com/mime-types/mime-types-data/tree/master/types, where each distinct mimetype is represented as eight lines of yaml.

It looks like Java takes a similar approach to go; look for a set of "magic" filenames and fallback to a small set of built-in defaults: https://docs.oracle.com/javase/8/docs/api/javax/activation/MimetypesFileTypeMap.html, and

mark@Marks-MBP-3 tmp % unzip /Users/mark/src/other/jboss-eap-6.1/modules/system/layers/base/javax/activation/api/main/activation-1.1.1-redhat-2.jar META-INF/mimetypes.default
Archive:  /Users/mark/src/other/jboss-eap-6.1/modules/system/layers/base/javax/activation/api/main/activation-1.1.1-redhat-2.jar
  inflating: META-INF/mimetypes.default  
mark@Marks-MBP-3 tmp % cat META-INF/mimetypes.default 
#
# A simple, old format, mime.types file
#
text/html		html htm HTML HTM
text/plain		txt text TXT TEXT
image/gif		gif GIF
image/ief		ief
image/jpeg		jpeg jpg jpe JPG
image/tiff		tiff tif
image/png		png PNG
image/x-xwindowdump	xwd
application/postscript	ai eps ps
application/rtf		rtf
application/x-tex	tex
application/x-texinfo	texinfo texi
application/x-troff	t tr roff
audio/basic		au
audio/midi		midi mid
audio/x-aifc		aifc
audio/x-aiff            aif aiff
audio/x-mpeg		mpeg mpg
audio/x-wav             wav
video/mpeg		mpeg mpg mpe
video/quicktime		qt mov
video/x-msvideo		avi
mark@Marks-MBP-3 tmp % 

@seankhliao
Copy link
Member

undocumented

it's here: https://pkg.go.dev/mime#TypeByExtension

Can you think of another instance where go library behaves this way?

tzdata, system certificates

@mbucc
Copy link
Contributor Author

mbucc commented Jun 6, 2021

Fair enough. Closing.

@mbucc mbucc closed this as completed Jun 6, 2021
@golang golang locked and limited conversation to collaborators Jun 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

4 participants