Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: embed: embed and expose also media/mime type #66121

Open
mitar opened this issue Mar 5, 2024 · 8 comments
Open

proposal: embed: embed and expose also media/mime type #66121

mitar opened this issue Mar 5, 2024 · 8 comments
Labels
Milestone

Comments

@mitar
Copy link
Contributor

mitar commented Mar 5, 2024

Proposal Details

I think this is a common use case: we use embed to embed some files at compile time and then we serve them at runtime using HTTP. But to serve files at runtime, one has to also know the media/mime type of the file, to set Content-Type correctly. One way to do that is to use standard mime.TypeByExtension on the embedded filenames. This works well if filenames have known file extensions and developer can generally predict the outcome if they notice a file extension is missing or if the file extension is an obscure one.

But what is hard to observe by a developer is that mime.TypeByExtension requires system installation of a corresponding database. If that database is missing, then behavior on their local machine might differ from the production machine. Even worse, the behavior might be different if the database is different (e.g., locally one uses Ubuntu and production uses Alpine).

To make production deployment work well, we use static builds. But sadly, because of the dependency on this database, even static build can behave differently.

Proposal

I propose that embedded files with embed.FS detect the media/mime type at compile time and expose it in some way to the developer (I am open to how exactly this is implemented). The critical point here is that this is done at compile where developer has easier time controlling the environment, so that the behavior is the same wherever the binary is later on deployed/ran.

Alternative proposal

Maybe there should be a way to embed currently available mime/media type database into the binary at compile time.

@mitar mitar added the Proposal label Mar 5, 2024
@gopherbot gopherbot added this to the Proposal milestone Mar 5, 2024
@Jorropo
Copy link
Member

Jorropo commented Mar 6, 2024

What happen if I build a program using this feature on two different machine with two different databases ? Does that means theses builds are not byte for byte reproducible anymore ?

That sounds like this could be solved by vendoring a generic enough database in the std and using that for builds, however then if we do that, there is no particular reason why this has to be done at build time for the usecase you describe to work.

@mitar
Copy link
Contributor Author

mitar commented Mar 6, 2024

What happen if I build a program using this feature on two different machine with two different databases ? Does that means theses builds are not byte for byte reproducible anymore ?

Yes, the database becomes then your compilation time dependency. So they are reproducible if you have all dependencies the same.

@Jorropo
Copy link
Member

Jorropo commented Mar 6, 2024

Pure go programs don't currently do this they are clear about what the input to the build (modules and your go toolchain). And I personally really dislike this about most C toolchains.

That a huge 👎 for me.

@mitar
Copy link
Contributor Author

mitar commented Mar 6, 2024

That sounds like this could be solved by vendoring a generic enough database in the std and using that for builds, however then if we do that, there is no particular reason why this has to be done at build time for the usecase you describe to work.

That could work. In a similar way time/tzdata is done.

@Jorropo
Copy link
Member

Jorropo commented Mar 6, 2024

I think this is a better solution, after checking there already exists a small table:

go/src/mime/type.go

Lines 60 to 77 in e8b5bc6

var builtinTypesLower = map[string]string{
".avif": "image/avif",
".css": "text/css; charset=utf-8",
".gif": "image/gif",
".htm": "text/html; charset=utf-8",
".html": "text/html; charset=utf-8",
".jpeg": "image/jpeg",
".jpg": "image/jpeg",
".js": "text/javascript; charset=utf-8",
".json": "application/json",
".mjs": "text/javascript; charset=utf-8",
".pdf": "application/pdf",
".png": "image/png",
".svg": "image/svg+xml",
".wasm": "application/wasm",
".webp": "image/webp",
".xml": "text/xml; charset=utf-8",
}

I think you should make a new proposal to extend this one (maybe seed it from some existing third party authority like the unicode tables are found).

@mitar
Copy link
Contributor Author

mitar commented Mar 6, 2024

I think you should make a new proposal to extend this one (maybe seed it from some existing third party authority like the unicode tables are found).

I do not think we should extend it for everyone, but allow opt-in by importing something like mime/data.

@Jorropo
Copy link
Member

Jorropo commented Mar 6, 2024

Maybe. I think you need an argument to why one more configuration knob is needed. If it could just be done, that better imo.

@mitar
Copy link
Contributor Author

mitar commented Mar 6, 2024

/etc/mime.types is 24 KB on my machine. I think including that in all builds might be disliked by some?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

3 participants