Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: mime: handling duplicate media parameters #28618

Closed
neganovalexey opened this issue Nov 6, 2018 · 7 comments
Closed

proposal: mime: handling duplicate media parameters #28618

neganovalexey opened this issue Nov 6, 2018 · 7 comments
Labels
FrozenDueToAge Proposal WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@neganovalexey
Copy link
Contributor

It is possible to receive an email that does not follow the specification, i. e. an email with the Content-Type header like

text/plain; charset=UTF-8; charset=UTF-8; format=flowed

Golang standard library does not allow duplicate media parameters so it is impossible to parse such a header. But two instances of the 'charset' parameter have the same value here, so it can be determined unambiguously.
I suggest making the mime.ParseMediaType function more tolerant to such errors. It should not stop the execution if it detected duplicate parameters have the same value. In order not to change behavior of any existing Go program, some special error value may be returned along with parsed media type and parameters.

@gopherbot gopherbot added this to the Proposal milestone Nov 6, 2018
@ianlancetaylor
Copy link
Contributor

It would help if you could point to packages that generate this invalid information, and if you could point to how other MIME parsing code, in other languages, handles this case. That is, we want to accept data that is out there in the wild, but we want to avoid being unnecessarily loose. Thanks.

@neganovalexey
Copy link
Contributor Author

@ianlancetaylor I have found the following examples on Python, Java and C#. All of them returns charset as "UTF-8" correctly, none throws an error.

Example in Python 3:

>>> import cgi
>>> mimetype, options = cgi.parse_header("text/plain; charset=UTF-8; charset=UTF-8; format=flowed")
>>> print(mimetype)
text/plain
>>> print(options)
{'charset': 'UTF-8', 'format': 'flowed'}

Example in Java:

import org.apache.http.entity.ContentType;
import java.nio.charset.Charset;

class TestContentTypeParsing {
        public static void main (String args []) {
                ContentType contentType = ContentType.parse("text/plain; charset=UTF-8; charset=UTF-8; format=flowed");
                Charset charset = contentType.getCharset();
                System.out.println (charset);
    }
}

prints "UTF-8"

Example in C#:

using System;
using System.Net.Mime;

public class TestContentTypeParsing
{
    static public void Main ()
    {
        var contentType = new ContentType("text/plain; charset=UTF-8; charset=UTF-8; format=flowed");
        Console.WriteLine("{0} ({1})", contentType.MediaType, contentType.CharSet);
    }
}

prints "text/plain (UTF-8)"

@neganovalexey
Copy link
Contributor Author

The example in Go:

package main

import (
	"fmt"
	"mime"
)

func main() {
	mediaType, params, err := mime.ParseMediaType("text/plain; charset=UTF-8; charset=UTF-8; format=flowed")
	fmt.Printf("mediaType = '%s', params = %+v, err = %v\n", mediaType, params, err)
}

prints "mediaType = '', params = map[], err = mime: duplicate parameter name"
https://play.golang.org/p/kEMGv60ElWz

@ianlancetaylor
Copy link
Contributor

Thanks, that wasn't quite what I was asking. Can you describe which programs generate the duplicate information? I'm trying to understand why we should make this change. If no program generates duplicates, then it seems to me that stricter is better, as it avoids any confusion about which parameter applies.

@rsc rsc added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Dec 12, 2018
@rsc
Copy link
Contributor

rsc commented Dec 12, 2018

@neganovalexey you wrote "It is possible to receive an email ...". The important question is "is it likely?" Do you have instances of this happening in real use cases? If not, then we are unlikely to add what might end up being some kind of security hole (by picking one or the other differently from other software) for purely speculative motivations.

@gopherbot
Copy link

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@xeruf
Copy link

xeruf commented Jun 9, 2019

I actually have such an email in my Inbox. The header field is this:

Content-Type: text/html; charset=utf-8; charset=UTF-8

@golang golang locked and limited conversation to collaborators Jun 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge Proposal WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

5 participants