Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mime: FormatMediaType should quote control characters #7668

Closed
rui314 opened this issue Mar 31, 2014 · 13 comments
Closed

mime: FormatMediaType should quote control characters #7668

rui314 opened this issue Mar 31, 2014 · 13 comments

Comments

@rui314
Copy link
Member

rui314 commented Mar 31, 2014

What does 'go version' print?
go version devel +1afdecdd77c1 Sat Mar 29 17:10:25 2014 -0400 linux/amd64

What steps reproduce the problem?
http://play.golang.org/p/LcQhZpJkX2

1. Call mime.FormatMediaType with parameters containing an ASCII control character
2. See the return value

What happened?
It returns a formatted string containing a control character as is.

What should have happened instead?
It should quote a control character with \ or return the empty string. Here is the
definition of quoted-string (http://www.ietf.org/rfc/rfc2616.txt). A control character
can appear as quoted-pair but not as qdtext.

       quoted-string  = ( <"> *(qdtext | quoted-pair ) <"> )
       quoted-pair    = "\" CHAR
       CHAR           = <any US-ASCII character (octets 0 - 127)>
       qdtext         = <any TEXT except <">>
       TEXT           = <any OCTET except CTLs,
                        but including LWS>
       CTL            = <any US-ASCII control character
                        (octets 0 - 31) and DEL (127)>

(It's actually surprising to me that it allows even a NUL after a backslash, but it's
what the standard says. It may be a spec bug?)
@bradfitz
Copy link
Contributor

bradfitz commented Apr 1, 2014

Comment 1:

Status changed to Accepted.

@rsc
Copy link
Contributor

rsc commented May 21, 2014

Comment 2:

Labels changed: added release-go1.4.

@rsc
Copy link
Contributor

rsc commented Sep 15, 2014

Comment 3:

Brad, interested?

@rui314
Copy link
Member Author

rui314 commented Sep 24, 2014

Comment 4:

Owner changed to @rui314.

@jonathanpittman
Copy link

Comment 5:

I created https://golang.org/cl/150070043 to fix this issue, but I have some
questions about how we want to treat some of the LWS characters.
The definition for LWS is:
LWS            = [CRLF] 1*( SP | HT )
HT, CR, and LF all fall in the 0 - 31 CTL range.  I am currently not escaping HT, but
still letting it escape CR and LF.  It seems the definition of LWS, where it includes
CRLF, intends for CRLF to mean CR followed by LF not as individual characters.  Does
that interpretation seem right?  What is the desired behavior here for CR, LF, and HT?

@rui314
Copy link
Member Author

rui314 commented Sep 26, 2014

Comment 6:

The manual of the function says "When any of the arguments result in a standard
violation then FormatMediaType returns the empty string." If we want to strictly follow
that, I think we should return the empty string if we find a stray CR or LF.  CR needs
to be followed by LF and a space or a tab.
One thing I'm not sure if if we really want to be that strict about the standard.
Brad, what do you think?

@bradfitz
Copy link
Contributor

Comment 7:

This is so corner-case that it's hard for me to care one way or another. We could also
just return "" from FormatMediaType if we detect any byte from 0x00 - 0x19 and don't
even try to escape them. That's a very aggressive option, but I don't think anybody will
care or notice, either.
I'll let you decide.

@griesemer
Copy link
Contributor

Comment 8:

Labels changed: added repo-main.

@jonathanpittman
Copy link

Comment 9:

ruiu, how would you like to proceed with this?

@rui314
Copy link
Member Author

rui314 commented Oct 13, 2014

Comment 10:

Sorry for the belated response. I've been thinking about this. More I see this issue,
more I feel the spec is wrong or pointless at least.
For example, in order to include DEL character in a quoted string, one needs to add a
backslash before the DEL. But the DEL is transmitted as-is. And, of course, DEL cannot
be confused with closing '"'. So what's the point of adding a backslash before the DEL?
It's getting seem like a corner case that even the spec doesn't handle that correctly.
I'm leaning to leave it alone.

@bradfitz
Copy link
Contributor

Comment 11:

Labels changed: added release-none, removed release-go1.4.

Owner changed to ---.

Status changed to Thinking.

@andrius4669
Copy link
Contributor

Content-Type and Content-Transfer-Encoding are defined in RFC 2045:

     content := "Content-Type" ":" type "/" subtype
                *(";" parameter)
                ; Matching of media type and subtype
                ; is ALWAYS case-insensitive.
     parameter := attribute "=" value
     attribute := token
                  ; Matching of attributes
                  ; is ALWAYS case-insensitive.
     value := token / quoted-string
     token := 1*<any (US-ASCII) CHAR except SPACE, CTLs,
                 or tspecials>
     tspecials :=  "(" / ")" / "<" / ">" / "@" /
                   "," / ";" / ":" / "\" / <">
                   "/" / "[" / "]" / "?" / "="
                   ; Must be in quoted-string,
                   ; to use within parameter values

quoted-string as defined in RFC 2822:

NO-WS-CTL       =       %d1-8 /         ; US-ASCII control characters
                        %d11 /          ;  that do not include the
                        %d12 /          ;  carriage return, line feed,
                        %d14-31 /       ;  and white space characters
                        %d127
qtext           =       NO-WS-CTL /     ; Non white space controls
                        %d33 /          ; The rest of the US-ASCII
                        %d35-91 /       ;  characters not including "\"
                        %d93-126        ;  or the quote character
qcontent        =       qtext / quoted-pair
quoted-string   =       [CFWS]
                        DQUOTE *([FWS] qcontent) [FWS] DQUOTE
                        [CFWS]

RFC 5322 obsoletes RFC 2822 and defines it in different way:

   qtext           =   %d33 /             ; Printable US-ASCII
                       %d35-91 /          ;  characters not including
                       %d93-126 /         ;  "\" or the quote character
                       obs-qtext
   qcontent        =   qtext / quoted-pair
   quoted-string   =   [CFWS]
                       DQUOTE *([FWS] qcontent) [FWS] DQUOTE
                       [CFWS]
   quoted-pair     =   ("\" (VCHAR / WSP)) / obs-qp
; from RFC 5234
         VCHAR          =  %x21-7E
                                ; visible (printing) characters

so as I understand, it basically discourages any non-FWS control characters now.

@gopherbot
Copy link

Change https://golang.org/cl/154760 mentions this issue: mime: encode CTL and non-US-ASCII characters in FormatMediaType

@golang golang locked and limited conversation to collaborators May 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants