mime: BEncoding and QEncoding don't respect the 75 character limit in RFC2047 #12300

joegrasse · 2015-08-24T20:03:47Z

The new mime.BEncoding.Encode and mime.QEncoding.Encode functions documented here, don't respect the 75 character line limit.

Excerpt from RFC2047:

An 'encoded-word' may not be more than 75 characters long, including
'charset', 'encoding', 'encoded-text', and delimiters. If it is
desirable to encode more text than will fit in an 'encoded-word' of
75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may
be used.

alexcesaro · 2015-08-24T21:08:03Z

Are you noticing bugs because of it?

Automatically breaking encoded-words is complicated because of this:

Each 'encoded-word' MUST represent an integral number of characters.
A multi-octet character may not be split across adjacent 'encoded-word's.

Users can use the charset they want so we cannot be sure where to break encoded-words.

Also:

The 75-char limit is optional according to the RFC.
Users can manually break words where they want.
Popular services like Gmail do not respect this 75-char limit.

That is why the 75-char limit was not implemented.
However if it is really causing bugs in email clients we could automatically break encoded-words when the charset is UTF-8 or when there is a space character for example.

joegrasse · 2015-08-25T12:46:12Z

I am not seeing where the 75-char limit is optional in RFC2047. I could be overlooking it though.
That is true, however if it is said that the functions encode according to RFC2047, and the 75-char limit isn't optional, then they should do it for you.
What makes you think Gmail doesn't respect the 75-char limit?

alexcesaro · 2015-08-25T13:07:51Z

The word "may" usually indicates that a rule is optional. See RFC 2119.
Just try sending an email with Gmail with a long subject containing special characters.

Again, are you getting bugs with an email client because of long encoded-words?

joegrasse · 2015-08-25T14:56:35Z

Good to know.
I have tested gmail and it was doing line folding.

I have not done extensive testing yet.

I think at the very least the functions should do folding for UTF-8.
Also, is the header field value included or excluded from the char limit? I have seen some popular languages allow for this as a parameter.

alexcesaro · 2015-08-25T15:30:22Z

I don't understand your last question. What do you mean by "header field value"?

joegrasse · 2015-08-25T15:53:56Z

For example:

To: "Test" test@test.com
Subject: This is the subject

"To" and "Subject" would be what I was referring to. Some of the popular languages allow for passing of 4 for the "To" header and 9 for the "Subject" header. So they can take that into account when folding to 75 chars.

joegrasse · 2015-08-28T13:43:15Z

After looking at this a little more, I am not sure that line folding is optional. You seem to think it is optional because of the use of "may" in the excerpt above. However, I believe that whenever they indicate requirement levels, they are capitalizing the keywords. If you notice in RFC 2047, there are several instances where may is capitalized (along with other keywords). Also in RFC 2119 the keywords are capitalized (Event though it does say "These words are often capitalized"). So, it is suspect that it isn't capitalized in the referred to instance.

akavel · 2015-09-09T17:35:15Z

I may be wrong, but I believe in English, while "may" is a "soft" qualifier, "may not" is a "hard" qualifier (similar as for "can" and "can not"). The RFC 2119 mentions MAY, but it doesn't mention MAY NOT, while it does mention both SHOULD and SHOULD NOT, etc. Also, in the specific RFC 2047 discussed here, there's (among others) another fragment with "may not", which I believe hints at the "hard"/"can not" meaning (emphasis added by me):

Each 'encoded-word' MUST encode an integral number of octets. The
'encoded-text' in each 'encoded-word' must be well-formed according
to the encoding specified; the 'encoded-text' may not be continued in
the next 'encoded-word'. (For example, "=?charset?Q?=?=
=?charset?Q?AB?=" would be illegal, because the two hex digits "AB"
must follow the "=" in the same 'encoded-word'.)

and another similar one just below it:

Each 'encoded-word' MUST represent an integral number of characters.
A multi-octet character may not be split across adjacent 'encoded-
word's.

alexcesaro · 2015-09-10T08:48:18Z

I MAY do a CL to break words in UTF-8 😄
I had it done in a previous CL, I will try to find it.

gopherbot · 2015-09-24T22:01:31Z

CL https://golang.org/cl/14957 mentions this issue.

joegrasse · 2015-10-19T16:42:27Z

@alexcesaro or @bradfitz, do either of you know which release this fix will be included?

bradfitz · 2015-10-19T16:47:13Z

1.6

joegrasse · 2015-10-19T17:35:45Z

Thanks.

mikioh changed the title ~~The mime BEncoding and QEncoding don't respect the 75 character limit in RFC2047~~ mime: BEncoding and QEncoding don't respect the 75 character limit in RFC2047 Aug 24, 2015

ianlancetaylor added this to the Unplanned milestone Aug 25, 2015

bradfitz closed this as completed in 65fc379 Oct 15, 2015

golang locked and limited conversation to collaborators Oct 24, 2016

gopherbot added the FrozenDueToAge label Oct 24, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mime: BEncoding and QEncoding don't respect the 75 character limit in RFC2047 #12300

mime: BEncoding and QEncoding don't respect the 75 character limit in RFC2047 #12300

joegrasse commented Aug 24, 2015

alexcesaro commented Aug 24, 2015

joegrasse commented Aug 25, 2015

alexcesaro commented Aug 25, 2015

joegrasse commented Aug 25, 2015

alexcesaro commented Aug 25, 2015

joegrasse commented Aug 25, 2015

joegrasse commented Aug 28, 2015

akavel commented Sep 9, 2015

alexcesaro commented Sep 10, 2015

gopherbot commented Sep 24, 2015

joegrasse commented Oct 19, 2015

bradfitz commented Oct 19, 2015

joegrasse commented Oct 19, 2015

mime: BEncoding and QEncoding don't respect the 75 character limit in RFC2047 #12300

mime: BEncoding and QEncoding don't respect the 75 character limit in RFC2047 #12300

Comments

joegrasse commented Aug 24, 2015

alexcesaro commented Aug 24, 2015

joegrasse commented Aug 25, 2015

alexcesaro commented Aug 25, 2015

joegrasse commented Aug 25, 2015

alexcesaro commented Aug 25, 2015

joegrasse commented Aug 25, 2015

joegrasse commented Aug 28, 2015

akavel commented Sep 9, 2015

alexcesaro commented Sep 10, 2015

gopherbot commented Sep 24, 2015

joegrasse commented Oct 19, 2015

bradfitz commented Oct 19, 2015

joegrasse commented Oct 19, 2015