New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mime/multipart: quoted printable decoding fails with invalid empty body #6507
Comments
Here is the complete example: MIME-Version: 1.0 Content-Type: multipart/related; charset="utf-8"; boundary="===============9099738862860321038==" Content-Transfer-Encoding: quoted-printable This is a multi-part message in MIME format. --===============9099738862860321038== MIME-Version: 1.0 Content-Type: multipart/alternative; charset="utf-8"; boundary="===============3362771883058111967==" Content-Transfer-Encoding: quoted-printable --===============3362771883058111967== Content-Type: text/html; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable <p>Hello</p> =20 --===============3362771883058111967==-- --===============9099738862860321038==-- |
I'm not quite sure what the appropriate "fix" for this is, since the input is obviously invalid, but there certainly has to be a way to prevent/circumvent this, either by: 1) Disabling the automatic QP decoding with a flag on multipart.Reader (preferred) 2) Patching line 109 `if bp.Header.Get(cte) == "quoted-printable" {` in mime/multipart/multipart.go to check for a multipart content type. |
Github seems to have mangled whitespace in the original examples. I have I think reproduced this here: https://play.golang.org/p/Kbw25pNXt_ but the code in mime/multipart has changed in the last few years, it seems the error message has changed to ErrUnexpectedEOF and is now on line 179 in Read. The original suggestion was to add a flag to allow disabling transparent decoding of quoted printable encoding in parts which sounds reasonable. I'll put up a change for review adding a boolean flag to Reader. |
Change https://golang.org/cl/74191 mentions this issue: |
@ianlancetaylor re CL 74191, I should probably have checked here first what needs fix and help wanted means - I took it to mean that mime/multipart should allow disabling the automatic decoding of parts. The initial issue here is about an invalid message, but it also points out that there is no way to disable automatic decoding of quoted-printable within a mime message. This means if a service receives mime messages which may be malformed like the ones above (or in some other way within a part where the quoted printable fails), but which they'd like to be able to decode into parts, they'd have to munge them first to fix them up, or parse them without the standard library. Hence the suggestion above of adding a flag that lets the user set whether multipart decodes quoted-printable on parts which seems a more general fix than handling this specific malformed mime with a special case when parsing, but I see why you might not want to expand the API surface. There are two possible solutions offered above and perhaps there are other fixes possible. Happy to look at a different fix if you have a preference, or leave this for now pending a decision. |
There are lots of different ways for messages to be invalid. If you know ahead of time that the input is formatted incorrectly in a particular way, you can apply a correction to it as it arrives. If you don't know ahead of time, then a new API knob won't help, because you won't know that you need to set it. I don't know what the right answer is here, but I would need to be convinced that a new API knob is the right approach. |
I spent a while digging into this issue and I don't understand why it is even an open issue. As far as I can see, the multipart parser correctly rejects an invalid multipart message. What is the argument for not rejecting an invalid multipart message? The usual argument would be that some popular piece of software generates these inputs, so we must be able to deal with them. But I don't see that argument being made: I see nothing about how this might arise in practice. Short of a compelling reason to accept invalid input, we should probably continue to reject invalid input. The full message in the example (with one annotation from me) is:
The problem with this message is not an "invalid empty body" but the fact that both the lines line I marked are present. RFC 2046 says:
Content-Transfer-Encoding: quoted-printable is not allowed in multipart/ headers. It's arguably a bug that the processing even reaches the second, when the first is wrong too. Maybe we should put in an error check like this:
but I think that will catch the first line, which we're currently allowing. The right answer here depends on why anyone cares about this case, which we need more information about. Moving to Go 1.11. |
Closing per previous comment. |
The text was updated successfully, but these errors were encountered: