Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mime/multipart: parsing error with embedded multipart #46042

Closed
oszika opened this issue May 7, 2021 · 7 comments
Closed

mime/multipart: parsing error with embedded multipart #46042

oszika opened this issue May 7, 2021 · 7 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@oszika
Copy link

oszika commented May 7, 2021

What version of Go are you using (go version)?

$ go version
go version go1.16.3 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/oszika/.cache/go-build"
GOENV="/home/oszika/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/oszika/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/oszika/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.16.3"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/tmp/ex/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1080126920=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I defined an embedded multipart inside the main multipart. But they have the same prefix, followed by a dash for the embedded multipart. ("foo" and "foo-sub")

https://play.golang.org/p/HlZiGseSb4b

What did you expect to see?

All parts of the main multipart.

Part "A": "--foo-sub\nFoo: A1\n\nSubcontent 1\n\n--foo-sub\nFoo: A2\n\nSubcontent 2\n\n--foo-sub--\n"
Part "B": "Content 2\n"

What did you see instead?

Part "A": ""
2009/11/10 23:00:00 multipart: unexpected line in Next(): "--foo-sub\n"

Note

On the func matchAfterPrefix at https://github.com/golang/go/blob/master/src/mime/multipart/multipart.go#L281, maybe we shouldn't consider only one '-' as end marker of dash-boundary-dash?

@seankhliao seankhliao added the NeedsFix The path to resolution is known, but the work has not been done. label May 8, 2021
@seankhliao
Copy link
Member

cc @bradfitz @minux

@gopherbot
Copy link

Change https://golang.org/cl/338549 mentions this issue: mime/multipart: nested boundary can have the outer boundary as prefix followed by a dash

@neild
Copy link
Contributor

neild commented Aug 5, 2021

Summarizing comment on https://golang.org/cl/338549:

By my reading of RFC 2046 section 5.1.1 a boundary delimiter must consist of the boundary parameter, either the string "--" or 0-2 characters of linear whitespace, and a newline. Currently, mime/multipart checks for the boundary parameter and a character in the set " \t\r\n-", which seems wrong to me.

@neild
Copy link
Contributor

neild commented Aug 5, 2021

Reading RFC 2046 more closely, in particular the BNF in section 5.1.1, I'm now unconvinced that it is valid for an embedded multipart part to use a boundary that has the outer boundary as a prefix. (e.g., outer part has boundary="foo", embedded part has boundary "foo-sub".)

The relevant part of the BNF is:

     dash-boundary := "--" boundary
                      ; boundary taken from the value of
                      ; boundary parameter of the
                      ; Content-Type field.

     delimiter := CRLF dash-boundary

     body-part := MIME-part-headers [CRLF *OCTET]
                  ; Lines in a body-part must not start
                  ; with the specified dash-boundary and
                  ; the delimiter must not appear anywhere
                  ; in the body part.  Note that the
                  ; semantics of a body-part differ from
                  ; the semantics of a message, as
                  ; described in the text.

This clearly states that the delimiter must not appear in a body part. If the delimiter is "\r\n--foo", then an embedded multipart part cannot use "\r\n--foo-sub" as a delimiter without violating this restriction.

@seankhliao
Copy link
Member

The supporting text in 5.1 says

As stated previously, each body part is preceded by a boundary
delimiter line that contains the boundary delimiter. The boundary
delimiter MUST NOT appear inside any of the encapsulated parts, on a
line by itself or as the prefix of any line

and 5.1.1

Boundary delimiters must not appear within the encapsulated material

NOTE TO IMPLEMENTORS: Boundary string comparisons must compare the
boundary value with the beginning of each candidate line. An exact
match of the entire candidate line is not required; it is sufficient
that the boundary appear in its entirety following the CRLF.

NOTE: Because boundary delimiters must not appear in the body parts
being encapsulated, a user agent must exercise care to choose a
unique boundary parameter value.
...
Alternate algorithms might result in more "readable" boundary
delimiters for a recipient with an old user agent, but would require
more attention to the possibility that the boundary delimiter might
appear at the beginning of some line in the encapsulated part.

Which seems pretty clear that using a prefix for the outer boundary is invalid

@oszika
Copy link
Author

oszika commented Aug 5, 2021

Indeed I missed that in the RFC.

That said, this part of the RFC seems not to be really followed, as in the issue #10616.
And the implementation of mime/multipart gives us this possibility, except with a dash.
This is an example inspired by real received emails.

@neild
Copy link
Contributor

neild commented Aug 30, 2021

As a data point, the Python 3 MIME parser accepts invalid multipart bodies in which the boundary delimiter appears within the body. I haven't checked any other implementations, but it wouldn't surprise me if others accept this as well.

#10616 is another case where mime/multipart correctly rejected an invalid message, but we relaxed the parser to accept messages seen in the wild. It seems reasonable to do the same here as well.

@dmitshur dmitshur added this to the Go1.19 milestone Apr 1, 2022
@golang golang locked and limited conversation to collaborators Apr 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

5 participants