Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mime: WordDecoder#DecodeHeader accept space as a part of 'encoded-word' #19417

Closed
hirochachacha opened this issue Mar 6, 2017 · 2 comments
Closed
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@hirochachacha
Copy link
Contributor

Please answer these questions before submitting your issue. Thanks!

What did you do?

If possible, provide a recipe for reproducing the error.
A complete runnable program is good.
A link on play.golang.org is best.

package main

import (
	"fmt"
	"io"
	"mime"
)

type charsetError string

func (e charsetError) Error() string {
	return fmt.Sprintf("charset not supported: %q", string(e))
}

var rfc2047Decoder = mime.WordDecoder{
	CharsetReader: func(charset string, input io.Reader) (io.Reader, error) {
		return nil, charsetError(charset)
	},
}

func main() {
	s, err := rfc2047Decoder.DecodeHeader("=?UTF-8?Q?aaa =C2=A1Hola,_se=C3=B1or! bbb?=")
	if err != nil {
		fmt.Println(err)
	} else {
		fmt.Println(s)
	}
}

What did you expect to see?

error or "=?UTF-8?Q?aaa =C2=A1Hola,_se=C3=B1or! bbb?="

What did you see instead?

aaa ¡Hola, señor! bbb

Does this issue reproduce with the latest release (go1.8)?

I think so

System details

go version devel +ccb319f Mon Mar 6 10:03:49 2017 +0900 darwin/amd64
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/hiro/.go"
GORACE=""
GOROOT="/Users/hiro/go"
GOTOOLDIR="/Users/hiro/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/wq/dwn8hs0x7njbzty9f68y61700000gn/T/go-build864028496=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOROOT/bin/go version: go version devel +ccb319f Mon Mar 6 10:03:49 2017 +0900 darwin/amd64
GOROOT/bin/go tool compile -V: compile version devel +ccb319f Mon Mar 6 10:03:49 2017 +0900 X:framepointer
uname -v: Darwin Kernel Version 16.4.0: Thu Dec 22 22:53:21 PST 2016; root:xnu-3789.41.3~3/RELEASE_X86_64
ProductName:	Mac OS X
ProductVersion:	10.12.3
BuildVersion:	16D32
lldb --version: lldb-360.1.70
gdb --version: GNU gdb (GDB) 7.12.1
@bradfitz bradfitz added this to the Go1.9Maybe milestone Mar 21, 2017
@bradfitz bradfitz added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 28, 2017
@bradfitz bradfitz modified the milestones: Go1.10, Go1.9Maybe Jun 28, 2017
@stapelberg
Copy link
Contributor

Your CharsetReader is never called. This is because your example uses UTF-8, which is handled by mime.WordDecoder internally:

go/src/mime/encodedword.go

Lines 327 to 354 in a025277

case strings.EqualFold("utf-8", charset):
buf.Write(content)
case strings.EqualFold("iso-8859-1", charset):
for _, c := range content {
buf.WriteRune(rune(c))
}
case strings.EqualFold("us-ascii", charset):
for _, c := range content {
if c >= utf8.RuneSelf {
buf.WriteRune(unicode.ReplacementChar)
} else {
buf.WriteByte(c)
}
}
default:
if d.CharsetReader == nil {
return fmt.Errorf("mime: unhandled charset %q", charset)
}
r, err := d.CharsetReader(strings.ToLower(charset), bytes.NewReader(content))
if err != nil {
return err
}
if _, err = buf.ReadFrom(r); err != nil {
return err
}
}
return nil
}

mime.WordDecoder’s documentation alludes to this, but perhaps isn’t clear enough:

        // Charsets are always lower-case. utf-8, iso-8859-1 and us-ascii charsets
        // are handled by default.

@hirochachacha
Copy link
Contributor Author

@stapelberg Ah, true. Thank you for pointing it out. However that's not relevant to the problem.
https://tools.ietf.org/html/rfc2047#section-2 says:

An 'encoded-word' is defined by the following ABNF grammar. The
notation of RFC 822 is used, with the exception that white space
characters MUST NOT appear between components of an 'encoded-word'.

I guess that's what I tried to fix before.
However I changed my mind after filing the issue.
I'm not sure that just making the decoder stricter is worth doing.
That's why I didn't tackle this. Maybe we can just close this issue.
Someone can re-file this issue when he needs.

Thanks.

@golang golang locked and limited conversation to collaborators Nov 18, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants