Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: string option (struct tag) on string field with SetEscapeHTML(false) escapes anyway #34154

Closed
breml opened this issue Sep 6, 2019 · 1 comment
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@breml
Copy link
Contributor

breml commented Sep 6, 2019

If using encoding/json.NewEncoder with SetEscapeHTML(false) an a field, that has the string option in the struct tags set, the HTML content gets escaped anyways.

What version of Go are you using (go version)?

$ go version
go version go1.12.5 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/lubr/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/lubr/go"
GOPROXY=""
GORACE=""
GOROOT="/home/lubr/.gvm/versions/go1.12.5.linux.amd64"
GOTMPDIR=""
GOTOOLDIR="/home/lubr/.gvm/versions/go1.12.5.linux.amd64/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build368255501=/tmp/go-build -gno-record-gcc-switches"

What did you do?

package main

import (
	"encoding/json"
	"os"
)

type Foo struct {
	Bar string `json:"bar,string"`
}

func main() {
	j := Foo{
		Bar: `<html>foobar</html>`,
	}
	e := json.NewEncoder(os.Stdout)
	e.SetEscapeHTML(false)
	_ = e.Encode(j)
}

What did you expect to see?

$  go run main.go
{"bar":"\"<html>foobar</html>\""}

What did you see instead?

$  go run main.go
{"bar":"\"\\u003chtml\\u003efoobar\\u003c/html\\u003e\""}

#34127 does fix this issue

breml added a commit to breml/go that referenced this issue Sep 6, 2019
Since Go 1.2 every string can be marshaled to JSON without error even if it
contains invalid UTF-8 byte sequences. Therefore there is no need to use
Marshal again for the only reason of enclosing the string in double quotes.
Not using Marshal here also removes the error check as there has not been a
way for Marshal to fail anyway.

Additionally, this code runs significantly faster for quoted string fields
(30 - 45% in my measurements, depending on the length of the string that is
processed).

Fixes: golang#34154
@gopherbot
Copy link

Change https://golang.org/cl/193604 mentions this issue: encoding/json: optimize marshal for quoted string

breml added a commit to breml/go that referenced this issue Sep 9, 2019
Since Go 1.2 every string can be marshaled to JSON without error even if it
contains invalid UTF-8 byte sequences. Therefore there is no need to use
Marshal again for the only reason of enclosing the string in double quotes.
Not using Marshal here also removes the error check as there has not been a
way for Marshal to fail anyway.

Additionally, this code runs significantly faster for quoted string fields
(30 - 45% in my measurements, depending on the length of the string that is
processed).

Fixes: golang#34154
breml added a commit to breml/go that referenced this issue Sep 9, 2019
Since Go 1.2 every string can be marshaled to JSON without error even if it
contains invalid UTF-8 byte sequences. Therefore there is no need to use
Marshal again for the only reason of enclosing the string in double quotes.
Not using Marshal here also removes the error check as there has not been a
way for Marshal to fail anyway.

Additionally, this code runs significantly faster for quoted string fields
(30 - 45% in my measurements, depending on the length of the string that is
processed).

Fixes: golang#34154
breml added a commit to breml/go that referenced this issue Sep 10, 2019
Since Go 1.2 every string can be marshaled to JSON without error even if it
contains invalid UTF-8 byte sequences. Therefore there is no need to use
Marshal again for the only reason of enclosing the string in double quotes.
Not using Marshal here also removes the error check as there has not been a
way for Marshal to fail anyway.

name old time/op new time/op delta
Issue34127-4 360ns ± 3% 200ns ± 3% -44.56% (p=0.008 n=5+5)

name old alloc/op new alloc/op delta
Issue34127-4 56.0B ± 0% 40.0B ± 0% -28.57% (p=0.008 n=5+5)

name old allocs/op new allocs/op delta
Issue34127-4 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.008 n=5+5)

Fixes golang#34154
@FiloSottile FiloSottile added the NeedsFix The path to resolution is known, but the work has not been done. label Sep 10, 2019
@FiloSottile FiloSottile added this to the Go1.14 milestone Sep 10, 2019
@golang golang locked and limited conversation to collaborators Sep 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants