Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: No way to avoid HTMLEscape when Marshal()-ing #8592

Closed
gopherbot opened this issue Aug 26, 2014 · 5 comments
Closed

encoding/json: No way to avoid HTMLEscape when Marshal()-ing #8592

gopherbot opened this issue Aug 26, 2014 · 5 comments

Comments

@gopherbot
Copy link

by surtri:

Before filing a bug, please check whether it has been fixed since the
latest release. Search the issue tracker and check that you're running the
latest version of Go:

Run "go version" and compare against
http://golang.org/doc/devel/release.html  If a newer version of Go exists,
install it and retry what you did to reproduce the problem.

Thanks.

What does 'go version' print?

What steps reproduce the problem?
If possible, include a link to a program on play.golang.org.

1. http://play.golang.org/p/SQ2j7c2mpQ
2. There used to be a distinction between Marshal() and MarshalForHTML(), but that was
removed because it was deemed that HTMLEscaping < and > would always be better.
For my use-case, I want to keep those characters because:

a) It makes the resulting JSON file smaller
b) It makes the raw JSON more readable

What happened?

"\u003chtml\u003e"

What should have happened instead?

"<html>"

Please provide any additional information below.

I know this issue has been discussed before
(https://golang.org/issue/3127) but I've been unable to find a way
to disable the HTMLEscape() method. Is there any way to get around that? It makes
everything much less readable and bigger (I transfer terabytes of JSON a month so these
small changes add up). I think there would be value in offering a method for opting out
of this automatic HTML escape feature.
@bradfitz
Copy link
Contributor

Comment 1:

Perhaps a bool on the Encoder struct.

Labels changed: added repo-main.

@gopherbot
Copy link
Author

Comment 2 by wangz@google.com:

Automatic HTML-escaping is irrelevant to JSON encoding. Therefore the feature should be
turned off by default.
The "auto-escaping is safer" argument in the original thread
(https://golang.org/issue/3127) is bogus because it all depends on
the context where the serialized data is used.
Before this gets fixed, one quick workaround is to fork the encoding/json module and
remove "&& b != '<' && b != '>' && b != '&' " from the following two functions in
encode.go:
func (e *encodeState) string(s string) (int, error)
func (e *encodeState) stringBytes(s []byte) (int, error)

@rsc
Copy link
Contributor

rsc commented Sep 16, 2014

Comment 3:

The result is valid JSON. The details of how it gets encoded are not guaranteed and
don't need to be under user control. The only guarantee is that the result is valid and
semantically correct JSON, which it is. It is not worth making the API more complex to
control whether < > get escaped. If you really care, do a global search and
replace on the output to put them back.

Status changed to WorkingAsIntended.

@gopherbot
Copy link
Author

Comment 4 by wangz@google.com:

The result is valid json but also unnecessarily inflated in many contexts where '<',
'>', and '&' abound.
Doing string replacement in a second pass is an inefficient and cumbersome workaround.
Please consider Brad's suggestion to add a bool to the Encoder struct.

@gopherbot
Copy link
Author

Comment 5 by karim.nassar@vervemobile.com:

+1 for flag. In my context, I am generating the html that is in response to an api that,
in some cases, mismanages the escaped text. I can't control them, and I trust myself.
Forking the json pkg seems pretty heavy weight, and string replacement is woefully
inefficient when milliseconds matter.

This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants