Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: Compact should be consistent with escaping #34070

Closed
tmthrgd opened this issue Sep 4, 2019 · 3 comments
Closed

encoding/json: Compact should be consistent with escaping #34070

tmthrgd opened this issue Sep 4, 2019 · 3 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@tmthrgd
Copy link
Contributor

tmthrgd commented Sep 4, 2019

As requested by @mvdan I'm filing an issue to decide on the escaping behaviour of Compact for go1.14.

#30357 was originally filed as Compact escapes U+2028 and U+2029 which increases the output size without being documented. The fix for that issue was documentation (CL 173417) that turned out to be incorrect and was reverted by CL 188717.

Compact currently escapes U+2028 and U+2029, but it doesn't escape <, > or &. Compact is thus only performing a subset of HTML escaping and it's output is not safe to embed inside HTML <script> tags as CL 173417 had documented.

If I understand correctly, the escaping of U+2028 and U+2029 was added not for HTML but rather for JavaScript interpreters that might consume a directly embedded JSON literal.

The options seem to be:

  • Stop performing any escaping in Compact as the original issue requested. The escaping of escaping U+2028 and U+2029 was added in CL 10883045, and reading over the comments, it does seem like it was intentional to have Compact escape U+2028 and U+2029.

  • Have Compact properly perform HTML escaping, as HTMLEscape, does and be documented as such. This is likely the safest option, as the only consequence to overly aggressive escaping is larger encoded JSON.

  • Document that Compact escapes U+2028 and U+2029, but not <, > or & and is thus not safe to embed in HTML without separately calling HTMLEscape.

  • Do nothing~

/cc @PhilipBorgesen @rsc @bradfitz

@PhilipBorgesen
Copy link
Contributor

PhilipBorgesen commented Sep 4, 2019

I think each of the options has merit, except doing nothing. In #30357 I complained that Compact behaved surprisingly, doing more than what was documented, and returning a less compact result in contrast to what the function name implied. Doing nothing means that issue persists.

As for the three other options I think:

  • Making Compact do proper HTML escaping may be the safest.
  • Making Compact do no escaping is the most useful. Far from all JSON is intended consumption by browsers, and depending on the application, character escaping may be seen as a corruption. Users wanting compacted and “HTML escaped” JSON can always apply both of Compact and HTMLEscape on their data, which is what the current documentation implies one should do. I note that this behavior was what the Go 1 compatibility guarantee originally promised going forward.
  • Simply documenting the current behavior is the lowest effort solution. It is also the best bet that no programs will break, which both of the above options in practice could, although unlikely.

@katiehockman katiehockman added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 4, 2019
@katiehockman katiehockman added this to the Go1.14 milestone Sep 4, 2019
@mvdan
Copy link
Member

mvdan commented Sep 22, 2019

I agree that we need to do something here.

I think the best solution is to stop doing any escaping at all. Like others have mentioned in this thread, that's what is documented, HTMLEscape already exists, and Compact has never done any complete form of escaping. Moreover, compacting has nothing to do with escaping to begin with.

I doubt that any programs would break. If any was depending on undocumented behavior, one could argue that their program should have been using HTMLEscape to begin with, to do proper escaping.

My opinion is that we can give this a try in the current cycle for 1.14, and reconsider or revert if a large amount of users complains once it lands in master or in any of the pre-releases. I'm happy to review a CL.

@gopherbot
Copy link

Change https://golang.org/cl/200217 mentions this issue: encoding/json: stop escaping U+2028 and U+2029 in Compact

@golang golang locked and limited conversation to collaborators Oct 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants