Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

html/template: treatment of CDATA sections in foreign content diverges from browsers #62617

Open
rolandshoemaker opened this issue Sep 13, 2023 · 0 comments
Labels
NeedsFix The path to resolution is known, but the work has not been done. Security
Milestone

Comments

@rolandshoemaker
Copy link
Member

rolandshoemaker commented Sep 13, 2023

CDATA sections are only valid in foreign content (i.e. <math> or <svg>), and have various different meanings within that content (love to mix XML and HTML).

Per the HMTL specification (https://html.spec.whatwg.org/#cdata-sections and https://html.spec.whatwg.org/#cdata-section-state) the contents of CDATA sections are considered plaintext for the purposes of parsing, as such the two following fragments are treated differently:

<svg>
<script>
</script>
</script>
</svg>

---

<svg>
<script>
<![CDATA[
</script>
]]>
</script>
</svg>

In the first fragment the first </script> tag terminates the script section, but in the second fragment, since the first </script> tag is inside of a CDATA section, it is treated as plaintext and doesn't terminate the script block.

This can cause some strange interactions, since there are ways to structure something that looks like an end tag, but is valid JS (i.e. depending on the context, </script/> may be interpreted as a regex literal, while being interpreted by an HTML parser as an ending tag).

In an extremely cursed template, this can cause html/template to improperly derive the correct context for contextual auto-escaping, leading to (extremely unlikely) XSS.

The following template illustrates this; since the parser does not believe the action is within the JS context (since it thinks </script/> terminated the script block), it will allow JS to be inserted unescaped.

<svg>
<script>
<![CDATA[
console.log(1</script/>`{{.}}`);
]]>
</script>
</svg>

(For those interested, this works because the JS interpreter treats this as an inequality 1 < /regex lit/ > "string" which is obviously majorly broken, but will be happily executed. You can then use the JS template literal syntax to inject code using string interpolation syntax, i.e. ${alert(1)}.)

This is obviously an incredibly contrived example, and is unlikely to ever actually impact any real world implementations, but should probably still be fixed. How to fix it is not immediately obvious, other than increasing the complexity of the parser to understand (a) foreign content sections and (b) CDATA (but only for the purposes of HTML parsing semantics).

golang.org/x/net/html handles this correctly, since it applies the proper CDATA rules (and it doesn't need to understand JS semantics).

Thanks to @arkark for reporting this issue.

cc @golang/security @nigeltao

@heschi heschi added Security NeedsFix The path to resolution is known, but the work has not been done. labels Sep 13, 2023
@heschi heschi added this to the Go1.22 milestone Sep 13, 2023
@gopherbot gopherbot modified the milestones: Go1.22, Go1.23 Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsFix The path to resolution is known, but the work has not been done. Security
Projects
None yet
Development

No branches or pull requests

3 participants