New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/net/html: unescape doesn't handle double quotes in attributes #60864
Comments
@Nasfame shouldn't the expected output be - as our input was - |
see #52911 (comment) |
I think that is normal legal encoding and is correct. In general, unescaping html and expecting valid html out the other side is not correct. |
@Nasfame Once the HTML in https://html.spec.whatwg.org/multipage/syntax.html#attributes-2 At the logical level a tag's attribute has only two strings for state ( <div style=font-family:arial>
<div style='font-family:arial'>
<div style="font-family:arial"> Notice in the Serializing HTML fragments section: https://html.spec.whatwg.org/multipage/parsing.html#serialising-html-fragments Under 13.3 4. 2. says:
note in particular that attributes always serialize with a double quote:
34 vs 39This is also why the As the attribute |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
yes
What operating system and processor architecture are you using (
go env
)?MACOS M1 chip
Issue-
<div style='font-family:arial, "helvetica neue", helvetica, sans-serif;'>Mehr Informationen</div>
while tokenizing the above code with net/html package, and converting the token html to string using
tokenizer.Token().String()
we get -
<div style="font-family:arial, "helvetica neue", helvetica, sans-serif;">Mehr Informationen</div>
which on unescaping is -
<div style="font-family:arial, "helvetica neue", helvetica, sans-serif;">Mehr Informationen</div>
which is incorrect. As its containing nested double quotes.
This is causing an issue while previewing the HTML on the Outlook app.
link to code- "golang.org/x/net/html"
What did you expect to see?
<div style='font-family:arial, "helvetica neue", helvetica, sans-serif;'>Mehr Informationen</div>
What did you see instead?
<div style="font-family:arial, "helvetica neue", helvetica, sans-serif;">Mehr Informationen</div>
The text was updated successfully, but these errors were encountered: