-
Notifications
You must be signed in to change notification settings - Fork 18k
x/net/html: Self-closing element results in wrong parsing #22834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I know nothing about this package, and I believe @tombergan doesn't either. |
Well, I know a little bit.... If I run the following program: package main
import (
"os"
"strings"
"golang.org/x/net/html"
)
func main() {
node, _ := html.Parse(strings.NewReader(
`<html>
<head>
</head>
<body>
<div class="foo">
<!-- CANVAS 1 -->
<canvas></canvas>
<div>BAR</div>
<!-- CANVAS 2 -->
<canvas>
<p>BAZ</p>
</div>
</body>
</html>`))
html.Render(os.Stdout, node)
} I get:
If I save your original HTML to a local file, load that file in Chrome, then run
The |
@tombergan Please reopen. I am unable to reopen. From your testing code above, you missed out a detail. Note the input has no canvas at the end. func main() {
node, _ := html.Parse(strings.NewReader(
`<html>
<head>
</head>
<body>
<div class="foo">
<!-- CANVAS 1 -->
<canvas></canvas>
<div>BAR</div>
<!-- CANVAS 2 -->
<canvas>
<p>BAZ</p> <!-- </CANVAS> is not present -->
</div>
</body>
</html>`))
html.Render(os.Stdout, node)
} Note output, <html><head>
</head>
<body>
<div class="foo">
<!-- CANVAS 1 -->
<canvas></canvas>
<div>BAR</div>
<!-- CANVAS 2 -->
<canvas>
<p>BAZ</p>
</canvas></div> <!-- </CANVAS> reappears!, enclosing <P> -->
</body></html> The issue is that the parser is expecting a closing |
It magically appears because the second If you can quote section and verse from the spec showing why the above behavior is wrong, we'll reopen this issue.
These are separate issues. #22298 is about a self-closing element being translated to separate start/end elements. This issue is about a closing element being added for an element that was never closed. |
Here's an example out in the wild, by Evernote (Chrome 62 on macOS): Notice the canvas is self-closing, without a closing Here is a real example: https://www.evernote.com/shard/s5/sh/d53a3747-d849-4b10-b17e-dbc9dee0383c/7417220218a9f310 I note that the canvas was likely JS-injected. |
Please answer these questions before submitting your issue. Thanks!
What version of Go are you using (
go version
)?1.9.2
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?What did you do?
Okay. Browsers are very forgiving when it comes to rendering HTML. A canvas element should be declared as
<canvas width="100" height="100"></canvas>
. However, if it is rendered as just<canvas width="100" height="100">
, the browser would be fine with it as well. But thehtml.Parse()
function is not as forgiving and parses this wrongly.What did you expect to see?
The Node for CANVAS 1 (above) behaves as expected, it has a
NextSibling != nil
What did you see instead?
The Node for CANVAS 2 (above) does not behave as expected, it has a
NextSibling == nil
. Instead, it has aFirstChild != nil
andFirstChild.NextSibling != nil
that holds that value of the correctNextSibling
I believe this is also related to #22298
The text was updated successfully, but these errors were encountered: