Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/html: NoFrames parsed as generic text #47071

Closed
Lazyshot opened this issue Jul 6, 2021 · 5 comments
Closed

x/net/html: NoFrames parsed as generic text #47071

Lazyshot opened this issue Jul 6, 2021 · 5 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@Lazyshot
Copy link

Lazyshot commented Jul 6, 2021

What version of Go are you using (go version)?

$ go version
go version go1.15.6 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/bpeterson/Library/Caches/go-build"
GOENV="/Users/bpeterson/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/bpeterson/go/pkg/mod"
GONOPROXY="bitbucket.phishlabs.com"
GONOSUMDB="bitbucket.phishlabs.com"
GOOS="darwin"
GOPATH="/Users/bpeterson/go"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/Cellar/go/1.15.6/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/Cellar/go/1.15.6/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/_y/n4rvk2bn6x94k17h3cc_4y400000gn/T/go-build175312321=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Parse html content containing <noframes> html tag. html.Parse does not allow for parsing out the inner contents instead the content is handled as a TextNode.

https://play.golang.org/p/5h-YK8MR-dA

What did you expect to see?

2009/11/10 23:00:00 p Type: 3 Data: p

What did you see instead?

2009/11/10 23:00:00 p Type: 1 Data: <p>Some text</p>
@gopherbot gopherbot added this to the Unreleased milestone Jul 6, 2021
@mknyszek
Copy link
Contributor

mknyszek commented Jul 7, 2021

Looks like we don't have an owner for this, but @neild is an owner of the next directory up, so CC @neild.

@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jul 7, 2021
@neild
Copy link
Contributor

neild commented Jul 7, 2021

I haven't looked at this in great depth and may be missing something, but golang.org/x/net/html implements https://html.spec.whatwg.org/multipage/syntax.html#tokenization, which defines special handling for <noframes>:

  • A start tag whose tag name is one of: "noframes", "style"
    Follow the generic raw text element parsing algorithm.

@neild
Copy link
Contributor

neild commented Jul 7, 2021

Obsolete elements still need to be parsed correctly.

@Lazyshot
Copy link
Author

Lazyshot commented Jul 7, 2021

You are correct that noframes are now considered obsolete by standard: https://html.spec.whatwg.org/multipage/obsolete.html#noframes. I'm trying to port over some library functionality from perl, which does the behavior as described. I have a workaround for my specific use case, but will close this.

@Lazyshot Lazyshot closed this as completed Jul 7, 2021
@neild
Copy link
Contributor

neild commented Jul 7, 2021

Plenty of reasons someone might need to parse obsolete elements, too. (Try not to use <noframes> in new code, though.)

I do think golang.org/x/net/html is behaving according to the spec here. (Although it's a big spec, and I might be missing something.)

@golang golang locked and limited conversation to collaborators Jul 7, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants