You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. Compile the attached tryx.go
2. Run with input redirected to the attached some.xml
What is the expected output?
A single `directive` object with nested !ENTITYs
What do you see instead?
An incomplete DOCTYPE directive, four ENTITY directives,
and trailing text with the final ]>.
Which compiler are you using (5g, 6g, 8g, gccgo)?
Which operating system are you using?
Which revision are you using? (hg identify)
8g, Fedora Core 9, ef61c195edc3+ tip
code from xml.go:
543 // Probably a directive: <!DOCTYPE ...>,
<!ENTITY ...>, etc.
544 // We don't care, but accumulate for caller.
545 p.buf.Reset()
546 p.buf.WriteByte(b)
547 for {
548 if b, ok = p.mustgetc(); !ok {
549 return nil, p.err
550 }
551 if b == '>' {
552 break
553 }
554 p.buf.WriteByte(b)
555 }
556 return Directive(p.buf.Bytes()), nil
This cuts the !DOCTYPE off at the closing > of the !ENTITY
(and discards the >). So the caller doesn't get the entire
!DOCTYPE, and nor does it get all of the text necessary for
reconstructing the !DOCTYPE. Unless, I suppose, it has to
keep pulling Directives and joining them with > until it gets
the ]>? In which case that's pretty horrid.
Looking at the XML spec
http://www.w3.org/TR/REC-xml/
looks like a directive can have nested other directives,
and as far as I can see, <> may also appear inside
quoted attributes.
So a revised version of that code could count <> and
only finish accumulation of the text when it hits a
properly balanced >, not counting < or > if they are
inside '...' or "..." strings.
That would mean that the entire outermost directive, and all
its nested directives, would be available to the program
using the parser. As is currently the case, what it chooses
to do with the contents is up to it.
mikioh
changed the title
xml parser does not accumulate nested directives properly
encoding/xml: parser does not accumulate nested directives properly
Jan 9, 2015
by ehog.hedge:
Attachments:
The text was updated successfully, but these errors were encountered: