Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: no way to extract value of tag when namespaced tag with same name occurs later #3703

Closed
evmar opened this issue Jun 4, 2012 · 10 comments
Milestone

Comments

@evmar
Copy link

evmar commented Jun 4, 2012

Briefly, there doesn't appear to be a way to get "bar" out of the following
XML:

<x>
<foo>bar</foo>
<unrelated:foo>...</unrelated:foo>
</x>

Matching on `xml:"foo"` grabs the latter unrelated tag.

Here's a runnable example exhibiting the problem:
  http://play.golang.org/p/m-17dw-fDn

This problem affects gdata feeds (Google APIs) which make heavy use of XML namespaces.
@rsc
Copy link
Contributor

rsc commented Jun 4, 2012

Comment 1:

Labels changed: added priority-later, go1.1, removed priority-triage.

Status changed to Accepted.

@rsc
Copy link
Contributor

rsc commented Sep 12, 2012

Comment 2:

Labels changed: added go1.1maybe.

@rsc
Copy link
Contributor

rsc commented Sep 14, 2012

Comment 3:

Labels changed: removed go1.1maybe.

@rsc
Copy link
Contributor

rsc commented Dec 10, 2012

Comment 4:

Labels changed: added size-l.

@tianon
Copy link
Contributor

tianon commented Jan 16, 2013

Comment 5:

Having this same issue trying to parse RSS from Slashdot.  They include <link> as
per the standard, then later include <atom10:link>, and no global xmlns on the
main <rss> tag to help disambiguate the two, so the first literally has no
namespace at all.
I tried several things to get around this, including using `xml:" link"` to explicitly
specify to match a blank namespace (which was a long shot, obviously), and then found
that the code I'm looking for is src/pkg/encoding/xml/read.go on line 242 which
(rightfully) ignores an empty namespace so that if we don't care about namespaces, we
don't have to specify them.
I'm not exactly sure what a solid fix for this might be, but figured I would add an
extra couple cents here.
My only idea is to add a special case namespace string that intentionally matches the
case of an empty namespace (thus a tag like `xml:"_ link"`), but that might conflict
with using something like xmlns:_="http://example.com" or xmlns:something="_" (which I
freely admit that I don't know if are valid, thus whether really are problems).

@tianon
Copy link
Contributor

tianon commented Feb 5, 2013

Comment 6:

Did a little digging and found out that technically speaking,
xmlns:_="http://example.com" is valid. (see
http://www.w3.org/TR/REC-xml/#NT-NameStartChar)
Not sure what simple alternative could be provided, but figured I'd add a little
followup to my earlier idea.  I also am not sure whether or not that specific example
has enough real world use to have an appreciable negative impact.
This cursory Google search (https://google.com/search?q="xmlns%3A_") suggests that at
least one person is using it, but there really aren't all that many hits, for what it's
worth.

@tianon
Copy link
Contributor

tianon commented Feb 5, 2013

Comment 7:

Also for what it's worth, a namespace name can't start with tilde or period, so those
could be acceptable alternatives (especially since hyphen is already taken for ignoring
a field).

@gopherbot
Copy link

Comment 8 by cornelius.howl:

Having this issue when parsing appcast.xml file (which has sparkle:version ..
attributes) can not be parsed.......

@rsc
Copy link
Contributor

rsc commented Mar 12, 2013

Comment 9:

https://golang.org/cl/7227056/

@rsc
Copy link
Contributor

rsc commented Mar 12, 2013

Comment 10:

This issue was closed by revision 4dd3e1e.

Status changed to Fixed.

@rsc rsc added this to the Go1.1 milestone Apr 14, 2015
@rsc rsc removed the go1.1 label Apr 14, 2015
@golang golang locked and limited conversation to collaborators Jun 24, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants