Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: unmarshaller and whitespace #22146

Closed
md2perpe opened this issue Oct 5, 2017 · 6 comments
Closed

encoding/xml: unmarshaller and whitespace #22146

md2perpe opened this issue Oct 5, 2017 · 6 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@md2perpe
Copy link

md2perpe commented Oct 5, 2017

The issue

I had a file with space padded numbers (NumberOfPoints=" 266703") but the XML unmarshaller couldn't handle the spaces.

Booleans are trimmed, so why not numbers?

Does this issue reproduce with the latest release?

The issue exists in the currently latest commit in the GitHub repository:
https://github.com/golang/go/blob/99da873/src/encoding/xml/read.go

What operating system and processor architecture are you using (go env)?

Not relevant as the issue is in architecture-independent library code.

Examples

Good to know

I first asked about this and was given a workaround on Google Groups:
https://groups.google.com/forum/#!topic/golang-nuts/RtKU05cE9PY

@mvdan mvdan changed the title XML unmarshaller and whitespace encoding/xml: unmarshaller and whitespace Oct 5, 2017
@ianlancetaylor ianlancetaylor added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Oct 5, 2017
@ianlancetaylor ianlancetaylor added this to the Go1.10 milestone Oct 5, 2017
@leighmcculloch
Copy link
Contributor

According to the git history 1a37656 bools have been trimmed since they were first added by @rsc. @rsc: What was the motivation for trimming bools? Was it convenience?

If the reasoning was convenience I think it makes sense for there to be consistency across types where whitespace has no meaning since consistency will result in less wrong assumptions, but I think if it was to be added for consistency it'd need to be added to unsigned ints and floats too.

The no error on an invalid integer isn't intuitive and I think it should be considered a bug if trimming isn't introduced.

@kostix
Copy link

kostix commented Oct 6, 2017

Not to weigh on any side of the scales, I'd still point out a bit from my response to @md2perpe over there on the ML:

The question of whether the string " 266703" represents a valid
integer is hardly decidable in my opinion. For instance, would the
string "\v\v\v\n\n\n\t\t266703\n\n\n\v\v\v\v\t\x20\t" represent a valid
integer as well?

That is, while " 266703" looks innocent enough, though mildly annoying as it hints at sloppy
implementation of whatever generated that document, a string containing lots of "weird"
characters considered to be whitespace by Unicode (or Go's strings / bytes packages)
is, IMO, a bit less easily decidable to represent "valid" data.

@ianlancetaylor
Copy link
Contributor

@leighmcculloch Since we don't do our code review on Github it sometimes helps to look over there to get more background on a change. In this case, https://golang.org/cl/218050 . In this case it looks like the purpose of the CL was to accept more strings for bool values. It's not clear that anybody really considered the call to strings.TrimSpace in there.

@rsc
Copy link
Contributor

rsc commented Oct 23, 2017

Clearly we didn't think much of adding strings.TrimSpace in CL 218050 (which Michael Hoisie wrote and I reviewed), but it seems fine to apply strings.TrimSpace to all those numeric fields. Please make sure to update the docs and add a test. CLs welcome. Thanks.

@rsc rsc added the NeedsFix The path to resolution is known, but the work has not been done. label Oct 23, 2017
@gopherbot gopherbot removed the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Oct 23, 2017
@leighmcculloch
Copy link
Contributor

If no one else is already on it I'd like to submit a CL to make the change. 🖐️

@gopherbot
Copy link

Change https://golang.org/cl/73891 mentions this issue: encoding/xml: ignore whitespace in values and attrs

@golang golang locked and limited conversation to collaborators Nov 1, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

6 participants