Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/csv: TrimLeadingSpace when Comma is a space #4089

Closed
gopherbot opened this issue Sep 16, 2012 · 5 comments
Closed

encoding/csv: TrimLeadingSpace when Comma is a space #4089

gopherbot opened this issue Sep 16, 2012 · 5 comments

Comments

@gopherbot
Copy link

by accipiter:

What steps will reproduce the problem?
Run the following code: http://play.golang.org/p/p9hY_jVsOc

What is the expected output?
[]string{"Col1", "", "Col3", "Col4"}

What do you see instead?
[]string{"Col1", "Col3", "Col4"}

Which operating system are you using?
Windows

Which version are you using?  (run 'go version')
go1.0.2

Please provide any additional information below.
TrimLeadingSpace=true makes the reader trim all space characters determined by
unicode.IsSpace(r rune) . If a space character is used as delimiter (Reader.Comma), it
will be removed as well.

My suggestion is to change the following line:
http://golang.org/src/pkg/encoding/csv/reader.go?s=7544:7583#L257
from
  for r1 != '\n' && unicode.IsSpace(r1) {
to
  for r1 != '\n' && unicode.IsSpace(r1) && r1 != r.Comma {
@rsc
Copy link
Contributor

rsc commented Sep 17, 2012

Comment 1:

It's unclear to me what the right semantics are.

Labels changed: added priority-later, removed priority-triage.

Status changed to Accepted.

@gopherbot
Copy link
Author

Comment 2 by borman@google.com:

This is unfortunately working correctly.  You input: "Col1\t \tCol3\tCol4" with
TrimLeadingSpace set to false produces 4 fields, the second one being " ", but with
TrimLeadingSpace set to true the the input for the second field is " \tCol3\tCol4".  The
" \t" should be removed as they are both space characters so the second field is "Col3"
and only 3 fields are produced.
What should the following input be if TrimLeadingSpace is true and Comma is a space?  (.
== space)  "a.b..c...d"?
Using a white space character as a comma and setting TrimLeadingSpace is not well
defined.  It would be better for you to call strings.TrimLeft(field, unicode.IsSpace) on
your fields after having CSV parse them without trimming the leading space.

@gopherbot
Copy link
Author

Comment 3 by accipiter:

There is no standard* defining neither usage of delimiter characters other than comma,
nor how to trim leading spaces, so I think saying "working correctly" is just to say "It
is doing what the code tells it to do", and that is true.
However, it is common to use other delimiters (as well as enclosures). In some country
other characters are more or less standard (; in Sweden). In LibreOffice there 5
standard choices for delimiters when saving to CSV, two of them white space characters:
, ; : {tab} {space}
In these cases, non-enclosed delimiters are always considered delimiters, never a white
space.
Your example a.b..c...d should of course result in:
a b {empty} c {empty} {empty} d
* Reffering to: http://www.ietf.org/rfc/rfc4180.txt

@gopherbot
Copy link
Author

Comment 4 by accipiter:

My suggestion to solution is wrong though.
It would result in a bug where an enclosed white space character the same as the
delimiter would not be trimmed when it should be.
Example of input string:
"Col1\t\"\tLeadingTab\"\tCol3"

@rsc
Copy link
Contributor

rsc commented Sep 18, 2012

Comment 5:

I think we should probably leave this as is. Trying to mix
TrimLeadingSpace with Comma = a space has multiple conflicting
definitions, so we have to pick one. We might as well pick the one
that Go 1 used and avoid having different behaviors in different
versions.
You can always turn off TrimLeadingSpace and then apply it to the
fields yourself after the parse.

Status changed to Unfortunate.

@golang golang locked and limited conversation to collaborators Jun 24, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants