Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/csv: Provide a mean to change the quote character #8458

Closed
gopherbot opened this issue Jul 31, 2014 · 14 comments
Closed

encoding/csv: Provide a mean to change the quote character #8458

gopherbot opened this issue Jul 31, 2014 · 14 comments
Labels
FrozenDueToAge Suggested Issues that may be good for new contributors looking for work to do.

Comments

@gopherbot
Copy link

by fuzxxl:

The package encoding/csv recognizes quoted fields. Sadly, it's not possible to change
the quote character to something different, as required for some use cases [1].

I request that encoding/csv be expanded to allow users to select a custom quote
character.

[1]: http://stackoverflow.com/questions/25062281/golang-enclosure-rule-for-csv-parsing
@ianlancetaylor
Copy link
Contributor

Comment 1:

Labels changed: added repo-main, release-none.

@adg
Copy link
Contributor

adg commented Aug 7, 2014

Comment 2:

This could be done by adding a SingleQuote boolean field to Reader and Writer. False,
its default, would indicate the usual double-quoted behaviour.

Labels changed: added suggested.

Status changed to Accepted.

@gopherbot
Copy link
Author

Comment 3 by fuzxxl:

I think this would be a bad fix. The next time someone with an entirely different quote
character appears, this approach doesn't work anymore.
Why not add a field
    QuoteCharacter rune
Which defaults to double-quotes if set to 0?

@minux
Copy link
Member

minux commented Aug 7, 2014

Comment 4:

What if people then want different quote characters for start and end
quotes?

@gopherbot gopherbot added accepted Suggested Issues that may be good for new contributors looking for work to do. labels Aug 7, 2014
@satran
Copy link

satran commented Dec 22, 2014

I've submitted a fix: https://go-review.googlesource.com/#/c/1576/
The fix is only for the reader. I was wondering if the writer should also have an option to specify the quote character.

@bradfitz
Copy link
Contributor

Why?

Is there a standard that says the quote character can differ?

If you have a file format that uses a different quote character, is it really CSV?

I don't think so.

I'm compelled to close this without action. The encoding/csv package is small and easily forkable elsewhere for PQCSV ("pipe-quoted CSV") or whatever.

@satran
Copy link

satran commented Dec 22, 2014

I believe the standard does not specify it. You are right in pointing out that the package is small and easily forkable. Just saw this issue open and thought of fixing it :)

@bradfitz
Copy link
Contributor

The standard specifies it as double quote. It's not a tweakable parameter:

https://tools.ietf.org/html/rfc4180 says

   escaped = DQUOTE *(TEXTDATA / COMMA / CR / LF / 2DQUOTE) DQUOTE
...
   DQUOTE =  %x22 ;as per section 6.1 of RFC 2234 [2]

We'll follow the standard.

People needing wacky formats can use wacky packages.

@clausecker
Copy link

CSV is well known for being a format ever program implements somewhat differently, like HTML in the early years of the internet. Yes, the standard says that the quote character is ", but there are many programs out there that expect differently formatted data. Having a CSV package that is flexible in the way it generates output is a very useful thing. I am not sure if suggesting that every program that tries to generate CSV files which do not exactly match the standard shall just fork the encoding/csv package is a good idea both in terms of code-reuse and reliability.

@bradfitz
Copy link
Contributor

Can you provide a few examples of such programs, ideally popular ones?

Absent a compelling reason, I see no reason to introduce complexity for theoretical uses.

Even with examples, I'm tempted to say no just as a minor encouragement to those program's authors and users to do something more normal.

@khuderm
Copy link

khuderm commented Jan 13, 2016

I know this is old but I would like to add an example data which desperately needs this change to be read. I am currently working with some sentiment analysis stuff which uses SentiWordNet list. A row from the list:

a
00001740
0.125
0
able#1
(usually followed by `to') having the necessary means or skill or know-how or authority to do something; "able to swim"; "she was able to program her computer"; "we were at last able to buy a car"; "able to get a grant for the project"

I put each column on a newline so it is easy to distinguish each column. The columns are tab (\t) delimited and long text does not have an enclosure. Obviously, because the long text already has double quotes in it, the csv package gives me an error trying to parse the quotes. Hope what I wrote makes sense lol.

@gopherbot
Copy link
Author

CL https://golang.org/cl/23401 mentions this issue.

gopherbot pushed a commit that referenced this issue May 25, 2016
The intent of this comment is to reduce the number of issues opened
against the package to add support for new kinds of CSV formats, such as
issues #3150, #8458, #12372, #12755.

Change-Id: I452c0b748e4ca9ebde3e6cea188bf7774372148e
Reviewed-on: https://go-review.googlesource.com/23401
Reviewed-by: Andrew Gerrand <adg@golang.org>
@hasnickl
Copy link

hasnickl commented Aug 5, 2016

RFC4180 also mentioned the delimiter has to be comma (","), yet encoding/csv supports changing this

@ianlancetaylor
Copy link
Contributor

@hasnickl This issue is closed. If you want to discuss this, please use the golang-dev mailing list. Thanks.

@golang golang locked and limited conversation to collaborators Aug 5, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge Suggested Issues that may be good for new contributors looking for work to do.
Projects
None yet
Development

No branches or pull requests

9 participants