|
|
Created:
10 years ago by importre Modified:
9 years, 6 months ago CC:
golang-codereviews, iant Visibility:
Public. |
Descriptionencoding/csv: Don't need to enclose double quotes of empty fields
rfc4180 # 5 says that
> If fields are not enclosed with double quotes, then
> double quotes may not appear inside the fields.
Google Drive and MS Excel also don't have double quotes of empty field.
I think it's reasonable.
Fixes issue 7586.
Patch Set 1 #Patch Set 2 : diff -r b3405f9c2e32 https://code.google.com/p/go/ #Patch Set 3 : diff -r 63408464ba37 https://code.google.com/p/go/ #
MessagesTotal messages: 10
Hello golang-codereviews@googlegroups.com (cc: golang-codereviews@googlegroups.com, iant@golang.org), I'd like you to review this change to https://code.google.com/p/go/
Sign in to reply to this message.
Too late for Go 1.3. We're in critical fixes-only mode now. Please ping this thread after Go 1.3 is released around Jun 1st. On Sun, Apr 13, 2014 at 5:59 PM, <jaeweheo@gmail.com> wrote: > Reviewers: golang-codereviews, > > Message: > Hello golang-codereviews@googlegroups.com (cc: > golang-codereviews@googlegroups.com, iant@golang.org), > > I'd like you to review this change to > https://code.google.com/p/go/ > > > Description: > encoding/csv: Don't need to enclose double quotes of empty fields > > rfc4180 # 5 says that > >> If fields are not enclosed with double quotes, then >> double quotes may not appear inside the fields. >> > > Google Drive and MS Excel also don't have double quotes of empty field. > I think it's reasonable. > Fixes issue 7586. > > Please review this at https://codereview.appspot.com/86980044/ > > Affected files (+3, -2 lines): > M src/pkg/encoding/csv/writer.go > M src/pkg/encoding/csv/writer_test.go > > > Index: src/pkg/encoding/csv/writer.go > =================================================================== > --- a/src/pkg/encoding/csv/writer.go > +++ b/src/pkg/encoding/csv/writer.go > @@ -115,10 +115,10 @@ > } > > // fieldNeedsQuotes returns true if our field must be enclosed in quotes. > -// Empty fields, files with a Comma, fields with a quote or newline, and > +// Files with a Comma, fields with a quote or newline, and > // fields which start with a space must be enclosed in quotes. > func (w *Writer) fieldNeedsQuotes(field string) bool { > - if len(field) == 0 || strings.IndexRune(field, w.Comma) >= 0 || > strings.IndexAny(field, "\"\r\n") >= 0 { > + if strings.IndexRune(field, w.Comma) >= 0 || > strings.IndexAny(field, "\"\r\n") >= 0 { > return true > } > > Index: src/pkg/encoding/csv/writer_test.go > =================================================================== > --- a/src/pkg/encoding/csv/writer_test.go > +++ b/src/pkg/encoding/csv/writer_test.go > @@ -28,6 +28,7 @@ > {Input: [][]string{{"abc\ndef"}}, Output: "\"abc\r\ndef\"\r\n", > UseCRLF: true}, > {Input: [][]string{{"abc\rdef"}}, Output: "\"abcdef\"\r\n", > UseCRLF: true}, > {Input: [][]string{{"abc\rdef"}}, Output: "\"abc\rdef\"\n", > UseCRLF: false}, > + {Input: [][]string{{""}}, Output: "\n"}, > } > > func TestWrite(t *testing.T) { > > > -- > You received this message because you are subscribed to the Google Groups > "golang-codereviews" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to golang-codereviews+unsubscribe@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. >
Sign in to reply to this message.
On 2014/04/14 20:09:02, bradfitz wrote: > Too late for Go 1.3. We're in critical fixes-only mode now. Please ping > this thread after Go 1.3 is released around Jun 1st. > > > > On Sun, Apr 13, 2014 at 5:59 PM, <mailto:jaeweheo@gmail.com> wrote: > > > Reviewers: golang-codereviews, > > > > Message: > > Hello mailto:golang-codereviews@googlegroups.com (cc: > > mailto:golang-codereviews@googlegroups.com, mailto:iant@golang.org), > > > > I'd like you to review this change to > > https://code.google.com/p/go/ > > > > > > Description: > > encoding/csv: Don't need to enclose double quotes of empty fields > > > > rfc4180 # 5 says that > > > >> If fields are not enclosed with double quotes, then > >> double quotes may not appear inside the fields. > >> > > > > Google Drive and MS Excel also don't have double quotes of empty field. > > I think it's reasonable. > > Fixes issue 7586. > > > > Please review this at https://codereview.appspot.com/86980044/ > > > > Affected files (+3, -2 lines): > > M src/pkg/encoding/csv/writer.go > > M src/pkg/encoding/csv/writer_test.go > > > > > > Index: src/pkg/encoding/csv/writer.go > > =================================================================== > > --- a/src/pkg/encoding/csv/writer.go > > +++ b/src/pkg/encoding/csv/writer.go > > @@ -115,10 +115,10 @@ > > } > > > > // fieldNeedsQuotes returns true if our field must be enclosed in quotes. > > -// Empty fields, files with a Comma, fields with a quote or newline, and > > +// Files with a Comma, fields with a quote or newline, and > > // fields which start with a space must be enclosed in quotes. > > func (w *Writer) fieldNeedsQuotes(field string) bool { > > - if len(field) == 0 || strings.IndexRune(field, w.Comma) >= 0 || > > strings.IndexAny(field, "\"\r\n") >= 0 { > > + if strings.IndexRune(field, w.Comma) >= 0 || > > strings.IndexAny(field, "\"\r\n") >= 0 { > > return true > > } > > > > Index: src/pkg/encoding/csv/writer_test.go > > =================================================================== > > --- a/src/pkg/encoding/csv/writer_test.go > > +++ b/src/pkg/encoding/csv/writer_test.go > > @@ -28,6 +28,7 @@ > > {Input: [][]string{{"abc\ndef"}}, Output: "\"abc\r\ndef\"\r\n", > > UseCRLF: true}, > > {Input: [][]string{{"abc\rdef"}}, Output: "\"abcdef\"\r\n", > > UseCRLF: true}, > > {Input: [][]string{{"abc\rdef"}}, Output: "\"abc\rdef\"\n", > > UseCRLF: false}, > > + {Input: [][]string{{""}}, Output: "\n"}, > > } > > > > func TestWrite(t *testing.T) { > > > > > > -- > > You received this message because you are subscribed to the Google Groups > > "golang-codereviews" group. > > To unsubscribe from this group and stop receiving emails from it, send an > > email to mailto:golang-codereviews+unsubscribe@googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. > > Yes, I will
Sign in to reply to this message.
R=close Until Go 1.4.
Sign in to reply to this message.
Now would be a good time to revisit this.
Sign in to reply to this message.
It seems to me that encoding/csv is emitting perfectly valid CSV already. I am inclined to say no.
Sign in to reply to this message.
On 2014/09/15 16:25:27, rsc wrote: > It seems to me that encoding/csv is emitting perfectly valid CSV already. I am > inclined to say no. This is causing me grief with PostgreSql. We have a process that generates several CSV files and then uses the COPY FROM command to bulk load them into the database. PG will not accept integer fields that are quoted: pq: invalid input syntax for integer: """" Wile I agree that the current behavior is correct, there are circumstances (including mine) where it is not useful. It would be nice to at least be able to control this if you didn't want to change the default behavior.
Sign in to reply to this message.
What does this have to do with Postgres or integers? An empty string isn't a valid integer either. An integer is already encoded as-is, without quotes, no? On Mon, Oct 20, 2014 at 2:20 PM, <biggernoise@gmail.com> wrote: > On 2014/09/15 16:25:27, rsc wrote: > >> It seems to me that encoding/csv is emitting perfectly valid CSV >> > already. I am > >> inclined to say no. >> > > This is causing me grief with PostgreSql. We have a process that > generates several CSV files and then uses the COPY FROM command to bulk > load them into the database. > > PG will not accept integer fields that are quoted: pq: invalid input > syntax for integer: """" > > Wile I agree that the current behavior is correct, there are > circumstances (including mine) where it is not useful. It would be nice > to at least be able to control this if you didn't want to change the > default behavior. > > https://codereview.appspot.com/86980044/ >
Sign in to reply to this message.
TL;DR: The relationship between this issue and Postgres is that I cannot use the files generated by encoding/csv with Postgres's bulk load command (COPY FROM). PG treats empty quoted strings as empty strings and will not convert these to NULL. Fields that are simply missing are treated as NULL. We are transforming quite a bit of data and loading the transformed results into PG. In order to improve performance, we generate clean CSV files from our Go process and then use the bulk load command to actually load the database (it's not cycle shaving, the performance gains can easily be orders of magnitude). PG requires these files to be very clean. This is true of other databases that I have worked with as well. When PG reads a field that is quoted, it interprets it as a string, even if the ultimate destination is an integer field. PG will not automatically convert strings to integers during this process. When PG reads an unquoted, empty field, it treats it as a NULL. A NULL can, of course, be placed into an integer column (assuming that it allows them). So, if I am in Go, and I have the following records (of type map[string]interface{}): {name: "Bob", score: int(11), school: "Jefferson"}, {name: "Mary", score: nil, school: "State"} The CSV will be: Bob,11,Jefferson Mary,"",State Which PG cannot understand. I could be wrong, but I don't think there is any combination of options that will make PG understand this format. PG understands: Bob,11,Jefferson Mary,,State Which makes the tables mirror the source records. But there is no way to make the csv/Writer generate this output. We're using some sed duct tape to work around this issue, but it would be nice if we didn't have to. I hope I answered your questions, please let me know if I still need to fill in some blanks. Andy On 2014/10/23 22:15:19, bradfitz wrote: > What does this have to do with Postgres or integers? > > An empty string isn't a valid integer either. > > An integer is already encoded as-is, without quotes, no? > > > On Mon, Oct 20, 2014 at 2:20 PM, <mailto:biggernoise@gmail.com> wrote: > > > On 2014/09/15 16:25:27, rsc wrote: > > > >> It seems to me that encoding/csv is emitting perfectly valid CSV > >> > > already. I am > > > >> inclined to say no. > >> > > > > This is causing me grief with PostgreSql. We have a process that > > generates several CSV files and then uses the COPY FROM command to bulk > > load them into the database. > > > > PG will not accept integer fields that are quoted: pq: invalid input > > syntax for integer: """" > > > > Wile I agree that the current behavior is correct, there are > > circumstances (including mine) where it is not useful. It would be nice > > to at least be able to control this if you didn't want to change the > > default behavior. > > > > https://codereview.appspot.com/86980044/ > >
Sign in to reply to this message.
On Thu, Oct 23, 2014 at 8:08 PM, <biggernoise@gmail.com> wrote: > TL;DR: The relationship between this issue and Postgres is that I cannot > use the files generated by encoding/csv with Postgres's bulk load > command (COPY FROM). PG treats empty quoted strings as empty strings > and will not convert these to NULL. Fields that are simply missing are > treated as NULL. > Thank you for this explanation. See https://codereview.appspot.com/164760043
Sign in to reply to this message.
|