Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(381)

Issue 13659043: code review 13659043: encoding/csv: Fix handling of quoted substrings at star...

Can't Edit
Can't Publish+Mail
Start Review
Created:
10 years, 7 months ago by Sudhir
Modified:
9 years, 4 months ago
Reviewers:
CC:
golang-codereviews, rsc
Visibility:
Public.

Description

encoding/csv: Fix handling of quoted substrings at start of field in a CSV file. The CSV reader could not properly parse a field that began with a quoted substring even when LazyQuotes was turned on. The field needed to be completely quoted. For example, the string |"Quoted Substring" other chars| when parsed would process the second quote as a bare quote and then consume characters (including newlines)until a closing quote before a Comma was encountered. This could sometimes result in memory exhaustion in the case of a large file. The change here fixes this behaviour if (and only if) LazyQuotes is true. If the character following the second quote character in a field is not a Comma or NewLine,the rest of the field is processed as a normal unquoted field until a Comma, Newline or EOF are reached (quotes are treated as BareQuotes and inserted in the field). If the last character in the field is a quote character, it is dropped (to match the quote at the beginning). If not, the leading quote is inserted as part of the field value. Change in behaviour: Input: `abc,"def"ghi,jkl\nMNO,PQR",STU` Old Output: `abc`, `def"ghi,jkl\nMNO,PQR`, `STU` New Output: `abc`, `"def"ghi`, `jkl` `MNO`, `PQR"`, `STU` Note how the previous behaviour sucks in a whole bunch of characters before terminating at the end of the second field on line 2. The new (fixed) handling terminates processing at the comma after 'i' and returns two records instead of one. Two new tests are added in the test package to describe this behaviour (all existing tests pass unchanged). Fixes issue 6352. Fixes issue 3150.

Patch Set 1 #

Patch Set 2 : diff -r 77a4d225cc7e https://code.google.com/p/go #

Patch Set 3 : diff -r 77a4d225cc7e https://code.google.com/p/go #

Total comments: 2

Patch Set 4 : diff -r d71a954bca35 https://code.google.com/p/go #

Patch Set 5 : diff -r d71a954bca35 https://code.google.com/p/go #

Unified diffs Side-by-side diffs Delta from patch set Stats (+44 lines, -1 line) Patch
M src/pkg/encoding/csv/reader.go View 1 2 3 1 chunk +29 lines, -1 line 0 comments Download
M src/pkg/encoding/csv/reader_test.go View 1 2 3 1 chunk +15 lines, -0 lines 0 comments Download

Messages

Total messages: 10
Sudhir
Hello golang-dev@googlegroups.com (cc: golang-dev@googlegroups.com), I'd like you to review this change to https://code.google.com/p/go
10 years, 7 months ago (2013-09-11 05:26:20 UTC) #1
rsc
Please find a way to preserve the current behavior for the tests listed. I think ...
10 years, 7 months ago (2013-09-11 19:07:24 UTC) #2
Sudhir
Thanks for the comment. The current behaviour on the tests cannot be preserved without leading ...
10 years, 7 months ago (2013-09-11 22:43:14 UTC) #3
rsc
You don't have to rewrite the bytes.Buffer. Just put the " in always, and then ...
10 years, 7 months ago (2013-09-11 23:47:28 UTC) #4
Sudhir
Hello golang-dev@googlegroups.com, rsc@golang.org (cc: golang-dev@googlegroups.com), Please take another look.
10 years, 7 months ago (2013-09-12 11:40:44 UTC) #5
Sudhir
I have uploaded a new version of the code. It now preserves current behavior for ...
10 years, 7 months ago (2013-09-12 11:40:45 UTC) #6
gobot
Replacing golang-dev with golang-codereviews.
10 years, 4 months ago (2013-12-20 16:26:04 UTC) #7
gobot
Replacing golang-dev with golang-codereviews.
10 years, 4 months ago (2013-12-20 16:26:10 UTC) #8
gobot
Replacing golang-dev with golang-codereviews.
10 years, 4 months ago (2013-12-20 16:26:17 UTC) #9
gobot
9 years, 4 months ago (2014-12-19 05:12:50 UTC) #10
R=close

To the author of this CL:

The Go project has moved to Gerrit Code Review.

If this CL should be continued, please see the latest version of
https://golang.org/doc/contribute.html for instructions on
how to set up Git and the Go project's Gerrit codereview plugin,
and then create a new change with your current code.

If there has been discussion on this CL, please give a link to it
(golang.org/cl/13659043 is best) in the description in your
new CL.

Thanks very much.
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b