New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encoding/csv: Lazy quotes reader does not properly parse quoted fields that end in a bare quote #56329
Comments
I didn't try to understand every case. I just note that you say that for row 2 you expect If I'm correct--and I may not be--then we aren't going to change this because it would change the existing behavior of this package. In general we want to change this package as little as possible because experience tells us that any change will break current users. The fix for your case may be to edit the input or the output, or to copy the package (it's pretty small) and change it to suit your purposes. |
Hey @ianlancetaylor sorry for the confusion, I had some copy/paste errors in the expected output, I fixed it. The second row in the first example should, as you have stated, get returned as |
Per RFC 4180, CSV permits a
The first
This is weird but it seems consistent. |
Ah I see, yes, according to the spec it would require a preceding I wrote the following to a file
This seems to handle the double quotes more correctly. I also imported it to a Google Sheet and got the same output: |
We've learned over the years that there is a vast variety of CSV formats out there. Rather than add knobs for all of them, we've decided that the standard library package will focus only on the RFC 4180 format. We added the configuration knob For people who need a different format, we suggest copying the package--it's only a few hundred lines of code--and modifying it for your purpose. |
Closing as working as intended for |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes. I also saw the same issue on 1.18.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
What did you expect to see?
What did you see instead?
Notes
"*"*"*",
)""*"*",
)"*"",
)""*"",
)–this is really a subset of the previous pattern.I looked into the csv library code and the culprit seems to be these two lines. When finding a closing
"
, the line position should not increment—similar to how the the lazy quotes case does not increment the line position when it finds an opening"
in a quoted field. The removal of quotes should always be handled by this block of code, never in the subsequent switch statement. The premature incrementing of the line causes (1) the next iteration of the loop to hit this clause in the if/else statement because there are no more quotes in the line and it assumes the end of the line is hit (causing any subsequent fields in the line and, potentially, following lines to be squashed) or (2) read until the next quote in the line squashing all of the fields in the line together (which can lead to case (1) if the next quote is the last rune of the field).The text was updated successfully, but these errors were encountered: