Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/csv: cut \r before EOF in input #22937

Closed
mjgarton opened this issue Nov 30, 2017 · 10 comments
Closed

encoding/csv: cut \r before EOF in input #22937

mjgarton opened this issue Nov 30, 2017 · 10 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Milestone

Comments

@mjgarton
Copy link
Contributor

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version devel +3a395e2283 Fri Nov 17 19:00:41 2017 +0000 linux/amd64

Does this issue reproduce with the latest release?

no

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/home/mgarton/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/mgarton/gopath"
GORACE=""
GOROOT="/home/mgarton/go"
GOTMPDIR=""
GOTOOLDIR="/home/mgarton/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build420727529=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Tried to read a csv with Go tip that worked with Go1.9

If possible, provide a recipe for reproducing the error.
A complete runnable program is good.
A link on play.golang.org is best.

https://play.golang.org/p/AfQJBQ-2JC

What did you expect to see?

The same behaviour as in 1.9

What did you see instead?

parse error on line 1, column 2: extraneous or missing " in quoted-field

@mvdan
Copy link
Member

mvdan commented Nov 30, 2017

Dup of #22746?

@mvdan
Copy link
Member

mvdan commented Nov 30, 2017

Nope, not an exact duplicate - can reproduce on go version devel +4435fcfd6c Thu Nov 30 14:39:19 2017 +0000 linux/amd64. But note that the program above is missing a check for the error - this is the program that reproduces the output:

https://play.golang.org/p/U75TI4ouaG

$ go run f.go
2017/11/30 16:11:03 parse error on line 1, column 2: extraneous or missing " in quoted-field
exit status 1

And, as expected, on 1.9.2:

$ go1 run f.go
2017/11/30 15:54:46 things worked as I expected!

@bradfitz bradfitz changed the title encoding/csv change in \r handling encoding/csv: change in \r handling Nov 30, 2017
@bradfitz bradfitz added the NeedsFix The path to resolution is known, but the work has not been done. label Nov 30, 2017
@bradfitz bradfitz added this to the Go1.10 milestone Nov 30, 2017
@dsnet
Copy link
Member

dsnet commented Nov 30, 2017

It's not obvious that this is valid since endline terminators should be "\r\n" or "\n". However, this string abruptly ends on "\r". However, it shouldn't be hard preserving the old behavior.

@bradfitz
Copy link
Contributor

I naively agree with @dsnet, but I don't know the CSV landscape enough to know what's commonly produced or accepted.

@dsnet
Copy link
Member

dsnet commented Nov 30, 2017

Essentially, it comes down to the old behavior being equivalent to:

line = strings.TrimSuffix(strings.TrimSuffix(line, "\n"), "\r")

While the new behavior is equivalent to:

if strings.HasSuffix(line, "\r\n") {
    line = strings.TrimSuffix(line, "\r\n")
} else {
    line = strings.TrimSuffix(line, "\n")
}

The old behavior is simpler, but the new behavior is a more correct interpretation of "\r\n" and "\n" as valid newline separators. Note also that we don't treat bare "\r" as newline separators, so it doesn't make sense to me why we should allow it as the trailing marker at EOF.

Furthermore, note that the RFC actually says that newline separators should be CRLF, so the fact that we allow bare LF is technically non-compliant.

Marking this as "Needs Decision". I'm slightly in favor of saying the new behavior is correct.

@dsnet dsnet added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. and removed NeedsFix The path to resolution is known, but the work has not been done. labels Nov 30, 2017
@mjgarton
Copy link
Contributor Author

FWIW I don't have a particular problem with the new behaviour. The reason I reported it was to be sure it wasn't changed by mistake and I didn't know if it was correct.

Whatever is decided, would an additional unit test be welcome to cover it?

@dsnet
Copy link
Member

dsnet commented Nov 30, 2017

Interestingly, I actually added a test when this behavior changed:
89ccfe4#diff-2890b6f0873dbda3ec47edcbc69f6f2aR295

@rsc
Copy link
Contributor

rsc commented Dec 1, 2017

We should probably restore the old behavior: a \r before EOF gets cut just like a \r before \n. I think it is debatable whether the old behavior or current behavior is more "correct", and therefore we should err on the side of preserving the behavior from the previous release.

@rsc rsc added the NeedsFix The path to resolution is known, but the work has not been done. label Dec 1, 2017
@gopherbot gopherbot removed the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Dec 1, 2017
@rsc rsc changed the title encoding/csv: change in \r handling encoding/csv: cut \r before EOF in input Dec 1, 2017
@rsc
Copy link
Contributor

rsc commented Dec 1, 2017

@dsnet, interested in working on a fix?

@dsnet dsnet self-assigned this Dec 1, 2017
@gopherbot
Copy link

Change https://golang.org/cl/81577 mentions this issue: encoding/csv: fold carriage return into line feeds

@golang golang locked and limited conversation to collaborators Dec 5, 2018
@rsc rsc unassigned dsnet Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Projects
None yet
Development

No branches or pull requests

6 participants