New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encoding/csv: add the true number of lines to ErrFieldCount #6770
Comments
Why is a bad precedent? I can see that we don't want to encourage blind changing of numbers to pass error checks. However, in this particular case, I don't think this error message change would tell you what is right, it would just be more specific about what is wrong. I understand the criticism with the word "expected", but the intent of the issue is to help the user understand what went wrong. |
As an addendum, I don't see the difference between this case (returning the number of fields set and the number of fields read) and, for example, returning the filename that couldn't be opened in os.Open. os.Open could just return "no such file or directory", but the filename is added so it's easier to tell if a file is unexpectedly not there, or if the filename passed to os.Open was incorrect. |
I think the easiest fix here is to add a new field "Field" to ParseError, which would contain the number of the field where the error occured (e.g. /cc @bradfitz |
So Field would be 1-based and ParseError.Field == 0 would mean no information? How does that solve the original problem when r.FieldsPerRecord is 10 but 15 rows were read. You would report Field == 11? or 15? |
Both 11 and 15 could make sense, although I think 11 makes more sense especially since in the ErrFieldCount case Read also returns the record along with the error, so the user can get the field count using len(record) even now. 15 would make sense as the code always reads the whole record before checking the field count. Although this could be seen as an implementation detail it is exposed as the returned slice. I'm personally fine with both. Field could also be 0-based. This basically only matters for the ErrFieldCount case as it's the only case where the user has an error and the records read. It would make accessing the fields more simpler (no -1 required), but I think that's all it does. |
The problem with 0-based (even if it makes more sense) is then you either have to always use it, or document on on the new ParseError.Field docs when it's defined and when it's not. Saying "when it's non-zero" is much easier than enumerating a list of reasons. |
I agree with you that the 1-based variant is easier. I'm still unsure about the value of Field for ErrFieldCount, but am leaning towards FieldsPerRecord+1. What's your opinion on this? |
If the user can already get 15 and they set 10, I think this whole bug is kinda useless. Who cares if we report 11, 12, 13, 14, or 15? The user set 10 and we returned len() of 15 and ErrFieldCount. Can't they draw their own conclusions? |
Yes. It would probably make more sense for the non-ErrFieldCount cases where the field could be more important and the user has just the column (rune index) of the error, although I can't currently think of a real use case where just the column isn't enough. I only thought about the possibility of using len(record) when writing my second comment. So yeah, this whole bug is really kinda useless in retrospective. |
The text was updated successfully, but these errors were encountered: