Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/csv: Add a "field reader" type #21370

Closed
earthboundkid opened this issue Aug 9, 2017 · 6 comments
Closed

encoding/csv: Add a "field reader" type #21370

earthboundkid opened this issue Aug 9, 2017 · 6 comments
Labels
FeatureRequest FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@earthboundkid
Copy link
Contributor

Python has a nice type called csv.DictReader. For those not familiar, here are its docs:

class csv.DictReader(f, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds)

Create an object that operates like a regular reader but maps the information in each row to an OrderedDict whose keys are given by the optional fieldnames parameter.

The fieldnames parameter is a sequence. If fieldnames is omitted, the values in the first row of file f will be used as the fieldnames. Regardless of how the fieldnames are determined, the ordered dictionary preserves their original ordering.

If a row has more fields than fieldnames, the remaining data is put in a list and stored with the fieldname specified by restkey (which defaults to None). If a non-blank row has fewer fields than fieldnames, the missing values are filled-in with None.

All other optional or keyword arguments are passed to the underlying reader instance.

So, it's just a convenience wrapper around their CSV reader that puts rows into dictionaries/maps with the keys being either supplied or from the file header.

I wrote a similar wrapper for Go: https://godoc.org/github.com/carlmjohnson/csv. I'd like it if it or something like it could be included in the standard library. The dictionary/map part is not so interesting to me, and I'd consider dropping it, but the important part is being able to refer to a column by a name instead of a number. Without that feature, you can make constants for the column numbers, but that's a bit awkward, and you mostly just end up littering your code with magic numbers. Also, the handling of EOF in the standard CSV reader is a little clunky.

@bradfitz bradfitz added FeatureRequest NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Aug 9, 2017
@bradfitz bradfitz added this to the Go1.10 milestone Aug 9, 2017
@ianlancetaylor
Copy link
Contributor

https://golang.org/doc/faq#x_in_std

Personally I'd rather encoding/csv got moved out of the standard library rather than giving it more features.

@earthboundkid
Copy link
Contributor Author

Could be a good Go 2 proposal: remove encoding/csv. 😄

The main reasons I want my additions in the stdlib are: I can slightly simplify the code if I could access encoding/csv's package internals and it's more convenient to use a stdlib package than an external package. In theory, go dep should resolve the second issue.

@josharian
Copy link
Contributor

I'd be happy to see encoding/csv move out of the standard library. It might be a good trial run for moving more things out of std, and maybe how it works in practice to break backwards compatibility as we do so. (Do people take the code upgrade hit or stay with the deprecated stdlib version anyway? What kind of docs and tools are needed in practice? Etc.)

I'd like to see it move to an x repo. Even go4.org feels pretty dependent on one person. And moving it to x would I think help adoption/migration and help avoid unnecessary fragmentation.

Maybe @nussjustin would be interested in spearheading the fork and maintaining the package going forward? I doubt it'd require all that much ongoing maintenance, perhaps after an initial first burst of activity.

@nussjustin
Copy link
Contributor

I know this is something commonly needed/wanted (last time I needed/wanted something like this was today...), but I'm not convinced that we should add this to the stdlib.

encoding/csv is a small package which a small API that can be easily extended via other packages. Adding a whole new type like the proposed field reader seems rather heavy to me, especially when the type itself doesn't really depend on encoding/csv itself, as in this case. If we decide to do this, I'd prefer a new method or two on Reader instead of a new type.

Considering this I'd be ok with having something like this outside the stdlib e.g. in an golang.org/x repo, but not really in the stdlib. At this end I agree with @ianlancetaylor and @josharian.

Speaking of @josharian ...

I'd be happy to see encoding/csv move out of the standard library.

Not just you.

It might be a good trial run for moving more things out of std, and maybe how it works in practice to break backwards compatibility as we do so.

I like the idea. If there is anything I can help with, just tell me.

Maybe @nussjustin would be interested in spearheading the fork and maintaining the package going forward? I doubt it'd require all that much ongoing maintenance, perhaps after an initial first burst of activity.

See my previous sentence :-)

@rsc
Copy link
Contributor

rsc commented Aug 21, 2017

I like having encoding/csv in the standard library. I'm not one of the standard library minimalists.

But this is easy to build on top of the encoding/csv in the standard library. Why put it in? Also, it's far less efficient when reading big files than the slice.

@earthboundkid
Copy link
Contributor Author

Closing the issue because there’s not enough interest in it, and it’s not hard to do this as a third party.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FeatureRequest FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

7 participants