Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/csv: add Reader.InputOffset method #43401

Closed
ghost opened this issue Dec 28, 2020 · 6 comments
Closed

encoding/csv: add Reader.InputOffset method #43401

ghost opened this issue Dec 28, 2020 · 6 comments

Comments

@ghost
Copy link

ghost commented Dec 28, 2020

// InputOffset returns the input stream byte offset of the current reader
// position. The offset gives the location of the end of the most recently
// read row and the beginning of the next row. 
func (r *Reader) InputOffset() int64

encoding/json.Decoder already has a method like this:

https://golang.org/pkg/encoding/json/#Decoder.InputOffset

@gopherbot gopherbot added this to the Proposal milestone Dec 28, 2020
@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Dec 28, 2020
@ianlancetaylor
Copy link
Contributor

For the encoding/json package it made sense to add InputOffset, because there was no straightforward way for the code calling encoding/json and providing the io.Reader to know which token was read last. That is a little less convincing for encoding/csv, which is a line-oriented format. The io.Reader can keep track of which line is being processed, by returning a line at a time.

Still, symmetry with the other packages is a reasonable argument.

@ghost
Copy link
Author

ghost commented Dec 28, 2020

It makes sense here, too. It is sometimes necessary to have random access to the rows, so that you can read some of them without reading the entire (possibly huge) file.

I've implemented a simple inverted index, where the documents are rows from CSV files. Every document ID is mapped to a filename and a row offset. When I query the index, I can read only the rows that match by seeking to the offsets.

There's also this question on StackOverflow: https://stackoverflow.com/questions/22875018/read-random-lines-off-a-text-file-in-go

@rsc
Copy link
Contributor

rsc commented Jul 28, 2021

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals (old) Jul 28, 2021
@rsc
Copy link
Contributor

rsc commented Aug 4, 2021

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc rsc moved this from Active to Likely Accept in Proposals (old) Aug 4, 2021
@rsc rsc moved this from Likely Accept to Accepted in Proposals (old) Aug 11, 2021
@rsc
Copy link
Contributor

rsc commented Aug 11, 2021

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: encoding/csv: add Reader.InputOffset method encoding/csv: add Reader.InputOffset method Aug 11, 2021
@rsc rsc modified the milestones: Proposal, Backlog Aug 11, 2021
@gopherbot
Copy link

Change https://go.dev/cl/405675 mentions this issue: encoding/csv: add Reader.InputOffset method

@dmitshur dmitshur modified the milestones: Backlog, Go1.19 May 14, 2022
@golang golang locked and limited conversation to collaborators May 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests

4 participants