Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: bufio.Scanner make maxConsecutiveEmptyReads overridable #43201

Closed
stergiotis opened this issue Dec 15, 2020 · 5 comments
Closed

proposal: bufio.Scanner make maxConsecutiveEmptyReads overridable #43201

stergiotis opened this issue Dec 15, 2020 · 5 comments

Comments

@stergiotis
Copy link

Allow override of the default value maxConsecutiveEmptyReads by implementing a Setter-Function

func (s *Scanner) MaxConsecutiveEmptyReads(max int)

analogous to the buffer size.

Using this, one could easily implement length-prefixed formats like netstring using a simple split function.

@gopherbot gopherbot added this to the Proposal milestone Dec 15, 2020
@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Dec 15, 2020
@ianlancetaylor
Copy link
Contributor

Can you give an example of how you would use this?

@stergiotis
Copy link
Author

Sure. Consider this example for reading a stream of lines with highly variable line length (think JSON line format).

scan := bufio.NewScanner(input)
scan.Buffer(make([]byte,largeNum,largeNum),largeNum)
// up to here we dont know how long the maximum accepted line of text may be
// as the chunk size of the underlying io.Reader is unknown
scan.MaxConsecutiveEmptyReads(8192)
// with this setting the scanner will surely be able to process lines of length 8192 bytes
// with a worst-case chunk size of 1 byte or much more with a bigger chunk size.

This will in particular be of great importance if a split function like the one below is used:

func lengthPrefixedUint32BESplitFunc(data []byte, atEOF bool) (advance int, token []byte, err error) {
	const prefixBytes int = 4
	if atEOF {
		return 0, nil, nil
	}
	if len(data) < prefixBytes {
		// request more data
		return 0, nil, nil
	}
	l := binary.BigEndian.Uint32(data)
	bytesToRead := int(l)
	if len(data)-prefixBytes >= bytesToRead {
		return prefixBytes + bytesToRead, data[prefixBytes : prefixBytes+bytesToRead], nil
	} else {
		// request more data
		return 0, nil, nil
	}
}

@rsc rsc moved this from Incoming to Active in Proposals (old) Dec 16, 2020
@rsc
Copy link
Contributor

rsc commented Dec 16, 2020

@stergiotis Can you please post a complete program (perhaps on play.golang.org) that gets the "max consecutive empty reads" error, which you think should not be getting the error?

I fail to see how that split function would trigger the error at all. The error is only triggered when a split function repeatedly returns advance == 0, token != nil, err != nil. Your split function never does that.

@stergiotis
Copy link
Author

Oops, mea culpa. The error popped up in an edge case (at EOF), and I didn't read the implementation of bufio.Scanner carefully enough. I am sorry for the inconvenience.

@rsc rsc moved this from Active to Declined in Proposals (old) Jan 6, 2021
@rsc
Copy link
Contributor

rsc commented Jan 6, 2021

No change in consensus, so declined.
— rsc for the proposal review group

@golang golang locked and limited conversation to collaborators Jan 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests

4 participants