You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have confirmed this for
go version devel +fcff3ba Mon Jan 12 02:09:50 2015 +0000 linux/amd64
go version go1.4 linux/amd64
I know that there is a 64kb limit to the Scanner buffer. My issue mainly deals with successive calls to Scan() with a scanner that has encountered a token that is too long. When a Scanner encounters a token that exceeds 64kb, a call to Scan() returns false and its token field is empty. However, it seems that if Scan() is called a second time on the same Scanner, this then populates the token field of the Scanner up to 64kb and returns true. If a third, fourth, ..., Nth call to Scan() is made, the token field is empty and returns false.
Here is an example:
...
file, _ := os.Open("line.txt") // file has a single line that exceeds the 64kb limit
scanner := bufio.NewScanner(file)
var ret bool
ret = scanner.Scan() // ret is false, scanner.Text() is an empty slice of bytes, error field says the line is too long
ret = scanner.Scan() // ret is true, scanner.Text() is a slice containing the first 64kb of the line, error field says the line is too long
ret = scanner.Scan() // ret is false, scanner.Text() is an empty slice again, error field still says the line is too long
...
I would argue that successive calls to Scan() in these situations should give a consistent result, or perhaps to advertise that a second call to Scan() gets the first 64kb of the token. I would also like to argue for making the 64kb token limit in bufio.Scanner clear in the documentation as it may save some headaches.
I know this issue is low priority, so I would be happy to take it on.
The text was updated successfully, but these errors were encountered:
The documentation says that Scan "returns false when the scan stops". Expecting to scan meaningfully after the scan has stopped seems wishful thinking at best. As the doctor says, don't do that.
I would rather not document the buffer size since it may change and I don't want people writing code that depends on the specific value. This is a convenience API, after all.
I have confirmed this for
go version devel +fcff3ba Mon Jan 12 02:09:50 2015 +0000 linux/amd64
go version go1.4 linux/amd64
I know that there is a 64kb limit to the Scanner buffer. My issue mainly deals with successive calls to Scan() with a scanner that has encountered a token that is too long. When a Scanner encounters a token that exceeds 64kb, a call to Scan() returns false and its token field is empty. However, it seems that if Scan() is called a second time on the same Scanner, this then populates the token field of the Scanner up to 64kb and returns true. If a third, fourth, ..., Nth call to Scan() is made, the token field is empty and returns false.
Here is an example:
I would argue that successive calls to Scan() in these situations should give a consistent result, or perhaps to advertise that a second call to Scan() gets the first 64kb of the token. I would also like to argue for making the 64kb token limit in bufio.Scanner clear in the documentation as it may save some headaches.
I know this issue is low priority, so I would be happy to take it on.
The text was updated successfully, but these errors were encountered: