Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bufio: small buffer size and Unicode character trigger flush at wrong time #22883

Closed
ooghry opened this issue Nov 26, 2017 · 12 comments
Closed
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@ooghry
Copy link

ooghry commented Nov 26, 2017

What version of Go are you using (go version)?

go version go1.9.2 windows/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

set GOARCH=amd64
set GOBIN=
set GOEXE=.exe
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=D:\gopath
set GORACE=
set GOROOT=C:\Go
set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64
set GCCGO=gccgo
set CC=gcc
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0
set CXX=g++
set CGO_ENABLED=1
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config

What did you do?

https://play.golang.org/p/wR1TryoaVK
Hi there
If create new buffer writer with a size of 3. then write 1byte character and then write 2byte character.
golang trigger flush only the first(1byte) character and put second character in buffer.
I expect flush all 2 characters.

What did you expect to see?

Buffered 0
Buffered 1
Buffered 3
Available 0
writer: "aب"
Buffered 1
Buffered 2
writer: "cd"

What did you see instead?

Buffered 0
Buffered 1
writer: "a"
Buffered 2
Available 1
Buffered 3
writer: "بc"
Buffered 1
writer: "d"

@robpike robpike changed the title small buffer size and Unicode character trigger flush at wrong time bufio: small buffer size and Unicode character trigger flush at wrong time Nov 27, 2017
@robpike
Copy link
Contributor

robpike commented Nov 27, 2017

I admit the behavior is a little odd but so is the program. Also, there is no violation here of any contract, as the package does not specify when the writes will occur.

I see no actual problem here. It is a complaint about internal details of a package that is working according to the documented behavior.

@ooghry
Copy link
Author

ooghry commented Nov 27, 2017

@bradfitz
Copy link
Contributor

Closing, as this is working as intended. It's a bytes.Buffer, not a runes.Buffer. It is totally unaware of the encoding of things you put inside of it, whether that's UTF-8 or anything else.

@bradfitz
Copy link
Contributor

Sorry, I too quickly assumed this was about UTF-8 boundaries.

@bradfitz bradfitz reopened this Nov 27, 2017
@bradfitz bradfitz added this to the Go1.11 milestone Nov 27, 2017
@bradfitz bradfitz added the NeedsFix The path to resolution is known, but the work has not been done. label Nov 27, 2017
@ooghry
Copy link
Author

ooghry commented Nov 27, 2017

@bradfitz could I push this change https://github.com/ooghry/go/commit/40c953593febd83b0f07c7ea8ae2259a03781aa4 to repository?
(by the following rules in golang.org/doc/contribute.html)

@bradfitz
Copy link
Contributor

Yes you may. No promises it'd be accepted but that'd be the next step to get a review. Be sure to include a test.

@gopherbot
Copy link

Change https://golang.org/cl/79920 mentions this issue: bufio: small buffer size and Unicode character trigger flush at wrong time

@gopherbot
Copy link

Change https://golang.org/cl/80095 mentions this issue: bufio: small buffer size and Unicode character trigger flush at wrong time

@ianlancetaylor
Copy link
Contributor

The code in CL 79920 is slightly less efficient than the existing code, and I don't see any reason that the existing code is wrong. The package doesn't make any promise that the buffering is exact, and indeed even with CL 79920 it remains inexact in that it doesn't break multi-byte runes across buffer boundaries.

Why should we change this?

@bradfitz
Copy link
Contributor

The old code was over-aggressively flushing on runes that encoded to fewer than 4 bytes. For instance, ASCII rune writes would flush 3 bytes too early.

@ianlancetaylor
Copy link
Contributor

The code seems to have a special case for ASCII rune writes. I think the only premature flush is for 2 and 3 byte runes.

@bradfitz
Copy link
Contributor

@ianlancetaylor, oh, I didn't see that. Gerrit folded that context away and I didn't expand it.

Okay, I've come around to your opinion. This isn't really worth it. Closing.

@golang golang locked and limited conversation to collaborators May 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

5 participants