Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: io: require io.Reader to return either n > 0 or an error #27531

Closed
crawshaw opened this issue Sep 6, 2018 · 26 comments
Closed

proposal: io: require io.Reader to return either n > 0 or an error #27531

crawshaw opened this issue Sep 6, 2018 · 26 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal
Milestone

Comments

@crawshaw
Copy link
Member

crawshaw commented Sep 6, 2018

Today the Read method of an io.Reader is allowed to return 0, nil. It is uncommon and discouraged:

Implementations of Read are discouraged from returning a zero byte
count with a nil error, except when len(p) == 0. Callers should treat a
return of 0 and nil as indicating that nothing happened; in particular
it does not indicate EOF.

It is so uncommon that it is hard to remember to deal with this case when using an io.Reader. All correct calls to Read need to be wrapped in a loop that tries multiple times, and after some ill-defined cutoff, report an error like io.ErrNoProgress. Consider the correct example of reading from bufio.Reader:

// Read new data: try a limited number of times.
for i := maxConsecutiveEmptyReads; i > 0; i-- {
	n, err := b.rd.Read(b.buf[b.w:])
	if n < 0 {
		panic(errNegativeRead)
	}
	b.w += n
	if err != nil {
		b.err = err
		return
	}
	if n > 0 {
		return
	}
}
b.err = io.ErrNoProgress

For Go 2 I propose we change the contract of io.Reader so implementations are required to return either n > 0 or a non-nil error.

@crawshaw crawshaw added the v2 A language change or incompatible library change label Sep 6, 2018
@gopherbot gopherbot added this to the Proposal milestone Sep 6, 2018
@bcmills
Copy link
Contributor

bcmills commented Sep 6, 2018

I think if we wanted to do that we would also need to change the name of the method.

We want Go 1 and Go 2 APIs to be able to coexist in the same program, and per https://research.swtch.com/vgo-import, that requires that “when you change a function’s behavior, you also change its name.”

@ianlancetaylor ianlancetaylor changed the title proposal: io: require io.Reader to return either n > 0 or an error proposal: Go 2: io: require io.Reader to return either n > 0 or an error Sep 6, 2018
@ianlancetaylor
Copy link
Contributor

It would be useful to know whether there are any existing implementations of io.Reader that can return zero with no error. If there are any, why do they exist?

See #5309 and #5310. In particular #5309 suggests that (*tls.Conn).Read() could return (0, nil). Is that still the case?

@bcmills
Copy link
Contributor

bcmills commented Sep 6, 2018

It would be useful to know whether there are any existing implementations of io.Reader that can return zero with no error. If there are any, why do they exist?

I think the net.Conns returned by net.Pipe can. I believe the reason is so that Write can unblock a pending Read without closing the pipe, but I'm not really sure.

(I didn't find a compelling reason the last time I checked: #24205 (comment).)

@dsnet
Copy link
Member

dsnet commented Sep 6, 2018

Though undocumented, the flate.Reader can return (0, nil) when it encounters successive Z_SYNC_FLUSH markers. There are tests to preserve this behavior and some network protocols depend on this.

EDIT: I just checked. flate.Reader actually only returns on SYNC_FLUSH if there at least some number of bytes read (example). That said, I got an email recently complaining that the reader did not return on all SYNC_FLUSH, which would imply that it may return (0, nil).

@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 10, 2018
@splace
Copy link

splace commented Jun 28, 2019

surely the contract, put as a conversation, is...

reader.Read == 'i want some bytes in this buffer, let me know how many you have put in it and if there is a problem that effects me repeating this action'.

the reader Reads without knowledge of how many are available, isn't this isolation a fundamental point of a reader.

@nhooyr
Copy link
Contributor

nhooyr commented Jan 20, 2021

It is interesting that https://golang.org/pkg/io/#Reader discourages this but allows it.

Implementations of Read are discouraged from returning a zero byte count with a nil error, except when len(p) == 0.

It appears this was considered when designing the io package. Perhaps to allow readers where Read can be used to poll for data to be read instead of blocking? It's unidiomatic for sure but that's the only reason I can think of. And anyone who definitely wants to block uses io.ReadFull anyway.

@ianlancetaylor
Copy link
Contributor

@nhooyr The guidelines for io.Reader have evolved over time. We've come to understand more clearly how a Read method should behave, and we've tried to capture that in the io.Reader documentation, but at the same time we don't want to break existing types that already have Read methods.

@ianlancetaylor ianlancetaylor changed the title proposal: Go 2: io: require io.Reader to return either n > 0 or an error proposal: io: require io.Reader to return either n > 0 or an error Jun 21, 2023
@ianlancetaylor ianlancetaylor removed the v2 A language change or incompatible library change label Jun 21, 2023
@rsc
Copy link
Contributor

rsc commented Jun 21, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Jun 28, 2023

I don't see a way to do this compatibly. There exist implementations that can return zero-length successful reads. Any reader that is reading and stopping on packet boundaries might do that, for example. This is another example of something that we can't easily change without invalidating all the implementations in the wild, which means we can't really change it.

@rsc
Copy link
Contributor

rsc commented Jul 5, 2023

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Jul 12, 2023

No change in consensus, so declined.
— rsc for the proposal review group

@slonopotamus
Copy link

If returning 0, nil is a valid Reader implementation, shouldn't bufio.maxConsecutiveEmptyReads be removed because it doesn't combine well with such readers?

@ianlancetaylor
Copy link
Contributor

A reader that returns 0, nil is valid, but a reader that returns it 100 times in a row probably has something wrong with it.

@ethan-gallant
Copy link

ethan-gallant commented Feb 12, 2024

A reader that returns 0, nil is valid, but a reader that returns it 100 times in a row probably has something wrong with it.

@ianlancetaylor what about when working with CGO and compression libraries. I was under the impression you want to avoid blocking the OS thread and I'd see this as a valid use-case where it could return 0, nil 100+ times and it remain valid.

For example this library https://github.com/DataDog/zstd

Edit: Wanted to add if there's a better approach of some sort, I'm more than open to implementing it. Just from my understanding the C library itself does not have a way to safely block and hook that with Go.

@slonopotamus
Copy link

slonopotamus commented Feb 12, 2024

@ethan-gallant You possibly should create an issue in DataDog/zstd (ideally, with a repro). I agree with @ianlancetaylor, returning 0-bytes 100 times in a row is suspicious. Normally, compressing library reads uncompressed data from input stream, does its compression magic and writes result to the output. Such logic is not supposed to produce many 0-bytes results.

@dsnet
Copy link
Member

dsnet commented Feb 12, 2024

Some compression formats have the concept of a "sync flush" where a special block is a signal to the decompressor that it should report all uncompressed data to the application. If a decompression stream contains a series of back-to-back sync flush blocks without any data in-between, I would expect it to still read (0, nil) a number of times repeatedly.

@ethan-gallant
Copy link

Some compression formats have the concept of a "sync flush" where a special block is a signal to the decompressor that it should report all uncompressed data to the application. If a decompression stream contains a series of back-to-back sync flush blocks without any data in-between, I would expect it to still read (0, nil) a number of times repeatedly.

Agreed, additionally 100 is a very ambiguous catch-all case. What makes 100 so significant here, why not 1000? Who's to say how frequently a consumer may call read or the frequency at which they do it.

@ianlancetaylor
Copy link
Contributor

My opinion is that we should leave bufio as is unless and until somebody finds a real world problem. The current code was worked out in the discussion on https://go.dev/cl/76400048.

@ethan-gallant
Copy link

@ianlancetaylor the real-world problem is that using bufio.Reader with implementations that return 0, nil results in an unrecoverable error.

While the documentation says "this is discouraged" it does not state it is unsupported.

@ianlancetaylor
Copy link
Contributor

@ethan-gallant What I mean by a real-world problem is a real existing Go program that breaks because of how bufio currently behaves, and would be fixed if bufio's behavior changes. I would also be willing to accept a real existing Go program that has extra code to work around the current bufio behavior.

A Reader that returns 0, nil upon occasion will not lead to an unrecoverable error when using bufio. A problem will only arise with a Reader that returns 0, nil 100 times in a row. Do any such implementations of Reader actually exist in real Go programs outside of tests? If not, it's not worth worrying about.

@ethan-gallant
Copy link

@ethan-gallant What I mean by a real-world problem is a real existing Go program that breaks because of how bufio currently behaves, and would be fixed if bufio's behavior changes. I would also be willing to accept a real existing Go program that has extra code to work around the current bufio behavior.

A Reader that returns 0, nil upon occasion will not lead to an unrecoverable error when using bufio. A problem will only arise with a Reader that returns 0, nil 100 times in a row. Do any such implementations of Reader actually exist in real Go programs outside of tests? If not, it's not worth worrying about.

This library when compressing large images and reading them with bufio.Reader produces the error. From what I gather it's related to how the library interacts with CGO and the ZStd library.

Not sure if there's an alternate way, but the zstd library does have a similar implementation that doesn't translate well into Go bindings for this reason.

Link: https://github.com/DataDog/zstd

@ianlancetaylor
Copy link
Contributor

When in actual practice are those packages going to return 0, nil 100 times in a row?

@ethan-gallant
Copy link

When in actual practice are those packages going to return 0, nil 100 times in a row?

Whenever you're compressing a large object, like a database backup or container image that exceeds 600MB on a slower CPU.

@ianlancetaylor
Copy link
Contributor

I'm sorry, I don't understand how that could happen. Can you show an example? Thanks.

@ethan-gallant
Copy link

An example is a reader that returns 0, nil without blocking. I'm a bit short on time and making a short example is a little tedious (writer, buffer, reader, chunk size, cgo) but I can guarantee it is an issue.

Maybe it's isolated to https://github.com/DataDog/zstd but regardless, it's documented that returning 0, nil is a valid reader implementation, it is simply "discouraged" so I'd assume internals packages like this would work alongside that assumption.

@slonopotamus
Copy link

As I understand, @ianlancetaylor wants a real-world usecase, not an artificially crafted reader that would infinitely return 0, nil.

If you say that https://github.com/DataDog/zstd is capable of doing that, do you have a repro code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal
Projects
Status: Declined
Development

No branches or pull requests

10 participants