Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io: ReadFull should not forward io.ErrUnexpectedEOF from underlying reader #47677

Open
dsnet opened this issue Aug 12, 2021 · 4 comments
Open
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@dsnet
Copy link
Member

dsnet commented Aug 12, 2021

The logic for io.ReadFull returns errors from the underlying io.Reader verbatim.

In the case where io.Reader returns io.ErrUnexpectedEOF, it is impossible for the caller of io.ReadFull to distinguish between whether:

  • the underlying io.Reader abruptly ended and is in a truncated state, or
  • the underlying io.Reader ended correctly and there is no more data.

To avoid this ambiguity, we should wrap any io.ErrUnexpectedEOF from the underlying io.Reader.

However, I suspect that this is a breaking change and the best that we can hope for is to just document this.

\cc @josharian @catzkorn

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 14, 2021
@dmitshur dmitshur added this to the Backlog milestone Aug 14, 2021
@dmitshur
Copy link
Contributor

CC @ianlancetaylor, @rsc.

@rsc
Copy link
Contributor

rsc commented Aug 14, 2021

I agree that we can't change this at this point:
underlying reader errors are never wrapped in these utility functions.

I'm curious how you are using io.ReadFull that this matters, though.
The point of ReadFull(n) is to say "I am certain that there are n bytes to be read, and I want to read them."
If the underlying reader only has, say, n-1 bytes,
it doesn't really matter whether the underlying reader thinks
there's a "good EOF" or "bad EOF" after those n-1 bytes.
Either way, ReadFull knows it is a bad EOF.

Looking at this from a different perspective, it's really only ever expected that
ReadFull, ReadAtLeast, and their ilk will return ErrUnexpectedEOF.
A general reader implementation should probably not return
a plain ErrUnexpectedEOF from an ordinary Read.
It should instead return a more detailed error.

So maybe the problem is in the specific Reader you are using:
it shouldn't be returning a bare ErrUnexpectedEOF from Read in the first place.

@josharian
Copy link
Contributor

I'm curious how you are using io.ReadFull that this matters, though.

We had a (pooled) buffer to read into, and a data source of unknown size. We wanted to read as much as possible into the buffer, until either the buffer was full (oops, grab a bigger buffer) or we reached EOF.

I believe the code in question was using ReadFull because it was the most obvious convenience function in package io that let you bring a buffer. And it worked fine, modulo this issue.

The fix was to write our own custom read loop. That was probably always the right answer, but it was a surprise to find a little footgun lurking in ReadFull, thus this issue. It might be a nice addition to package io. (The semantics of Read means that read helpers are required for ~all uses of io.Reader.)

So maybe the problem is in the specific Reader you are using:
it shouldn't be returning a bare ErrUnexpectedEOF from Read in the first place.

In the case that inspired this issue, the underlying reader is net/http.Request.Body. If the connection is closed during a read, it returns ErrUnexpectedEOF. We can’t change that either.

@dsnet
Copy link
Member Author

dsnet commented Aug 14, 2021

The code that @josharian is describing looked something like:

var buf []byte
switch n1, err := io.ReadFull(r, smallBuf); err {
case io.EOF, io.ErrUnexpectedEOF:
	buf = smallBuf[:n]
case nil:
	switch n2, err := io.ReadFull(r, bigBuf[n1:]); err {
	case io.EOF, io.ErrUnexpectedEOF:
		copy(bigBuf[:n1], smallBuf[:n1])
		buf = bigBuf[:n1+n2]
	case nil:
		return errors.New("input too large")
	default:
		return err
	}
default:
	return err
}

This code is really weird because the error conditions are entirely flipped (i.e., nil is an error, while io.EOF and io.ErrUnexpectedEOF are success). We had a truncation bug where if the underlying reader returned io.ErrUnexpectedEOF, then we treated that as success (oops).

We ended up writing our own io.ReadFull-like function with inverted semantics:

// ReadUntilEOF continually reads r into b until io.EOF.
// It returns nil if io.EOF is encountered.
// It returns io.ErrShortBuffer if the entirety of the buffer is filled,
// which indicates that a larger buffer is needed.
func ReadUntilEOF(r io.Reader, b []byte) (n int, err error) {
	for n < len(b) && err == nil {
		var nn int
		nn, err = r.Read(b[n:])
		n += nn
	}
	switch {
	case err == io.EOF:
		return n, nil
	case n == len(buf):
		return n, io.ErrShortBuffer
	default:
		return n, err
	}
}

Using ReadUntilEOF, our earlier snippet became much more readable:

var buf []byte
switch n1, err := ReadUntilEOF(r, smallBuf); err {
case nil:
	buf = smallBuf[:n]
case io.ErrShortBuffer:
	switch n2, err := io.ReadFull(bigBuf[n1:]); err {
	case nil:
		copy(bigBuf[:n1], smallBuf[:n1])
		buf = bigBuf[:n1+n2]
	case io.ErrShortBuffer:
		return errors.New("input too large")
	default:
		return err
	}
default:
	return err
}

I doubt our use-case of needing to read as much of r into a buffer is unique. I wonder if there is a justified use-case for a io.ReadFull-like function with inverted semantics (i.e., io.ReadUntilEOF).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants