Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fmt, encoding/gob: fix misuse of Read #3472

Closed
minux opened this issue Apr 4, 2012 · 25 comments
Closed

fmt, encoding/gob: fix misuse of Read #3472

minux opened this issue Apr 4, 2012 · 25 comments
Labels
FrozenDueToAge Suggested Issues that may be good for new contributors looking for work to do.
Milestone

Comments

@minux
Copy link
Member

minux commented Apr 4, 2012

http://groups.google.com/group/golang-nuts/browse_thread/thread/9ecc8f21b65a0061
@rsc
Copy link
Contributor

rsc commented Apr 4, 2012

Comment 1:

The documentation is correct.  Read can return both 0,nil and len(buf),err.

Labels changed: removed documentation.

@minux
Copy link
Member Author

minux commented Apr 4, 2012

Comment 2:

also encoding/gob:
  63 func decodeUintReader(r io.Reader, buf []byte) (x uint64, width int, err error) {
  64     width = 1
  65     _, err = r.Read(buf[0:width])
  66     if err != nil {
Should we also explicitly mention that in documentation?

@rsc
Copy link
Contributor

rsc commented Apr 4, 2012

Comment 3:

Mention what?  I think the documentation on io.Reader is very clear.

@minux
Copy link
Member Author

minux commented Apr 4, 2012

Comment 4:

Mention that the caller of Read() should prepare for (0, nil) and (len(buf), err).
(I think although these two cases can be inferred from the doc, they are pretty easy to
overlook.)

@gopherbot
Copy link

Comment 5 by germanium@gmx.us:

I agree, there is confusion about it and we found code not prepared for these cases.
They are somewhat special cases and explicitly mentioning them would avoid any doubt. I
for one lost some time into it ;)
The doc says:
"If _some_ data is available but not len(p) bytes, Read conventionally returns what is
available instead of waiting for more."
It could make someone think that if _no_ data is available, Read just wait until it can
return either some bytes or an error (I know it's not saying this, but it's easy to take
like this).
I think it's safe to use fmt.ReadFull() for reading to a one-byte slice without checking
the count:
_, err = r.reader.Read(r.pendBuf[0:1])  //wrong
_, err = fmt.ReadFull(r.reader, r.pendBuf[0:1])  //ok

@gopherbot
Copy link

Comment 6 by germanium@gmx.us:

s/fmt.ReadFull/io.ReadFull/

@gopherbot
Copy link

Comment 7 by germanium@gmx.us:

I agree, they are somewhat edge cases and I think they are worth mentioning even if they
don't behave specially. People have the bad habit to assume too much, and it can happen
to readers as well as writers.
"If _some_ data is available but not len(p) bytes, Read conventionally returns what is
available instead of waiting for more."
I know this sentence is not expressing a requirement, but it may make you think that if
_no_ data is available, Read just wait until it can return either some bytes or an error.
io.ReadFull() should never return (0, nil) if len(slice) > 0, but it can return
(len(slice), some_error).
(looking at the source you can see it never returns (len(slice), io.EOF), but the
comment doesn't say that)

@minux
Copy link
Member Author

minux commented Jun 4, 2012

Comment 8:

i really think allowing Read to return len(buf),err is a bad idea.
For example, according to io.ReadFull docs:
"It returns the number of bytes copied and an error if fewer bytes were read."
what if the reader returns the required number of bytes and also an error?
the current implementation of io.ReadFull will return len(buf),err (unless
err == io.EOF) but i think this doesn't agree with the docs.
http://play.golang.org/p/zibK2130bQ

@rsc
Copy link
Contributor

rsc commented Jun 4, 2012

Comment 9:

This would be a significant backwards-incompatible API change (it
would impose extra constraints on implementers of io.Reader). The doc
comment on io.Reader explains the rationale for Read.
ReadFull is different.
Russ

@minux
Copy link
Member Author

minux commented Jun 4, 2012

Comment 10:

so ReadFull is buggy?
I guess there are still places with this kind of false assumptions.

@rsc
Copy link
Contributor

rsc commented Jun 5, 2012

Comment 11:

It is possible - I am not promising - that it would be okay to put
if n >= min {
    err = nil
}
at the end of io.ReadAtLeast. I think that is consistent with the
comments, but I am still uncomfortable throwing away the error
information. I am undecided.

@gopherbot
Copy link

Comment 12 by kballard:

I see no reason that ReadFull cannot return (len(p),nil). It says "It returns the number
of bytes copied and an error if fewer bytes were read". That's "if", not "if and only
if". It may be worth clarifying that this can return (len(p),err), but I don't think the
functionality needs to change.
As for (0,nil), that seems to be to be bad behavior. What Read() implementation in its
right mind would ever do that? I think we should make sure no Read() implementation
right now can return (0,nil) and then adjust the documentation to point out that (0,nil)
is not a valid return value. As an example of how (0,nil) is bad, trying to use
ReadFull() with a Read() that returns (0,nil) will spin forever, and it would not be
unexpected to find other code that has similarly pathological behavior in the face of
(0,nil).

@minux
Copy link
Member Author

minux commented Jun 5, 2012

Comment 13:

though it doesn't say "if and only if", i think people will be thinking that way.
(i really think it should mean "iff", because otherwise it's almost useless)
you can grep the standard library for '_, err := io.ReadFull'. imo, the main
reason people use io.ReadFull is to:
1. avoid do the Read() in a loop
2. don't need to bother with partial reads (i.e., doesn't want to check n)
i think the problem is that, although we don't document the requirements
for Read(), in practice, people can and will make some assumption about
this behavior. And what's worse, different people might take different
assumptions. It's far worse than we say explicitly Read() shouldn't do
this and that. your 0,nil example just demonstrates this point; we can't
use Read() without any implicit assumption about its behavior.

@gopherbot
Copy link

Comment 14 by kballard:

I think it's fair to say ReadFull should never return (len(p),io.EOF), but it should
certainly feel free to return (len(p),SomeOtherError). If you care about the ability to
both handle an error that happens after reading n bytes, as well as processing those n
bytes anyway, then you should be checking the first return value from the function. The
problem with discarding the error when the entire buffer is read is this is assuming
that subsequent calls to Read() will return the same error. While I believe subsequent
calls to Read() should in fact return the same error, I don't believe this is actually
required (or even necessarily should be required, in case someone comes up with a
transient error value).
That said, a Read() implementation that returns (len(p), ErrThatsNotIoEOF) might be
considered broken anyway. It's certainly unexpected to be handed an error (besides
io.EOF) at the same time as you're handed all the data you asked for. If we really want
to fix ReadFull to never return (len(p),err) then this may be done best by specifying
that Read() should not return (len(p),errBesidesIoEOF).
I agree that we shouldn't make implicit assumptions, which is why we should explicitly
document the acceptable return values from Read(). Any implementation which returns
(0,nil) will break code (e.g. cause ReadFull() to go into an infinite loop, etc). I
suspect that most gophers right now assume Read() cannot return (0,nil), and are
therefore writing code that assumes that either n > 0 or err != nil. This assumption
should be codified in the documentation so nobody decides to implement a non-blocking
reader that returns (0,nil) (this is actually the perfect example of a transient
error—such a non-blocking reader should return (0,ErrThatIndicatesNoData), but
subsequent Read()s could then return (>0,nil), although this may be better-implemented
as a separate function than Read()).

@gopherbot
Copy link

Comment 15 by germanium@gmx.us:

In current code, Read loops stop at the first error (or the second, but that would be a
mistake) and Readers built on top of other Readers usually save the first error from the
underlying reader and never Read it again.
Therefore if someone comes up with a transient error, that error would be treated as
fatal (to the read) by all current code.
I think the only advantage of not requiring that subsequent Reads return the same error
is so that Readers don't have to make efforts to guarantee it, but the requirement would
make life easier for callers.

@gopherbot
Copy link

Comment 16 by germanium@gmx.us:

In existing code, Read loops stop at the first error (or the second, but that would be a
mistake) and Readers built on top of other Readers usually save the first error from the
underlying Reader and never Read it again.
Therefore if someone comes up with a transient error, that error would be treated as
fatal (to the process of reading) by all existing code.
I think the only advantage of not requiring that subsequent Reads return the same error
is so that Readers don't have to make efforts to guarantee it, but the requirement would
make life easier for callers.

@gopherbot
Copy link

Comment 17 by germanium@gmx.us:

A reader that returns (0,nil) doesn't necessarily break ReadFull or other code: in the
worst case it introduces busy-waiting.
The usual way of getting data from a generic io.Reader in a loop, combined with
non-blocking Readers means busy-waiting, is that acceptable?

@gopherbot
Copy link

Comment 18 by germanium@gmx.us:

A reader that returns (0, nil) doesn't necessarily break ReadFull or similar code: in
the worst case it introduces busy-waiting.
As I said in my previous comment, transient errors would require changes in all that's
currently built on top of a generic io.Reader. If we are not going to do that, the only
way to do non-blocking is by returning (0, nil) when not ready, which means busy-waiting.
If busy-waiting is not acceptable we may as well require that io.Readers are blocking.

@gopherbot
Copy link

Comment 19 by germanium@gmx.us:

A lot of existing Read callers make one big assumption: that if err != nil you have got
all the data you can possibly get from there (maybe you really can get more data, or
maybe the Reader can be reset someway, but if you are dealing with a generic io.Reader
you can't know that).
I think this assumption should be in the docs.
We can't have unspecified transient errors and expect generic io.Reader callers to
handle them correctly. Even with specification I don't think the added complexity would
be worth it, and a lot of code would need fixing. E.g. each composite Reader would need
to manage transient errors from the underlying Reader.
A reader that returns (0, nil) doesn't necessarily break ReadFull or similar code: in
the worst case it introduces busy-waiting.
Currently the only way for a non-blocking reader to be used as an io.Reader is by
returning (0, nil) when not ready, which means busy-waiting in Read loops.
If busy-waiting is not acceptable we may as well require that io.Readers are blocking. I
think you can abstract Readers more effectively if you abstract blocking and
non-blocking separately or if the abstraction always behave as blocking.

@gopherbot
Copy link

Comment 20 by germanium@gmx.us:

A lot of existing Read callers make one big assumption: that if err != nil you have got
all the data you can possibly get from there (maybe you really can get more data, or
maybe the Reader can be reset someway, but if you are dealing with a generic io.Reader
you can't know that).
I think this assumption should be in the docs.
We can't have unspecified transient errors and expect io.Reader users to handle them
correctly. Even with specification I don't think the added complexity would be worth it,
and a lot of code would need fixing. E.g. each composite Reader would need to manage
transient errors from the underlying Reader.
A reader that returns (0, nil) doesn't necessarily break ReadFull or similar code: in
the worst case it introduces busy-waiting.
Currently the only way for a non-blocking reader to be used as an io.Reader is by
returning (0, nil) when not ready, which means busy-waiting in Read loops.
If busy-waiting is not acceptable we may as well require that io.Readers are blocking. I
think you can abstract Readers more effectively if you abstract blocking and
non-blocking separately or if the abstraction always behave as blocking.

@rsc
Copy link
Contributor

rsc commented Sep 12, 2012

Comment 21:

Labels changed: added go1.1.

@rsc
Copy link
Contributor

rsc commented Dec 9, 2012

Comment 22:

Redesigning Read is not an option.
Code that says:
    _, err := r.Read(buf[:1])
    if err != nil {
        ...
    }
should instead say
    n, err := r.Read(buf[:1])
    if n < 1 {
        ...
    }
and code with larger buffers should use ReadFull.

Owner changed to ---.

@rsc
Copy link
Contributor

rsc commented Dec 10, 2012

Comment 23:

Labels changed: added size-m.

@rsc
Copy link
Contributor

rsc commented Dec 10, 2012

Comment 24:

Labels changed: added suggested.

@minux
Copy link
Member Author

minux commented Dec 17, 2012

Comment 25:

This issue was closed by revision 5a2c275.

Status changed to Fixed.

@minux minux added fixed Suggested Issues that may be good for new contributors looking for work to do. labels Dec 17, 2012
@rsc rsc added this to the Go1.1 milestone Apr 14, 2015
@rsc rsc removed the go1.1 label Apr 14, 2015
@golang golang locked and limited conversation to collaborators Jun 24, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge Suggested Issues that may be good for new contributors looking for work to do.
Projects
None yet
Development

No branches or pull requests

3 participants