Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io: MultiReader should be more efficient when chained multiReaders contain few actual readers #13558

Closed
ajwerner opened this issue Dec 9, 2015 · 2 comments
Milestone

Comments

@ajwerner
Copy link

ajwerner commented Dec 9, 2015

The specific motivating case here is when doing decoding from a stream of json. In some cases I need to make two passes to determine the appropriate struct into which to decode based on some content. The json.Decoder offers buffered in an io.Reader. Thus I have access to the in-tact stream by taking:

io.MultiReader(dec.Buffered(), r) // where r is the underlying reader.

The problem with this is that each time this process happens, a new multiReader is created to wrap these two readers. In general, the next call to read will Read the entire Reader from Buffered leaving just the underlying reader. The problem is that the last reader remains wrapped in an underlying *multiReader. So after doing this 100 times, there may be just one actual reader with data but the call to read will have to call through the multiReader.Read method 100 times.

The current constructor looks like this:

func MultiReader(readers ...Reader) Reader {
    r := make([]Reader, len(readers))
    copy(r, readers)
    return &multiReader{r}
}

I propose that we do something like this:

func MultiReader(readers ...Reader) Reader {
    r := make([]Reader, 0, len(readers))
    for _, reader := range readers {
        if mr, ok := reader.(*multiReader); ok {
            r = append(r, mr.readers...)
        } else {
            r = append(r, reader)
        }
    } 
    return &multiReader{r}
}
@bradfitz bradfitz changed the title io.MultiReader should flatten multiReaders at construction time io: MultiReader should flatten multiReaders at construction time Dec 9, 2015
@bradfitz bradfitz added this to the Go1.7 milestone Dec 9, 2015
@ajwerner
Copy link
Author

An alternative would be to pay the penalty at Read time
Below is the current source code with a proposed addition.

func (mr *multiReader) Read(p []byte) (n int, err error) {
    for len(mr.readers) > 0 {

Proposed addition:

        if len(mr.readers) == 1 {
            if r, ok := mr.readers[0].(*multiReader); ok {
                mr.readers = r.readers
                continue
            }
        }
        n, err = mr.readers[0].Read(p)
        if n > 0 || err != EOF {
            if err == EOF {
                // Don't return EOF yet. There may be more bytes
                // in the remaining readers.
                err = nil
            }
            return
        }
        mr.readers = mr.readers[1:]
    }
    return 0, EOF
}

This is probably better because it doesn't change the runtime of the function. The previous solution could make a sequence of calls to MultiReader with non-MultiReaders expensive, like when combining many short buffers. It would become O(readers contained) instead of O(readers passed), the latter of which is probably more expected. It also eliminates the possibly bad runtime from the successive calls to nested multiReaders that contain few actual Readers that originally prompted the proposal.

@gopherbot
Copy link

CL https://golang.org/cl/17873 mentions this issue.

@ajwerner ajwerner changed the title io: MultiReader should flatten multiReaders at construction time io: MultiReader should be more efficient when chained multiReaders contain few actual readers Dec 15, 2015
@golang golang locked and limited conversation to collaborators May 16, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants