io: ReadAll buffer resizing is inefficient #50774

bboreham · 2022-01-24T15:18:42Z

What version of Go are you using (`go version`)?

$ go version
go version go1.17.5 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (`go env`)?

go env Output

$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/vagrant/.cache/go-build"
GOENV="/home/vagrant/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/vagrant/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/vagrant"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17.5"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/vagrant/src/github.com/cortexproject/cortex/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build2696308846=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Called ioutil.ReadAll to read everything from an http response.

What did you expect to see?

Reasonable memory and CPU usage.

What did you see instead?

~60% more memory is allocated when using ioutil.ReadAll compared with bytes.Buffer.ReadFrom.

See prometheus/client_golang#976.
#46564 is making the same complaint, although it was solved in a different way.

The underlying reason is that ReadAll() uses append to resize its buffer, which goes up by a factor of 1.25 (slightly more in latest Go, but asymptotically the same) each time.

One might measure these alternatives on various axes:

number of bytes allocated and freed
max size of heap required to read a given size
amount of copying while resizing
percentage waste in returned buffer

The current implementation is better on the last one, and much worse on all the others.

The text was updated successfully, but these errors were encountered:

andig · 2022-02-10T18:13:05Z

Is the underlying question here if append should resize x2? It feels as if „memory-growing slice-like structures“ should potentially behave similarly throughout Go?

earthboundkid · 2022-02-10T18:33:03Z

bytes.Buffer.grow(int) uses a different algorithm, where it grows by 2*currentCap + requested n. It looks like this ends up being more efficient in certain cases.

bboreham · 2022-02-10T20:27:22Z

Consistency is a fine thing, but I would expect blogs, books, etc., to advise “never call ReadAll“ if it remains as is.

Stretching my imagination, some cases can avoid memory-growing: for instance when reading the body of an http.Request with ContentLength, the implementation of ReadAll could use a special interface to obtain the expected size. Or, more simply, there could be a ReadAllWithExpectedSize() helper.

CAFxX · 2022-05-18T06:00:50Z

We could maybe make ReadAll forward to Copy with something like this?

func ReadAll(r Reader) ([]byte, error) {
	sw := new(sliceWriter)
	_, err := Copy(sw, r)
	return *sw, err
}

type sliceWriter []byte

func (sw *sliceWriter) Write(b []byte) (int, error) {
	*sw = append(*sw, b...)
	return len(b), nil
}

This would also add support for the WriterTo logic implemented in Copy (so something like a source bytes.Buffer or bufio.Reader should be able to just write directly to the destination buffer).

Alternatively, if forwarding to Copy is undesirable because of the larger temporary buffer¹, we could just add WriterTo forwarding in the current logic:

func ReadAll(r Reader) ([]byte, error) {
    if wt, ok := r.(WriterTo); ok {
        sw := new(sliceWriter)
        _, err := wt.WriteTo(sw)
        return *sw, err
    }
    // ...existing ReadAll logic...

ReadAll does not really make any guarantee in this sense, so maybe this is not really a valid concern. ↩

bboreham · 2022-08-05T16:26:26Z

@CAFxX can you clarify how this helps with the resizing of the destination buffer?

CAFxX · 2022-08-06T05:54:38Z

@bboreham sure, it boils down to io.Copy being able to optimize the copy by forwarding to io.ReaderFrom on the destination io.Writer, or to io.WriterTo on the source io.Reader.

In this specific case, the source io.Reader implements io.WriterTo, so the copy is optimized - and this completely sidesteps the resizing issue in the "reading the whole response body" case. Incidentally, this is almost the same thing that you did manually by calling bytes.Buffer.ReadFrom.

name         time/op
Original-16  236µs ± 1%
Patched-16   172µs ± 2%

name         alloc/op
Original-16  291kB ± 0%
Patched-16   153kB ± 0%

name         allocs/op
Original-16   78.3 ± 1%
Patched-16    65.0 ± 0%

Benchmark source code

package issue50774

import (
	"bytes"
	"io"
	"net/http"
	"net/http/httptest"
	"testing"
)

func benchmark(b *testing.B, readall func(io.Reader) ([]byte, error)) {
	body := bytes.Repeat([]byte("x"), 1<<16)
	s := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		w.Write(body)
	}))
	defer s.Close()

	var buf []byte
	for i := 0; i < b.N; i++ {
		res, err := http.Get(s.URL)
		if err != nil {
			b.Fatal(err)
		}
		buf, err = readall(res.Body)
		if err != nil {
			b.Fatal(err)
		}
		res.Body.Close()
	}
	_ = buf
}

func BenchmarkOriginal(b *testing.B) {
	benchmark(b, io.ReadAll)
}

func BenchmarkPatched(b *testing.B) {
	benchmark(b, ReadAll)
}

func ReadAll(r io.Reader) ([]byte, error) {
	sw := new(sliceWriter)
	_, err := io.Copy(sw, r)
	return *sw, err
}

type sliceWriter []byte

func (sw *sliceWriter) Write(b []byte) (int, error) {
	*sw = append(*sw, b...)
	return len(b), nil
}

update: just for confirmation, I also tried with bytes.Buffer.ReadFrom, and it turns out my approach is even faster than that:

name               time/op
Original-16        248µs ± 4%
Patched-16         173µs ± 2%
BufferReadFrom-16  215µs ± 3%

name               alloc/op
Original-16        291kB ± 0%
Patched-16         153kB ± 0%
BufferReadFrom-16  266kB ± 0%

name               allocs/op
Original-16         79.0 ± 0%
Patched-16          65.0 ± 0%
BufferReadFrom-16   70.0 ± 0%

func BenchmarkBufferReadFrom(b *testing.B) {
	benchmark(b, func(r io.Reader) ([]byte, error) {
		b := &bytes.Buffer{}
		_, err := b.ReadFrom(r)
		if err != nil {
			return nil, err
		}
		return b.Bytes(), nil
	})
}

bboreham · 2022-08-06T06:56:21Z

I think your approach is taking advantage of being able to read the entire buffer in one go.
This is not possible when reading from a socket, as in the problem that spawned this issue.

I pasted your ReadAll into that code here

Before:

% go test -run xxx -bench . -benchmem ./api            
goos: darwin
goarch: arm64
pkg: github.com/prometheus/client_golang/api
BenchmarkClient/4KB-8              28846             35189 ns/op           20543 B/op         73 allocs/op
BenchmarkClient/50KB-8             17594             68666 ns/op          135782 B/op         99 allocs/op
BenchmarkClient/1000KB-8            1515            960378 ns/op         2111125 B/op        593 allocs/op
BenchmarkClient/2000KB-8             894           1343809 ns/op         4212548 B/op       1095 allocs/op
PASS
ok      github.com/prometheus/client_golang/api 6.333s

After:

% go test -run xxx -bench . -benchmem ./api
goos: darwin
goarch: arm64
pkg: github.com/prometheus/client_golang/api
BenchmarkClient/4KB-8              30478             36265 ns/op           43158 B/op         72 allocs/op
BenchmarkClient/50KB-8             16678             67617 ns/op          129620 B/op         96 allocs/op
BenchmarkClient/1000KB-8            1338            871225 ns/op         5466073 B/op        604 allocs/op
BenchmarkClient/2000KB-8             724           1783090 ns/op        11197990 B/op       1113 allocs/op
PASS
ok      github.com/prometheus/client_golang/api 6.187s

CAFxX · 2022-08-06T07:07:58Z

@bboreham My benchmark is using httptest.NewServer to simulate reading the http response body from a real socket. You can also confirm this by inspecting the benchmark CPU profile and noticing how it's spending most of its time in read syscalls deep into the net package.

On the other hand, your benchmark does not seem to make sense as you are instantiating a bytes.Buffer, but not using it. And you're throwing the actual results of ReadAll away...

bboreham · 2022-08-06T07:23:27Z

Sorry, going a bit too fast. I corrected the Prometheus client-library version to return the body, and the benchmark results are the same.

Maybe it's just that your test response is quite small at 64K (if I read that right)?

CAFxX · 2022-08-06T08:26:16Z

Yup. And you're right, with singnficantly larger response sizes (1MB or more) the larger buffer growth of bytes.Buffer will make bytes.Buffer.ReadFrom faster. I will try in the next days to see if there's something that can be done.

mvdan added Performance NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jan 24, 2022

ianlancetaylor changed the title ~~ioutil: ReadAll buffer resizing is inefficient~~ os: ReadAll buffer resizing is inefficient Jan 25, 2022

ianlancetaylor added this to the Backlog milestone Jan 25, 2022

This comment has been minimized.

Sign in to view

randall77 changed the title ~~os: ReadAll buffer resizing is inefficient~~ io: ReadAll buffer resizing is inefficient Jan 25, 2022

bboreham mentioned this issue Jun 20, 2023

Proposal: grow buffers more efficiently json-iterator/go#679

Open

leungster mentioned this issue Nov 3, 2023

utilities.IOReaderFactory is inefficient grpc-ecosystem/grpc-gateway#3714

Closed

kalbhor mentioned this issue Nov 28, 2023

lz4 decompression causes a lot of allocations due to ioutil.ReadAll twmb/franz-go#623

Closed

cmacknz mentioned this issue Feb 14, 2024

Add optimized ReadAll for http responses elastic/elastic-agent-libs#183

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

io: ReadAll buffer resizing is inefficient #50774

io: ReadAll buffer resizing is inefficient #50774

bboreham commented Jan 24, 2022

This comment has been minimized.

andig commented Feb 10, 2022

earthboundkid commented Feb 10, 2022

bboreham commented Feb 10, 2022

CAFxX commented May 18, 2022 •

edited

bboreham commented Aug 5, 2022

CAFxX commented Aug 6, 2022 •

edited

bboreham commented Aug 6, 2022

CAFxX commented Aug 6, 2022 •

edited

bboreham commented Aug 6, 2022

CAFxX commented Aug 6, 2022

io: ReadAll buffer resizing is inefficient #50774

io: ReadAll buffer resizing is inefficient #50774

Comments

bboreham commented Jan 24, 2022

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

This comment has been minimized.

andig commented Feb 10, 2022

earthboundkid commented Feb 10, 2022

bboreham commented Feb 10, 2022

CAFxX commented May 18, 2022 • edited

Footnotes

bboreham commented Aug 5, 2022

CAFxX commented Aug 6, 2022 • edited

bboreham commented Aug 6, 2022

CAFxX commented Aug 6, 2022 • edited

bboreham commented Aug 6, 2022

CAFxX commented Aug 6, 2022

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?

CAFxX commented May 18, 2022 •

edited

CAFxX commented Aug 6, 2022 •

edited

CAFxX commented Aug 6, 2022 •

edited