Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: append is slower than make+assignments for small slices with known length #26734

Open
ghost opened this issue Aug 1, 2018 · 8 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@ghost
Copy link

ghost commented Aug 1, 2018

What version of Go are you using (go version)?

go1.10.3 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/home/opennota/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/opennota/gocode"
GORACE=""
GOROOT="/home/opennota/go"
GOTMPDIR=""
GOTOOLDIR="/home/opennota/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build289966161=/tmp/go-build -gno-record-gcc-switches"

What did you do?

https://play.golang.org/p/e34FVX-sqQx

go test -bench=.

What did you expect to see?

The ns/op numbers in the two tests are close.

What did you see instead?

append is 24% slower:

BenchmarkAppend-4               20000000                89.9 ns/op
BenchmarkMakeAssign-4           20000000                73.1 ns/op
@ghost
Copy link

ghost commented Aug 1, 2018

please run with go test -bench=. -benchmem

@martisch
Copy link
Contributor

martisch commented Aug 1, 2018

Note that it could be that append allocates and zeroes more memory than make+assign as append rounds up to the next best allocation bucket size. In addition s = append(s, ...) will copy the contents of the previous iteration in addition to the new work if s needs to be resized. Please also run benchmarks on a quiet system and with -count=20 as noise in the ns range when allocation are involved can be quiet high.

@ghost
Copy link
Author

ghost commented Aug 1, 2018

With -benchmem:

BenchmarkAppend-4               20000000                89.8 ns/op            80 B/op          1 allocs/op
BenchmarkMakeAssign-4           20000000                75.5 ns/op            80 B/op          1 allocs/op

With -count=20:

BenchmarkAppend-4               20000000                88.3 ns/op
BenchmarkAppend-4               20000000                84.9 ns/op
BenchmarkAppend-4               20000000                86.8 ns/op
BenchmarkAppend-4               20000000                84.8 ns/op
BenchmarkAppend-4               20000000                84.7 ns/op
BenchmarkAppend-4               20000000                86.2 ns/op
BenchmarkAppend-4               20000000                84.7 ns/op
BenchmarkAppend-4               20000000                94.7 ns/op
BenchmarkAppend-4               20000000                88.1 ns/op
BenchmarkAppend-4               20000000                85.4 ns/op
BenchmarkAppend-4               20000000                85.4 ns/op
BenchmarkAppend-4               20000000                86.0 ns/op
BenchmarkAppend-4               20000000                84.8 ns/op
BenchmarkAppend-4               20000000                87.1 ns/op
BenchmarkAppend-4               20000000                85.1 ns/op
BenchmarkAppend-4               20000000                85.4 ns/op
BenchmarkAppend-4               20000000                86.5 ns/op
BenchmarkAppend-4               20000000                85.1 ns/op                                                                    
BenchmarkAppend-4               20000000                84.8 ns/op
BenchmarkAppend-4               20000000                86.7 ns/op
BenchmarkMakeAssign-4           20000000                71.5 ns/op
BenchmarkMakeAssign-4           20000000                72.4 ns/op
BenchmarkMakeAssign-4           20000000                73.8 ns/op
BenchmarkMakeAssign-4           20000000                71.9 ns/op
BenchmarkMakeAssign-4           20000000                71.3 ns/op
BenchmarkMakeAssign-4           20000000                71.5 ns/op
BenchmarkMakeAssign-4           20000000                71.7 ns/op
BenchmarkMakeAssign-4           20000000                74.4 ns/op
BenchmarkMakeAssign-4           20000000                71.6 ns/op
BenchmarkMakeAssign-4           20000000                73.2 ns/op
BenchmarkMakeAssign-4           20000000                71.3 ns/op
BenchmarkMakeAssign-4           20000000                71.9 ns/op
BenchmarkMakeAssign-4           20000000                72.8 ns/op
BenchmarkMakeAssign-4           20000000                71.3 ns/op
BenchmarkMakeAssign-4           20000000                71.2 ns/op
BenchmarkMakeAssign-4           20000000                73.6 ns/op
BenchmarkMakeAssign-4           20000000                71.4 ns/op
BenchmarkMakeAssign-4           20000000                71.7 ns/op
BenchmarkMakeAssign-4           20000000                71.5 ns/op
BenchmarkMakeAssign-4           20000000                74.5 ns/op

benchstat:

name      old time/op  new time/op  delta
Test-4*  72.2ns ± 3%  85.8ns ± 3%  +18.84%  (p=0.000 n=20+19)

* I had to make the test names the same.

@martisch martisch added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 1, 2018
@martisch
Copy link
Contributor

martisch commented Aug 1, 2018

I modified the benchmarks a bit to make more similar:
https://play.golang.org/p/2gWjX7-5D28
and also get that makeassign is faster for this case:
name time/op
MakeAssign 83.8ns ± 4%
Append 118ns ± 5%

@meirf
Copy link
Contributor

meirf commented Aug 1, 2018

@martisch should the next step be comparing the assembly of the two implementations? (I don't think most people expect append to be that close to make+assign since make+assign seems optimal.)

@martisch
Copy link
Contributor

martisch commented Aug 1, 2018

Yes looking at the assemblies could reveal if make+assign does an optimization (e.g. combining writes, avoiding some write barriers) that append does not. But it could also come down to runtime.growslice just having more overhead due to being a more general function and not being specialized for compile time known length of the appended slice (if that is the case maybe it could be detected and optimized away by the compiler).

When increasing the size of the string array append/created and number of values copied by 8x append seems to become faster:
name time/op
MakeAssign 263ns ± 3%
Append 246ns ± 3%

Some thoughts:

  • It can be looked into if the compiler could be better optimizing appends when the length of the append slice/arguments list/string is known
  • possibly (special in this case) it suffices to check only the bounds of the largest index of string slice element that is copied in. not sure how often this comes up to warrant such an optimization

@martisch martisch changed the title runtime: append is 24% slower than make+assign runtime: append is slower than make+assignments for small slices with known length Aug 1, 2018
@martisch martisch added this to the Unplanned milestone Aug 1, 2018
@go101
Copy link

go101 commented Aug 1, 2018

As @martisch suggests, this might be related to write barriers.

https://github.com/go101/go-benchmarks/tree/master/append-vs-make

related issue: #23306

@agnivade
Copy link
Contributor

The gap is closing in now (go version devel +acbfbc634e Mon Nov 11 13:53:02 2019 +0530 linux/amd64):

name            time/op
MakeAssign-4  108ns ± 2%
Append-4      118ns ± 2%

This is with the benchmarks at #26734 (comment).

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

5 participants