Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: reordering struct field accesses can alter performance #22479

Open
mvdan opened this issue Oct 28, 2017 · 1 comment
Open

cmd/compile: reordering struct field accesses can alter performance #22479

mvdan opened this issue Oct 28, 2017 · 1 comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@mvdan
Copy link
Member

mvdan commented Oct 28, 2017

What version of Go are you using (go version)?

go version devel +47c868dc1c Sat Oct 28 11:53:49 2017 +0000 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/mvdan/go/land:/home/mvdan/go"
GORACE=""
GOROOT="/home/mvdan/tip"
GOTOOLDIR="/home/mvdan/tip/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build268906853=/tmp/go-build -gno-record-gcc-switches"

What did you do?

https://play.golang.org/p/d5g7tuaxHW

go test -bench=.

What did you expect to see?

Both of them performing equally.

What did you see instead?

With 6 runs of each on an idle machine:

name          time/op
Separate-4    2.48ns ± 1%
Contiguous-4  2.15ns ± 2%

This is a minifiedin version of a performance issue I had in a very hot function. In particular, the function is the ASCII fast path of a rune advance method.

Here's the assembly for the two funcs:

"".(*T).separate STEXT nosplit size=26 args=0x10 locals=0x0
        0x0000 00000 (f_test.go:9)      TEXT    "".(*T).separate(SB), NOSPLIT, $0-16
        0x0000 00000 (f_test.go:9)      MOVQ    "".t+8(SP), AX
        0x0005 00005 (f_test.go:10)     INCQ    8(AX)
        0x0009 00009 (f_test.go:11)     MOVQ    $0, (AX)
        0x0010 00016 (f_test.go:12)     MOVQ    8(AX), AX
        0x0014 00020 (f_test.go:12)     MOVQ    AX, "".~r0+16(SP)
        0x0019 00025 (f_test.go:12)     RET
"".(*T).contiguous STEXT nosplit size=29 args=0x10 locals=0x0
        0x0000 00000 (f_test.go:15)     TEXT    "".(*T).contiguous(SB), NOSPLIT, $0-16
        0x0000 00000 (f_test.go:15)     MOVQ    "".t+8(SP), AX
        0x0005 00005 (f_test.go:16)     MOVQ    $0, (AX)
        0x000c 00012 (f_test.go:17)     MOVQ    8(AX), CX
        0x0010 00016 (f_test.go:17)     INCQ    CX
        0x0013 00019 (f_test.go:17)     MOVQ    CX, 8(AX)
        0x0017 00023 (f_test.go:18)     MOVQ    CX, "".~r0+16(SP)
        0x001c 00028 (f_test.go:18)     RET

Funnily enough, the faster one results in an extra instruction. How that makes sense is beyond me. Perhaps it's because the first one accesses 8(AX) three times, and the second only twice?

My understanding of the compiler and assembly are limited, so any pointers welcome.

/cc @randall77 @philhofer

@randall77
Copy link
Contributor

Pasting code here:

func (t *T) separate() int {
	t.b++
	t.a = 0
	return t.b
}

func (t *T) contiguous() int {
	t.a = 0
	t.b++
	return t.b
}

separate needs an additional load, because the compiler currently thinks that the write to t.a invalidates the t.b value that it already loaded. So it needs to reload the t.b value instead of using the registerized value.

The extra instruction doesn't really matter - it's the extra load that matters.

This is a tricky problem to solve in general, because it requires some alias analysis. Perhaps something could be done for this simpler situation (different constant offsets from the same pointer).

@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 29, 2018
@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Mar 29, 2018
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

5 participants