Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: slight reorder of conditions in loop results in much slower code #60611

Closed
eliben opened this issue Jun 5, 2023 · 4 comments
Closed
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@eliben
Copy link
Member

eliben commented Jun 5, 2023

What version of Go are you using (go version)?

$ gotip version
go version devel go1.21-26a90e4e Mon Jun 5 19:18:13 2023 +0000 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/eliben/.cache/go-build"
GOENV="/home/eliben/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/eliben/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/eliben/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.20.1"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1665273903=/tmp/go-build -gno-record-gcc-switches"

What did you do?

These two functions are equivalent:

func countFor1(s string) int {
	result := 0
	for i := 0; i < len(s); {
		if s[i]&1 == 1 {
			result += 1
		}
		i++
	}

	return result
}

func countFor2(s string) int {
	result := 0
	for i := 0; i < len(s); {
		if s[i]&1 == 1 {
			result += 1
			i++
		} else {
			i++
		}
	}

	return result
}

However, the second is ~2x slower on AMD64 (benchmark: https://go.dev/play/p/IlYydhYWUGl)

Looking at the disassembly, it seems like for countFor2 a useless bounds check is generated on the hot loop path - it's missing in the disasm of countFor1. Moreover, countFor1 generates a conditonal move instruction to update result, while counteFor2 uses another branch.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jun 5, 2023
@eliben
Copy link
Member Author

eliben commented Jun 5, 2023

I just checked this on arm64 and there countFor2 is 3.5x slower than countFor1.

@dr2chase dr2chase added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 6, 2023
@mknyszek mknyszek added this to the Backlog milestone Jun 7, 2023
@cuonglm
Copy link
Member

cuonglm commented Jun 7, 2023

This seems to be a failure in induction variable finding.

Currently, the induction variable is being detected as (ind min next), where next is (Add inc ind). In countFor2, next is another Phi, with two defs (Add inc ind) for if-else branches.

I tried a quick hack to detect this case, and the bound check is removed, but countFor2 is still 1.5x slower than countFor1 on amd64.

I haven't had time to dig deeper.

Kindly cc @randall77 , (IIRC) the last person who touch findIndVar.

@randall77
Copy link
Contributor

@eliben Where did you come across this example code? I'm wondering how often this occurs in practice. I think we'd want to know how often this happens in real code before we commit to complicating the induction variable pass to handle it.

@eliben
Copy link
Member Author

eliben commented Aug 25, 2023

Came back to this to evaluate, and unfortunately this is another instance of #27400 (comment)

Having added a sink to both, the results are very different:

goos: linux
goarch: amd64
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkFor1-8   	    2180	    539757 ns/op	1852.69 MB/s
BenchmarkFor2-8   	    2434	    480561 ns/op	2080.90 MB/s

Seems like this issue can be closed

@eliben eliben closed this as completed Aug 25, 2023
@dmitshur dmitshur closed this as not planned Won't fix, can't repro, duplicate, stale Aug 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

8 participants