Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: optimize xor to complement #39710

Open
niaow opened this issue Jun 19, 2020 · 4 comments
Open

cmd/compile: optimize xor to complement #39710

niaow opened this issue Jun 19, 2020 · 4 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@niaow
Copy link

niaow commented Jun 19, 2020

What version of Go are you using (go version)?

$ go version
go version go1.14.4 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/jadenw/.cache/go-build"
GOENV="/home/jadenw/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/jadenw/GOPATH"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build102234605=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Tried to flip bits in a large section of a bitmap using ^= ^uint64(0) in a loop.

What did you expect to see?

The compiler treats this as a bitwise complement operation.

What did you see instead?

The compiler emits an xor against a constant, and treats it differently than a bitwise complement.

https://godbolt.org/z/vxQyD3

If I understand correctly, this could be solved by adding the following rule to generic.rules:

(Xor(64|32|16|8) (Const(64|32|16|8) [-1]) x) => (Com(64|32|16|8) x)
@randall77
Copy link
Contributor

Sure, that would work.
For some architectures it won't matter, as XOR -1, R and NOT R are, as far as the chip is concerned, the same instruction. It helps a bit on x86 for code size.

@randall77 randall77 added this to the Unplanned milestone Jun 19, 2020
@cagedmantis cagedmantis added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 22, 2020
@shawndx
Copy link
Contributor

shawndx commented Jun 29, 2020

Would like to suggest evaluating expressions that have more than two operands as well, like:
res := -1 ^ x ^ const_c
as per our recent investigation on arm64 for rewriting 'MVN XOR' to EON, (https://go-review.googlesource.com/c/go/+/239638), reducing '-1 ^ x' too early may result in non-optimized code for aforementioned scenarios (NOT + XOR vs. XOR), since current rules in generic.rules prefer to associating constants first.

@niaow
Copy link
Author

niaow commented Jun 29, 2020

Huh. If we wanted to make them interchangeable for optimization purposes, we would probably need some rules like these:

(Xor(64|32|16|8) (Const(64|32|16|8) [c]) (Com(64|32|16|8) x)) => (Xor(64|32|16|8) (Const(64|32|16|8) [^c]) x)
(Com(64|32|16|8) (Xor(64|32|16|8) (Const(64|32|16|8) [c]) x)) => (Xor(64|32|16|8) (Const(64|32|16|8) [^c]) x)

At which point I am left wondering whether the existence of Com at this phase of compilation is actually worthwhile. Maybe this should be instead always represented as an Xor op and converted to a complement by the arch rules on platforms with a separate complement instruction?

@randall77
Copy link
Contributor

res := -1 ^ x ^ const_c

We try to lift constants to the outermost level, and then fold them. For instance, we have these rules for xor:

// x ^ (C ^ z) -> C ^ (x ^ z)
(Xor64 (Xor64 i:(Const64 <t>) z) x) && (z.Op != OpConst64 && x.Op != OpConst64) => (Xor64 i (Xor64 <t> z x))
// C ^ (D ^ x) -> (C ^ D) ^ x
(Xor64 (Const64 <t> [c]) (Xor64 (Const64 <t> [d]) x)) => (Xor64 (Const64 <t> [c^d]) x)

Which should do the right thing for this example, at least in generic.rules.

When we lower, we do need rules to combine (for amd64) NOT and XORconst. They shouldn't be hard to do.

Yes, we could eliminate Com at the generic level. I'm not sure that would buy us much, as NOTs can be introduced in other ways at the arch-specific level, and thus the arch-specific rules need to know how to combine XOR and NOT anyway.
But I'm not against removing Com if it simplifies things.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

5 participants