cmd/compile: optimize comparisons to min/max (u)ints by checking for overflow #41664

josharian · 2020-09-27T22:29:56Z

package p

import "math"

func f(x int64) bool {
	return x == math.MinInt64
}

On amd64, the core of this compiles to:

	0x0005 00005 (z.go:6)	MOVQ	$-9223372036854775808, CX
	0x000f 00015 (z.go:6)	CMPQ	AX, CX
	0x0012 00018 (z.go:6)	SETEQ	"".~r1+16(SP)

It would be cheaper and smaller instead to decrement CX and check the flags for underflow. A similar trick can be used for checking min and max ints and uints of all sizes. It might also be useful in the division fix-up code, where we must check for min int divisor.

cc @randall77 @dr2chase @martisch @mundaym

The text was updated successfully, but these errors were encountered:

quasilyte · 2021-11-29T15:38:21Z

I tried to implement this. The idea looks interesting, at least on the surface.

Unfortunately, I get 5-8% slowdown in comparison with the original ("unoptimized") code. It does generate smaller code, but it runs slower.

The benchmarking code:

package foo

import (
	"testing"
	"math"
)

//go:noinline
func checkInt64(i int64, u uint64) int {
	if i == math.MaxInt64 {
		if u != math.MaxUint64 {
			return 10
		}
		return 1
	}
	if i == math.MinInt64 {
		return 2
	}
	return 0
}

func BenchmarkCheckInt(b *testing.B) {
	for i := 0; i < b.N; i++ {
		checkInt64(0, 0)
		checkInt64(math.MaxInt64, 0)
		checkInt64(math.MinInt64, 0)
		checkInt64(math.MaxInt64, math.MaxUint64)
		checkInt64(math.MinInt64, math.MaxUint64)
	}
}

Optimization rules:

// Convert `x == int64max` to increment+overflow check,
// for int64min test do a decrement+overflow check.
// For unsigned values, check for the carry flag instead.
//
// TODO: we could emit INC/DEC instead of ADDQ, but that would require a separate
// flag-aware SSA op: INC/DEC do not set carry flag, so we can't produce them for ADDQconstcarry.
// This rewrite needs overflow flags that are modified by both ADD and INC instructions.
((EQ|NE) (CMPQ <flags> x intmin:(MOVQconst [c])) yes no) && c == math.MinInt64 && intmin.Uses == 1 =>
  ((OS|OC) (Select1 <flags> (ADDQconstcarry [-1] x)) yes no)
((SETEQ|SETNE) (CMPQ <flags> x intmin:(MOVQconst [c]))) && c == math.MinInt64 && intmin.Uses == 1 =>
  ((SETO|SETNO) (Select1 <flags> (ADDQconstcarry [-1] x)))
((EQ|NE) (CMPQ <flags> x intmax:(MOVQconst [c])) yes no) && c == math.MaxInt64 && intmax.Uses == 1 =>
  ((OS|OC) (Select1 <flags> (ADDQconstcarry [1] x)) yes no)
((SETEQ|SETNE) (CMPQ <flags> x intmax:(MOVQconst [c]))) && c == math.MaxInt64 && intmax.Uses == 1 =>
  ((SETO|SETNO) (Select1 <flags> (ADDQconstcarry [1] x)))
// For the reference:
// SETB is "overflow flag is set" (SETCS in asm)
// SETAE is "overflow flag is not set" (SETCC in asm); also known as SETNB
// UGE is AJCS in asm
// ULT is AJCC in asm
((EQ|NE) intmax:(CMPQconst <flags> x [c]) yes no) && uint64(c) == math.MaxUint64 && intmax.Uses == 1 =>
  ((ULT|UGE) (Select1 <flags> (ADDQconstcarry [1] x)) yes no)
((SETEQ|SETNE) (CMPQ <flags> x intmin:(MOVQconst [c]))) && uint64(c) == math.MaxUint64 && intmin.Uses == 1 =>
  ((SETB|SETAE) (Select1 <flags> (ADDQconstcarry [1] x)))

I can submit a CL with tests if anyone is interested in seeing the whole thing.

Note: I haven't tried changing ADDQ 1/-1 to INC/DEC. I'm not sure it can help there.

Note: gcc/clang for C and C++ seem not to use this optimization idea either. Maybe there is a good reason not to do this.

I see two ways right now:

Figuring out what's wrong with the results (using different bench or reading the CPU counters for more info) or the implementation.
Marking this idea as wontfix with this issue being closed.

This is better than someone else spending their evening (or weekend) pursuing something that will result in disappointment. :)

josharian · 2021-11-29T21:38:54Z

Thanks so much for digging into this! Sounds like we shouldn't do this.

josharian added Performance help wanted labels Sep 27, 2020

josharian added this to the Unplanned milestone Sep 27, 2020

andybons added the NeedsFix The path to resolution is known, but the work has not been done. label Sep 29, 2020

josharian closed this as completed Nov 29, 2021

golang locked and limited conversation to collaborators Nov 29, 2022

gopherbot added the FrozenDueToAge label Nov 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/compile: optimize comparisons to min/max (u)ints by checking for overflow #41664

cmd/compile: optimize comparisons to min/max (u)ints by checking for overflow #41664

josharian commented Sep 27, 2020

quasilyte commented Nov 29, 2021

josharian commented Nov 29, 2021

cmd/compile: optimize comparisons to min/max (u)ints by checking for overflow #41664

cmd/compile: optimize comparisons to min/max (u)ints by checking for overflow #41664

Comments

josharian commented Sep 27, 2020

quasilyte commented Nov 29, 2021

josharian commented Nov 29, 2021