Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: optimize type switch #18492

Closed
bradfitz opened this issue Jan 2, 2017 · 15 comments
Closed

cmd/compile: optimize type switch #18492

bradfitz opened this issue Jan 2, 2017 · 15 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Performance
Milestone

Comments

@bradfitz
Copy link
Contributor

bradfitz commented Jan 2, 2017

In code review https://go-review.googlesource.com/c/34773/, John Doe points out that a type switch in an inner loop can be optimized by pulling it out of the loop, setting an integer, and switching on the integer value instead.

John Doe provided this benchmark: https://play.golang.org/p/6LXF82e6U4

With tip, I get:

$ go test -v -bench=. -benchtime=2s
BenchmarkTypeSwitchInside-4         2000           1298710 ns/op
BenchmarkTypeSwitchOutside-4       10000            475899 ns/op

These seem like they could be identical. In both cases, it's just a switch on an integer. In the "Inside" case, that integer just happens to be in the first word of an interface value.

/cc @randall77 @mdempsky @josharian

@bradfitz bradfitz added this to the Go1.9Maybe milestone Jan 2, 2017
@josharian
Copy link
Contributor

cc @dr2chase who likes hoisting expensive operations out of loops

@mvdan
Copy link
Member

mvdan commented Jan 2, 2017

Just to clarify, is this about optimizing all type switches or just about those that can be moved outside of a loop?

@josharian
Copy link
Contributor

As I understand it, moving expensive things out of loops. But maybe I'm missing something.

@minux
Copy link
Member

minux commented Jan 3, 2017 via email

@mvdan
Copy link
Member

mvdan commented Jan 3, 2017

I agree that the former is more interesting. It's also what I would understand given the issue title.

@randall77
Copy link
Contributor

randall77 commented Jan 3, 2017

In go 1.8, the type switch compiles fairly small. The non-empty-interface-to-concrete-type switch compiles to just a few instructions.

func f(t I) int {
	switch t.(type) {
	case *T:
		return 1
	}
	return 0
}

generates for the switch

	0x0000 00000 (tmp1.go:14)	MOVQ	"".t+8(FP), AX    // AX = itab of interface
	0x0005 00005 (tmp1.go:14)	TESTQ	AX, AX
	0x0008 00008 (tmp1.go:14)	JEQ	52                // nil interface
	0x000a 00010 (tmp1.go:14)	MOVQ	8(AX), CX         // CX = type
	0x000e 00014 (tmp1.go:14)	MOVL	16(CX), DX        // DX = hash of type
	0x0011 00017 (tmp1.go:14)	CMPL	DX, $432690315
	0x0017 00023 (tmp1.go:14)	JNE	52
	0x0019 00025 (tmp1.go:14)	TESTQ	AX, AX
	0x001c 00028 (tmp1.go:14)	JEQ	$0, 62
	0x001e 00030 (tmp1.go:18)	LEAQ	type.*"".T(SB), AX
	0x0025 00037 (tmp1.go:14)	CMPQ	AX, CX
	0x0028 00040 (tmp1.go:14)	JNE	$0, 52
	0x002a 00042 (tmp1.go:16)	MOVQ	$1, "".~r1+24(FP)
	0x0033 00051 (tmp1.go:16)	RET
	0x0034 00052 (tmp1.go:18)	MOVQ	$0, "".~r1+24(FP)
	0x003d 00061 (tmp1.go:18)	RET
	0x003e 00062 (tmp1.go:14)	MOVQ	AX, CX
	0x0041 00065 (tmp1.go:18)	JMP	30

It's not optimal yet (the redundant TESTQ), but it's pretty small. The code is even slightly smaller when switching on empty interfaces.

Possibly for small switches we could get rid of the hash comparison and check the type pointers directly. (The hash enables binary search. We'd have to do linear search if we just used the type pointers.)

For nonempty interfaces, maybe we could even compare the itab with the address of the now-statically-known I/*T itab struct? Then we wouldn't even need to load the type out of the itab.

The more general question, how to pull this stuff out of the loop altogether, is harder. It involves memory operations so it isn't clear lifting is legal just from looking at the SSA form. We'd have to understand that some of the loads are from never-changing memory locations.

@gopherbot
Copy link

CL https://golang.org/cl/34810 mentions this issue.

@rasky
Copy link
Member

rasky commented Jan 4, 2017

For larger switches, given that we already have a compile-time known hash per type, it looks like it would be quite easy to use something along the lines of MRST. In other words, a small sequence of bits in those hashes could be used as perfect hashing to index a small jump table.

@randall77
Copy link
Contributor

Jump tables: #5496, #10870

@josharian
Copy link
Contributor

Jump tables see also #15780 (comment)

gopherbot pushed a commit that referenced this issue Feb 13, 2017
When doing i.(T) for non-empty-interface i and concrete type T,
there's no need to read the type out of the itab. Just compare the
itab to the itab we expect for that interface/type pair.

Also optimize type switches by putting the type hash of the
concrete type in the itab. That way we don't need to load the
type pointer out of the itab.

Update #18492

Change-Id: I49e280a21e5687e771db5b8a56b685291ac168ce
Reviewed-on: https://go-review.googlesource.com/34810
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
@ghost
Copy link

ghost commented Jun 30, 2017

With go1.9beta2, I get

BenchmarkTypeSwitchInside-4         2000            687881 ns/op
BenchmarkTypeSwitchOutside-4        3000            443001 ns/op

@bradfitz bradfitz modified the milestones: Go1.10, Go1.9Maybe Jun 30, 2017
@bradfitz bradfitz added the NeedsFix The path to resolution is known, but the work has not been done. label Jun 30, 2017
@bradfitz bradfitz modified the milestones: Go1.10, Go1.11 Nov 28, 2017
@bradfitz bradfitz modified the milestones: Go1.11, Unplanned May 18, 2018
@ghost
Copy link

ghost commented Nov 19, 2018

Go 1.11.2:

BenchmarkTypeSwitchInside-4         3000            513823 ns/op
BenchmarkTypeSwitchOutside-4        3000            444997 ns/op

@odeke-em
Copy link
Member

For Go1.15, here are the results

$ benchstat before.txt after.txt 
name          old time/op  new time/op  delta
TypeSwitch-8   576µs ± 4%   579µs ± 4%   ~     (p=0.548 n=5+5)

after

$ go test -run=^$ -bench=. -count=5 
goos: darwin
goarch: amd64
pkg: github.com/odeke-em/bugs/golang/18492
BenchmarkTypeSwitchInside-8    	    2011	    572145 ns/op
BenchmarkTypeSwitchInside-8    	    2148	    555198 ns/op
BenchmarkTypeSwitchInside-8    	    2050	    579787 ns/op
BenchmarkTypeSwitchInside-8    	    2040	    586727 ns/op
BenchmarkTypeSwitchInside-8    	    1983	    583754 ns/op
BenchmarkTypeSwitchOutside-8   	    2128	    576922 ns/op
BenchmarkTypeSwitchOutside-8   	    2080	    557053 ns/op
BenchmarkTypeSwitchOutside-8   	    2116	    588772 ns/op
BenchmarkTypeSwitchOutside-8   	    2005	    582071 ns/op
BenchmarkTypeSwitchOutside-8   	    2013	    592535 ns/op
PASS
ok  	github.com/odeke-em/bugs/golang/18492	13.956s

Perhaps we can close this issue, as the tertiary improvements have their own issues.

@josharian
Copy link
Contributor

I wonder whether the thing that "fixed" this was https://go-review.googlesource.com/c/go/+/228106

@josharian
Copy link
Contributor

Either way, yes, let's call this good. Thanks for checking, @odeke-em.

@golang golang locked and limited conversation to collaborators May 29, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Performance
Projects
None yet
Development

No branches or pull requests

8 participants