Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: compiler/inliner leads to suboptimal register usage #49634

Open
holiman opened this issue Nov 17, 2021 · 1 comment
Open

cmd/compile: compiler/inliner leads to suboptimal register usage #49634

holiman opened this issue Nov 17, 2021 · 1 comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@holiman
Copy link

holiman commented Nov 17, 2021

What version of Go are you using (go version)?

go version go1.17 linux/amd64

and

go version go1.17.3 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/user/.cache/go-build"
GOENV="/home/user/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/user/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/user/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17.3"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/user/go/src/github.com/ethereum/go-ethereum/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build833321828=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I have these two methods, where one just wraps the other for type casting. However, the two types are really just aliases:

type OpCode byte

// GetOp returns the n'th element in the contract's byte array
func (c *Contract) GetOp(n uint64) OpCode {
	return OpCode(c.GetByte(n))
}

// GetByte returns the n'th byte in the contract's byte array
func (c *Contract) GetByte(n uint64) byte {
	if n < uint64(len(c.Code)) {
		return c.Code[n]
	}

	return 0
}

From a compiler PoV, it seems to me that these two methods should be identical. However, they are not:

TEXT github.com/ethereum/go-ethereum/core/vm.(*Contract).GetOp(SB) /home/user/go/src/github.com/ethereum/go-ethereum/core/vm/contract.go
  contract.go:146       0x9191a0                488b4858                MOVQ 0x58(AX), CX       
  contract.go:151       0x9191a4                48395860                CMPQ BX, 0x60(AX)       
  contract.go:151       0x9191a8                7606                    JBE 0x9191b0            
  contract.go:152       0x9191aa                0fb60c19                MOVZX 0(CX)(BX*1), CX   
  contract.go:146       0x9191ae                eb02                    JMP 0x9191b2            
  contract.go:146       0x9191b0                31c9                    XORL CX, CX             
  contract.go:146       0x9191b2                89c8                    MOVL CX, AX             
  contract.go:146       0x9191b4                c3                      RET                     

TEXT github.com/ethereum/go-ethereum/core/vm.(*Contract).GetByte(SB) /home/user/go/src/github.com/ethereum/go-ethereum/core/vm/contract.go
  contract.go:151       0x9191c0                488b4858                MOVQ 0x58(AX), CX       
  contract.go:151       0x9191c4                48395860                CMPQ BX, 0x60(AX)       
  contract.go:151       0x9191c8                7605                    JBE 0x9191cf            
  contract.go:152       0x9191ca                0fb60419                MOVZX 0(CX)(BX*1), AX   
  contract.go:152       0x9191ce                c3                      RET                     
  contract.go:155       0x9191cf                31c0                    XORL AX, AX             
  contract.go:155       0x9191d1                c3                      RET                     

It seems that the GetOp version makes a less optimal use of the registers: moving the result into CX, and then copy the result to AX. Whereas GetByte stores the result directly into AX and returns it.

If I modify the code, to manually inline it:

// GetOp returns the n'th element in the contract's byte array
func (c *Contract) GetOp(n uint64) OpCode {
	if n < uint64(len(c.Code)) {
		return OpCode(c.Code[n])
	}
	return OpCode(0)
}

// GetByte returns the n'th byte in the contract's byte array
func (c *Contract) GetByte(n uint64) byte {
	if n < uint64(len(c.Code)) {
		return c.Code[n]
	}

	return 0
}

Then the two outputs become identical:

TEXT github.com/ethereum/go-ethereum/core/vm.(*Contract).GetOp(SB) /home/user/go/src/github.com/ethereum/go-ethereum/core/vm/contract.go
  contract.go:146       0x9191a0                488b4858                MOVQ 0x58(AX), CX       
  contract.go:146       0x9191a4                48395860                CMPQ BX, 0x60(AX)       
  contract.go:146       0x9191a8                7605                    JBE 0x9191af            
  contract.go:147       0x9191aa                0fb60419                MOVZX 0(CX)(BX*1), AX   
  contract.go:147       0x9191ae                c3                      RET                     
  contract.go:149       0x9191af                31c0                    XORL AX, AX             
  contract.go:149       0x9191b1                c3                      RET                     

TEXT github.com/ethereum/go-ethereum/core/vm.(*Contract).GetByte(SB) /home/user/go/src/github.com/ethereum/go-ethereum/core/vm/contract.go
  contract.go:154       0x9191c0                488b4858                MOVQ 0x58(AX), CX       
  contract.go:154       0x9191c4                48395860                CMPQ BX, 0x60(AX)       
  contract.go:154       0x9191c8                7605                    JBE 0x9191cf            
  contract.go:155       0x9191ca                0fb60419                MOVZX 0(CX)(BX*1), AX   
  contract.go:155       0x9191ce                c3                      RET                     
  contract.go:158       0x9191cf                31c0                    XORL AX, AX             
  contract.go:158       0x9191d1                c3                      RET     

Tested with go1.17 linux/amd64 and go1.17.3 linux/amd64.

I thought this was a strange quirk with the compiler/inliner/optimizer, and possibly worth mentioning here. Feel free to close this issue if it's not relevant.

What did you expect to see?

That the assembly output for the two methods were identical.

What did you see instead?

Less optimal output for the version returning an alias type.

@dr2chase dr2chase added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance labels Nov 17, 2021
@dr2chase dr2chase added this to the Go1.19 milestone Nov 17, 2021
@randall77
Copy link
Contributor

Here's a simpler example:

func f(x int) int {
	if x < 5 {
		return 3
	}
	return 4
}
func g(x int) int {
	r := 4
	if x < 5 {
		r = 3
	}
	return r
}

They compute the same thing, but in f there are 2 different return blocks, and in g there is only one. The compiler doesn't duplicate return blocks in g to avoid the merge point (which would fix this issue, I think) or merge return blocks in f (which is #11189).

@randall77 randall77 modified the milestones: Go1.19, Unplanned May 10, 2022
@randall77 randall77 changed the title compiler/inliner leads to suboptimal register usage cmd/compile: compiler/inliner leads to suboptimal register usage May 10, 2022
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

4 participants