Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: inefficient CALL setup when more than 32bytes of args #23377

Open
ALTree opened this issue Jan 8, 2018 · 3 comments
Open

cmd/compile: inefficient CALL setup when more than 32bytes of args #23377

ALTree opened this issue Jan 8, 2018 · 3 comments
Labels
binary-size compiler/runtime Issues related to the Go compiler and/or runtime. Performance
Milestone

Comments

@ALTree
Copy link
Member

ALTree commented Jan 8, 2018

$ gotip version
go version devel +a62071a209 Sat Jan 6 04:52:00 2018 +0000 linux/amd64
type T struct {
	s1, s2 string
}

//go:noinline
func foo(t T) { _ = t }

func bar() {
	var t T
	foo(t)
}

generates

0x0020 00032 (test.go:14)	MOVUPS	X0, (SP)
0x0024 00036 (test.go:14)	MOVUPS	X0, 16(SP)
0x0029 00041 (test.go:14)	CALL	"".foo(SB)

but when

type T struct {
	s1, s2, s3 string    // one more string
}
0x001d 00029 (test.go:13)	XORPS	X0, X0
0x0020 00032 (test.go:13)	MOVUPS	X0, "".t+48(SP)
0x0025 00037 (test.go:13)	MOVUPS	X0, "".t+64(SP)
0x002a 00042 (test.go:13)	MOVUPS	X0, "".t+80(SP)
0x002f 00047 (test.go:13)	MOVQ	SP, DI
0x0032 00050 (test.go:14)	LEAQ	"".t+48(SP), SI
0x0037 00055 (test.go:14)	DUFFCOPY	$854
0x004a 00074 (test.go:14)	CALL	"".foo(SB)

The stack is bigger; first we MOVUPS a bunch of zeros to 48/64/80(SP), then we call DUFFCOPY to move them again to (SP). This seems wasteful. Even if we cross the multiple-MOVs/DUFF threshold, it seems it would be possible to just DUFFZERO at (SP), essentially the thing the first snippet does.

This also happen when there's no zeroing going on. For example, for struct { a, b, c, d int64}, when initialized as t = {1, 2, 3, 4}, the values are moved directly to (SP), but for struct { a, b, c, d, e int64}, which is bigger than 32bytes, they aren't. There are 5 moves high into the stack and then a DUFFCOPY call moves them to (SP).

@randall77
Copy link
Contributor

This is a special case of the more general problem that large structs (> 4 words) aren't handled efficiently.
It is on my radar, and I've even attempted a CL. But I don't like my solution yet.

@ALTree
Copy link
Member Author

ALTree commented Jan 8, 2018

@randall77 thanks. I'll let you decide if it's worth keeping this open to track the issue or it's not really necessary, since the general problem is known, and we can close this.

@randall77
Copy link
Contributor

Let's leave it open for now. It will be good as an example to double-check when the general issue gets fixed.

@randall77 randall77 added this to the Unplanned milestone Jan 8, 2018
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binary-size compiler/runtime Issues related to the Go compiler and/or runtime. Performance
Projects
None yet
Development

No branches or pull requests

4 participants