Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: Instruction split on amd64 #56476

Closed
klauspost opened this issue Oct 28, 2022 · 6 comments
Closed

cmd/compile: Instruction split on amd64 #56476

klauspost opened this issue Oct 28, 2022 · 6 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@klauspost
Copy link
Contributor

What version of Go are you using (go version)?

Latest version.

I just want to confirm that this is intended behavior.

Does this issue reproduce with the latest release?

Yes.

What did you do?

I was investigating assembly output from various variations and noticed some peculiar output with go tip amd64 compiler:

Code in question: bits.TrailingZeros64(a^b) >= 51

Godbolt output: https://godbolt.org/z/z1jfE85ax

Playground link of the same code: https://go.dev/play/p/UpuJvUnS1Lm

What did you expect to see?

        XORQ    CX, AX
        BSFQ    AX, CX
        MOVL    $64, DX
        CMOVQEQ DX, CX
        CMPQ    CX, $51
        SETGE   CL        

What did you see instead?

        XORQ    CX, AX
        BSFQ    AX, CX
[code not modifying AX or CX, but modifying flags]
        BSFQ    AX, DX
        MOVL    $64, DX
        CMOVQEQ DX, CX
        CMPQ    CX, $51
        SETGE   CL        

My main objection is that is re-executes BSF.

It seems to calculate the result on CX, and then later it re-computes to get the flag for the CMOV. If code is simplified (by removing call to A or observing the not-inlined function), this split goes away.

On Intel this is a rather expensive way to check if AX is zero. If this is a more common pattern it seems like it should be avoided. I know I cannot expect perfect assembly, but this seems a bit odd to me.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Oct 28, 2022
@heschi
Copy link
Contributor

heschi commented Oct 28, 2022

cc @golang/compiler , though questions might be better suited for golang-dev.

@heschi heschi added this to the Backlog milestone Oct 28, 2022
@heschi heschi added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 28, 2022
@renthraysk
Copy link

Only see two BSFQ instructions with go version devel go1.20-b726b0cadb Fri Oct 28 23:35:08 2022 +0000 linux/amd64

One is in B() and the other is from B() being in-lined into main().

@wdvxdr1123
Copy link
Contributor

I find go tip in compiler explorer is not go tip. I add -goversion in compiler options, only to see:
compile: version "devel go1.18-088bb4b Tue Nov 2 06:25:39 2021 +0000" does not match go tool version "-nolocalimports"

And I can't reproduce with current go tip(go version devel go1.20-3c17053bba Sat Oct 29 00:40:18 2022 +0000 linux/amd64)

@klauspost
Copy link
Contributor Author

@wdvxdr1123 I can reproduce locally on 1.19.2 - you must check the main() function where it is inlined:

	0x00ad 00173 (d:\temp\2\main.go:18)	XORQ	CX, AX
	0x00b0 00176 (d:\temp\2\main.go:22)	BSFQ	AX, CX
	0x00b4 00180 (d:\temp\2\main.go:12)	XCHGL	AX, AX
	0x00b5 00181 (d:\temp\2\main.go:13)	XCHGL	AX, AX
	0x00b6 00182 (d:\temp\2\main.go:14)	MOVUPS	X15, main..autotmp_35+56(SP)
	0x00bc 00188 (d:\temp\2\main.go:14)	MOVUPS	X15, main..autotmp_35+72(SP)
	0x00c2 00194 (d:\temp\2\main.go:14)	LEAQ	type.bool(SB), DX
	0x00c9 00201 (d:\temp\2\main.go:14)	MOVQ	DX, main..autotmp_35+56(SP)
	0x00ce 00206 (d:\temp\2\main.go:18)	MOVQ	$1125899906842623, BX
	0x00d8 00216 (d:\temp\2\main.go:18)	TESTQ	AX, BX
	0x00db 00219 (d:\temp\2\main.go:18)	SETEQ	BL
	0x00de 00222 (d:\temp\2\main.go:14)	MOVBLZX	BL, BX
	0x00e1 00225 (d:\temp\2\main.go:14)	LEAQ	runtime.staticuint64s(SB), SI
	0x00e8 00232 (d:\temp\2\main.go:14)	LEAQ	(SI)(BX*8), BX
	0x00ec 00236 (d:\temp\2\main.go:14)	MOVQ	BX, main..autotmp_35+64(SP)
	0x00f1 00241 (d:\temp\2\main.go:14)	MOVQ	DX, main..autotmp_35+72(SP)
	0x00f6 00246 (d:\temp\2\main.go:22)	BSFQ	AX, DX
	0x00fa 00250 (d:\temp\2\main.go:22)	MOVL	$64, DX
	0x00ff 00255 (d:\temp\2\main.go:22)	CMOVQEQ	DX, CX
	0x0103 00259 (d:\temp\2\main.go:22)	CMPQ	CX, $51
	0x0107 00263 (d:\temp\2\main.go:22)	SETGE	CL
	0x010a 00266 (d:\temp\2\main.go:14)	MOVBLZX	CL, CX

Output above is produced with:

λ go version
go version go1.19.2 windows/amd64
λ go build -asmflags="-D=GOAMD64_v3 -S" -gcflags=-S . 2>package.asm

@wdvxdr1123
Copy link
Contributor

@klauspost Yes, I can reproduce with go 1.19.2. I think this issue has been fixed in 669ec54. I revert this commit in my local branch and got the bad assembly output.
See comment from @randall77 (https://go-review.googlesource.com/c/go/+/432275/comment/30b66b90_67cd39ef/), I think this commit does improve codegen with flags.

@klauspost
Copy link
Contributor Author

klauspost commented Oct 29, 2022

If it is gone in tip we can just close it. Thanks for taking the time to investigate.

@golang golang locked and limited conversation to collaborators Oct 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants