We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
package p var sink uint8 func BenchmarkSub(b *testing.B) { for i := 0; i < b.N; i++ { sink = 64 - sink } }
Currently, the core of the inner loop compiles to:
0x0009 00009 (/Users/josh/src/mask_test.go:29) MOVBLZX "".sink(SB), DX 0x0010 00016 (/Users/josh/src/mask_test.go:29) ADDL $-64, DX 0x0013 00019 (/Users/josh/src/mask_test.go:29) NEGL DX 0x0015 00021 (/Users/josh/src/mask_test.go:29) MOVB DL, "".sink(SB)
Disabling a few optimizations allows this to revert back to:
0x0009 00009 (/Users/josh/src/mask_test.go:29) MOVBLZX "".sink(SB), DX 0x0010 00016 (/Users/josh/src/mask_test.go:29) MOVL $64, BX 0x0015 00021 (/Users/josh/src/mask_test.go:29) SUBL DX, BX 0x0017 00023 (/Users/josh/src/mask_test.go:29) MOVB BL, "".sink(SB)
It is two bytes longer, but executes faster: 1.60 ns vs 2.06 ns. (Both measurements are extremely consistent.)
Perhaps we should remove the optimizations, or perhaps we should make them contingent on something (what?).
cc @randall77
The text was updated successfully, but these errors were encountered:
I guess the existing code is not just shorter, it only requires a single register. I'll find a different attack on this code.
Sorry, something went wrong.
No branches or pull requests
Currently, the core of the inner loop compiles to:
Disabling a few optimizations allows this to revert back to:
It is two bytes longer, but executes faster: 1.60 ns vs 2.06 ns. (Both measurements are extremely consistent.)
Perhaps we should remove the optimizations, or perhaps we should make them contingent on something (what?).
cc @randall77
The text was updated successfully, but these errors were encountered: