cmd/compile: favour UDIV over UMULH + LSR on arm64 for 64 bit integer division by a constant #37773
Labels
FrozenDueToAge
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Performance
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
What did you expect to see?
I expected the compiler to generate the UDIV instruction rather than
UMULH
+LSR
, as empirical testing (below) shows it to be twice as fast on an arm64 Cortex A53 (BCM2837) SoC.The final assembly instructions are:
The division by 216 is comprised of the six instructions:
Benchmark results
I created two simple executables to compare the generated
mov/movk/movk/movk/umulh/lsr
instructions with the equivalentmov/udiv
instructions that I had expected to see, and compared results. Here we see thatmov/udiv
is consistently twice as fast:The text was updated successfully, but these errors were encountered: