cmd/compile: performance regression on ppc64 due to change in scheduling CL 270940 #57976
Labels
arch-ppc64x
compiler/runtime
Issues related to the Go compiler and/or runtime.
FrozenDueToAge
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Performance
Milestone
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I noticed some regressions in performance for a few benchmarks since the scheduling change was made in CL 270940.
What did you expect to see?
Same or better performance.
What did you see instead?
10-20% degradation for some crypto benchmarks. So far I've mostly looked in the crypto package, but I'm still looking.
This example is from crypto/internal/bigmod when comparing the commit before the CL was merged against the latest. I verified the degradation happened with CL 270940, and ran against latest to make sure it hadn't been fixed yet.
The degradation in Add and Sub are due to a change in the sub loop:
Before:
After:
In the latest code, the increment of r3 is at the top of the loop the result is put into r8, but then moves it back to r3 at the bottom even though r3 is never clobbered in the loop. In the previous code r3 was incremented at the bottom of the loop and stayed in r3 throughout the loop. I also see that the shift of res is in a different place although I think the degradation is due to using another register and moving it.
The text was updated successfully, but these errors were encountered: