-
Notifications
You must be signed in to change notification settings - Fork 18k
math: Poor performance for math.Sqrt on ppc64le/ppc64 #14349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
https://go-review.googlesource.com/#/c/19515/ Improved some sqrt testcases by 10X or more. Further improvement if this was inlined so that the load of the argument and store of the return value could be avoided. |
CL https://golang.org/cl/19515 mentions this issue. |
There is an easy to miss hack in cmd/compile/internal/gc/walk.go which inlines the square root function on platforms where it is a single instruction: if n.Left.Op == ONAME && n.Left.Sym.Name == "Sqrt" && n.Left.Sym.Pkg.Path == "math" {
switch Thearch.Thechar {
case '5', '6', '7':
n.Op = OSQRT
n.Left = n.List.N
n.List = nil
break opswitch
}
} It sounds like |
Thanks! I'll try it out. On 02/19/2016 10:04 AM, Michael Munday wrote:
|
You also need to handle gc.OSQRT in optoas in the ppc64
backend.
|
I had to change a few files to get the inlining to work for sqrt but I think it is right now. Are these additional files something I can add to my original CL or should I create a new one? |
Add to your existing CL. |
The existing implementation uses code written in Go to implement Sqrt; this adds the assembler to use the sqrt instruction for Power and makes the necessary changes to allow it to be inlined. The following tests showed this relative improvement: benchmark delta BenchmarkSqrt -97.91% BenchmarkSqrtIndirect -96.65% BenchmarkSqrtGo -35.93% BenchmarkSqrtPrime -96.94% Fixes golang#14349 Change-Id: I8074f4dc63486e756587564ceb320aca300bf5fa Reviewed-on: https://go-review.googlesource.com/19515 Reviewed-by: Minux Ma <minux@golang.org>
The implementation of the Sqrt function is written in Go and performs poorly on ppc64le & ppc64. For other platforms there are asm implementations.
I have a change to use the asm sqrt instruction to improve the performance on Power. I will post that change shortly.
The text was updated successfully, but these errors were encountered: