New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
math: consider removing Dim assembly (and possibly Max/Min too) #21913
Comments
/cc @TocarIP @cherrymui |
Also, if we don't care about the bit pattern used for NaN (which we probably don't) then we could just do: func Dim(x, y float64) float64 {
x = x - y
if x <= 0 {
return 0
}
return x
} |
Change https://golang.org/cl/64194 mentions this issue: |
Removing asm version of Dim everywhere, to make it faster/inlinable is great. We will need to add it to cmd/compile/internal/gc/inl_test.go to avoid regressions in the future. I don't like different Nan bit-patterns |
By calculating dim directly, rather than calling max, we can simplify the generated code significantly. The compiler now reports that dim is easily inlineable, but it can't be inlined because there is still an assembly stub for Dim. Since dim is now very simple I no longer think it is worth having assembly implementations of it. I have therefore removed the s390x assembly. Removing the other assembly for Dim is #21913. name old time/op new time/op delta Dim 4.29ns ± 0% 3.53ns ± 0% -17.62% (p=0.000 n=9+8) Change-Id: Ic38a6b51603cbc661dcdb868ecf2b1947e9f399e Reviewed-on: https://go-review.googlesource.com/64194 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>
In your Max example above, I don't understand why there are functions for NaN() and Inf() when they are compile time constants. Based on what you are saying, if those could be represented as constants instead of function calls that should allow more functions that contain them to be inlined because they will have lower cost? |
@laboger Yes, if we could specify NaN and ±∞ as constants then it would be relatively easy to reduce the cost of Min and Max enough to make them inlineable. If we had mid-stack inlining then another approach we could take would be to push NaN handling into a helper function to reduce the inlining cost (though we'd probably need to use func Max(x, y float64) float64 {
switch {
case y > x:
return y
case x > y:
return x
case x == y:
// return +0 in preference to -0
return Float64frombits(Float64bits(x) & Float64bits(y))
}
// x or y is NaN (rare)
return maxNaN(x, y)
}
// maxNaN returns +Inf if x or y is +Inf, otherwise it returns NaN.
//go:noinline
func maxNaN(x, y float64) float64 {
if IsInf(x, 1) || IsInf(y, 1) {
return Inf()
}
return NaN()
} |
Change https://golang.org/cl/80695 mentions this issue: |
Dim performance has regressed by 14% vs 1.9 on amd64. Current pure go version of Dim is faster and, what is even more important for performance, is inlinable, so instead of tweaking asm implementation, just remove it. I had to update BenchmarkDim, because it was simply reloading constant(answer) in a loop. Perf data below: name old time/op new time/op delta Dim-6 6.79ns ± 0% 1.60ns ± 1% -76.39% (p=0.000 n=7+10) If I modify benchmark to be the same as in this CL results are even better: name old time/op new time/op delta Dim-6 10.2ns ± 0% 1.6ns ± 1% -84.27% (p=0.000 n=8+10) Updates #21913 Change-Id: I00e23c8affc293531e1d9f0e0e49f3a525634f53 Reviewed-on: https://go-review.googlesource.com/80695 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
As far as I can see, this is fixed. |
I've just sent CL 64194 that makes
dim
(the generic implementation ofDim
) simple enough to be easily inlineable. I've deleted the s390x assembly since I don't think it is worth having any more. I could have made it better but I think inlining the code would be more of a win. If this CL is accepteddim
will look something like this:It would be really nice if we could delete the assembly for the other platforms so that we can get rid of the assembly stubs and
Dim
itself can be made inlineable.Min
andMax
could also be reworked to make them inlineable (and, theoretically, faster in the usual case), using code like this:Note that we can't use the
IsInf
,Inf
andNaN
functions because they increase the complexity of the code the compiler sees and pushmax
over the inlining threshold.It is less clear cut that this is better than the assembly on all platforms (arm64 seems to have an FMAX instruction for example) and my benchmarking results were more mixed, so I haven't made this change.
The text was updated successfully, but these errors were encountered: