Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile,math: improve performance for some math functions for ppc64x #21390

Closed
laboger opened this issue Aug 10, 2017 · 7 comments
Closed

Comments

@laboger
Copy link
Contributor

laboger commented Aug 10, 2017

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go tip

What operating system and processor architecture are you using (go env)?

Ubuntu (or any) ppc64le

Making some improvements to several math functions by writing some in asm and implementing some as intrinsics.

I will be creating multiple CLs to do this, so instead of opening multiple issues, I would like to assign them all to this one.

The first change will be to implement the trunc, floor, and ceil functions as intrinsics.

@ALTree
Copy link
Member

ALTree commented Aug 10, 2017

This is (a subset of) #8913 (math, crypto, hash: Write power64 versions of assembler routines where applicable)

@gopherbot
Copy link

Change https://golang.org/cl/54654 mentions this issue: cmd/compile: intrinsics for trunc, floor, ceil on ppc64x

gopherbot pushed a commit that referenced this issue Aug 11, 2017
This implements trunc, floor, and ceil in the math package
as intrinsics on ppc64x.  Significant improvement mainly due
to avoiding call overhead of args and return value.

BenchmarkCeil-16                    5.95          0.69          -88.40%
BenchmarkFloor-16                   5.95          0.69          -88.40%
BenchmarkTrunc-16                   5.82          0.69          -88.14%

Updates #21390

Change-Id: I951e182694f6e0c431da79c577272b81fb0ebad0
Reviewed-on: https://go-review.googlesource.com/54654
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
@gopherbot
Copy link

Change https://golang.org/cl/59932 mentions this issue: cmd/compile, math/bits: intrinsics for rotates on ppc64le

@gopherbot
Copy link

Change https://golang.org/cl/63290 mentions this issue: cmd/compile,math: improve int<->float conversions on ppc64x

gopherbot pushed a commit that referenced this issue Sep 12, 2017
This adds rules to match the code in math/bits RotateLeft,
RotateLeft32, and RotateLef64 to allow them to be inlined.

The rules are complicated because the code in these function
use different types, and the non-const version of these
shifts generate Mask and Carry instructions that become
subexpressions during the match process.

Also adds a testcase to asm_test.go.

Improvement in math/bits:

BenchmarkRotateLeft-16       1.57     1.32      -15.92%
BenchmarkRotateLeft32-16     1.60     1.37      -14.37%
BenchmarkRotateLeft64-16     1.57     1.32      -15.92%

Updates #21390

Change-Id: Ib6f17669ecc9cab54f18d690be27e2225ca654a4
Reviewed-on: https://go-review.googlesource.com/59932
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
gopherbot pushed a commit that referenced this issue Sep 14, 2017
The functions Float64bits and Float64frombits perform
poorly on ppc64x because the int<->float conversions
often result in load and store sequences to handle the
type change. This patch adds more rules to recognize
those sequences and use register to register moves
and avoid unnecessary loads and stores where possible.

There were some existing rules to improve these conversions,
but this provides additional improvements. Included here:

- New instruction FCFIDS to improve on conversion to 32 bit
- Rename Xf2i64 and Xi2f64 as MTVSRD, MFVSRD, to match the asm
- Add rules to lower some of the load/store sequences for
- Added new go asm to ppc64.s testcase.
conversions

Improvements:

BenchmarkAbs-16                2.16          0.93          -56.94%
BenchmarkCopysign-16           2.66          1.18          -55.64%
BenchmarkRound-16              4.82          2.69          -44.19%
BenchmarkSignbit-16            1.71          1.14          -33.33%
BenchmarkFrexp-16              11.4          7.94          -30.35%
BenchmarkLogb-16               10.4          7.34          -29.42%
BenchmarkLdexp-16              15.7          11.2          -28.66%
BenchmarkIlogb-16              10.2          7.32          -28.24%
BenchmarkPowInt-16             69.6          55.9          -19.68%
BenchmarkModf-16               10.1          8.19          -18.91%
BenchmarkLog2-16               17.4          14.3          -17.82%
BenchmarkCbrt-16               45.0          37.3          -17.11%
BenchmarkAtanh-16              57.6          48.3          -16.15%
BenchmarkRemainder-16          76.6          65.4          -14.62%
BenchmarkGamma-16              26.0          22.5          -13.46%
BenchmarkPowFrac-16            197           174           -11.68%
BenchmarkMod-16                112           99.8          -10.89%
BenchmarkAsinh-16              59.9          53.7          -10.35%
BenchmarkAcosh-16              44.8          40.3          -10.04%

Updates #21390

Change-Id: I56cc991fc2e55249d69518d4e1ba76cc23904e35
Reviewed-on: https://go-review.googlesource.com/63290
Reviewed-by: Michael Munday <mike.munday@ibm.com>
@gopherbot
Copy link

Change https://golang.org/cl/67130 mentions this issue: cmd/compile,math,cmd/internal/obj/ppc64: abs,copysign instrinsics on ppc64x

gopherbot pushed a commit that referenced this issue Oct 30, 2017
…insics on ppc64x

This adds support for math Abs, Copysign to be instrinsics on ppc64x.

New instruction FCPSGN is added to generate fcpsgn. Some new
rules are added to improve the int<->float conversions that are
generated mainly due to the Float64bits and Float64frombits in
the math package. PPC64.rules is also modified as suggested
in the review for CL 63290.

Improvements:
benchmark                           old ns/op     new ns/op     delta
BenchmarkAbs-16                   1.12          0.69          -38.39%
BenchmarkCopysign-16              1.30          0.93          -28.46%
BenchmarkNextafter32-16           9.34          8.05          -13.81%
BenchmarkFrexp-16                 8.81          7.60          -13.73%

Others that used Copysign also saw smaller improvements.

I attempted to make this work using rules since that
seems to be preferred, but due to the use of Float64bits and
Float64frombits in these functions, several rules had to be added and
even then not all cases were matched. Using rules became too
complicated and seemed too fragile for these.

Updates #21390

Change-Id: Ia265da9a18355e08000818a4fba1a40e9e031995
Reviewed-on: https://go-review.googlesource.com/67130
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/74411 mentions this issue: math: improve dim, min, max, modf for ppc64x

gopherbot pushed a commit that referenced this issue Nov 2, 2017
This change adds an asm implementations modf for ppc64x.

Improvements:

BenchmarkModf-16               7.48          6.26          -16.31%

Updates: #21390

Change-Id: I9c4f3213688e3e8842d050840dc04fc9c0bf6ce4
Reviewed-on: https://go-review.googlesource.com/74411
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
@laboger
Copy link
Contributor Author

laboger commented Nov 2, 2017

I have merged all the math changes I had planned for go 1.10, so will close this one.

@laboger laboger closed this as completed Nov 2, 2017
@golang golang locked and limited conversation to collaborators Nov 2, 2018
@rsc rsc unassigned laboger Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants