-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/asm, math: improve performance for math.Floor, math.Ceil, math.Trunc with asm on ppc64x #17185
Labels
Comments
CL https://golang.org/cl/29654 mentions this issue. |
frim breaks ppc970 comparability, reopen.
|
The decision was that ppc64 would support instructions back to power5, not ppc970. This was documented in the go1.7 release notes. https://golang.org/doc/go1.7. |
Yes, looks like it. |
ceseo
pushed a commit
to powertechpreview/go
that referenced
this issue
Dec 1, 2016
…s.Compare with asm on ppc64x bytes: fix typo in ppc64le asm for Compare Correcting a line in asm_ppc64x.s in the cmpbodyLE function that originally was R14 but accidentally changed to R4. Fixes golang#17488 Change-Id: Id4ca6fb2e0cd81251557a0627e17b5e734c39e01 Reviewed-on: https://go-review.googlesource.com/31266 Reviewed-by: Michael Munday <munday@ca.ibm.com> Run-TryBot: Michael Munday <munday@ca.ibm.com> Backport of 2190f77 by Lynn Boger <laboger@linux.vnet.ibm.com> bytes: improve performance for bytes.Compare on ppc64x This improves the performance for byte.Compare by rewriting the cmpbody function in runtime/asm_ppc64x.s. The previous code had a simple loop which loaded a pair of bytes and compared them, which is inefficient for long buffers. The updated function checks for 8 or 32 byte chunks and then loads and compares double words where possible. Because the byte.Compare result indicates greater or less than, the doubleword loads must take endianness into account, using a byte reversed load in the little endian case. Fixes golang#17433 benchmark old ns/op new ns/op delta BenchmarkBytesCompare/8-16 13.6 7.16 -47.35% BenchmarkBytesCompare/16-16 25.7 7.83 -69.53% BenchmarkBytesCompare/32-16 38.1 7.78 -79.58% BenchmarkBytesCompare/64-16 63.0 10.6 -83.17% BenchmarkBytesCompare/128-16 112 13.0 -88.39% BenchmarkBytesCompare/256-16 211 28.1 -86.68% BenchmarkBytesCompare/512-16 410 38.6 -90.59% BenchmarkBytesCompare/1024-16 807 60.2 -92.54% BenchmarkBytesCompare/2048-16 1601 103 -93.57% Change-Id: I121acc74fcd27c430797647b8d682eb0607c63eb Reviewed-on: https://go-review.googlesource.com/30949 Reviewed-by: David Chase <drchase@google.com> Backport of 1e28dce by Lynn Boger <laboger@linux.vnet.ibm.com> math, cmd/internal/obj/ppc64: improve floor, ceil, trunc with asm This adds the instructions frim, frip, and friz to the ppc64x assembler for use in implementing the math.Floor, math.Ceil, and math.Trunc functions to improve performance. Fixes golang#17185 BenchmarkCeil-128 21.4 6.99 -67.34% BenchmarkFloor-128 13.9 6.37 -54.17% BenchmarkTrunc-128 12.7 6.33 -50.16% Change-Id: I96131bd4e8c9c8dbafb25bfeb544cf9d2dbb4282 Reviewed-on: https://go-review.googlesource.com/29654 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Michael Munday <munday@ca.ibm.com> Backport of 3311275 by Lynn Boger <laboger@linux.vnet.ibm.com>
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Please answer these questions before submitting your issue. Thanks!
What version of Go are you using (
go version
)?tip
What operating system and processor architecture are you using (
go env
)?Ubuntu 16.04 ppc64le
What did you do?
Built and ran the math package benchmark test.
What did you expect to see?
Better performance for floor, ceil, and trunc functions in math package.
What did you see instead?
Room for improvement
The Power instruction set includes frim, frip, and friz which can be used to implement the Go math package's floor, ceil, and trunc functions more efficiently. These instructions are used in the existing gccgo implementation for these math functions for ppc64x. These are compatible with power5 so can be used in both ppc64le & ppc64.
The text was updated successfully, but these errors were encountered: