Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: performance improvements for memclr on ppc64x #17348

Closed
laboger opened this issue Oct 5, 2016 · 3 comments
Closed

runtime: performance improvements for memclr on ppc64x #17348

laboger opened this issue Oct 5, 2016 · 3 comments

Comments

@laboger
Copy link
Contributor

laboger commented Oct 5, 2016

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version devel +a0d83eb Wed Oct 5 12:44:29 2016 -0500 linux/ppc64le

What operating system and processor architecture are you using (go env)?

Ubuntu 16.04 ppc64le

Looking at performance of runtime benchmarks and found that memclr could be improved on ppc64x.

@bradfitz
Copy link
Contributor

bradfitz commented Oct 5, 2016

This needs more details. What is the code for memclr now, and what should it be?

@laboger
Copy link
Contributor Author

laboger commented Oct 5, 2016

The file to be changed is memclr_ppc64x.s. The change will be similar to what is currently done in memmove_ppc64x.s, where loops are unrolled to improve performance.

For example when it can be determined that chunks of 32 bytes are being cleared:

loop:
std r0, 0(r3)
std r0, 8(r3)
std r0, 16(r3)
std r0, 24(r3)
bc loop

Does much better than

loop:
stdu r0,8(r3)
bc loop

@gopherbot
Copy link

CL https://golang.org/cl/30373 mentions this issue.

ceseo pushed a commit to powertechpreview/go that referenced this issue Dec 1, 2016
This updates runtime/memclr_ppc64x.s to improve performance,
by unrolling loops for larger clears.

Fixes golang#17348

benchmark                    old MB/s     new MB/s     speedup
BenchmarkMemclr/5-80         199.71       406.63       2.04x
BenchmarkMemclr/16-80        693.66       1817.41      2.62x
BenchmarkMemclr/64-80        2309.35      5793.34      2.51x
BenchmarkMemclr/256-80       5428.18      14765.81     2.72x
BenchmarkMemclr/4096-80      8611.65      27191.94     3.16x
BenchmarkMemclr/65536-80     8736.69      28604.23     3.27x
BenchmarkMemclr/1M-80        9304.94      27600.09     2.97x
BenchmarkMemclr/4M-80        8705.66      27589.64     3.17x
BenchmarkMemclr/8M-80        8575.74      23631.04     2.76x
BenchmarkMemclr/16M-80       8443.10      19240.68     2.28x
BenchmarkMemclr/64M-80       8390.40      9493.04      1.13x
BenchmarkGoMemclr/5-80       263.05       630.37       2.40x
BenchmarkGoMemclr/16-80      904.33       1148.49      1.27x
BenchmarkGoMemclr/64-80      2830.20      8756.70      3.09x
BenchmarkGoMemclr/256-80     6064.59      20299.46     3.35x

Change-Id: Ic76c9183c8b4129ba3df512ca8b0fe6bd424e088
Reviewed-on: https://go-review.googlesource.com/30373
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Reviewed-by: David Chase <drchase@google.com>

Backport of 3107c91
by Lynn Boger <laboger@linux.vnet.ibm.com>
@golang golang locked and limited conversation to collaborators Oct 5, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants