Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bytes: Equal and other byte compare functions perform poorly on ppc64le/ppc64 #14368

Closed
laboger opened this issue Feb 17, 2016 · 2 comments
Closed
Milestone

Comments

@laboger
Copy link
Contributor

laboger commented Feb 17, 2016

The bytes.Equal function found in runtime/asm_ppc64x.s performs poorly. It compares a single byte at a time which is very inefficient on Power especially when very long buffers are involved.

I'm working on a change to add some basic byte compare functions that are more efficient for Power that can be reused by other functions in this file (memeq, memeq_varlen, eqstring, cmpstring, Compare, Equal), similar to what is done in some of the asm files for other platforms.

@ianlancetaylor ianlancetaylor added this to the Go1.7 milestone Feb 17, 2016
@gopherbot
Copy link

CL https://golang.org/cl/20249 mentions this issue.

@laboger
Copy link
Contributor Author

laboger commented Mar 8, 2016

One note about this change. Further improvements could be made if there was some kind of directive to align the code so that highly executed loops will stay in the icache during execution. Is there any directive that would do that in the plan9 assembler or linker?

ceseo pushed a commit to powertechpreview/go that referenced this issue May 3, 2016
The existing implementation for Equal and similar
functions in the bytes package operate on one byte at
at time.  This performs poorly on ppc64/ppc64le especially
when the byte buffers are large.  This change improves
those functions by loading and comparing double words where
possible.  The common code has been moved to a function
that can be shared by the other functions in this
file which perform the same type of comparison.
Further optimizations are done for the case where
>= 32 bytes are being compared.  The new function
memeqbody is used by memeq_varlen, Equal, and eqstring.

When running the bytes test with -test.bench=Equal

benchmark                     old MB/s     new MB/s     speedup
BenchmarkEqual1               164.83       129.49       0.79x
BenchmarkEqual6               563.51       445.47       0.79x
BenchmarkEqual9               656.15       1099.00      1.67x
BenchmarkEqual15              591.93       1024.30      1.73x
BenchmarkEqual16              613.25       1914.12      3.12x
BenchmarkEqual20              682.37       1687.04      2.47x
BenchmarkEqual32              807.96       3843.29      4.76x
BenchmarkEqual4K              1076.25      23280.51     21.63x
BenchmarkEqual4M              1079.30      13120.14     12.16x
BenchmarkEqual64M             1073.28      10876.92     10.13x

It was determined that the degradation in the smaller byte tests
were due to unfavorable code alignment of the single byte loop.

Fixes golang#14368

Change-Id: I0dd87382c28887c70f4fbe80877a8ba03c31d7cd
Reviewed-on: https://go-review.googlesource.com/20249
Reviewed-by: Minux Ma <minux@golang.org>
@golang golang locked and limited conversation to collaborators Mar 23, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants