-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: add asm version of cmpstring and bytes·Compare for ppc64 #10007
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Here's a first pass at an arm version for
|
Update #10007 Implement runtime.cmpstring and bytes.Compare in asm for arm. benchmark old ns/op new ns/op delta BenchmarkCompareBytesEqual 254 91.4 -64.02% BenchmarkCompareBytesToNil 41.5 37.6 -9.40% BenchmarkCompareBytesEmpty 40.7 37.6 -7.62% BenchmarkCompareBytesIdentical 255 96.3 -62.24% BenchmarkCompareBytesSameLength 125 60.9 -51.28% BenchmarkCompareBytesDifferentLength 133 60.9 -54.21% BenchmarkCompareBytesBigUnaligned 17985879 5669706 -68.48% BenchmarkCompareBytesBig 17097634 4926798 -71.18% BenchmarkCompareBytesBigIdentical 16861941 4389206 -73.97% benchmark old MB/s new MB/s speedup BenchmarkCompareBytesBigUnaligned 58.30 184.95 3.17x BenchmarkCompareBytesBig 61.33 212.83 3.47x BenchmarkCompareBytesBigIdentical 62.19 238.90 3.84x This is a collaboration between Josh Bleecher Snyder and myself. Change-Id: Ib3944b8c410d0e12135c2ba9459bfe131df48edd Reviewed-on: https://go-review.googlesource.com/8010 Reviewed-by: Keith Randall <khr@golang.org>
Reducing the scope of this bug to be just about arm. We might want to do something for arm in Go 1.5, since @mwhudson is putting string compares into interface checks, although it's only on the slow path. arm64 and ppc64 are not going to be completed for Go 1.5 and are missing tons of other assembly anyway. |
Arm and arm64 is done, it's really just ppc64 left
|
Looks like this issue has been fixed on ppc64{,le}: 32d3b96 (and 1e28dce). |
Currently only amd64 and 386 have asm versions of
cmpstring
andbytes·Compare
. These can be important for performance (see e.g. #10000). The compiler generates ok but not amazing code for these. Here's the ARM output forcmpstring
:At first glance, there are extraneous
panicindex
calls, and we could be doing 32 bit comparisons instead of 8 bit comparisons as we loop. And I'm sure it could be optimized further.cc @davecheney
The text was updated successfully, but these errors were encountered: