New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime/internal/atomic: improve performance for atomic functions on ppc64le/ppc64 #15469
Labels
Milestone
Comments
CL https://golang.org/cl/22549 mentions this issue. |
ceseo
pushed a commit
to powertechpreview/go
that referenced
this issue
May 5, 2016
The following performance improvements have been made to the low-level atomic functions for ppc64le & ppc64: - For those cases containing a lwarx and stwcx (or other sizes): sync, lwarx, maybe something, stwcx, loop to sync, sync, isync The sync is moved before (outside) the lwarx/stwcx loop, and the sync after is removed, so it becomes: sync, lwarx, maybe something, stwcx, loop to lwarx, isync - For the Or8 and And8, the shifting and manipulation of the address to the word aligned version were removed and the instructions were changed to use lbarx, stbcx instead of register shifting, xor, then lwarx, stwcx. - New instructions LWSYNC, LBAR, STBCC were tested and added. runtime/atomic_ppc64x.s was changed to use the LWSYNC opcode instead of the WORD encoding. Fixes golang#15469 Ran some of the benchmarks in the runtime and sync directories. Some results varied from run to run but the trend was improvement based on best times for base and new: runtime.test: BenchmarkChanNonblocking-128 0.88 0.89 +1.14% BenchmarkChanUncontended-128 569 511 -10.19% BenchmarkChanContended-128 63110 53231 -15.65% BenchmarkChanSync-128 691 598 -13.46% BenchmarkChanSyncWork-128 11355 11649 +2.59% BenchmarkChanProdCons0-128 2402 2090 -12.99% BenchmarkChanProdCons10-128 1348 1363 +1.11% BenchmarkChanProdCons100-128 1002 746 -25.55% BenchmarkChanProdConsWork0-128 2554 2720 +6.50% BenchmarkChanProdConsWork10-128 1909 1804 -5.50% BenchmarkChanProdConsWork100-128 1624 1580 -2.71% BenchmarkChanCreation-128 237 212 -10.55% BenchmarkChanSem-128 705 667 -5.39% BenchmarkChanPopular-128 5081190 4497566 -11.49% BenchmarkCreateGoroutines-128 532 473 -11.09% BenchmarkCreateGoroutinesParallel-128 35.0 34.7 -0.86% BenchmarkCreateGoroutinesCapture-128 4923 4200 -14.69% sync.test: BenchmarkUncontendedSemaphore-128 112 94.2 -15.89% BenchmarkContendedSemaphore-128 133 128 -3.76% BenchmarkMutexUncontended-128 1.90 1.67 -12.11% BenchmarkMutex-128 353 310 -12.18% BenchmarkMutexSlack-128 304 283 -6.91% BenchmarkMutexWork-128 554 541 -2.35% BenchmarkMutexWorkSlack-128 567 556 -1.94% BenchmarkMutexNoSpin-128 275 242 -12.00% BenchmarkMutexSpin-128 1129 1030 -8.77% BenchmarkOnce-128 1.08 0.96 -11.11% BenchmarkPool-128 29.8 27.4 -8.05% BenchmarkPoolOverflow-128 40564 36583 -9.81% BenchmarkSemaUncontended-128 3.14 2.63 -16.24% BenchmarkSemaSyntNonblock-128 1087 1069 -1.66% BenchmarkSemaSyntBlock-128 897 893 -0.45% BenchmarkSemaWorkNonblock-128 1034 1028 -0.58% BenchmarkSemaWorkBlock-128 949 886 -6.64% Change-Id: I4403fb29d3cd5254b7b1ce87a216bd11b391079e Reviewed-on: https://go-review.googlesource.com/22549 Reviewed-by: Michael Munday <munday@ca.ibm.com> Reviewed-by: Minux Ma <minux@golang.org>
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Please answer these questions before submitting your issue. Thanks!
go version
)?go version devel +0436a89 Mon Apr 25 19:21:40 2016 +0000 linux/ppc64le
go env
)?go env
GOARCH="ppc64le"
GOBIN=""
GOEXE=""
GOHOSTARCH="ppc64le"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/boger/gotools"
GORACE=""
GOROOT="/home/boger/golang/atomic/go"
GOTOOLDIR="/home/boger/golang/atomic/go/pkg/tool/linux_ppc64le"
CC="gcc"
GOGCCFLAGS="-fPIC -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build937495002=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
If possible, provide a recipe for reproducing the error.
A complete runnable program is good.
A link on play.golang.org is best.
During some performance analysis of various applications it was found that some of the atomic functions appeared high on the profile list, and on further inspection of that code it was determined that some of those code sequences could be improved for ppc64le & ppc64.
The files to be changed are:
runtime/internal/atomic/asm_ppc64x.s
sync/atomic/asm_ppc64x.s
The main purpose is to eliminate the extra sync instructions. I have a patch and will post it shortly.
The text was updated successfully, but these errors were encountered: