-
Notifications
You must be signed in to change notification settings - Fork 17.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: memset/memmove clobbers r9/r10 on Linux/ARM #3718
Labels
Comments
i'm afraid the reason is not that simple. i straced profilesegv, and found out the thread handled several SIGPROF signals before the one that triggered SIGSEGV. 4435 --- SIGPROF (Profiling timer expired) @ 0 (0) --- 4435 rt_sigreturn(0x1) = 0 4435 --- SIGPROF (Profiling timer expired) @ 0 (0) --- 4435 rt_sigreturn(0x1) = 2 4435 --- SIGPROF (Profiling timer expired) @ 0 (0) --- 4435 rt_sigreturn(0x1) = -94133986 4435 --- SIGPROF (Profiling timer expired) @ 0 (0) --- 4435 rt_sigreturn(0x1) = 2 4435 --- SIGPROF (Profiling timer expired) @ 0 (0) --- 4435 --- SIGSEGV (Segmentation fault) @ 0 (0) --- 4434 +++ killed by SIGSEGV +++ 4433 +++ killed by SIGSEGV +++ because the strace log is very small, I attached it. Attachments:
|
finally, I've found the reason, it has nothing to do with signal masking (although we do have a bug there, http://golang.org/cl/6297087/) the signal occurred in runtime.memset, which clobbers r9 and r10; I think we never anticipated that possibility (except kaib's TODO note in memset_arm.s, but that only mentioned SIGSEGV triggered by memset itself). any suggestions? should we fix memset/memmove to not clobber r9/r10 or we block all signals in them? @dfc, you have some improvements on these function, what do you think? can you make memset/memmove not touching r9/r10? Labels changed: added priority-soon, removed priority-triage. Status changed to Accepted. |
http://golang.org/cl/6300043 Fixes r9/r10 during memset. I hadn't mailed it because it produced no measurable improvement in performance. |
I've confirmed avoiding r9/10 during memmove solves the problem (http://golang.org/cl/6300096 _really_ not ready for submission). I'll continue to try to find a nicer way to do memmove. |
Responding to comment #3, doing two back to back 16 byte moves, vs a single 32byte move looks like it costs about 15% when the data fits in the cache. pando(~/go/src/pkg/runtime) % ~/go/misc/benchcmp {old,new}.txt benchmark old ns/op new ns/op delta BenchmarkMemmove32 113 112 -0.88% BenchmarkMemmove4K 1242 1437 +15.70% BenchmarkMemmove64K 60473 68747 +13.68% BenchmarkMemmove4M 28684690 28676750 -0.03% BenchmarkMemmove64M 460833600 460650800 -0.04% benchmark old MB/s new MB/s speedup BenchmarkMemmove32 282.05 284.38 1.01x BenchmarkMemmove4K 3295.79 2849.69 0.86x BenchmarkMemmove64K 1083.72 953.28 0.88x BenchmarkMemmove4M 146.22 146.26 1.00x BenchmarkMemmove64M 145.62 145.68 1.00x |
Here is an alternative proposal for memmove that avoids spilling r9/10 in favour of spilling and reusing the start/end register during the block move section. http://golang.org/cl/6305100 @minux: I could not make MOVM.IA.W [R4-R7,R11] style syntax work, possibly the 5a assembler doesn't understand it. |
This issue was closed by revision e34079b. Status changed to Fixed. |
davecheney
added a commit
that referenced
this issue
May 11, 2015
««« backport 5ca8acc84025 runtime: avoid r9/r10 during memmove Fixes #3718. Requires CL 6300043. R=rsc, minux.ma, extraterrestrial.neighbour CC=golang-dev https://golang.org/cl/6305100 »»»
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Attachments:
The text was updated successfully, but these errors were encountered: