New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: significant regression in pthread_cond_timedwait performance on Darwin #26019
Comments
|
CC @randall77 We changed the way we handle condition variables to be more compatible with Darwin mechanisms. There may not be much we can do here. |
Or perhaps we're not calling I haven't spent more than about 5m looking at this so it's possible I'm entirely off base. |
And, indeed, replacing diff --git a/src/runtime/os_darwin.go b/src/runtime/os_darwin.go
index 5019b9423c..ff375004a3 100644
--- a/src/runtime/os_darwin.go
+++ b/src/runtime/os_darwin.go
@@ -45,7 +45,7 @@ func semasleep(ns int64) int32 {
if ns >= 0 {
var t timespec
t.set_nsec(ns)
- err := pthread_cond_timedwait(&mp.cond, &mp.mutex, &t)
+ err := pthread_cond_timedwait_relative_np(&mp.cond, &mp.mutex, &t)
if err == _ETIMEDOUT {
pthread_mutex_unlock(&mp.mutex)
return -1
diff --git a/src/runtime/sys_darwin.go b/src/runtime/sys_darwin.go
index ef5aef1929..f0d0815903 100644
--- a/src/runtime/sys_darwin.go
+++ b/src/runtime/sys_darwin.go
@@ -287,10 +287,10 @@ func pthread_cond_wait_trampoline()
//go:nosplit
//go:cgo_unsafe_args
-func pthread_cond_timedwait(c *pthreadcond, m *pthreadmutex, t *timespec) int32 {
- return libcCall(unsafe.Pointer(funcPC(pthread_cond_timedwait_trampoline)), unsafe.Pointer(&c))
+func pthread_cond_timedwait_relative_np(c *pthreadcond, m *pthreadmutex, t *timespec) int32 {
+ return libcCall(unsafe.Pointer(funcPC(pthread_cond_timedwait_relative_np_trampoline)), unsafe.Pointer(&c))
}
-func pthread_cond_timedwait_trampoline()
+func pthread_cond_timedwait_relative_np_trampoline()
//go:nosplit
//go:cgo_unsafe_args
@@ -348,7 +348,7 @@ func closeonexec(fd int32) {
//go:cgo_import_dynamic libc_pthread_mutex_unlock pthread_mutex_unlock "/usr/lib/libSystem.B.dylib"
//go:cgo_import_dynamic libc_pthread_cond_init pthread_cond_init "/usr/lib/libSystem.B.dylib"
//go:cgo_import_dynamic libc_pthread_cond_wait pthread_cond_wait "/usr/lib/libSystem.B.dylib"
-//go:cgo_import_dynamic libc_pthread_cond_timedwait pthread_cond_timedwait "/usr/lib/libSystem.B.dylib"
+//go:cgo_import_dynamic libc_pthread_cond_timedwait_relative_np pthread_cond_timedwait_relative_np "/usr/lib/libSystem.B.dylib"
//go:cgo_import_dynamic libc_pthread_cond_signal pthread_cond_signal "/usr/lib/libSystem.B.dylib"
// Magic incantation to get libSystem actually dynamically linked.
diff --git a/src/runtime/sys_darwin_amd64.s b/src/runtime/sys_darwin_amd64.s
index 5522a86a1f..142933585d 100644
--- a/src/runtime/sys_darwin_amd64.s
+++ b/src/runtime/sys_darwin_amd64.s
@@ -447,13 +447,13 @@ TEXT runtime·pthread_cond_wait_trampoline(SB),NOSPLIT,$0
POPQ BP
RET
-TEXT runtime·pthread_cond_timedwait_trampoline(SB),NOSPLIT,$0
+TEXT runtime·pthread_cond_timedwait_relative_np_trampoline(SB),NOSPLIT,$0
PUSHQ BP
MOVQ SP, BP
MOVQ 8(DI), SI // arg 2 mutex
MOVQ 16(DI), DX // arg 3 timeout
MOVQ 0(DI), DI // arg 1 cond
- CALL libc_pthread_cond_timedwait(SB)
+ CALL libc_pthread_cond_timedwait_relative_np(SB)
POPQ BP
RET I'm going to verify this patch more scientifically. Assuming it withstands scrutiny, I'll clean it up and submit a CL. |
Results from a slightly more scientific experiment that involved running the kv generator that Nathan mentioned in the original description:
I'd say we've found our culprit. Preparing a CL now. |
Change https://golang.org/cl/120635 mentions this issue: |
Solid catch! Glad this wasn't an issue in libc itself and that we'll be able to resolve it shortly. Out of curiosity, how did you find out about |
Heh, I was reading the libpthead source code. The link is in the CL if
you’re curious. You can also see the function declaration in pthread.h.
…On Fri, Jun 22, 2018 at 10:35 PM Nathan VanBenschoten < ***@***.***> wrote:
Solid catch! Glad this wasn't an issue in libc itself and that we'll be
able to resolve it shortly.
Out of curiosity, how did you find out about
pthread_cond_timedwait_relative_np? I can see it in the global symbol
table of /usr/lib/system/libsystem_pthread.dylib now that I know what to
look for, but that's about it. It's not referenced at all in the pthread
documentation. Instead they just have a long (but easy to miss) rationale
<http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_cond_timedwait.html>
about why pthread_cond_timedwait doesn't take a relative time measurement.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#26019 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AA15IEsWsNpgeDPD9M_L2gkzqePNYbIgks5t_dOkgaJpZM4U0kMF>
.
|
Wow, thanks for the investigation and patch! In case anyone else sees this, I observed ~200% CPU utilization + heavy fan usage on Mac, by starting a HTTP server that opened some Postgres connections. With this patch applied, the CPU usage from my Go binary is at 0%. |
What version of Go are you using (
go version
)?Custom build of
go1.11beta1
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?What did you do?
Compiled CockroachDB, ran basic
kv
load generator against it.What did you expect to see?
Comparable (hopefully better!) performance to go1.10.
What did you see instead?
A large (~30%) throughput reduction. Comparing CPU profiles between 1.10 and go1.11beta1 revealed a significant increase in time spent in
runtime.pthread_cond_timedwait
:go1.11beta1
go1.10
A git blame of
runtime.pthread_cond_timedwait
indicated that 5b5473c could be the cause of the regression. Re-running my benchmark with a binary compiled with and without that commit confirmed it to be cuprit.cc. @randall77
The text was updated successfully, but these errors were encountered: