Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: hang in pthread_cond_timedwait_relative_np on darwin/arm64 #35800

Closed
eliasnaur opened this issue Nov 23, 2019 · 7 comments
Closed

runtime: hang in pthread_cond_timedwait_relative_np on darwin/arm64 #35800

eliasnaur opened this issue Nov 23, 2019 · 7 comments
Labels
FrozenDueToAge mobile Android, iOS, and x/mobile NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Milestone

Comments

@eliasnaur
Copy link
Contributor

Similar in spirit to the hangs on Android, I had to kill -QUIT a hanging darwin/arm64 build:

https://build.golang.org/log/9678d54fa369d914b12c3b32df3a2470ba7ee4ec

It hung after bootstrap, but before tests:

Building packages and commands for darwin/arm64.
SIGQUIT: quit
PC=0x18ed54c8c m=1 sigcode=0

goroutine 0 [idle]:
runtime.pthread_cond_timedwait_relative_np(0x13003e388, 0x13003e348, 0x16b94eda8, 0x100000000)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/runtime/sys_darwin.go:393 +0x38
runtime.semasleep(0xdf8475800, 0x130000480)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/runtime/os_darwin.go:57 +0xd8
runtime.notetsleep_internal(0x104bc5a58, 0xdf8475800, 0x130000480, 0x2c37bfd1dfea0, 0x130030f60)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/runtime/lock_sema.go:224 +0x128
runtime.notetsleep(0x104bc5a58, 0xdf8475800, 0x104847400)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/runtime/lock_sema.go:275 +0x44
runtime.sysmon()
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/runtime/proc.go:4491 +0x450
runtime.mstart1()
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/runtime/proc.go:1133 +0xa4
runtime.mstart()
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/runtime/proc.go:1098 +0x54

goroutine 1 [syscall]:
syscall.syscall(0x1045d6158, 0x151d02840, 0x0, 0x0, 0x0, 0x0, 0x0)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/runtime/sys_darwin.go:63 +0x14
syscall.closedir(0x151d02840, 0xc8, 0x130381e00)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/syscall/zsyscall_darwin_arm64.go:534 +0x3c
os.(*dirInfo).close(...)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/os/dir_darwin.go:23
os.(*file).close(0x13038fe00, 0xffffffffffffffff, 0x1303f8000)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/os/file_unix.go:245 +0x194
os.(*File).Close(...)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/os/file_unix.go:237
path/filepath.readDirNames(0x130381e00, 0x53, 0x0, 0x1045f2ff4, 0x130381e00, 0x53, 0x130381e4b)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/path/filepath/path.go:422 +0xf8
path/filepath.walk(0x130381e00, 0x53, 0x104985680, 0x130383520, 0x130113650, 0x0, 0x0)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/path/filepath/path.go:363 +0x44
path/filepath.walk(0x13030b900, 0x4a, 0x104985680, 0x130359110, 0x130113650, 0x0, 0x0)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/path/filepath/path.go:384 +0x22c
path/filepath.walk(0x1301f3680, 0x46, 0x104985680, 0x13020e9c0, 0x130113650, 0x0, 0x0)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/path/filepath/path.go:384 +0x22c
path/filepath.walk(0x130024440, 0x40, 0x104985680, 0x1301060d0, 0x130113650, 0x0, 0x2)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/path/filepath/path.go:384 +0x22c
path/filepath.Walk(0x130024440, 0x40, 0x1300ab650, 0x1048c6316, 0x1)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/path/filepath/path.go:406 +0xdc
cmd/go/internal/search.MatchPackages(0x16b8c7a83, 0x3, 0x2)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/cmd/go/internal/search/search.go:59 +0x2a8
cmd/go/internal/search.ImportPathsQuiet(0x130018280, 0x2, 0x2, 0x1048bd620, 0x104839700, 0x104e08000)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/cmd/go/internal/search/search.go:338 +0xc4
cmd/go/internal/search.ImportPaths(0x130018280, 0x2, 0x2, 0x13001c480, 0x7, 0x104546c20)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/cmd/go/internal/search/search.go:328 +0x30
cmd/go/internal/load.ImportPaths(0x130018280, 0x2, 0x2, 0x1300aba00, 0x7, 0x104573968)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/cmd/go/internal/load/pkg.go:2118 +0xa0
cmd/go/internal/load.PackagesAndErrors(0x130018280, 0x2, 0x2, 0x0, 0x0, 0x0)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/cmd/go/internal/load/pkg.go:2063 +0xb4
cmd/go/internal/load.PackagesForBuild(0x130018280, 0x2, 0x2, 0x104875700, 0x0, 0x0)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/cmd/go/internal/load/pkg.go:2125 +0x3c
cmd/go/internal/work.runInstall(0x104bbdb40, 0x130018280, 0x2, 0x2)
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/cmd/go/internal/work/build.go:516 +0x34
main.main()
	/private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/src/cmd/go/main.go:189 +0x51c

r0      0x104
r1      0x0
r2      0x0
r3      0x0
r4      0x0
r5      0xa0
r6      0x3b
r7      0x3b9abfbf
r8      0x9ba00
r9      0x9b901
r10     0x13003e360
r11     0x2
r12     0x20a0
r13     0x0
r14     0x0
r15     0x0
r16     0x131
r17     0x0
r18     0x0
r19     0x13003e348
r20     0x13003e388
r21     0x0
r22     0x3b9abfbf
r23     0x3b
r24     0x0
r25     0x9b901
r26     0x9ba00
r27     0x104bdf958
r28     0x130000480
r29     0x16b94ecc0
lr      0x18ec72238
sp      0x16b94ec50
pc      0x18ed54c8c
fault   0x18ed54c8c
go tool dist: FAILED: /private/var/tmp/workdir-host-darwin-arm64-corellium-ios/go/pkg/tool/darwin_arm64/go_bootstrap install -gcflags=all= -ldflags=all= std cmd: exit status 2
@ianlancetaylor
Copy link
Contributor

It's not obvious to me that this has anything to do with pthread_cond_timedwait_relative_np. The backtrace suggests that sysmon called notetsleep to sleep for up to 0xdf8475800 nanoseconds on the note, which is 1 minute. That is normal enough, and it's possible that sysmon was in a loop sleeping for a minute or two each time waiting for something to happen.

The stack trace shows the other goroutine calling entersyscall. It would be nice to know what it was doing in that function. Clearly closedir should not take any noticeable amount of time to run.

@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 25, 2019
@ianlancetaylor ianlancetaylor added this to the Go1.14 milestone Nov 25, 2019
@ianlancetaylor ianlancetaylor added mobile Android, iOS, and x/mobile OS-Darwin labels Nov 25, 2019
@eliasnaur
Copy link
Contributor Author

More weird crashes on darwin/arm64:

https://build.golang.org/log/186ffe8d4a32f24c719cb49eddf6e37e73957239 ("fatal: bad g in signal handler")
https://build.golang.org/log/8da8b3d360f7b226bbc021f29c3d9617d35773f9 ("signal: illegal instruction")

@ianlancetaylor
Copy link
Contributor

CC @cherrymui

@cherrymui
Copy link
Member

Like Android, this is also the go_bootstrap program, and it also has the warning

warning: unable to find runtime/cgo.a

On darwin/arm64, we use libc calls, but as it fails to load runtime/cgo package, it does not save/restore G in TLS. If C code temporarily clobbers the G register and a signal is received, bad things will happen.

I guess the only thing we could do is to have async preemption disabled in go_bootstrap (and hope that we never receive a signal during C execution).

@gopherbot
Copy link

Change https://golang.org/cl/208818 mentions this issue: runtime: disable async preemption on darwin/arm(64) if no cgo

@eliasnaur
Copy link
Contributor Author

So you think https://build.golang.org/log/8da8b3d360f7b226bbc021f29c3d9617d35773f9 is a different issue? It happens during cmd/compile tests.

@cherrymui
Copy link
Member

Yeah, I think that is probably a different issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge mobile Android, iOS, and x/mobile NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Projects
None yet
Development

No branches or pull requests

4 participants