New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gccgo, runtime: panic with checkdead: find g 19 in status 1 #11406
Comments
We have a testcase that reproduces the problem, which shows that it is due to the Entersyscall and Exitsyscall within libgo/go/syscall/libcall_linux.go function Getdents. It should not be in Getdents, since Entersyscall and Exitsyscall are called within the call to Syscall(). Calling Exitsyscall on this path is what causes the panic in checkdead. I have a patch to fix this here: https://go-review.googlesource.com/#/c/11761/ I built did a bootstrap build on ppc64le and tested it. Testcase to reproduce the problem with gccgo from gcc5 branch on both ppc64le and x86 is below. You must do mkdir temp to cause the panic to occur:
|
I did not intend to close it, sorry. |
CL https://golang.org/cl/11761 mentions this issue. |
Thanks for sorting this out. Your patch has been committed to GCC mainline and GCC 4.9 branch. Waiting for RM approval for GCC 5 branch. |
The syscall for Getdents in syscall/libcall_linux.go called Entersyscall and Exitsyscall, causing the runtime_sched counts for goroutines to be incorrect. Inconsistent counts caused checkdead in runtime/proc.c to panic. Fixes golang/go#11406 Reviewed-on: https://go-review.googlesource.com/11761 git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@225745 138bc75d-0d04-0410-961f-82ee72b054a4
The syscall for Getdents in syscall/libcall_linux.go called Entersyscall and Exitsyscall, causing the runtime_sched counts for goroutines to be incorrect. Inconsistent counts caused checkdead in runtime/proc.c to panic. Fixes golang/go#11406 Reviewed-on: https://go-review.googlesource.com/11761 git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch@225746 138bc75d-0d04-0410-961f-82ee72b054a4
Also committed to GCC 5 branch. |
The syscall for Getdents in syscall/libcall_linux.go called Entersyscall and Exitsyscall, causing the runtime_sched counts for goroutines to be incorrect. Inconsistent counts caused checkdead in runtime/proc.c to panic. Fixes golang/go#11406 Reviewed-on: https://go-review.googlesource.com/11761 git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-5-branch@225773 138bc75d-0d04-0410-961f-82ee72b054a4
The syscall for Getdents in syscall/libcall_linux.go called Entersyscall and Exitsyscall, causing the runtime_sched counts for goroutines to be incorrect. Inconsistent counts caused checkdead in runtime/proc.c to panic. Fixes golang/go#11406 Reviewed-on: https://go-review.googlesource.com/11761 git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch@225746 138bc75d-0d04-0410-961f-82ee72b054a4
The syscall for Getdents in syscall/libcall_linux.go called Entersyscall and Exitsyscall, causing the runtime_sched counts for goroutines to be incorrect. Inconsistent counts caused checkdead in runtime/proc.c to panic. Fixes golang/go#11406 Reviewed-on: https://go-review.googlesource.com/11761 From-SVN: r225746
The syscall for Getdents in syscall/libcall_linux.go called Entersyscall and Exitsyscall, causing the runtime_sched counts for goroutines to be incorrect. Inconsistent counts caused checkdead in runtime/proc.c to panic. Fixes golang/go#11406 Reviewed-on: https://go-review.googlesource.com/11761 From-SVN: r225745
Using gccgo from latest gcc5 branch. Building Docker from their master branch on ppc64le trying to provide continuous integration. This had been working without issue up until sometime in the past 2 weeks, where there is a failure during the Docker build. This same failure occurs when running one of the Docker unit tests after we've built Docker this way.
Failure occurs when building on Ubuntu 14.04 and 14.10 ppc64le.
Here is and example of the stack information for the Docker build failure:
runtime: checkdead: find g 19 in status 1
fatal error: checkdead: runnable g
runtime stack:
runtime_dopanic
../../../src/libgo/runtime/panic.c:135
runtime_throw
../../../src/libgo/runtime/panic.c:193
checkdead
../../../src/libgo/runtime/proc.c:2866
mput
../../../src/libgo/runtime/proc.c:3113
stopm
../../../src/libgo/runtime/proc.c:1456
findrunnable
../../../src/libgo/runtime/proc.c:1764
schedule
../../../src/libgo/runtime/proc.c:1847
park0
../../../src/libgo/runtime/proc.c:1912
runtime_mstart
../../../src/libgo/runtime/proc.c:1076
goroutine 16 [chan receive]:
main.mainDaemon
/go/src/github.com/docker/docker/docker/daemon.go:186
main.main
/go/src/github.com/docker/docker/docker/docker.go:99
created by main
../../../src/libgo/runtime/go-main.c:48
goroutine 18 [finalizer wait]:
created by runtime_createfing
../../../src/libgo/runtime/mgc0.c:2572
goroutine 19 [runnable]:
syscall.Exitsyscall
../../../src/libgo/runtime/proc.c:2122
os_signal.signal_recv
../../../src/libgo/runtime/sigqueue.goc:110
signal.loop
../../../src/libgo/go/os/signal/signal_unix.go:21
created by os_signal..import
../../../src/libgo/go/os/signal/signal_unix.go:25
goroutine 25 [chan receive]:
github_com_docker_docker_api_server.ServeApi.pN42_github_com_docker_docker_api_server.Server
/go/src/github.com/docker/docker/api/server/server.go:110
main.$nested2
/go/src/github.com/docker/docker/docker/daemon.go:136
created by main.mainDaemon
/go/src/github.com/docker/docker/docker/daemon.go:135
goroutine 26 [chan receive]:
daemon.$nested20
/go/src/github.com/docker/docker/daemon/debugtrap.go:17
created by daemon.setupSigusr1Trap
/go/src/github.com/docker/docker/daemon/debugtrap.go:16
goroutine 63 [chan receive]:
database_sql.connectionOpener.pN15_database_sql.DB
../../../src/libgo/go/database/sql/sql.go:589
We can also generate a failure in one of the unit tests with the same message, and the stack trace is not quite the same:
runtime: checkdead: find g 20 in status 1
fatal error: checkdead: runnable g
runtime stack:
runtime_dopanic
../../../src/libgo/runtime/panic.c:135
runtime_throw
../../../src/libgo/runtime/panic.c:193
checkdead
../../../src/libgo/runtime/proc.c:2866
mput
../../../src/libgo/runtime/proc.c:3113
stopm
../../../src/libgo/runtime/proc.c:1456
exitsyscall0
../../../src/libgo/runtime/proc.c:2205
runtime_mstart
../../../src/libgo/runtime/proc.c:1076
goroutine 16 [chan receive]:
testing.RunTests
../../../src/libgo/go/testing/testing.go:556
testing.Run.pN9_testing.M
../../../src/libgo/go/testing/testing.go:485
main.main
/tmp/go-build271522865/github.com/docker/docker/pkg/archive/_test/_testmain.go:180
created by main
../../../src/libgo/runtime/go-main.c:48
goroutine 18 [finalizer wait]:
created by runtime_createfing
../../../src/libgo/runtime/mgc0.c:2572
goroutine 73 [runnable]:
syscall.Exitsyscall
../../../src/libgo/runtime/proc.c:2122
syscall.Getdents
../../../src/libgo/go/syscall/libcall_linux.go:236
syscall.ReadDirent
../../../src/libgo/go/syscall/libcall_linux.go:250
archive.readdirnames
/go/src/github.com/docker/docker/pkg/archive/changes_linux.go:236
github_com_docker_docker_pkg_archive.walk.pN43_github_com_docker_docker_pkg_archive.walker
/go/src/github.com/docker/docker/pkg/archive/changes_linux.go:120
archive.collectFileInfoForChanges
/go/src/github.com/docker/docker/pkg/archive/changes_linux.go:53
github_com_docker_docker_pkg_archive.ChangesDirs
/go/src/github.com/docker/docker/pkg/archive/changes.go:309
archive.checkNoChanges
/go/src/github.com/docker/docker/pkg/archive/archive_test.go:553
github_com_docker_docker_pkg_archive.TestTarFiles
/go/src/github.com/docker/docker/pkg/archive/archive_test.go:521
testing.tRunner
../../../src/libgo/go/testing/testing.go:447
created by testing.RunTests
../../../src/libgo/go/testing/testing.go:555
goroutine 74 [runnable]:
sync.runtime_Syncsemacquire
../../../src/libgo/runtime/sema.goc:262
sync.Wait.pN9_sync.Cond
../../../src/libgo/go/sync/cond.go:62
io.write.pN7_io.pipe
../../../src/libgo/go/io/pipe.go:94
io.Write.pN13_io.PipeWriter
../../../src/libgo/go/io/pipe.go:161
bufio.flush.pN12_bufio.Writer
../../../src/libgo/go/bufio/bufio.go:530
bufio.Flush.pN12_bufio.Writer
../../../src/libgo/go/bufio/bufio.go:519
pools.$nested3
/go/src/github.com/docker/docker/pkg/pools/pools.go:102
github_com_docker_docker_pkg_ioutils.Close.pN55_github_com_docker_docker_pkg_ioutils.writeCloserWrapper
/go/src/github.com/docker/docker/pkg/ioutils/writers.go:31
archive.$nested0
/go/src/github.com/docker/docker/pkg/archive/archive.go:476
created by github_com_docker_docker_pkg_archive.TarWithOptions
/go/src/github.com/docker/docker/pkg/archive/archive.go:393
Tests failed: ./pkg/archive
Interesting that in both cases there is a stack marked as runnable with syscall.Exitsyscall at the top.
I have been looking at the code in checkdead where the message is coming from. In order for this to correctly detect a deadlock, the counts found in the runtime_sched structure must be in sync (i.e. must reflect the same state) with the entries runtime_allg list and I don't see how that is enforced. Seems like the run count could be computed but before the runtime_allg list is processed, a new goroutine could be added, resulting in a new run count, identifying a deadlock when there isn't one.
We will continue to try to reproduce this problem in a smaller testcase and do some more debugging.
The text was updated successfully, but these errors were encountered: