Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: deadlock detection broken with external linking #46260

Closed
4a6f656c opened this issue May 19, 2021 · 10 comments
Closed

runtime: deadlock detection broken with external linking #46260

4a6f656c opened this issue May 19, 2021 · 10 comments

Comments

@4a6f656c
Copy link
Contributor

The following code (based on test/fixedbugs/issue21576.go) works as expected when compiled with internal linking, however deadlocks when compiled with external linking:

$ cat b.go
package main

func main() {
  c := make(chan int)
  c <- 1
}
$ go build -o b b.go
$ time ./b
fatal error: all goroutines are asleep - deadlock!

goroutine 1 [chan send]:
main.main()
        /tmp/b.go:5 +0x50
    0m00.06s real     0m00.00s user     0m00.01s system
$ go build -ldflags=-linkmode=external -o b b.go
$ time ./b
^C    0m10.47s real     0m00.00s user     0m00.02s system


This does not appear to be new (Go 1.15, Go 1.16 and -tip result in the same on initial testing) and impacts multiple platforms (tested on openbsd/mips64, openbsd/amd64, linux/amd64).

@maxsxu
Copy link

maxsxu commented May 19, 2021

It does not reproduce in darwin/amd64.

➜  go version                                      
go version go1.16.3 darwin/amd64
➜  cat test.go  
package main

func main() {
  c := make(chan int)
  c <- 1
}
➜  go build test.go        
➜  ./test 
fatal error: all goroutines are asleep - deadlock!

goroutine 1 [chan send]:
main.main()
        /tmp/test.go:5 +0x50
➜  go build -ldflags=-linkmode=external test.go
➜  ./test
fatal error: all goroutines are asleep - deadlock!

goroutine 1 [chan send]:
main.main()
        /tmp/test.go:5 +0x50
➜  go build -ldflags=all=-linkmode=external test.go
➜  ./test 
fatal error: all goroutines are asleep - deadlock!

goroutine 1 [chan send]:
main.main()
        /tmp/test.go:5 +0x50

@AlexRouSg

@thanm thanm added this to the Go1.17 milestone May 19, 2021
@thanm
Copy link
Contributor

thanm commented May 19, 2021

I can reproduce this on tip for linux/amd64.

@thanm
Copy link
Contributor

thanm commented May 19, 2021

@AlexRouSg
Copy link
Contributor

AlexRouSg commented May 19, 2021

On go1.16.4 windows/amd64 this does not reproduce with go build -ldflags=-linkmode=external .
But does reproduce with go build -ldflags=all=-linkmode=external . and go build -ldflags=-linkmode=external b.go

There is one notable difference, when running go build -ldflags=-linkmode=external .
It prints loadinternal: cannot find runtime/cgo

This could explain why @jsonbruce couldn't reproduce, can you confirm the exact command you ran?

@cherrymui
Copy link
Member

cherrymui commented May 19, 2021

Yes, this is known and expected. External linking in general brings in cgo. For a cgo binary even all goroutines are asleep C code can make progress, and we don't know. So we don't signal an error for all goroutines being asleep. That's why runtime deadlock tests require internal linking, e.g. https://tip.golang.org/src/runtime/crash_test.go#L184 .

Maybe we could do better detecting just external linking vs. real cgo. Other than that, this is expected.

@cherrymui cherrymui modified the milestones: Go1.17, Unplanned May 19, 2021
@bcmills
Copy link
Contributor

bcmills commented May 19, 2021

At one point there was an accepted proposal for partial deadlock detection (#13759), but it was apparently not implemented. The proposed partial-deadlock detector would be able to diagnose Go deadlocks even in cgo programs.

@ianlancetaylor
Copy link
Contributor

See also #29322.

Closing this issue because there is nothing new to do here. It's unfortunate but known. In general the deadlock detector is best-effort. I believe that it will also fail in a program that uses the network, even if there are no currently open network connections.

@4a6f656c
Copy link
Contributor Author

Yes, this is known and expected. External linking in general brings in cgo. For a cgo binary even all goroutines are asleep C code can make progress, and we don't know. So we don't signal an error for all goroutines being asleep.

Thanks for confirming - it makes sense with the external linking/cgo connection.

That's why runtime deadlock tests require internal linking, e.g. https://tip.golang.org/src/runtime/crash_test.go#L184 .

Heh, the tests in src/runtime do, however at least one in test does not (hence running into this). I'll send a CL to make this more obvious for future travelers.

Maybe we could do better detecting just external linking vs. real cgo. Other than that, this is expected.

Right, in this case there is no actual cgo, with all of the threads being Go controlled. Looking at the backtrace I was expecting it to work (and even reading the code and seeing iscgo in the runtime, I failed to realise the connection).

@cherrymui
Copy link
Member

Heh, the tests in src/runtime do, however at least one in test does not (hence running into this). I'll send a CL to make this more obvious for future travelers.

Yeah, unfortunately GOROOT/test tests do not have access to internal/testenv, so it has to be done explicitly.

@gopherbot
Copy link

Change https://golang.org/cl/321450 mentions this issue: test: use internal linking with deadlock detector test

@golang golang locked and limited conversation to collaborators May 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants