-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime/race: Potential race in math/big during TLS handshake #60247
Comments
Go1.12 is way out of our support window, sorry. Are you running your production server with the race detector on? That would be fine, I guess, but unusual. It looks like most of these threads are still doing work, they aren't totally stuck. What do you mean by "hangs", exactly? |
Here's the build command: go build -race -x -i $(LDFLAGS) The application process gets stuck waiting for some event that never gets triggered(i.e. some goroutines stuck on futexes). Some additional goroutine dump (from strace) for your reference: goroutine 0 [idle]: runtime.futex(0xe701c8, 0x80, 0x0, 0x0, 0x0, 0x508, 0x0, 0x0, 0x7fff0de4d1b0, 0x40bcb1, ...) /usr/local/go/src/runtime/sys_linux_amd64.s:535 +0x21 runtime.futexsleep(0xe701c8, 0x7fff00000000, 0xffffffffffffffff) /usr/local/go/src/runtime/os_linux.go:46 +0x4b runtime.notesleep(0xe701c8) /usr/local/go/src/runtime/lock_futex.go:151 +0xa1 runtime.stoplockedm() /usr/local/go/src/runtime/proc.go:2076 +0x8c runtime.schedule() /usr/local/go/src/runtime/proc.go:2477 +0x3ba runtime.park_m(0xc000000180) /usr/local/go/src/runtime/proc.go:2605 +0xa1 runtime.mcall(0x0) /usr/local/go/src/runtime/asm_amd64.s:299 +0x5b goroutine 1 [IO wait, locked to thread]: internal/poll.runtime_pollWait(0x7fea6e067f08, 0x72, 0x0) /usr/local/go/src/runtime/netpoll.go:182 +0x56 internal/poll.(*pollDesc).wait(0xc000267498, 0x72, 0x0, 0x0, 0x9ce241) /usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x9b internal/poll.(*pollDesc).waitRead(...) /usr/local/go/src/internal/poll/fd_poll_runtime.go:92 internal/poll.(*FD).Accept(0xc000267480, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0) /usr/local/go/src/internal/poll/fd_unix.go:384 +0x1ba net.(*netFD).accept(0xc000267480, 0xc000010001, 0x0, 0x0) /usr/local/go/src/net/fd_unix.go:238 +0x42 net.(*TCPListener).accept(0xc000010700, 0xc000260f08, 0xc000260f10, 0x18) /usr/local/go/src/net/tcpsock_posix.go:139 +0x32 net.(*TCPListener).Accept(0xc000010700, 0x9f55e0, 0xc000199800, 0xaa9480, 0xc0000100e0) /usr/local/go/src/net/tcpsock.go:260 +0x48 ./vendor/google.golang.org/grpc.(*Server).Serve(0xc000199800, 0xaa2220, 0xc000010700, 0x0, 0x0) goroutine 387 [select, 2 minutes]: net/http.(*persistConn).writeLoop(0xc0001e3e60) /usr/local/go/src/net/http/transport.go:1979 +0x113 created by net/http.(*Transport).dialConn /usr/local/go/src/net/http/transport.go:1361 +0xb1d goroutine 386 [IO wait, 2 minutes]: internal/poll.runtime_pollWait(0x7fea6e067c98, 0x72, 0xffffffffffffffff) /usr/local/go/src/runtime/netpoll.go:182 +0x56 internal/poll.(*pollDesc).wait(0xc000266298, 0x72, 0x5700, 0x571b, 0xffffffffffffffff) /usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x9b internal/poll.(*pollDesc).waitRead(...) /usr/local/go/src/internal/poll/fd_poll_runtime.go:92 internal/poll.(*FD).Read(0xc000266280, 0xc000584000, 0x571b, 0x571b, 0x0, 0x0, 0x0) /usr/local/go/src/internal/poll/fd_unix.go:169 +0x19b net.(*netFD).Read(0xc000266280, 0xc000584000, 0x571b, 0x571b, 0x203000, 0x0, 0x39) /usr/local/go/src/net/fd_unix.go:202 +0x4f net.(*conn).Read(0xc0001f0008, 0xc000584000, 0x571b, 0x571b, 0x0, 0x0, 0x0) /usr/local/go/src/net/net.go:177 +0x69 crypto/tls.(*atLeastReader).Read(0xc0003080e0, 0xc000584000, 0x571b, 0x571b, 0xe75bc0, 0x7fea6e02d0d0, 0xc0006a5938) /usr/local/go/src/crypto/tls/conn.go:761 +0x60 bytes.(*Buffer).ReadFrom(0xc000114cd8, 0xa96d80, 0xc0003080e0, 0x40b1e5, 0x93d040, 0x9b0560) /usr/local/go/src/bytes/buffer.go:207 +0xbd crypto/tls.(*Conn).readFromUntil(0xc000114a80, 0xa97640, 0xc0001f0008, 0x5, 0xc0001f0008, 0x203000) /usr/local/go/src/crypto/tls/conn.go:783 +0xf8 crypto/tls.(*Conn).readRecordOrCCS(0xc000114a80, 0x9f6400, 0xc000114bb8, 0xc0006a5b88) /usr/local/go/src/crypto/tls/conn.go:590 +0x125 crypto/tls.(*Conn).readRecord(...) /usr/local/go/src/crypto/tls/conn.go:558 crypto/tls.(*Conn).Read(0xc000114a80, 0xc000370000, 0x1000, 0x1000, 0x0, 0x0, 0x0) /usr/local/go/src/crypto/tls/conn.go:1236 +0x137 net/http.(*persistConn).Read(0xc0001e3e60, 0xc000370000, 0x1000, 0x1000, 0xc0006a5c88, 0x406815, 0xc00046a660) /usr/local/go/src/net/http/transport.go:1527 +0x7b bufio.(*Reader).fill(0xc0005945a0) /usr/local/go/src/bufio/bufio.go:100 +0x10f bufio.(*Reader).Peek(0xc0005945a0, 0x1, 0x0, 0x0, 0x1, 0xc00091c200, 0x0) /usr/local/go/src/bufio/bufio.go:138 +0x4f net/http.(*persistConn).readLoop(0xc0001e3e60) /usr/local/go/src/net/http/transport.go:1680 +0x1a3 created by net/http.(*Transport).dialConn /usr/local/go/src/net/http/transport.go:1360 +0xaf8 |
Timed out in state WaitingForInfo. Closing. (I am just a bot, though. Please speak up if this is a mistake or you have the requested information.) |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
The issue is hard to reproduce even with the above Go version due to its random nature. We can't roll out the service with the latest Go without knowing the actual cause.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Got a gRPC server configured with TLS that gets hung on a random set of hosts in the production fleet.
What did you expect to see?
gRPC server responding to requests without getting stuck due to any race condition.
What did you see instead?
Application/process hungs. pprof shows the following traces:
The text was updated successfully, but these errors were encountered: