-
Notifications
You must be signed in to change notification settings - Fork 18k
net: race between Close and Read #3507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Milestone
Comments
I believe that what happened is: c.Read: check c.ok(), which checks c.fd != nil <pause> c.Close: c.fd = nil c.Read: <resume> call c.fd.Read Then (*netFD).Read uses its receiver as if non-nil, and the crash happens. The fix may be to delete the c.fd = nil line. The fd knows whether it's closed or not. Labels changed: added priority-later, go1.1, removed priority-triage. Status changed to Accepted. |
I believe rsc is correct. I've tried to cook up a little test program which will blow up pretty quickly on a 2 core machine with GOMAXPROCS=100 (probably overkill). The busier the machine the more likely it is hit the race. fullung, can you please test this on your big host and see if you can replicate the problem, then we can discuss the fix. Attachments:
|
Hello Yes, as soon as I do something to make the machine a bit busier, this triggers immediately. panic: runtime error: invalid memory address or nil pointer dereference [signal 0xb code=0x1 addr=0x70 pc=0x47cae6] goroutine 479618 [running]: sync/atomic.CompareAndSwapUint32(0x70, 0x100000000, 0xf8401b2a20, 0x0, 0x421d42, ...) go/src/pkg/sync/atomic/asm_amd64.s:12 +0xd sync.(*Mutex).Lock(0x70, 0x0) go/src/pkg/sync/mutex.go:40 +0x35 net.(*netFD).Read(0x0, 0xf840068000, 0x200000002000, 0x0, 0x0, ...) go/src/pkg/net/fd.go:417 +0x42 net.(*TCPConn).Read(0xf84018c1b0, 0xf840068000, 0x200000002000, 0x2e3732310000000e, 0x0, ...) go/src/pkg/net/tcpsock_posix.go:87 +0xce created by main.main netstress.go:35 +0x294 goroutine 1 [running]: goroutine 2 [syscall]: created by runtime.main go/src/pkg/runtime/proc.c:221 goroutine 3 [syscall]: syscall.Syscall6() go/src/pkg/syscall/asm_linux_amd64.s:40 +0x5 syscall.EpollWait(0xf84006a010, 0xf84006a010, 0xa0000000a, 0xffffffff, 0xc, ...) go/src/pkg/syscall/zerrors_linux_amd64.go:1781 +0xa1 created by net.newPollServer go/src/pkg/net/newpollserver.go:35 +0x382 goroutine 4 [chan receive]: net.(*pollServer).WaitRead(0xf840034800, 0xf84006b000, 0xf840036450, 0xb, 0x1, ...) go/src/pkg/net/fd.go:268 +0x73 net.(*netFD).accept(0xf84006b000, 0x401027, 0x0, 0xf840036420, 0xf84005a030, ...) go/src/pkg/net/fd.go:622 +0x20d net.(*TCPListener).AcceptTCP(0xf84005a0a0, 0x1, 0x0, 0x0, 0x10, ...) go/src/pkg/net/tcpsock_posix.go:322 +0x71 net.(*TCPListener).Accept(0xf84005a0a0, 0x0, 0x0, 0x0, 0x0, ...) go/src/pkg/net/tcpsock_posix.go:332 +0x49 main._func_001(0xf84003a260, 0xf84003a250, 0x0, 0x0) netstress.go:19 +0x5a created by main.main netstress.go:28 +0x189 go version weekly.2012-03-27 +1b2b113a2d66 |
Hi, Thanks for testing, could you please apply this CL which incorporates rsc's suggested fix and see how you go. I've been running this fix for a few hours and netstress hasn't faulted yet. http://golang.org/cl/6002053 |
Hello, Could you please reapply http://golang.org/cl/6002053 and try these additional test files. unixstress.go needs a bit of tlc if you run it more than once. The same issue probably effects iprawsock_posix.go, but it's pretty hard to test that, so I'm going to have to trust it also benefits from this change. Attachments:
|
No panics. On a busy system, unixstress dies with: dial unix /tmp/x: resource temporarily unavailable and tcpstress dies with: dial tcp 127.0.0.1:9999: cannot assign requested address I'm guessing the UNIX one is due to the backlog filling up and the TCP one is due too many sockets in the TIME_WAIT state. |
Owner changed to @davecheney. Attachments:
|
This issue was closed by revision 1f14d45. Status changed to Fixed. |
davecheney
added a commit
that referenced
this issue
May 11, 2015
««« backport 5f24ff99b5f1 net: fix race between Close and Read Fixes #3507. Applied the suggested fix from rsc. If the connection is in closing state then errClosing will bubble up to the caller. The fix has been applied to udp, ip and unix as well as their code path include nil'ing c.fd on close. Func tests are available in the linked issue that verified the bug existed there as well. R=rsc, fullung, alex.brainman, mikioh.mikioh CC=golang-dev https://golang.org/cl/6002053 »»»
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The text was updated successfully, but these errors were encountered: