Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(176)

Issue 11920044: code review 11920044: runtime: fix potential deadlock in netpoll on windows (Closed)

Can't Edit
Can't Publish+Mail
Start Review
Created:
11 years, 8 months ago by dvyukov
Modified:
11 years, 8 months ago
Reviewers:
brainman
CC:
golang-dev, brainman
Visibility:
Public.

Description

runtime: fix potential deadlock in netpoll on windows If netpoll has been told to block, it must not return with nil, otherwise scheduler assumes that netpoll is disabled.

Patch Set 1 #

Patch Set 2 : diff -r 654ca7de0282 https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 3 : diff -r 654ca7de0282 https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 4 : diff -r 9ad8f5ef60ae https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 5 : diff -r 9ad8f5ef60ae https://dvyukov%40google.com@code.google.com/p/go/ #

Unified diffs Side-by-side diffs Delta from patch set Stats (+4 lines, -1 line) Patch
M src/pkg/runtime/netpoll_windows.c View 1 2 chunks +4 lines, -1 line 0 comments Download

Messages

Total messages: 4
dvyukov
Hello golang-dev@googlegroups.com, I'd like you to review this change to https://dvyukov%40google.com@code.google.com/p/go/
11 years, 8 months ago (2013-07-26 14:37:43 UTC) #1
brainman
LGTM I assumed that netpollready will always set gp to non nil given it io ...
11 years, 8 months ago (2013-07-27 02:30:44 UTC) #2
dvyukov
*** Submitted as https://code.google.com/p/go/source/detail?r=c3466cad62b0 *** runtime: fix potential deadlock in netpoll on windows If netpoll ...
11 years, 8 months ago (2013-07-27 09:46:47 UTC) #3
dvyukov
11 years, 8 months ago (2013-07-27 09:52:41 UTC) #4
Message was sent while issue was closed.
On 2013/07/27 02:30:44, brainman wrote:
> LGTM
> 
> I assumed that netpollready will always set gp to non nil given it io
completion
> packet. How is it possible to otherwise? Do you think this could be solution
to
> our problem?

There a logical race condition between waiter calling netpollblock and netpoll
calling netpollunblock. If the latter happens first, then netpollunblock just
sets the semaphore to READY and returns with nil gp. netpollblock will consume
READY w/o blocking.
The protocol between scheduler and netpoll is that netpoll(block=true) must not
return with nil gp, otherwise scheduler assumes that netpoll is disabled.

Unfortunately it is not the cause of the hang we observe, I've patched it very
early during my investigation (though, it could be).


> I haven't seen BSOD but because of faulty drivers or hardware. I am very
> surprised with what you are seeing. I will keep looking. I will try and see
> again, if I did see CPU 100% or process was just locked like you have
described.

I've seen in goroutine tracebacks that the goroutine is blocked in
WaitCancelled, which means that it has received the timeout.
I've inserted time.Sleep(time.Hour) when the hang is detected and seen with
ProcExplorer that one of the threads is blocked in GQCS, and it also does not
consume CPU during hang.


> Do you think we should disable net test temporarily until we decide what to do
> next?

I've filed https://code.google.com/p/go/issues/detail?id=5971
and disabled the test for now.
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b