Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: apparent deadlock in testSpliceIssue25985 on linux-arm64-aws #41326

Open
bcmills opened this issue Sep 10, 2020 · 3 comments
Open

net: apparent deadlock in testSpliceIssue25985 on linux-arm64-aws #41326

bcmills opened this issue Sep 10, 2020 · 3 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Sep 10, 2020

2020-09-10T14:44:25-8098dbb/linux-arm64-aws shows an apparent deadlock in the net test: various goroutines associated with testSpliceIssue25985 (#25985) were stuck in internal/poll for ~2 minutes.

panic: test timed out after 3m0s

goroutine 3250 [running]:
testing.(*M).startAlarm.func1()
/workdir/go/src/testing/testing.go:1664 +0xc8
created by time.goFunc
/workdir/go/src/time/sleep.go:167 +0x40

goroutine 1 [chan receive, 2 minutes]:
testing.(*T).Run(0x4000082f00, 0x27267a, 0xa, 0x284c30, 0x5f5a3c01)
/workdir/go/src/testing/testing.go:1203 +0x260
testing.runTests.func1(0x4000106480)
/workdir/go/src/testing/testing.go:1475 +0x74
testing.tRunner(0x4000106480, 0x400015dd28)
/workdir/go/src/testing/testing.go:1157 +0xdc
testing.runTests(0x400012e0a0, 0x408320, 0xed, 0xed, 0xbfcead39c8ca3c92, 0x29e8def1ac, 0x40b9c0, 0x40001127a0)
/workdir/go/src/testing/testing.go:1473 +0x29c
testing.(*M).Run(0x400014c000, 0x0)
/workdir/go/src/testing/testing.go:1381 +0x1b4
net.TestMain(0x400014c000)
/workdir/go/src/net/main_test.go:52 +0x30
main.main()
_testmain.go:587 +0x160

goroutine 321 [chan receive, 2 minutes]:
testing.(*T).Parallel(0x4000107680)
/workdir/go/src/testing/testing.go:1039 +0xd4
net.TestLookupGoogleSRV(0x4000107680)
/workdir/go/src/net/lookup_test.go:74 +0x30
testing.tRunner(0x4000107680, 0x2848d8)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 3039 [chan receive, 2 minutes]:
testing.(*T).Parallel(0x400042dc80)
/workdir/go/src/testing/testing.go:1039 +0xd4
net.TestCloseWrite(0x400042dc80)
/workdir/go/src/net/net_test.go:83 +0x28
testing.tRunner(0x400042dc80, 0x284388)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 546 [chan receive, 2 minutes]:
testing.(*T).Parallel(0x4000107800)
/workdir/go/src/testing/testing.go:1039 +0xd4
net.TestLookupGmailMX(0x4000107800)
/workdir/go/src/net/lookup_test.go:123 +0x28
testing.tRunner(0x4000107800, 0x2848a8)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 3130 [chan receive, 2 minutes]:
testing.(*T).Run(0x40001f9200, 0x272fc5, 0xb, 0x284c28, 0x5f5a3c01)
/workdir/go/src/testing/testing.go:1203 +0x260
net.TestSplice(0x4000082f00)
/workdir/go/src/net/splice_test.go:26 +0x94
testing.tRunner(0x4000082f00, 0x284c30)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 2974 [chan receive, 2 minutes]:
testing.(*T).Parallel(0x400048f500)
/workdir/go/src/testing/testing.go:1039 +0xd4
net.TestNotTemporaryRead(0x400048f500)
/workdir/go/src/net/net_test.go:566 +0x28
testing.tRunner(0x400048f500, 0x2849c0)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 3038 [chan receive, 2 minutes]:
testing.(*T).Parallel(0x400042db00)
/workdir/go/src/testing/testing.go:1039 +0xd4
net.TestCloseRead(0x400042db00)
/workdir/go/src/net/net_test.go:26 +0x28
testing.tRunner(0x400042db00, 0x284368)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 3075 [chan receive, 2 minutes]:
testing.(*T).Parallel(0x400050c600)
/workdir/go/src/testing/testing.go:1039 +0xd4
net.TestSendfilePipe(0x400050c600)
/workdir/go/src/net/sendfile_test.go:226 +0x48
testing.tRunner(0x400050c600, 0x284bd8)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 547 [chan receive, 2 minutes]:
testing.(*T).Parallel(0x4000107980)
/workdir/go/src/testing/testing.go:1039 +0xd4
net.TestLookupGmailNS(0x4000107980)
/workdir/go/src/net/lookup_test.go:169 +0x28
testing.tRunner(0x4000107980, 0x2848b0)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 2973 [chan receive, 2 minutes]:
testing.(*T).Parallel(0x400048f380)
/workdir/go/src/testing/testing.go:1039 +0xd4
net.TestCloseUnblocksRead(0x400048f380)
/workdir/go/src/net/net_test.go:541 +0x28
testing.tRunner(0x400048f380, 0x284380)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 548 [chan receive, 2 minutes]:
testing.(*T).Parallel(0x4000107b00)
/workdir/go/src/testing/testing.go:1039 +0xd4
net.TestLookupGmailTXT(0x4000107b00)
/workdir/go/src/net/lookup_test.go:218 +0x30
testing.tRunner(0x4000107b00, 0x2848b8)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 3041 [chan receive, 2 minutes]:
testing.(*T).Parallel(0x40001a8000)
/workdir/go/src/testing/testing.go:1039 +0xd4
net.TestListenerClose(0x40001a8000)
/workdir/go/src/net/net_test.go:228 +0x28
testing.tRunner(0x40001a8000, 0x284880)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 3042 [chan receive, 2 minutes]:
testing.(*T).Parallel(0x40001a8180)
/workdir/go/src/testing/testing.go:1039 +0xd4
net.TestPacketConnClose(0x40001a8180)
/workdir/go/src/net/net_test.go:286 +0x28
testing.tRunner(0x40001a8180, 0x2849e0)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 3040 [chan receive, 2 minutes]:
testing.(*T).Parallel(0x400042de00)
/workdir/go/src/testing/testing.go:1039 +0xd4
net.TestConnClose(0x400042de00)
/workdir/go/src/net/net_test.go:183 +0x28
testing.tRunner(0x400042de00, 0x2843e8)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 3047 [chan receive, 2 minutes]:
testing.(*T).Parallel(0x40001a8600)
/workdir/go/src/testing/testing.go:1039 +0xd4
net.TestZeroByteRead(0x40001a8600)
/workdir/go/src/net/net_test.go:401 +0x28
testing.tRunner(0x40001a8600, 0x284f08)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 3164 [chan receive, 2 minutes]:
testing.(*T).Run(0x40001f9800, 0x2727f6, 0xa, 0x4000297bf0, 0x2d34f73d01)
/workdir/go/src/testing/testing.go:1203 +0x260
net.testSplice(0x40001f9200, 0x270be8, 0x4, 0x27098a, 0x3)
/workdir/go/src/net/splice_test.go:39 +0x4b8
net.TestSplice.func2(0x40001f9200)
/workdir/go/src/net/splice_test.go:26 +0x50
testing.tRunner(0x40001f9200, 0x284c28)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 3222 [IO wait, 2 minutes]:
internal/poll.runtime_pollWait(0xffff8e599ab8, 0x72, 0x3e0548)
/workdir/go/src/runtime/netpoll.go:220 +0x44
internal/poll.(*pollDesc).wait(0x40002a3518, 0x72, 0x400053ea00, 0x200, 0x200)
/workdir/go/src/internal/poll/fd_poll_runtime.go:87 +0x38
internal/poll.(*pollDesc).waitRead(...)
/workdir/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0x40002a3500, 0x400053ea00, 0x200, 0x200, 0x0, 0x0, 0x0)
/workdir/go/src/internal/poll/fd_unix.go:159 +0x164
net.(*netFD).Read(0x40002a3500, 0x400053ea00, 0x200, 0x200, 0x0, 0x400053ea00, 0x400051bd78)
/workdir/go/src/net/fd_posix.go:55 +0x44
net.(*conn).Read(0x40000102b0, 0x400053ea00, 0x200, 0x200, 0x0, 0x0, 0x0)
/workdir/go/src/net/net.go:183 +0x74
bytes.(*Buffer).ReadFrom(0x400022ce10, 0x2becd8, 0x40000102b0, 0x2becd8, 0x40000102b0, 0x40002a3500)
/workdir/go/src/bytes/buffer.go:204 +0xa4
io/ioutil.readAll(0x2becd8, 0x40000102b0, 0x200, 0x0, 0x0, 0x0, 0x0, 0x0)
/workdir/go/src/io/ioutil/ioutil.go:36 +0xa8
io/ioutil.ReadAll(...)
/workdir/go/src/io/ioutil/ioutil.go:45
net.testSpliceIssue25985(0x40001f9800, 0x270be8, 0x4, 0x27098a, 0x3)
/workdir/go/src/net/splice_test.go:205 +0x3c0
net.testSplice.func2(0x40001f9800)
/workdir/go/src/net/splice_test.go:39 +0x48
testing.tRunner(0x40001f9800, 0x4000297bf0)
/workdir/go/src/testing/testing.go:1157 +0xdc
created by testing.(*T).Run
/workdir/go/src/testing/testing.go:1202 +0x244

goroutine 3223 [semacquire, 2 minutes]:
internal/poll.runtime_Semacquire(0x40002a34a8)
/workdir/go/src/runtime/sema.go:61 +0x38
internal/poll.(*FD).Close(0x40002a3480, 0x40002a3480, 0x0)
/workdir/go/src/internal/poll/fd_unix.go:108 +0x84
net.(*netFD).Close(0x40002a3480, 0x0, 0x0)
/workdir/go/src/net/fd_posix.go:37 +0x48
net.(*conn).Close(0x4000120338, 0x0, 0x0)
/workdir/go/src/net/net.go:207 +0x58
net.testSpliceIssue25985.func1()
/workdir/go/src/net/splice_test.go:187 +0x1dc
created by net.testSpliceIssue25985
/workdir/go/src/net/splice_test.go:189 +0x208

goroutine 3235 [IO wait, 2 minutes]:
internal/poll.runtime_pollWait(0xffff8e599b98, 0x72, 0x3e0548)
/workdir/go/src/runtime/netpoll.go:220 +0x44
internal/poll.(*pollDesc).wait(0x40002a3498, 0x72, 0x40002a8000, 0x8000, 0x8000)
/workdir/go/src/internal/poll/fd_poll_runtime.go:87 +0x38
internal/poll.(*pollDesc).waitRead(...)
/workdir/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0x40002a3480, 0x40002a8000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
/workdir/go/src/internal/poll/fd_unix.go:159 +0x164
net.(*netFD).Read(0x40002a3480, 0x40002a8000, 0x8000, 0x8000, 0x40001ed6b8, 0x80d4c, 0x8000)
/workdir/go/src/net/fd_posix.go:55 +0x44
net.(*conn).Read(0x4000120338, 0x40002a8000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
/workdir/go/src/net/net.go:183 +0x74
io.copyBuffer(0xffff8db043e0, 0x40000102a8, 0x2becd8, 0x4000120338, 0x40002a8000, 0x8000, 0x8000, 0x6d57c030, 0x15f650, 0x4000010290)
/workdir/go/src/io/io.go:409 +0xd4
io.Copy(...)
/workdir/go/src/io/io.go:368
net.testSpliceIssue25985.func1.1(0x2c5be0, 0x40000102a8, 0x2c5b30, 0x4000120338, 0x400033e300)
/workdir/go/src/net/splice_test.go:180 +0x98
created by net.testSpliceIssue25985.func1
/workdir/go/src/net/splice_test.go:179 +0x168
FAIL net 180.006s

CC @mikioh @bradfitz @ianlancetaylor

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 10, 2020
@bcmills bcmills added this to the Backlog milestone Sep 10, 2020
@ianlancetaylor
Copy link
Contributor

CC @acln0

@acln0
Copy link
Contributor

acln0 commented Sep 10, 2020

I don't see what it is, at first sight. I will keep looking.

@acln0
Copy link
Contributor

acln0 commented Sep 10, 2020

I notice that one of the goroutines is stuck in Close, which looks like #25985, but I thought we fixed that correctly back then. I don't know what (if anything) has changed from then to now. Perhaps some of the recent defer optimization work has something to do with it? I have no other ideas. Unfortunately, I cannot do a bisection or further investigations this week. My development machine isn't available at the moment. Sorry. I will look at it some more next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

3 participants