Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

time: TestTicker timeouts after CL 232298 #42424

Closed
bcmills opened this issue Nov 6, 2020 · 9 comments
Closed

time: TestTicker timeouts after CL 232298 #42424

bcmills opened this issue Nov 6, 2020 · 9 comments
Assignees
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Nov 6, 2020

(Issue forked from #42237.)

panic: test timed out after 15m0s

goroutine 219 [running]:
testing.(*M).startAlarm.func1()
	C:/workdir/go/src/testing/testing.go:1680 +0xe6
created by time.goFunc
	C:/workdir/go/src/time/sleep.go:167 +0x4b

goroutine 1 [chan receive, 14 minutes]:
testing.(*T).Run(0xc000058000, 0x4cb903, 0xa, 0x545b78, 0x368a01)
	C:/workdir/go/src/testing/testing.go:1219 +0x2da
testing.runTests.func1(0xc000048f00)
	C:/workdir/go/src/testing/testing.go:1491 +0x7f
testing.tRunner(0xc000048f00, 0xc000133de0)
	C:/workdir/go/src/testing/testing.go:1173 +0xef
testing.runTests(0xc000004558, 0x674680, 0x6f, 0x6f, 0xbfe1343f8f278ecc, 0xd18c5b20e5, 0x678480, 0x4cca88)
	C:/workdir/go/src/testing/testing.go:1489 +0x310
testing.(*M).Run(0xc00001e100, 0x0)
	C:/workdir/go/src/testing/testing.go:1397 +0x1f5
main.main()
	_testmain.go:377 +0x145

goroutine 31 [chan receive, 14 minutes]:
testing.(*T).Parallel(0xc00019d680)
	C:/workdir/go/src/testing/testing.go:1039 +0x119
time_test.TestParseYday(0xc00019d680)
	C:/workdir/go/src/time/format_test.go:762 +0x52
testing.tRunner(0xc00019d680, 0x545ae8)
	C:/workdir/go/src/testing/testing.go:1173 +0xef
created by testing.(*T).Run
	C:/workdir/go/src/testing/testing.go:1218 +0x2b3

goroutine 231 [chan receive, 14 minutes]:
time_test.TestTicker(0xc000058000)
	C:/workdir/go/src/time/tick_test.go:45 +0xc5
testing.tRunner(0xc000058000, 0x545b78)
	C:/workdir/go/src/testing/testing.go:1173 +0xef
created by testing.(*T).Run
	C:/workdir/go/src/testing/testing.go:1218 +0x2b3
FAIL	time	900.029s

2020-11-05T19:22:02-370682a/windows-amd64-longtest
2020-11-05T16:47:45-a19a4dc/windows-amd64-longtest
2020-11-05T15:16:57-34c0969/windows-amd64-longtest
2020-11-02T21:08:14-4fcb506/windows-amd64-longtest
2020-11-02T15:23:43-d5388e2/linux-amd64-longtest
2020-11-02T05:29:59-f2ee58b/windows-amd64-longtest
2020-11-02T03:03:16-0387bed/windows-amd64-longtest
2020-10-30T18:06:13-6d087c8/windows-amd64-longtest
2020-10-30T16:29:11-1af388f/windows-amd64-longtest
2020-10-30T15:25:49-e1faebe/linux-386-longtest
2020-10-29T20:50:02-38d1ec8/windows-amd64-longtest
2020-10-29T19:06:32-5cc43c5/windows-amd64-longtest
2020-10-29T03:52:23-3089ef6/windows-amd64-longtest
2020-10-28T00:48:22-49b017f/windows-amd64-longtest
2020-10-28T00:25:05-40d1ec5/windows-amd64-longtest
2020-10-27T22:57:36-b4b0144/windows-amd64-longtest

CC @prattmic @ChrisHines @aclements @ianlancetaylor

@bcmills bcmills added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker labels Nov 6, 2020
@bcmills bcmills added this to the Go1.16 milestone Nov 6, 2020
@bcmills

This comment has been minimized.

@bcmills

This comment has been minimized.

@prattmic
Copy link
Member

prattmic commented Nov 9, 2020

One of these failures was after http://golang.org/cl/267257, so it seems that's not the fix.

2020-11-06T23:59:04-f7ef5ca/windows-amd64-longtest

@aclements
Copy link
Member

@laboger may have an easy reproducer for this on linux/amd64 (and linux/ppc64le) in #42402.

@bcmills
Copy link
Contributor Author

bcmills commented Nov 16, 2020

Sounds like #42402 was fixed by CL 267257, so unfortunately we don't seem to have a lead on a reliable reproducer.

@prattmic prattmic self-assigned this Nov 30, 2020
@prattmic
Copy link
Member

prattmic commented Dec 1, 2020

The root cause is a timer in TestCheckRuntimeTimerOverflow, where we explictly set t.when = 0: https://cs.opensource.google/go/go/+/master:src/time/internal_test.go;l=67;bpv=0;bpt=1.

This timer is on top of the heap, and in turn, pp.timer0When = 0, and then checkTimers will decide there are no timers at all, thus missing any timers further down the heap.

I'm still looking into when we broke this, because as I recall, we've treated timer0When == 0 as "no timers" for a while.

As an aside, @mknyszek notes that we may want timer0When's sentinel value to be maxWhen to make this more obvious (regardless if it fixes the issue directly).


[5.158222300 P 2] M 4 modtimer: modify t 0xc0000661e0 when 9223372036854775807 nextwhen 0 P 2 
        0xec3288 [time.resetTimer+0x68 C:/workdir/go/src/runtime/time.go:611]
        0xec3244 [time.resetTimer+0x24 C:/workdir/go/src/runtime/time.go:224]
        0xef58c4 [time.CheckRuntimeTimerOverflow.func1+0x64 C:/workdir/go/src/time/internal_test.go:67]
        0xef56bb [time.CheckRuntimeTimerOverflow+0xfb C:/workdir/go/src/time/internal_test.go:78]
        0xfe5cee [time_test.TestOverflowRuntimeTimer+0x4e C:/workdir/go/src/time/sleep_test.go:482]
        0xf4de2e [testing.tRunner+0xee C:/workdir/go/src/testing/testing.go:1195]
        0xec6100 [runtime.goexit+0x0 C:/workdir/go/src/runtime/asm_amd64.s:1367]
...
[5.158222300 P 2] M 4 cleantimers: modified t 0xc0000661e0 when 9223372036854775807 nextwhen 0
...
[5.158222300 P 2] P 2 dodeltimer0: t 0xc0000661e0
...
[5.158222300 P 2] P 2 doaddtimer: t 0xc0000661e0 when 0 now 7187394008100 until -7187394008100 len(pp.timers) 0 
... (when = 0 at top of heap)
[5.158222300 P 2] P 2 doaddtimer: t 0xc000066378 when 7187414008100 now 7187394008100 until 20000000 len(pp.timers) 1 
        0xeacabc [runtime.addtimer+0xbc C:/workdir/go/src/runtime/time.go:260]
        0xec3191 [time.startTimer+0x31 C:/workdir/go/src/runtime/time.go:207]
        0xeeaa64 [time.NewTicker+0xe4 C:/workdir/go/src/time/tick.go:39]
        0xfe6c89 [time_test.TestTicker+0x349 C:/workdir/go/src/time/tick_test.go:42]
        0xf4de2e [testing.tRunner+0xee C:/workdir/go/src/testing/testing.go:1195]
        0xec6100 [runtime.goexit+0x0 C:/workdir/go/src/runtime/asm_amd64.s:1367]
[5.158222300 P 2] P 2 doaddtimer done: t 0xc000066378 len(pp.timers) 2 timer0When 0 numTimers 2
... (added new timer, timer0When still 0)
[5.158222300 P 2] M 4 checkTimers: P 2
[5.158222300 P 2] M 4 checkTimers: no timers: timer0When 0 timerModifiedEarliest 0
... (no timers!)

@prattmic
Copy link
Member

prattmic commented Dec 1, 2020

Bisect shows that this was broken by golang.org/cl/258303 (cc @ianlancetaylor). If I understand correctly, previously this happened:

  1. TestCheckRuntimeTimerOverflow timer sets nextwhen = 0 in modtimer, and increments adjustTimers.
  2. checkTimers sees adjustTimers != 0, adjusts timer, and then immediately runs the when = 0 timer, removing it from the heap.

Now we've only updated timerModifiedEarliest. Since timer0When and timerModifiedEarliest are 0, checkTimers will never run the when = 0 timer, causing the stuckness.

Now, how to fix? The simple fix is to change https://cs.opensource.google/go/go/+/master:src/time/internal_test.go;l=67;bpv=1;bpt=0 to set when to 1 instead of 0. The overflow test still does it's job, the only question is could a timer ever legitimately get when = 0? I don't think so, but I may be missing something.

@gopherbot
Copy link

Change https://golang.org/cl/274632 mentions this issue: time, runtime: don't set timer when = 0

@golang golang locked and limited conversation to collaborators Dec 3, 2021
@prattmic prattmic self-assigned this Jun 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Projects
None yet
Development

No branches or pull requests

4 participants