Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

time: TestTimerModifiedEarlier failures with "timer took too long" on plan9-arm #50470

Open
bcmills opened this issue Jan 6, 2022 · 5 comments
Labels
arch-arm Issues solely affecting the 32-bit arm architecture. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Plan9
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Jan 6, 2022

#!watchflakes
post <- builder ~ `^plan9` && pkg == "time" && test == "TestTimerModifiedEarlier" && `timer took too long`
--- FAIL: TestTimerModifiedEarlier (10.08s)
    sleep_test.go:552: timer took too long (10.006295837s)
    sleep_test.go:561: 1 failures
FAIL
FAIL	time	13.241s

greplogs --dashboard -md -l -e 'FAIL: TestTimerModifiedEarlier'

2022-01-04T21:10:15-1c8f9d2/plan9-arm
2021-12-02T23:02:33-8835343/plan9-arm
2021-11-05T21:34:10-bb53fd7/plan9-arm

The failing test is the regression test for #47329.

CC @millerresearch @ianlancetaylor

@bcmills bcmills added arch-arm Issues solely affecting the 32-bit arm architecture. OS-Plan9 labels Jan 6, 2022
@bcmills bcmills added this to the Backlog milestone Jan 6, 2022
@gopherbot
Copy link

Change https://golang.org/cl/375934 mentions this issue: time: skip TestTimerModifiedEarlier on plan9/arm

gopherbot pushed a commit that referenced this issue Jan 6, 2022
This test is observed to be flaky on the plan9-arm builder.
Skip it on that platform until it can be diagnosed and fixed.

For #50470

Change-Id: If626af426d856c377e00ac5baaca52899456556e
Reviewed-on: https://go-review.googlesource.com/c/go/+/375934
Trust: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@cagedmantis cagedmantis added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jan 7, 2022
jproberts pushed a commit to jproberts/go that referenced this issue Jun 21, 2022
This test is observed to be flaky on the plan9-arm builder.
Skip it on that platform until it can be diagnosed and fixed.

For golang#50470

Change-Id: If626af426d856c377e00ac5baaca52899456556e
Reviewed-on: https://go-review.googlesource.com/c/go/+/375934
Trust: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@bcmills
Copy link
Contributor Author

bcmills commented Aug 23, 2023

I wonder if this may be related to the race fixed in https://go.dev/cl/470215.

@millerresearch
Copy link
Contributor

I wonder if this may be related to the race fixed in https://go.dev/cl/470215.

It doesn't look related to me. A possible clue is that the delayed read of the reset-into-the-past timer always occurs after just over 10 seconds, ie when the other (10 second) deadline timer fires. Hypothesis: the select for some reason is not detecting that the timer has already expired, so the goroutine is descheduled until the deadine timer fires. The deadline timer reschedules the goroutine which then finds both timer channels ready, and inputs from the first one giving the observed error.

But what's special about plan9-arm builders here? I don't think the timer or scheduling logic is different from other platforms. But they are Raspberry Pi boards with small RAM and SD cards as swap device, so potentially subject to extremely long delays in wall-clock time.

@bcmills
Copy link
Contributor Author

bcmills commented Aug 23, 2023

Hypothesis: the select for some reason is not detecting that the timer has already expired, so the goroutine is descheduled until the deadine timer fires.

That sounds very plausible to me.

It's been quite a while since we started skipping this test. Should we remove the Skip and see if it's still happening?

@millerresearch
Copy link
Contributor

Should we remove the Skip and see if it's still happening?

Yes, it's worth a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm Issues solely affecting the 32-bit arm architecture. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Plan9
Projects
Status: No status
Development

No branches or pull requests

4 participants