Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

time: timer reset sometimes ignored, causing delayed ticks #47762

Closed
jgraettinger opened this issue Aug 17, 2021 · 13 comments
Closed

time: timer reset sometimes ignored, causing delayed ticks #47762

jgraettinger opened this issue Aug 17, 2021 · 13 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@jgraettinger
Copy link

What version of Go are you using (go version)?

Go 1.16.7

I noticed the issue only with the 1.16.6 => 1.16.7 minor version bump, and cannot reproduce with 1.16.6.

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/flow/.cache/go-build"
GOENV="/home/flow/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/flow/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/flow/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.16.7"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/workspaces/flow/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/var/tmp/go-build1486076951=/tmp/go-build -gno-record-gcc-switches"

What did you do?

In rare circumstances, using Timer.Stop and Timer.Reset fails to cause a timer to properly deliver the expected tick on time.
The tick is eventually delivered, and experimentally it's roughly delivered according to previous Reset-and-then-Stop'd values.
I believe this is related to the timer-related fixes applied in the 1.16.7 point release discussed in issue #47329

I managed to distill down pseudo-random but deterministic test case. I only encounter a failure some of the time -- you'll want to use -count=10 for example, and should then observe delays.

func TestTimerStopAndReset(t *testing.T) {
	var rnd = rand.New(rand.NewSource(1234567))

	timers := make([]*time.Timer, 100)
	states := make([]int, len(timers))
	indices := rnd.Perm(len(timers))

	for len(indices) != 0 {
		var ii = rnd.Intn(len(indices))
		var i = indices[ii]

		var timer = timers[i]
		var state = states[i]
		states[i]++

		// t.Logf("timer %d state %d", i, state)

		switch state {
		case 0:
			timers[i] = time.NewTimer(0)
		case 1:
			<-timer.C // Timer is now idle.

		// Reset to various long durations, which we'll cancel.
		case 2:
			if timer.Reset(1 * time.Minute) {
				panic("shouldn't be active (1)")
			}
		case 4:
			if timer.Reset(3 * time.Minute) {
				panic("shouldn't be active (3)")
			}
		case 6:
			if timer.Reset(2 * time.Minute) {
				panic("shouldn't be active (2)")
			}

		// Stop and drain a long-duration timer.
		case 3, 5, 7:
			if !timer.Stop() {
				<-timer.C
			}

		// Start a short-duration timer we expect to select without blocking.
		case 8:
			timer.Reset(0)
		case 9:
			t.Logf("timer %d awaiting final tick... ", i)
			<-timer.C
			t.Logf("timer %d ticked", i)

		// Timer is done. Swap with tail and remove.
		case 10:
			indices[ii] = indices[len(indices)-1]
			indices = indices[:len(indices)-1]
		}
	}
}

What did you expect to see?

The test should complete quickly, since all timers are stopped before awaiting longer-duration time intervals,
and timer channels are only selected after being Reset(0).

What did you see instead?

The test frequently blocks for longer periods of time, on the order of a minute or two (and correlated to the long-duration time interval used).

@jgraettinger
Copy link
Author

Here's a reproduction in the Go playground, against 1.17.

@mknyszek mknyszek changed the title Timer reset sometimes ignored, causing delayed ticks time: timer reset sometimes ignored, causing delayed ticks Aug 18, 2021
@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 18, 2021
@mknyszek mknyszek added this to the Backlog milestone Aug 18, 2021
@mknyszek
Copy link
Contributor

I think this may be related to #44343 as well. CC @prattmic @ChrisHines

@mpx
Copy link
Contributor

mpx commented Aug 19, 2021

Cc @ianlancetaylor (for timers)

@ianlancetaylor
Copy link
Contributor

Thanks for the test case. I think I found the problem. Sending a CL.

@ianlancetaylor
Copy link
Contributor

@gopherbot Please open a backport to 1.16 and 1.17.

This bug can cause timers to fail to fire. It appears to be new in recent 1.16 releases.

@gopherbot
Copy link

Backport issue(s) opened: #47858 (for 1.16), #47859 (for 1.17).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@ianlancetaylor
Copy link
Contributor

Thanks for the nice test case.

@gopherbot
Copy link

Change https://golang.org/cl/343882 mentions this issue: runtime: in adjustTimers back up as far as necessary

@jgraettinger
Copy link
Author

jgraettinger commented Sep 13, 2021

Not to nag, but is there a timeline for when the CL might be wrapped up ? I'd love to try out the fix. Thanks

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Sep 14, 2021

See the discussion at https://golang.org/cl/343882.

@gopherbot
Copy link

Change https://golang.org/cl/350000 mentions this issue: [release-branch.go1.16] runtime: in adjustTimers back up as far as necessary

@gopherbot
Copy link

Change https://golang.org/cl/350001 mentions this issue: [release-branch.go1.17] runtime: in adjustTimers back up as far as necessary

@jgraettinger
Copy link
Author

I've tested with gotip and can confirm this fixes the problem as it originally manifested for us.

gopherbot pushed a commit that referenced this issue Sep 15, 2021
…cessary

When the adjustTimers function removed a timer it assumed it was
sufficient to continue the heap traversal at that position.
However, in some cases a timer will be moved to an earlier
position in the heap. If that timer is timerModifiedEarlier,
that can leave timerModifiedEarliest not correctly representing
the earlier such timer.

Fix the problem by restarting the heap traversal at the earliest
changed position.

For #47762
Fixes #47859

Change-Id: I152bbe62793ee40a680baf49967bcb89b1f94764
Reviewed-on: https://go-review.googlesource.com/c/go/+/343882
Trust: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
(cherry picked from commit 2da3375)
Reviewed-on: https://go-review.googlesource.com/c/go/+/350001
gopherbot pushed a commit that referenced this issue Sep 15, 2021
…cessary

When the adjustTimers function removed a timer it assumed it was
sufficient to continue the heap traversal at that position.
However, in some cases a timer will be moved to an earlier
position in the heap. If that timer is timerModifiedEarlier,
that can leave timerModifiedEarliest not correctly representing
the earlier such timer.

Fix the problem by restarting the heap traversal at the earliest
changed position.

For #47762
Fixes #47858

Change-Id: I152bbe62793ee40a680baf49967bcb89b1f94764
Reviewed-on: https://go-review.googlesource.com/c/go/+/343882
Trust: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
(cherry picked from commit 2da3375)
Reviewed-on: https://go-review.googlesource.com/c/go/+/350000
@golang golang locked and limited conversation to collaborators Sep 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants