Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

time, runtime: zero duration timer takes 2 minutes to fire #44868

Closed
prattmic opened this issue Mar 8, 2021 · 9 comments
Closed

time, runtime: zero duration timer takes 2 minutes to fire #44868

prattmic opened this issue Mar 8, 2021 · 9 comments
Assignees
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@prattmic
Copy link
Member

prattmic commented Mar 8, 2021

Very rarely, zero duration timers fail to fire for 2 minutes. This program will occassionally fail. e.g.,

$ while ./timer; do echo again; done
again
again
again
again
again
again
again
again
again
panic: Expected time to get value from Timer channel to be less than 2 sec, took 2m0.006359722s, time on channel is 2021-03-08 15:17:38.623999432 -0500 EST m=+120.798106019, current time is 2021-03-08 15:17:38.624042501 -0500 EST m=+120.798149081

This reproduces on 1.16 (linux-amd64, other platforms not tested), but not 1.15. Thus this is probably more fallout from http://golang.org/cl/232298.

Failure seems to consistently occur at 2 minutes, which is a strong indication that this is related to the 2 minute force GC period. Most likely the scheduler is completely missing the timer until forced GC kicks it awake again. That makes this distinct from #44343, where timer expiration is slower but never missed.

cc @ChrisHines @aclements @mknyszek

@prattmic prattmic added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 8, 2021
@prattmic prattmic added this to the Go1.17 milestone Mar 8, 2021
@prattmic prattmic self-assigned this Mar 8, 2021
@ianlancetaylor
Copy link
Contributor

@gopherbot Please open backport to 1.16.

This issue can cause timers to fire up to two minutes late.

@gopherbot
Copy link

Backport issue(s) opened: #44869 (for 1.16).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@mdempsky
Copy link
Member

mdempsky commented Mar 8, 2021

NewTimer creates a new Timer that will send the current time on its channel after at least duration d.

Technically, 2 minutes is after at least 0 seconds. :>

@prattmic
Copy link
Member Author

prattmic commented Mar 8, 2021

This is a bug in timerpMask. i.e, http://golang.org/cl/264477.

The logic behind timerpMask is based on the assumption that a P will only add timers to its own heap (comment).

This isn't quite true, as addtimer runs in a preemptible context, making the following sequence possible:

  1. Call addtimer on P 0, G 1.
  2. addtimer computes pp := <*p for P 0>.
  3. P 0 preempted (likely in morestack on the lock call). G 1 placed on runq as runnable.
  4. timerpMask cleared for P 0, as it has no timers.
  5. P 1 starts running. timerpMask set for P 1, as "it may add a timer at any time".
  6. G 1 continues execution on P 1, adding the timer to P 0's timer heap.

P 0's timerpMask bit will never be set until P 0 runs again, causing all timer stealing to skip P 0 and miss the timer.

timerpMask has this strange update location instead of simply updating directly in doaddtimer because updating a centralized data structure in the middle of the timer code adds too much contention.

Since it doesn't affect correctness, I believe it would be sufficient to fix this problem by updating timerpMask in the case of doaddtimer pp != getg().m.p.ptr(). This should be extremely rare, so it won't affect performance of the normal case.

@prattmic
Copy link
Member Author

prattmic commented Mar 8, 2021

Alternatively, it may be cleaner to keep the expectation the addtimer only adds to its own P by making it retry if the P has changed after locking.

@gopherbot
Copy link

Change https://golang.org/cl/300610 mentions this issue: runtime, time: disable preemption in addtimer

@gopherbot
Copy link

Change https://golang.org/cl/300611 mentions this issue: [release-branch.go1.16] runtime, time: disable preemption in addtimer

gopherbot pushed a commit that referenced this issue Mar 12, 2021
The timerpMask optimization updates a mask of Ps (potentially)
containing timers in pidleget / pidleput. For correctness, it depends on
the assumption that new timers can only be added to a P's own heap.

addtimer violates this assumption if it is preempted after computing pp.
That G may then run on a different P, but adding a timer to the original
P's heap.

Avoid this by disabling preemption while pp is in use.

Other uses of doaddtimer should be OK:

* moveTimers: always moves to the current P's heap
* modtimer, cleantimers, addAdjustedTimers, runtimer: does not add net
  new timers to the heap while locked

For #44868
Fixes #44869

Change-Id: I4a5d080865e854931d0a3a09a51ca36879101d72
Reviewed-on: https://go-review.googlesource.com/c/go/+/300610
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
(cherry picked from commit aa26687)
Reviewed-on: https://go-review.googlesource.com/c/go/+/300611
@gopherbot
Copy link

Change https://golang.org/cl/313129 mentions this issue: [release-branch.go1.15] runtime, time: disable preemption in addtimer

@gopherbot
Copy link

Change https://golang.org/cl/313130 mentions this issue: [release-branch.go1.15] runtime, time: disable preemption in addtimer

gopherbot pushed a commit that referenced this issue Apr 29, 2021
The timerpMask optimization updates a mask of Ps (potentially)
containing timers in pidleget / pidleput. For correctness, it depends on
the assumption that new timers can only be added to a P's own heap.

addtimer violates this assumption if it is preempted after computing pp.
That G may then run on a different P, but adding a timer to the original
P's heap.

Avoid this by disabling preemption while pp is in use.

Other uses of doaddtimer should be OK:

* moveTimers: always moves to the current P's heap
* modtimer, cleantimers, addAdjustedTimers, runtimer: does not add net
  new timers to the heap while locked

For #44868
Fixes #45731

Change-Id: I4a5d080865e854931d0a3a09a51ca36879101d72
Reviewed-on: https://go-review.googlesource.com/c/go/+/300610
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-on: https://go-review.googlesource.com/c/go/+/313129
Trust: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
@golang golang locked and limited conversation to collaborators Apr 23, 2022
@prattmic prattmic self-assigned this Jun 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants