time: add ExternalNow, etc for external time and timers #36141

zx2c4 · 2019-12-14T13:27:53Z

Update May 5 2021: The current proposed API is in #36141 (comment). - rsc

Vocabulary:

Program time: monotonic, but stops when the computer is in S3 sleep.
Real time: monotonic, but continues to advance when the computer is in S3 sleep.
Wall time: non-monotonic thing on your wristwatch or wall clock that NTP messes with. This one plays no role in this discussion here at all.
Operating system: this always refers to the tuple of OS+ParticularVersion+ParticularConfiguration.

(These vocabulary terms can be nitpicked - maybe program time should be cpu time or something - but we've been using them prior in discussion, so let's continue to use them so as not to introduce confusion.)

Proposal:

Find some way to introduce "real time" semantics into Go, which currently mostly uses "program time", except on Windows, where it's always been "real time" for historical reasons.

Motivation:

Network protocols need to keep track of timeouts independent of whether a computer is asleep, since parties on a network exist in the real world, rather than virtualized on a CPU.

Landscape:

On some operating systems, the poll/select/kqueue/WaitFor*Object/futex family of functions takes a timeout that is measured in "real time", and on others measured in "program time".
Most operating systems support a "program time" counter. Some support a "real time" counter, but some do not, depending on configuration or existence of S3.
Most operating systems offer a notifier for resuming from sleep, though some may not, depending on configuration or existence of S3.
Important observation: operating systems that do not offer a notifier support "program time" rather than "real time".

Possibilities:

a. Make the existing time. and time.Timer. functions use "real time" exclusively, when possible. Introduce a function runtime.RealtimeTimers() bool to indicate whether Go successfully enabled "real time" timers rather than "program time" timers, the fallback.

b. Introduce additional duplicated functions to time. and time.Timer. that use "real time" rather than "program time". Introduce a function time.RealtimeTimersAreRealTime() bool to indicate whether Go successfully enabled "real time" timers on this new set of functions, or if the new set of functions behave identically to the old.

c. Introduce additional duplicated functions to time. and time.Timer. that use "real time" rather than "program time", and throw an error if "real time" capabilities aren't available, forcing users to introduce verbose fallback code if they only want to support "real time" opportunistically.

d. Add a function runtime.UseRealtimeTimers() error that attempts to change the runtime to use "real time" timers everywhere, like (a).

e. Add runtime function runtime.UseRealtimeTimers(yes bool) error that attempts to change the runtime to use "real time" or "program time" timers everywhere, like (a) but the ability to toggle. Add runtime function runtime.RealtimeTimers() bool to indicate the current state. The default start-state would be either OS-defined or "real time" or "program time", depending on what we decide.

f. Other options?

My personal preference would be (a) or (e), but I'm open to discussion.

CC @ianlancetaylor @bradfitz @aclements @rsc

The text was updated successfully, but these errors were encountered:

zephyrtronium · 2019-12-14T14:58:52Z

Making time sense a configuration option in runtime as in (d) or (e) means that packages which want to use real time can't be composed with packages which want to use program time, hence almost no one can depend on either behavior. That seems like probably the worst option.

From reading the discussion at #35482, it seems to me like the only cases where real time is mandatory are cases where the program interacts with the outside world, and furthermore that all cases where the program interacts with the outside world really want real time, even if they currently don't (or previously didn't) use it due to operating system behavior. That implies that the suggestion in #35482 (comment) of TimerAt, SleepUntil, &c., which I believe is a strict subset of your possibility (b), could and should be used everywhere in net. At the risk of bikeshedding, I would suggest that those new functions could be in a new x/time package that has runtime support, but then net can't use them. Maybe a new time/realtime package would help keep the API surface area of time small for typical wall time users while supporting those who must have real time.

As a supporting anecdote: I have an IRC bot that I run using Go 1.13.0 on my local Windows machine, which I also put to sleep from time to time. Sometimes-but-not-always after waking my computer, the program will actively spin for 300 seconds while waiting for the network deadlines to "actually" happen. I haven't tried it yet with Go 1.13.3+, but from what I understand, that behavior would be fixed there – the deadlines would always occur immediately after the computer wakes. This indicates to me that real time is the correct time for all network protocols, not just for WireGuard.

zx2c4 · 2019-12-14T15:05:46Z

This indicates to me that real time is the correct time for all network protocols, not just for WireGuard.

Yes, probably more so than not. Brad wanted "real time" on a networking CL recently too and settled for the unreliable "wall time" instead.

networkimprov · 2019-12-14T18:39:14Z

We're expecting a separate proposal from the Go team along the lines described by @aclements in #35482 (comment).

I posted some followup questions to it in #35482 (comment)

See also #24595, #35012, #29485 (comment)

@bradfitz mentions his wall (i.e. pseudo-real) timers in #35482 (comment)

It would probably help if Go allowed us to set a suspend/resume handler.

Note that network protocol use between apps on the same machine is also common, and would generally not want real-time timers.

ianlancetaylor · 2019-12-14T19:07:14Z

I agree with @zephyrtronium that a global setting as proposed in possibilities d and e is a non-starter. Large programs include many packages, and those packages may legitimately have different requirements.

Possibility a is essentially #24595 as a choice made for all systems where possible. The Linux kernel attempted to make this choice, and reverted it, as discussed at https://www.spinics.net/lists/linux-tip-commits/msg43709.html. Watchdog timers failing on resume are a particular issue. So are timers for single user games. Very roughly, timers associated with behavior of other programs on the current system, or with behavior of the user of the system, seem likely to want "program time". Timers associated with behavior of other systems on the network seem likely to want "real time". Given that Go has historically used "program time" on all systems other than Windows, shifting all systems to "real time" seems likely to hurt more existing code than it helps.

The distinction between possibilities b and c is subtle, and it's not yet clear to me that it matters. If we have a way to detect that a system was suspended, which is the only case where "program time" differs from "real time", then we can always implement timers that use "real time".

ianlancetaylor · 2019-12-14T19:17:32Z

From the API perspective I think the relevant functions and methods are:

time.Sleep
time.NewTimer (and its variants time.After and time.AfterFunc)
time.(*Timer).Reset
net.Conn.SetDeadline (and SetReadDeadline and SetWriteDeadline and implementations)
os.(*File).SetDeadline (and SetReadDeadline and SetWriteDeadline)

If there are other relevant functions/methods, please point them out.

We also need to pay particular attention to time.Sub and the related functions time.Until and time.Since. Currently time.Sub sometimes computes "real time" and sometimes computes "program time". This is documented in the time package docs, which say

On some systems the monotonic clock will stop if the computer goes to sleep. On such a system, t.Sub(u) may not accurately reflect the actual time that passed between t and u.

If we are going to make changes in this area, it would be nice to eliminate this inconsistency if possible.

ianlancetaylor · 2019-12-14T19:26:29Z

As others have said, one way that an API can permit programs to distinguish between "real time" and "program time" is to use the difference between time.Duration and time.Time. Currently time.NewTimer takes an argument of type time.Duration. We could define that as meaning to send a value on the channel after that duration in "program time" (I believe that is the current meaning on all systems other than Windows 8 and later).

We could add a new function time.NewTimerAt which would take an argument of time.Time. That would be defined as sending a value on the channel on or shortly after that instant in time. Thus time.NewTimer(d) would measure time in "program time" while time.NewTimerAt(time.Now().Add(d) would measure time in "real time."

A disadvantage of this approach is that net.Conn.SetDeadline and friends already take a time.Time value. By analogy with the change in timers, we would expect to change the existing meaning of net.Conn.SetDeadline to measure in "real time" rather than "program time". But that change might affect existing programs that use deadlines for network communication on the same machine.

ianlancetaylor · 2019-12-14T19:40:21Z

Turning to implementation, I believe that we can straightforwardly implement both "real time" and "program time" timers if we have a way to receive a notification that the system has woken up from a suspension. Unfortunately, as far as I know, GNU/Linux provides no such notification capability. We can detect a suspension after the fact by comparing the results of clock_gettime using CLOCK_MONOTONIC and CLOCK_BOOTTIME. But we don't want to be continually checking those times.

The problem is that for any sort of timer we need to have a way to sleep until the timer is ready to fire. To support both timers that work in "program time" and timers that work in "real time" we need to have a way to sleep in "program time" and in "real time", or a way to detect that we've woken up from suspend and need to recalculate sleep times.

As far as I know, the only way to sleep in "real time" on GNU/Linux is via timer_create, which is somewhat horrible in that it reports timer completion by sending a signal.

I don't know what is available on other systems.

networkimprov · 2019-12-14T20:48:41Z

systemd sees suspend/resume events. Go could provide an optional resume hook that pings a unix socket or other pollable widget.

zx2c4 · 2019-12-14T22:09:44Z

As far as I know, the only way to sleep in "real time" on GNU/Linux is via timer_create, which is somewhat horrible in that it reports timer completion by sending a signal.

There's timerfd_create(2), actually, which gives you a normal fd that you can then netpoll on. I talked to tglx, the kernel timer maintainer, a few months ago about adding boottime support to futex(2), and he seemed interested, but of course kernel changes mean something far off in the future.

I'll do some research on Linux sleep resumption notification mechanisms. Tailing dmesg sounds pretty bad. Hooking into systemd or android power binder misery or similar sounds less bad but still bad. I wonder if there's some magic sysfs file we can epoll on. I'll look around.

I agree with @zephyrtronium that a global setting as proposed in possibilities d and e is a non-starter. Large programs include many packages, and those packages may legitimately have different requirements.

Yea, I was afraid of that. I guess runtime has a lot of global nobs, but I understand that you don't want to add more, especially important ones.

Possibility a is essentially #24595 as a choice made for all systems where possible.

Sounds like you don't like that either. I'm biased based on my own needs of course, but something to consider is the slight misunderstanding implied by @aclements's original nomenclature we've been using. "Program time" isn't actually the time spent by a program, since the scheduler implies it might not actually be running that whole time, or might be SIGSTOP'd or something. That means that ordinary programs may very well see large leaps in "program time", just as they will for "real time" in the case of resuming from suspend. With large leaps being a potential either way, what's the point of the distinction? Seems like programs are better off coding around "real time" assumptions since "program time" assumptions don't actually apply in the case of SIGSTOP or scheduler contention and such. Maybe the distinction between the two is only important for something like watchdog timers? That seems like more of the exception than the rule though. But, as said, I'm a bit biased here.

The distinction between possibilities b and c is subtle, and it's not yet clear to me that it matters. If we have a way to detect that a system was suspended, which is the only case where "program time" differs from "real time", then we can always implement timers that use "real time".

Do you mean to say that we could perhaps add the new API without needing a function to determine whether or not we have a detection mechanism (or without returning an error), because in the case where we don't have a detection mechanism, there is no S3 sleep anyway, so we might as well consider them the same? That would be fine with me, I suppose.

If there are other relevant functions/methods, please point them out.

time.Ticker is another one.

We also need to pay particular attention to time.Sub and the related functions time.Until and time.Since. Currently time.Sub sometimes computes "real time" and sometimes computes "program time". This is documented in the time package docs, which say

On some systems the monotonic clock will stop if the computer goes to sleep. On such a system, t.Sub(u) may not accurately reflect the actual time that passed between t and u.

Right now a time object has internally both wall time and program time. We could slow it down further and have it always fetch real time. And then introduce more comparison functions. But that seems tricky. Alternatively, we could change the construction functions, for NowReal() and NowProgram() or something, but then there are hairy issues on what to do when comparing unlike objects. Or maybe we'll just want to introduce a whole new object all together?

A disadvantage of this approach is that net.Conn.SetDeadline and friends already take a time.Time value. By analogy with the change in timers, we would expect to change the existing meaning of net.Conn.SetDeadline to measure in "real time" rather than "program time". But that change might affect existing programs that use deadlines for network communication on the same machine.

But actually, isn't the net case the one area where we actually do want to prefer "real time"? For example, see Brad's CLs where he tries to approximate "real time" by using "wall time" (via Round(0)). So maybe that's okay.

networkimprov · 2019-12-14T23:43:18Z

On laptops, programs are rarely stopped, and if they are, it affects network peers on the same machine. Suspend/resume is frequent, and does not affect peers on the same machine.

ianlancetaylor · 2019-12-15T03:53:53Z

Yea, I was afraid of that. I guess runtime has a lot of global nobs, but I understand that you don't want to add more, especially important ones.

As far as I know the runtime package does not have any knobs that change the meaning of any APIs.

ianlancetaylor · 2019-12-15T03:55:59Z

I did forget about timerfd_create, thanks for pointing it out.

ianlancetaylor · 2019-12-15T03:58:17Z

But actually, isn't the net case the one area where we actually do want to prefer "real time"?

In general, yes, but not always. For processes communicating on the same system, "program time" may be expected. We could go ahead and decide to change it anyhow, but it requires thought.

ianlancetaylor · 2019-12-15T04:01:03Z

Sounds like you don't like that either. I'm biased based on my own needs of course, but something to consider is the slight misunderstanding implied by @aclements's original nomenclature we've been using. "Program time" isn't actually the time spent by a program, since the scheduler implies it might not actually be running that whole time, or might be SIGSTOP'd or something. That means that ordinary programs may very well see large leaps in "program time", just as they will for "real time" in the case of resuming from suspend. With large leaps being a potential either way, what's the point of the distinction? Seems like programs are better off coding around "real time" assumptions since "program time" assumptions don't actually apply in the case of SIGSTOP or scheduler contention and such. Maybe the distinction between the two is only important for something like watchdog timers? That seems like more of the exception than the rule though. But, as said, I'm a bit biased here.

It's true of course that "program time" (which I agree is a minomer) can see delays. But consider the case of an interactive game with no network connection that sets up dozens of timers for different times that cause events to occur in the game. If the user suspends a laptop while playing the game, those timers should also suspend; they shouldn't all fire when the laptop is resumed.

networkimprov · 2019-12-15T08:53:54Z

Assuming the Windows runtime switches to program/monotonic time, it might help to provide a switch (e.g. env var) to return it to real/boot time.

Conversely, it would help for 1.14 to have a switch to enable program time; the work for that is mostly done.

zx2c4 · 2019-12-16T10:07:11Z

Alright here's where we are with technical solutions for supporting "real time":

Windows

Reading from 0x7ffe0008 returns "real time", as we do now.
The WaitFor functions we're using for sleeps use "real time" on Windows 7.
The WaitFor functions we're using for sleeps use "program time" on Windows 8+, but PowerRegisterSuspendResumeNotification is also available for Windows 8+, so we can interrupt the sleep and recalculate. This is what the Go runtime does now.

Linux

clock_gettime(CLOCK_BOOTTIME) returns "real time" and is in the VDSO in modern kernels.
timerfd_create(CLOCK_BOOTTIME) creates something we can sleep with in "real time".

macOS

mach_continuous_time returns "real time".
Passing in NOTE_MACH_CONTINUOUS_TIME as the flags argument to any of the event function causes the sleep to be in "real time".

I haven't looked into the BSDs yet.

networkimprov · 2019-12-16T18:39:20Z

It might be essential to make real timers on Windows use a high-precision source; current resolution is 2ms, which isn't enough for some protocols, for example #29485 (comment).

Instead of changing time.Sub() to compute real time, we could consider an API which provides any delta between program & real time, e.g. func SuspendedTotal() Duration.

Re SetDeadline() computing a duration internally, I'd suggest this API, and deprecating Set*Deadline().

func (c *T) SetDeadlineAt(t time.Time)
func (c *T) SetDeadlineFor(d time.Duration)

balasanjay · 2019-12-16T22:24:15Z

SetDeadline is also on interfaces (like net.Conn), where we cannot add methods directly.

networkimprov · 2019-12-16T23:00:10Z

True; requiring a type-assert to call a new method.

Maybe new fields in net.Dialer & net.ListenConfig should set the meaning of the Set*Deadline() time.Time argument.

beoran · 2019-12-17T09:29:31Z

I would prefer a whole new API, in a separate package, 'time/realtime' that has all the functionality of the time package but that explicitly uses real time clocks on all platform. The current time package then should use program time clocks on all platforms. That is IMHO the cleanest solution.

And @ianlancetaylor, as an aside, thanks of thinking about game programming using Go. I definitely appreciate that core Go developers are starting to take non-server applications for Go seriously. As you say, for single player games, timers of game events really need to be suspended when the program is suspended.

As an aside, over 20 years ago, I learned programming because I wanted to make games. Likewise I want to teach my daughter how to program Go by letting her make games. If we want Go to become more popular, then we have to make it even easier to use for game programming as well.

networkimprov · 2020-01-14T21:44:52Z

@ianlancetaylor is the Go team working on a concrete proposal? If so, is there an ETA?

Presumably we'd want to see a draft implementation early in 1.15 cycle...

ianlancetaylor · 2020-01-14T23:44:28Z

There is no ETA.

beoran · 2020-01-15T07:51:12Z

@networkimprov Perhaps this doesn't need to be in the Go standard library? I think it would not be that hard to write an external realtime module that uses real time timers with high precision.

networkimprov · 2020-01-15T08:40:36Z

@beoran, it's discussed at length in #35482

beoran · 2020-01-15T11:12:02Z

I looked at that discussion, and after that I'm not very happy with the current state of things. What I would like is consistency between platforms. The WireGuard project should not have to use a patched runtime on Linux, but to me it looks like they got a patch into the Windows Go runtime that worsened the inconsistency overall.

As I said above, I would solve the problem by making 'time' use program time/wall time, and have a 'time/realtime' package for real time/monotonic clocks. The former can be done relatively easily in Go itself, the latter could be implemented as a third party library at first.

rsc · 2021-05-26T17:37:56Z

Overall it seems like people agree on making Since and Until just work.

rsc · 2021-05-26T18:14:16Z

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

networkimprov · 2021-05-26T20:28:04Z

Under this plan, it's easy to mistakenly call time.Now() when you meant time.ExternalNow(), and then make a comparison that silently yields the wrong result. Isn't a panic preferable to a bug?

hherman1 · 2021-05-27T01:49:31Z

I’m not sure about this case specifically, but sometimes crashing is much worse for users than slightly wrong results.

networkimprov · 2021-05-27T03:48:59Z

In this case, you'd see the panic when testing; it wouldn't be intermittent.

rsc · 2021-06-02T17:07:47Z

Under this plan, it's easy to mistakenly call time.Now() when you meant time.ExternalNow(), and then make a comparison that silently yields the wrong result. Isn't a panic preferable to a bug?

No.

rsc · 2021-06-02T18:00:31Z

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

antichris · 2021-06-03T08:28:38Z

Isn't a panic preferable to a bug?

No.

That's a surprise. I thought that failing fast and loud is preferable to silently accumulating error. Especially considering that panics are not crashes and can be recovered from.

hherman1 · 2021-06-03T15:12:09Z

If you make it fail-able then you make it harder to use, as devs need to consider handling that failure instead of just having a guaranteed successful function

networkimprov · 2021-06-03T16:00:47Z

Hunter, what you suggest is akin to saying that out-of-bounds writes like s[:1][8] = v should silently do nothing.

hherman1 · 2021-06-03T16:27:39Z

I don’t think it’s black and white. I do think it’s undeniable that can-panic is part of an API and makes that API harder to use. It can still sometimes be the right decision, such as in the out of bounds array access case.

ALTree added the Proposal label Dec 14, 2019

ALTree added this to the Proposal milestone Dec 14, 2019

ianlancetaylor changed the title ~~proposal[s]: "real time" timer semantics for 1.15~~ proposal: support "real time" timer semantics Dec 14, 2019

networkimprov mentioned this issue Jan 23, 2020

net/http: permanently broken connection with error "read: connection reset by peer" when response body is not closed #36700

Open

anjmao mentioned this issue Feb 20, 2020

Fix time.Sub after device resumes from sleep mysteriumnetwork/node#1749

Closed

rsc moved this from Active to Likely Accept in Proposals (old) May 26, 2021

rsc moved this from Likely Accept to Accepted in Proposals (old) Jun 2, 2021

rsc changed the title ~~proposal: time: add ExternalNow, etc for external time and timers~~ time: add ExternalNow, etc for external time and timers Jun 2, 2021

rsc added the Proposal-Accepted label Jun 2, 2021

rsc modified the milestones: Proposal, Backlog Jun 2, 2021

smlx mentioned this issue Sep 19, 2021

Switch to time.ExternalNow() in exit timer smlx/piv-agent#65

Open

jstarks mentioned this issue Nov 2, 2021

use CLOCK_BOOTTIME in Instant::now rust-lang/rust#88714

Closed

kazrakcom mentioned this issue Jan 19, 2022

Move to golang ExternalNow or a similar API google/calblink#18

Open

komuw mentioned this issue Mar 15, 2022

std.time.Timer counts suspend time ziglang/zig#11175

Open

honDhan mentioned this issue May 13, 2022

Certificate refresh cycle does not work with laptop sleep GoogleCloudPlatform/cloud-sql-proxy#1199

Closed

problame mentioned this issue Jul 31, 2022

Timers don't resume counting after system suspend/resume cycle zrepl/zrepl#611

Closed

msz-kp mentioned this issue Jan 27, 2023

No spindown after resume in v1.18. adelolmo/hd-idle#81

Closed

rsc removed the Proposal-FinalCommentPeriod label Jul 26, 2023

This was referenced Dec 23, 2023

Android: Instant not advancing while the screen is off rust-lang/rust#71860

Open

Provide boottime timers tokio-rs/tokio#3185

Open

time: add ExternalNow, etc for external time and timers #36141

time: add ExternalNow, etc for external time and timers #36141

Comments

zx2c4 commented Dec 14, 2019 • edited by rsc

Vocabulary:

Proposal:

Motivation:

Landscape:

Possibilities:

zephyrtronium commented Dec 14, 2019

zx2c4 commented Dec 14, 2019 • edited

networkimprov commented Dec 14, 2019 • edited

ianlancetaylor commented Dec 14, 2019

ianlancetaylor commented Dec 14, 2019

ianlancetaylor commented Dec 14, 2019

ianlancetaylor commented Dec 14, 2019

networkimprov commented Dec 14, 2019

zx2c4 commented Dec 14, 2019

networkimprov commented Dec 14, 2019

ianlancetaylor commented Dec 15, 2019

ianlancetaylor commented Dec 15, 2019

ianlancetaylor commented Dec 15, 2019

ianlancetaylor commented Dec 15, 2019

networkimprov commented Dec 15, 2019

zx2c4 commented Dec 16, 2019

Windows

Linux

macOS

networkimprov commented Dec 16, 2019

balasanjay commented Dec 16, 2019

networkimprov commented Dec 16, 2019

beoran commented Dec 17, 2019 • edited

networkimprov commented Jan 14, 2020

ianlancetaylor commented Jan 14, 2020

beoran commented Jan 15, 2020

networkimprov commented Jan 15, 2020

beoran commented Jan 15, 2020 • edited

rsc commented May 26, 2021

rsc commented May 26, 2021

networkimprov commented May 26, 2021 • edited

hherman1 commented May 27, 2021

networkimprov commented May 27, 2021

rsc commented Jun 2, 2021

rsc commented Jun 2, 2021

antichris commented Jun 3, 2021

hherman1 commented Jun 3, 2021

networkimprov commented Jun 3, 2021 • edited

hherman1 commented Jun 3, 2021

zx2c4 commented Dec 14, 2019 •

edited by rsc

zx2c4 commented Dec 14, 2019 •

edited

networkimprov commented Dec 14, 2019 •

edited

beoran commented Dec 17, 2019 •

edited

beoran commented Jan 15, 2020 •

edited

networkimprov commented May 26, 2021 •

edited

networkimprov commented Jun 3, 2021 •

edited