Go Wiki: Go 1.23 Timer Channel Changes

Go 1.23 includes a new implementation of the channel-based timers created by time.NewTimer, time.After, time.NewTicker, and time.Tick.

The new implementation makes two important changes:

  1. Unstopped timers and tickers that are no longer referenced can be garbage collected. Before Go 1.23, unstopped timers could not be garbage collected until the timer went off, and unstopped tickers could never be garbage collected. The Go 1.23 implementation avoids resource leaks in programs that don’t use t.Stop.

  2. Timer channels are now synchronous (unbuffered), giving the t.Reset and t.Stop methods a stronger guarantee: after one of those methods returns, no future receive from the timer channel will observe a stale time value corresponding to the old timer configuration. Before Go 1.23, it was impossible to avoid stale values with t.Reset, and avoiding stale values with t.Stop required careful use of the return value from t.Stop. The Go 1.23 implementation removes this concern entirely.

The implementation changes have two observable side effects that may affect production behavior or tests, described in the following sections.

The new implementation is only used in programs where package main is in a module with a go.mod declaring go 1.23 or later. Other programs continue to use the old semantics. The GODEBUG setting asynctimerchan=1 forces the old semantics; conversely, asynctimerchan=0 forces the new semantics.

Cap and Len

Before Go 1.23, the cap of a timer channel was 1, and the len of a timer channel indicated whether a value was waiting to be received (1 if so, 0 if not). The Go 1.23 implementation creates timer channels with cap and len always 0.

In general, using len to poll any channel is usually not helpful, since another goroutine may receive from the channel concurrently, invalidating the result of len at any time. Code that polls a timer channel using len should instead use a non-blocking select.

That is, code that does:

if len(t.C) == 1 {
    <-t.C
    more code
}

should instead do:

select {
default:
case <-t.C:
    more code
}

Select Races

Before Go 1.23, a timer created with a very short interval, like 0ns or 1ns, would take significantly longer than that interval to make its channel ready for receiving, due to scheduling delays. That delay can be observed in code that selects between a channel that is ready before the select and a newly created timer with a very short timeout:

c := make(chan bool)
close(c)

select {
case <-c:
    println("done")
case <-time.After(1*time.Nanosecond):
    println("timeout")
}

By the time the select arguments are evaluated and select looks at the channels involved, the timer should have expired, meaning that both cases are ready to proceed. Select chooses between multiple ready cases by choosing one randomly, so this program should select each case about half the time.

Due to the scheduling delays in the timer implementation before Go 1.23, programs like this incorrectly executed the “done” case 100% of the time.

The Go 1.23 timer implementation is not affected by the same scheduling delays, so in Go 1.23, that program executes each case about half the time.

During testing of Go 1.23 in Google’s code base, we found a handful of tests that used select to race channels ready to proceed (often context Done channels) against timers with very low timeouts. Usually, the production code would use a real timeout, in which case the race is uninteresting, but for testing the timeout would be set to something very small. And then the test would insist on the non-timeout case executing, failing if the timeout was reached. A reduced example might look like:

select {
case <-ctx.Done():
    return nil
case <-time.After(timeout):
    return errors.New("timeout")
}

Then the test would call this code with timeout set to 1ns and fail if the code returned an error.

To fix a test like this, either the caller can be changed to understand that timeouts are possible, or the code can be changed to prefer the done channel even in the timeout case, like this:

select {
case <-ctx.Done():
    return nil
case <-time.After(timeout):
    // Double-check that Done is not ready,
    // in case of short timeout during test.
    select {
    default:
    case <-ctx.Done():
        return nil
    }
    return errors.New("timeout")
}

Debugging

If a program or test fails using Go 1.23 but worked using Go 1.22, the asynctimerchan GODEBUG setting can be used to check whether the new timer implementation triggers the failure:

GODEBUG=asynctimerchan=0 mytest  # force Go 1.23 timers
GODEBUG=asynctimerchan=1 mytest  # force Go 1.22 timers

If the program or test consistently passes using Go 1.22 but consistently fails using Go 1.23, that is a strong sign that the problem is related to timers.

In all the test failures we have observed, the problem has been in the test itself, not the timer implementation, so the next step is to identify exactly which code in mytest depends on the old implementation. To do that, you can use the bisect tool:

go install golang.org/x/tools/cmd/bisect@latest
bisect -godebug asynctimerchan=1 mytest

Invoked this way, bisect runs mytest repeatedly, toggling the new timer implementation on and off depending on the stack trace leading to the timer call. Using binary search, it narrows down an induced failure to enabling the new timers during specific stack traces, which it reports. While bisect runs, it prints status messages about its trials, mainly so that when the test is slow you know it is still running.

An example bisect run looks like:

$ bisect -godebug asynctimerchan=1 ./view.test
bisect: checking target with all changes disabled
bisect: run: GODEBUG=asynctimerchan=1#n ./view.test... FAIL (7 matches)
bisect: run: GODEBUG=asynctimerchan=1#n ./view.test... FAIL (7 matches)
bisect: checking target with all changes enabled
bisect: run: GODEBUG=asynctimerchan=1#y ./view.test... ok (7 matches)
bisect: run: GODEBUG=asynctimerchan=1#y ./view.test... ok (7 matches)
bisect: target fails with no changes, succeeds with all changes
bisect: searching for minimal set of disabled changes causing failure
bisect: run: GODEBUG=asynctimerchan=1#!+0 ./view.test... FAIL (3 matches)
bisect: run: GODEBUG=asynctimerchan=1#!+0 ./view.test... FAIL (3 matches)
bisect: run: GODEBUG=asynctimerchan=1#!+00 ./view.test... ok (1 matches)
bisect: run: GODEBUG=asynctimerchan=1#!+00 ./view.test... ok (1 matches)
bisect: run: GODEBUG=asynctimerchan=1#!+10 ./view.test... FAIL (2 matches)
bisect: run: GODEBUG=asynctimerchan=1#!+10 ./view.test... FAIL (2 matches)
bisect: run: GODEBUG=asynctimerchan=1#!+0010 ./view.test... ok (1 matches)
bisect: run: GODEBUG=asynctimerchan=1#!+0010 ./view.test... ok (1 matches)
bisect: run: GODEBUG=asynctimerchan=1#!+1010 ./view.test... FAIL (1 matches)
bisect: run: GODEBUG=asynctimerchan=1#!+1010 ./view.test... FAIL (1 matches)
bisect: confirming failing change set
bisect: run: GODEBUG=asynctimerchan=1#v!+x65a ./view.test... FAIL (1 matches)
bisect: run: GODEBUG=asynctimerchan=1#v!+x65a ./view.test... FAIL (1 matches)
bisect: FOUND failing change set
--- change set #1 (disabling changes causes failure)
internal/godebug.(*Setting).Value()
    go/src/internal/godebug/godebug.go:165
time.syncTimer()
    go/src/time/sleep.go:25
time.NewTimer()
    go/src/time/sleep.go:144
time.After()
    go/src/time/sleep.go:202
region_dash/regionlist.(*Cache).Top()
    region_dash/regionlist/regionlist.go:89
region_dash/view.(*Page).ServeHTTP()
    region_dash/view/view.go:45
region_dash/view.TestServeHTTPStatus.(*Router).Handler.func2()
    httprouter/httprouter/params_go17.go:27
httprouter/httprouter.(*Router).ServeHTTP()
    httprouter/httprouter/router.go:339
region_dash/view.TestServeHTTPStatus.func1()
    region_dash/view/view.test.go:105
testing.tRunner()
    go/src/testing/testing.go:1689
runtime.goexit()
    go/src/runtime/asm_amd64.s:1695

---
bisect: checking for more failures
bisect: run: GODEBUG=asynctimerchan=1#!-x65a ./view.test... ok (6 matches)
bisect: run: GODEBUG=asynctimerchan=1#!-x65a ./view.test... ok (6 matches)
bisect: target succeeds with all remaining changes disabled

In this case, the stack trace makes clear exactly which call to time.After causes a failure when using the new timers.


This content is part of the Go Wiki.