Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: new timer causes racy use of timer #35394

Closed
changkun opened this issue Nov 6, 2019 · 6 comments
Closed

runtime: new timer causes racy use of timer #35394

changkun opened this issue Nov 6, 2019 · 6 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@changkun
Copy link
Member

changkun commented Nov 6, 2019

What version of Go are you using (go version)?

$ go version
go version devel +6dc250f456 Wed Nov 6 13:55:04 2019 +0000 darwin/amd64

Does this issue reproduce with the latest release?

Yes, it is reproducible in the current master branch(6dc250f), but it does not appear in go1.13.3 darwin/amd64.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/changkun/Library/Caches/go-build"
GOENV="/Users/changkun/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/changkun/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/changkun/dev/golang-go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/changkun/dev/golang-go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/changkun/dev/gobase/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/j_/h4j7spgj3znbfxcbbph5fp4w0000gn/T/go-build604565365=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

$ git clone https://github.com/changkun/gobase.git && cd gobase && git checkout bad-timer && cd sched                                                                                                      [16:19:33]
Cloning into 'gobase'...
remote: Enumerating objects: 357, done.
remote: Counting objects: 100% (357/357), done.
remote: Compressing objects: 100% (253/253), done.
remote: Total 357 (delta 118), reused 318 (delta 87), pack-reused 0
Receiving objects: 100% (357/357), 1.68 MiB | 3.26 MiB/s, done.
Resolving deltas: 100% (118/118), done.
Branch 'bad-timer' set up to track remote branch 'bad-timer' from 'origin'.
Switched to a new branch 'bad-timer'

$ ~/dev/golang-go/bin/go test -bench=. -run=^$ .           # new compiled go command                                                                                                           bad-timer ● [16:19:45]
size:  10
goos: darwin
goarch: amd64
pkg: github.com/changkun/gobase/sched
BenchmarkSubmit/#tasks-10-12                   0                      NaN ns/op
panic: runtime error: racy use of timers

goroutine 35 [running]:
time.stopTimer(0xc00040aea8, 0x0)
        /Users/changkun/dev/golang-go/src/runtime/time.go:297 +0x2b
time.(*Timer).Stop(...)
        /Users/changkun/dev/golang-go/src/time/sleep.go:81
github.com/changkun/gobase/sched.(*sched).pause(0x1250100)
        /Users/changkun/dev/experiments/gobase/sched/sched.go:151 +0x3a
github.com/changkun/gobase/sched.(*sched).schedule(0x1250100, 0x117a120, 0xc000010740, 0xbf68d706a30d30b8, 0x31137b84, 0x1250220, 0x1178a00, 0xc0003ea800)
        /Users/changkun/dev/experiments/gobase/sched/sched.go:106 +0x2f
github.com/changkun/gobase/sched.Submit(0x117a120, 0xc000010740, 0x1178a00, 0xc0003ea800)
        /Users/changkun/dev/experiments/gobase/sched/sched.go:60 +0x77
github.com/changkun/gobase/sched.BenchmarkSubmit.func1(0xc0000bc380)
        /Users/changkun/dev/experiments/gobase/sched/sched_test.go:305 +0x7c
testing.(*B).runN(0xc0000bc380, 0xeeef)
        /Users/changkun/dev/golang-go/src/testing/benchmark.go:190 +0xbd
testing.(*B).launch(0xc0000bc380)
        /Users/changkun/dev/golang-go/src/testing/benchmark.go:320 +0xea
created by testing.(*B).doBench
        /Users/changkun/dev/golang-go/src/testing/benchmark.go:275 +0x55
exit status 2
FAIL    github.com/changkun/gobase/sched        1.071s
FAIL

What did you expect to see?

$ go test -bench=. -run=^$ .     # go1.13.3bad-timer ● [16:19:54]
size:  10
goos: darwin
goarch: amd64
pkg: github.com/changkun/gobase/sched
BenchmarkSubmit/#tasks-10-12               27680             42454 ns/op
PASS
ok      github.com/changkun/gobase/sched        3.358s

Without any panics.

What did you see instead?

Runtime panic.

@ianlancetaylor
Copy link
Contributor

I haven't yet been able to recreate the problem myself.

Your code looks like it can make simultaneous calls to the Stop and Reset methods of a timer. That does seem like a racy use of a timer. Am I misreading the code?

@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 6, 2019
@ianlancetaylor ianlancetaylor added this to the Go1.14 milestone Nov 6, 2019
@changkun
Copy link
Member Author

changkun commented Nov 6, 2019

@ianlancetaylor The problem appears with very high probability(the test may success), perhaps you will need to run the test many times.

The code does not have simultaneous calls to the Stop and Reset because of the sched.setTimer call CAS the old timer out of sched.timer pointer so that it prevents race.

@changkun
Copy link
Member Author

changkun commented Nov 6, 2019

@ianlancetaylor BTW, the test never fails in Go 1.12 and 1.13 releases. The use of runtime.Gosched causes the reported panic, see changkun/pkg@6897eac

@ianlancetaylor
Copy link
Contributor

What prevents the setTimer and pause methods from running simultaneously in different goroutines?

@changkun
Copy link
Member Author

changkun commented Nov 6, 2019

You are right, setTimer and pause can run in different Gs. They are racing timer on Stop from pause and Reset from setTimer. Feel bad about reporting a false positive.

However, if so, the previous releases are failed to report this issue. Does it still count as issue report?

In Go 1.13:

go test -v -run=^$ -bench=.  -count=100 . # never panic

In Go 1.14:

~/dev/golang-go/bin/go test -v -run=^$ -bench=.  -count=100 .   # panic very quickly                                                                                                                                          master ✖︎ [21:29:58]
goos: darwin
goarch: amd64
pkg: github.com/changkun/gobase/sched
BenchmarkSubmit
size:  10
BenchmarkSubmit/#tasks-10
BenchmarkSubmit/#tasks-10-6                54175             21358 ns/op
panic: runtime error: racy use of timers

goroutine 1221450 [running]:
time.stopTimer(0xc00052e368, 0x0)
        /Users/changkun/dev/golang-go/src/runtime/time.go:297 +0x2b
time.(*Timer).Stop(...)
        /Users/changkun/dev/golang-go/src/time/sleep.go:81
github.com/changkun/gobase/sched.(*sched).pause(0x124a9c0)
        /Users/changkun/dev/gobase/sched/sched.go:151 +0x3a
github.com/changkun/gobase/sched.(*sched).schedule(0x124a9c0, 0x1176a60, 0xc00007aa00, 0xbf68e933866b5750, 0x1654c0e76, 0x124aae0, 0x11753c0, 0xc000528090)
        /Users/changkun/dev/gobase/sched/sched.go:106 +0x2f
github.com/changkun/gobase/sched.Submit(0x1176a60, 0xc00007aa00, 0x11753c0, 0xc000528090)
        /Users/changkun/dev/gobase/sched/sched.go:60 +0x77
github.com/changkun/gobase/sched.BenchmarkSubmit.func1(0xc0000d4540)
        /Users/changkun/dev/gobase/sched/sched_test.go:309 +0x7c
testing.(*B).runN(0xc0000d4540, 0xda14)
        /Users/changkun/dev/golang-go/src/testing/benchmark.go:190 +0xbd
testing.(*B).launch(0xc0000d4540)
        /Users/changkun/dev/golang-go/src/testing/benchmark.go:320 +0xea
created by testing.(*B).doBench
        /Users/changkun/dev/golang-go/src/testing/:275 +0x55
exit status 2
FAIL    github.com/changkun/gobase/sched        7.868s
FAIL

@ianlancetaylor
Copy link
Contributor

Thanks for figuring it out. No, it's not a bug that the new version reports a race when the old version does not. This kind of race reporting is not deterministic. It might miss some races. The most important thing is that it shouldn't have any false positives: any race that it reports should be a real race.

liov pushed a commit to liov/hoper that referenced this issue Feb 25, 2020
liov pushed a commit to liov/hoper that referenced this issue Feb 26, 2020
@golang golang locked and limited conversation to collaborators Feb 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

3 participants