Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

time: document proper usage of Timer.Stop #14383

Closed
anacrolix opened this issue Feb 18, 2016 · 23 comments
Closed

time: document proper usage of Timer.Stop #14383

anacrolix opened this issue Feb 18, 2016 · 23 comments
Labels
Documentation NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@anacrolix
Copy link
Contributor

After calling Timer.Stop(), and then clearing the channel, the channel can later be filled anyway. Presumably the task to send to the Timer channel has been put on a run queue, and isn't remove when Stop is called. Here's a test that demonstrates it:

func TestTimerFiresAfterStop(t *testing.T) {
    fail := make(chan struct{})
    done := make(chan struct{})
    defer close(done)
    for range iter.N(1000) {
        tr := time.NewTimer(0)
        tr.Stop()
        // There may or may not be a value in the channel now. But definitely
        // one should not be added after we receive it.
        select {
        case <-tr.C:
        default:
        }
        // Now set the timer to trigger in hour. It definitely shouldn't be
        // receivable now for an hour.
        tr.Reset(time.Hour)
        go func() {
            select {
            case <-tr.C:
                // As soon as the channel receives, notify failure.
                fail <- struct{}{}
            case <-done:
            }
        }()
    }
    select {
    case <-fail:
        t.FailNow()
    case <-time.After(time.Second):
    }
}

Increase the value in iter.N if it doesn't trigger on your system initially. Note that it isn't even necessary to call Reset() to trigger this. You can remove that line. You can Stop() a timer, and then still receive a value from it, some time after you drain it.

@ianlancetaylor ianlancetaylor changed the title Timers can fire some time after Stop() time: timers can fire some time after Stop Feb 18, 2016
@ianlancetaylor ianlancetaylor added this to the Go1.7 milestone Feb 18, 2016
@ulikunitz
Copy link
Contributor

The go routine should get the timer as argument, because otherwise there will be a race. But even with that fixed the channel fires every few test runs.

@bradfitz
Copy link
Contributor

Aside: you do realize the iter package was a joke, right?

I'd change your example to use for i := 0; i < 1000; i++.

@ulikunitz, I think the code looks fine. The goroutine closure captures the new tr variable each iteration.

@cespare
Copy link
Contributor

cespare commented Feb 18, 2016

This bit of code:

tr := time.NewTimer(0)
tr.Stop()
// There may or may not be a value in the channel now. But definitely
// one should not be added after we receive it.
select {
        case <-tr.C:
        default:
}

does not guarantee that there isn't an in-flight value on tr.C, because Stop may not have stopped the timer before it fired, yet that doesn't mean that the channel was ready to receive when the select ran. So it's a race.

See extensive previous discussion on #11513 and https://groups.google.com/forum/#!topic/golang-dev/c9UUfASVPoU.

In the latter thread, @rsc suggests doing

if !timer.Stop() {
  <-timer.C
}

I also previously filed #12721 to ask for an easier way to create a new, stopped timer.

@bradfitz
Copy link
Contributor

Indeed, looks like a dup of #12721.

But let's keep this bug open as a documentation fix issue for Timer.Stop. An example might help too.

@bradfitz bradfitz changed the title time: timers can fire some time after Stop time: document proper usage of Timer.Stop Feb 19, 2016
@anacrolix
Copy link
Contributor Author

Yes, it is a race.

if !timer.Stop() {
  <-timer.C
}

Is not sufficient, as there's no way to tell if the fired event has already been received. Which in my use case it often has. This is necessary, as linked in the thread: http://play.golang.org/p/Ys9qqanqmU

@ulikunitz
Copy link
Contributor

The race appears to be in the function timerproc() in runtime/time.go. The mutex timers.lock is unlocked for calling the runtime timer function, which sends the time into the channel. Therefore timer.Stop can return before the time value is written to the channel.

Calling the runtime timer function with the mutex locked, ensures that timer.Stop cannot return before the channel is written too. I tested all.bash and also checked the test function from @anacrolix. I think this can be justified because there is no way for a user to set the runtime timer function. I even checked whether I can compile the time package outside of GOROOT; it didn't work.

On the other hand I can't see how

if !tr.Stop() {
    <-tr.C
}

should fail. At least on Linux AMD64 I was not able to trigger the modified test to fail, even after creating 1 million goroutines.

It appears there are two ways to address this issue:

A) Amend the documentation of time.Stop(), or
B) calling the runtime timer function with the timers.lock hold.

I prefer B because it fixes the Test above.

@bradfitz You are right, there is no race in the test code.

@quentinmit quentinmit added the NeedsFix The path to resolution is known, but the work has not been done. label May 20, 2016
@rsc
Copy link
Contributor

rsc commented May 20, 2016

I still don't think there's a race here. This is a documentation issue. See
https://groups.google.com/d/msg/golang-dev/c9UUfASVPoU/tlbK2BpFEwAJ for how to use Stop correctly.

See also #11513.

@rsc
Copy link
Contributor

rsc commented May 20, 2016

I should add: There is an argument for making

tr := time.NewTimer(0)
tr.Stop()
// There may or may not be a value in the channel now. But definitely
// one should not be added after we receive it.
select {
        case <-tr.C:
        default:
}

work, but it doesn't today, and we're certainly not going to do that for Go 1.7. Maybe early in Go 1.8 if someone wants to file a different bug.

@adg adg self-assigned this May 31, 2016
@gopherbot
Copy link

CL https://golang.org/cl/23575 mentions this issue.

@r0l1
Copy link

r0l1 commented Dec 6, 2016

Just implemented a small fix, which can be used as a drop-in replacement.

https://github.com/desertbit/timer

@rsc
Copy link
Contributor

rsc commented Dec 6, 2016

@m4ng0squ4sh FWIW, in the revised docs (https://beta.golang.org/pkg/time/#Timer.Stop) please note the phrase "assuming the program has not received from t.C already". That wrapper package is actually impossible to use correctly if the program has received from t.C already: the receive in the wrapper's Stop method will block waiting for a value that isn't coming.

@r0l1
Copy link

r0l1 commented Dec 6, 2016

@rsc Thanks for pointing that out. That's why I included the following phrase:

This should not be done concurrent to other receives from the Timer's channel.

I'll add some more documentation so it's clear for everybody. It might be possible to even solve that problem by adding another channel and a Mutex to the wrapper. However I have to try that first. Are there any other solutions?

@rsc
Copy link
Contributor

rsc commented Dec 6, 2016

Note that most people would not consider:

t := time.NewTimer(1*time.Second)
<-t.C
t.Reset(2*time.Second)

to be calling t.Reset "concurrent to other receives". But the wrapper will hang in that case.

The obvious solution is to use time.Timer directly and not write a wrapper. I suppose you could kick off a new goroutine to pass time events along a new channel and keep track of whether any have been seen, but then you're creating a much more heavyweight timer, all to avoid one or two lines of code.

@r0l1
Copy link

r0l1 commented Dec 6, 2016

True story. Just have seen the updated doc (beta). Now it's clear. Thanks.
Yeah, firing up a new goroutine,... works. Just tried a small implementation but it's really too heavyweight.

Edit: Deleted the wrapper.

@r0l1
Copy link

r0l1 commented Dec 11, 2016

Finally, here is a lightweight drop-in replacement. This makes more sense for my projects, because sometimes it's just time consuming to fix the broken Reset behavior. Especially in a concurrent situation...

@rsc Would love to get some feedback and/or some hints, if everything is correct. Thanks!

@funny-falcon
Copy link
Contributor

@m4ng0squ4sh your Reset() is correct.
@rsc, there is really should be separate function in runtime to serve Timer.Reset() instead of calling stoptimer()+starttimer(), cause timer.lock should be captured once. Currently concurrently called Timer.Reset() could damage timers heap (this is not my finding).

@r0l1
Copy link

r0l1 commented Jan 30, 2017

@funny-falcon Thanks for having a look at it!

@rsc
Copy link
Contributor

rsc commented Jan 30, 2017

@m4ng0squ4sh I don't think the Reset is correct. If it were correct I don't think this program would panic:

package main

import (
	"time"

	"github.com/desertbit/timer"
)

func main() {
	for {
		print(".")
		c := make(chan int, 1)
		t := timer.AfterFunc(1*time.Millisecond, func() { c <- 1 })
		timer.AfterFunc(1*time.Millisecond, func() { t.Reset(100 * time.Second); close(c); t.Stop() })
		<-c
		<-c
	}
}

and if you fix that, be careful not to make this one hang:

package main

import (
	"time"

	"github.com/desertbit/timer"
)

func main() {
	done := make(chan bool)
	timer.AfterFunc(1*time.Second, func() {
		println("1")
		timer.AfterFunc(1*time.Second, func() {
			println("2")
			done <- true
		})
	})
	<-done
}

I don't think you can paper over this.

@funny-falcon
Copy link
Contributor

@rsc first, it panics with "time" package also.
It panics not because of some mistake, but because function passed to AfterFunc runs in separate goroutine. So, both AfterFunc's handlers are triggered simultaneously after 1 millisecond timeout: goroutines are spawned and scheduled for execution, - and it happens that second one runs before first one.
It is cause of design of AfterFunc in standard package: "user function should not affect internal timerproc, so that, goroutine is fired to run function". It was told me in https://codereview.appspot.com/12876047

@r0l1
Copy link

r0l1 commented Jan 31, 2017

@rsc AfterFunc was added to the package to stay compatible to the standard lib timer implementation. As mentioned in the doc:

AfterFunc waits for the duration to elapse and then calls f in its own goroutine.

So the panic must be expected.

Here is a simple ping example showing the general problem. I do believe that many gophers are not aware of the special Reset() conditions and that this is a common mistake.

package main

import "time"

const (
	keepaliveInterval = 2 * time.Millisecond
)

var (
	resetC = make(chan struct{}, 1)
)

func main() {
	go keepaliveLoop()

	// Sample routine triggering the reset.
	// Example: this could be due to incoming peer requests and
	// a keepalive check should be reset to the max keepalive timeout.
	for i := 0; i < 1000; i++ {
		time.Sleep(time.Millisecond)
		resetKeepalive()
	}
}

func resetKeepalive() {
	// Don't block if there is already a reset request.
	select {
	case resetC <- struct{}{}:
	default:
	}
}

func keepaliveLoop() {
	t := time.NewTimer(keepaliveInterval)

	for {
		select {
		case <-resetC:
			time.Sleep(3 * time.Millisecond) // Simulate some reset work...
			t.Reset(keepaliveInterval)
		case <-t.C:
			ping()
			t.Reset(keepaliveInterval)
		}
	}
}

func ping() {
	panic("ping must not be called in this example")
}

@rsc
Copy link
Contributor

rsc commented Jan 31, 2017

@funny-falcon Yes, I know it panics with the Go time package. My point is that this package claims to provide a "fixed" Reset without these problems. It does not.

@funny-falcon
Copy link
Contributor

@rsc, it claims to fix Reset only for channel variant. There were no mention of AfterFunc before your comment.

@rsc
Copy link
Contributor

rsc commented Jan 31, 2017

Let's move this discussion off of a closed issue and onto golang-nuts. Thanks.

@golang golang locked and limited conversation to collaborators Jan 31, 2017
@rsc rsc unassigned adg Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Documentation NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests