testing: add T.Context() accessor #16221

bradfitz · 2016-06-30T00:12:09Z

Can we add a func (*T) Context() context.Context accessor in the testing package?

And then make each test run with a context which is canceled when it finishes.

Then goroutines created by tests can select on the test's completion (pass or fail).

Thoughts?

The text was updated successfully, but these errors were encountered:

robpike · 2016-08-22T23:42:43Z

Requirement: This CL should not add any dependencies to the testing package (other than context of course),

bradfitz · 2016-08-22T23:43:58Z

I'll prototype an implementation for review.

broady · 2016-08-22T23:45:43Z

Transitive imports from context are a subset of those from testing.

dsnet · 2016-10-06T23:10:53Z

Something to think about when implementing this. Currently, it is a bug when people call T.FatalX or T.SkipX inside a goroutine other than the main test function. The current behavior is subtly wrong, but we could consider having them cancel the context as a possible alternative to #15758

rsc · 2016-10-20T23:56:16Z

SGTM. Seems orthogonal to #15758. @dsnet, remember that cancelling a context is a polite request only. It doesn't force things to be torn down.

gopherbot · 2016-10-22T14:30:40Z

CL https://golang.org/cl/31724 mentions this issue.

gopherbot · 2016-11-03T19:10:18Z

CL https://golang.org/cl/32648 mentions this issue.

So that testing can use context in its public API. For #16221. Change-Id: I6263fa7266c336c9490f20164ce79336df44a57e Reviewed-on: https://go-review.googlesource.com/32648 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

dsnet · 2016-12-02T20:51:31Z

I started some discussion about this feature in: https://groups.google.com/forum/#!topic/golang-dev/rS6psxIf17Q

bcmills · 2016-12-02T21:07:27Z

A Context that is canceled when the test fails would be fine, but trying to use it in isolation to signal the return of the Test function seems like a mistake. Context cancellation is asynchronous, so nothing guarantees that the goroutines started as part of the test have actually returned when the test ends, which means they are likely to interfere with subsequent tests.

For the "clean up because the test is done" use-case, it seems like the testing.T needs some mechanism to wait for the cleanup to finish. You could provide a Go method for that, along the lines of errgroup but with an implicit Wait when the test returns:

func TestFoo(t *testing.T) {
  ctx := t.Context()  // Canceled when the test fails or the test function returns.
  t.Go(func() {
    doSomeBackgroundThing(ctx, t)  // The test runner will not begin the next test until this returns.
  })
  … // Run the test.
}

dsnet · 2016-12-02T21:18:11Z

@bcmills, a Go method is very similar to my proposal in #18022.

neild · 2016-12-02T22:36:22Z

I strongly feel that any interface which encourages allowing goroutines to outlive the test that starts them is a mistake.

@bcmills's proposal is pretty much exactly what I first thought of, but I wonder if saving a couple lines of code in tests that use concurrency is worth adding more methods to testing.T.

bcmills · 2016-12-02T23:35:36Z

One alternative would be to make t.Context() crash loudly (per #15758) if any of its methods are called after the Test function returns. That would make it much easier to detect interfering goroutines, while still keeping the synchronization explicitly in the test code rather than adding more methods to testing.T.

cstockton · 2016-12-18T00:13:22Z

I just wanted to chime in and say I don't fully agree with @dsnet that having Context in the testing package simply "saves two lines of code".. if implemented to cancel on Fatal() the moment before Goexit() was called. Having a context that was canceled when Fatal() was called would simplify tight integration between test coverage, passing context through your systems and the synchronization primitives used to instrument it all would be easier to reason with. I personally (which I may just suck at writing Go) find it a pretty big mental drain orchestrate synchronization between things during testing. Soon as Fatal() is called and Goexit() begins Done() selects that are working in harmony with defer'd channel sends or unlocks are tricky since your defers are always behind the context Done(), which is a bit backwards from how I normally think. Start ctx, add sync prims, channels to unify context pipelines, exit context, exit channels, exit sync prims.

I think adding Context to testing makes for simpler usage for. If there is debate on when a context should be canceled maybe some form of WithErrors(ctx) or WithFatal(ctx) could be supplied to allow it to cancel on errors/fatal, and the default be on test exit. Or maybe a go test **-ctx={exit|fatal|error}** flag.

quentinmit · 2016-12-18T01:55:29Z

@cstockton What is the point of cancelling the context on Fatal calls? Even if you do cancel the context, because it's asynchronous your code is not guaranteed to notice before the Goexit call runs.

cstockton · 2016-12-18T04:16:10Z

@quentinmit I'm not sure what you mean, calling cancel() causes the context to begin ending immediately. If you mean that anything waiting for Done() channels in a select are not guaranteed to happen immediately I understand that, the aim here is to have them happen at all. The order does not matter, just that done is called when runtime.Goexit() is, so blocked goroutines eventually wake up and exit. There are a lot of ways to illustrate this, i.e.:

func TestDeadlock(t *testing.T) {
  go t.Fatal(`oops`)
  <-t.Context().Done()
}

func TestDeadlock(t *testing.T) {
  defer func() {
    <-t.Context().Done()
  }()
  t.Fatal(`oops`)
}

Any of them are fixed immediately by calling Cancel when the test is over.

func TestNoDeadlock(t *testing.T) {
  ctx, cancel := context.WithCancel(context.Background())
  go func() {
    go cancel()
    t.Fatal(`oops`)
  }()
  <-ctx.Done()
}

dsnet · 2016-12-19T07:11:45Z

Please note that both of your examples are racy. It is not valid to call t.Fatal in any other routine other than the original Test function. If your proposal in #18368 is based on this pattern, then it is going to be rejected since it an abuse of the testing API.

cstockton · 2016-12-19T07:30:53Z

The examples I posted where to show how things can deadlock, I know they are racy.. that is the entire point.

The first example is to illustrate that if you call fatal while waiting on Done(), you deadlock.

The second example shows how this can happen in defers as well, since you haven't returned to the caller yet.

Apparently I wrote more text than worth reading for you, so I'll dumb it down to this: runtime.Goexit is not friendly to deal with when you orchestrating highly concurrent testing. Having a context that will be canceled when my test fails an assertion would relieve the mental drain of writing unit tests because I have to worry about one less factor.

Share by communicating, runtime.Goexit doesn't communicate anything.

neild · 2016-12-19T20:15:04Z

I believe t.Fatal is a bit of a distraction here, since it can only be called from the main test goroutine. If I understand @cstockton correctly, the idea is that t.Context would become done if t.Error is called from any goroutine. (This is not, AFAIK, what the original proposal did.)

I'm not certain this would be useful in the general case. A test which starts a goroutine which may produce an error will need to wait for that goroutine to exit or otherwise report completion.

To use a slightly modified version one of the above examples,

func TestDeadlock(t *testing.T) {
  go t.Error(`oops`)
  <-t.Context().Done()
}

This test reliably fails, but it cannot succeed. If we modify it so that the child goroutine does not produce an error, we need to introduce an additional synchronization mechanism; waiting on t.Context.Done does not work, because t.Context never becomes done. Once we introduce that synchronization mechanism, t.Context no longer adds any value:

func TestDeadlock(t *testing.T) {
  ctx, cancel := context.WithCancel(context.Background())
  go func() {
    defer cancel()
    if someErrorCondition {
      t.Error("oops")
    }
  }()
  <-ctx.Done()

If I am misunderstanding the argument, perhaps a more complete example demonstrating both success and failure paths would be useful.

cstockton · 2016-12-19T23:02:04Z

Thanks @neild for thoughtful reply, the aim of my examples was to illustrate that waiting on a context provided by T.Context() is not intuitive as implemented since it is not canceled when fatal is called. I think it the fact that dsnet focused on the example being racy, and you tried to further decompose them proves that writing tests is more difficult then it should be. Which.. does circle back to my originating point.

Having a context that was canceled when Fatal() was called would simplify tight integration between test coverage, passing context through your systems and the synchronization primitives used to instrument it all would be easier to reason with. I personally (which I may just suck at writing Go) find it a pretty big mental drain orchestrate synchronization between things during testing.

t.Fatal() races, my app races, my dependencies race, everything may race. It's a huge mental drain to orchestrate synchronization when testing it with the current implementation. You have no signal for when your test is going to exit, other than being the only one to call t.Fatal(). This makes reusing test fixtures very difficult. T.Context() helps these problems by giving you a signal to select on when things go wrong.

As for Error canceling the context, I don't think that's intuitive, I use t.Error to spit out a set of failures to dive as deep into a failure as I can before I know I shouldn't continue without serious issues. Once I call t.Fatal I know my goroutine exits, and the test is now over. That is why I expect t.Context to be canceled when Fatal is called, I canceled my test, why wouldn't the context be canceled? I believe the original implementation was otherwise fine, just add an additional cancellation condition For (Skip|Fail)Now

Finally I'm going to note this here now so I don't forget, but you may ignore this, it might not be reasonable anyways.

All this focus about T.Fatal() only being callable from goroutines outside the main goroutine has made me recognize how unfriendly it is, it's something you learn and deal with once, but thinking about having this constraint lifted would make writing tests easier if it could be sanely implemented. Maybe having it exit the goroutine t.Fatal was called from and mark the executing test belonging to T as Failed() and canceling the Context(). This would allow the originating goroutine to launch multiple test fixtures with his instance of T and select on the context in the event of failure or wait for a finishing condition.

bcmills · 2016-12-19T23:07:43Z

@cstockton

You have no signal for when your test is going to exit, other than being the only one to call t.Fatal().

That's simply not true. goexit runs deferred functions: to a first approximation, all you have to do to synchronize your test correctly is to defer the cleanup.

That does mean that you have to structure your API in such a way that the deferred function call can determine when the cleanup has completed, but that's a good idea anyway: that's also what allows you to write programs with predictable memory footprints.

cstockton · 2016-12-20T00:31:09Z

@bcmills I'm well aware that defers are ran, or I wouldn't have posted this example:

func TestDeadlock(t *testing.T) {
  defer func() {
    <-t.Context().Done()
  }()
  t.Fatal(`oops`)
}

Which I made to illustrate that the context not being done when t.Fatal is called is not intuitive to me in the prior implementation of T.Context().

Which I brought up because of my Request:

Add back t.Context()
t.Fatal() should cancel the context, because the test is done.

Because:

I personally (which I may just suck at writing Go) find it a pretty big mental drain orchestrate synchronization between things during testing.

But instead of extrapolating it just a little along with what I said to imagine the scenarios a user could run into it's taken in literal form where it can be proved "incorrect" so the originating argument can be discarded.

Anyways, I really do have an issue with your statement:

all you have to do to synchronize your test correctly is to defer the cleanup.
That does mean that you have to structure your API in such a way that the deferred function call can determine when the cleanup has completed, but that's a good idea anyway: that's also what allows you to write programs with predictable memory footprints.

As for API design and memory footprints, we are talking about testing here. If you mean the application API's, then yes, good API design is important. There is already a mechanism in place to allow mutual communication of cancellation when an API that crosses your API's boundaries (the executing test) is done and ready to cleanup, it's called context.Context. Something the testing API is lacking. Which is why I am advocating to add it back. To make writing concurrent tests easier to reason about. I am completely perplexed how you are comfortable telling me to use a LIFO queue to manage synchronization of my tests with a hint of "design good API's". I mean, seriously?

Anyways, I think I'm done here. I made my points best I could. Thanks for taking the time to respond.

bcmills · 2016-12-20T19:04:57Z

@cstockton

There is already a mechanism in place to allow mutual communication of cancellation

That's kind of my point: Context cancellation is unidirectional, not "mutual". It provides a way to notify calls that their results are no longer relevant, but does not wait for them to finish using resources. That latter step is important, not only for testing (to reduce interference between tests) but also for production code (to reduce the potential for runaway resource consumption under unusual conditions).

The usual way that we signal that a call is done using resources is by returning from it. But that's not the case for background goroutines.

I actually agree with you that a Context that is cancelled when SkipNow or FailNow is called would be useful. But it needs to be paired with synchronization in the other direction: simplifying one but not the other makes for fragile tests (and fragile APIs).

To give a concrete example: consider the race that @bradfitz fixed in change 34021. Simply canceling the acceptor goroutine would still leave the test flaky. The fix Brad implemented was to switch from an asynchronous API (calling t.Error in a background goroutine) to a somewhat more synchronous one (returning the error on a channel). That fixes the immediate flakiness, but it's still a bit leaky: on failure, it may leave a net.Conn open until the goroutine is scheduled and closes it, interfering with port assignments for subsequent tests. (In a very heavily-loaded environment with lots of similar test-cases in the same binary, that pattern can lead to port exhaustion and spurious failures.) A more robust fix for test cancellation should make it easier to eliminate that leak too.

cstockton · 2016-12-20T21:21:36Z

Ohh @bcmills

That's kind of my point: Context cancellation is unidirectional, not "mutual". It provides a way to notify calls that their results are no longer relevant, but does not wait for them to finish using resources. That latter step is important, not only for testing (to reduce interference between tests) but also for production code (to reduce the potential for runaway resource consumption under unusual conditions).

I didn't realize that the context interface didn't have a way for me to pass messages upstream, ignore my request. Thanks.

bradfitz added the Proposal label Jun 30, 2016

bradfitz added this to the Go1.8Maybe milestone Jun 30, 2016

bradfitz self-assigned this Aug 22, 2016

robpike added the Proposal-Accepted label Aug 22, 2016

dsnet mentioned this issue Oct 15, 2016

Skip the "context.Context should be the first parameter of a function" error if the first parameter is *testing.T golang/lint#248

Closed

bradfitz changed the title ~~proposal: testing: add T.Context() accessor?~~ testing: add T.Context() accessor Oct 22, 2016

gopherbot closed this as completed in 26827bc Nov 3, 2016

bradfitz mentioned this issue Dec 5, 2016

Proposal: add context.Context objects to testing.T EDIT: and testing.M #18182

Closed

dsnet mentioned this issue Dec 5, 2016

testing: T.Context usability concerns #18199

Closed

cstockton mentioned this issue Dec 18, 2016

proposal: testing: add T.Context #18368

Closed

nvanbenschoten mentioned this issue Mar 14, 2017

all: audit context.TODO usage cockroachdb/cockroach#14128

Merged

golang locked and limited conversation to collaborators Dec 20, 2017

gopherbot added the FrozenDueToAge label Dec 20, 2017

rsc unassigned bradfitz Jun 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

testing: add T.Context() accessor #16221

testing: add T.Context() accessor #16221

bradfitz commented Jun 30, 2016

robpike commented Aug 22, 2016 •

edited

bradfitz commented Aug 22, 2016

broady commented Aug 22, 2016

dsnet commented Oct 6, 2016

rsc commented Oct 20, 2016

gopherbot commented Oct 22, 2016

gopherbot commented Nov 3, 2016

dsnet commented Dec 2, 2016

bcmills commented Dec 2, 2016 •

edited

dsnet commented Dec 2, 2016

neild commented Dec 2, 2016

bcmills commented Dec 2, 2016

cstockton commented Dec 18, 2016

quentinmit commented Dec 18, 2016

cstockton commented Dec 18, 2016

dsnet commented Dec 19, 2016

cstockton commented Dec 19, 2016

neild commented Dec 19, 2016

cstockton commented Dec 19, 2016

bcmills commented Dec 19, 2016

cstockton commented Dec 20, 2016

bcmills commented Dec 20, 2016

cstockton commented Dec 20, 2016

testing: add T.Context() accessor #16221

testing: add T.Context() accessor #16221

Comments

bradfitz commented Jun 30, 2016

robpike commented Aug 22, 2016 • edited

bradfitz commented Aug 22, 2016

broady commented Aug 22, 2016

dsnet commented Oct 6, 2016

rsc commented Oct 20, 2016

gopherbot commented Oct 22, 2016

gopherbot commented Nov 3, 2016

dsnet commented Dec 2, 2016

bcmills commented Dec 2, 2016 • edited

dsnet commented Dec 2, 2016

neild commented Dec 2, 2016

bcmills commented Dec 2, 2016

cstockton commented Dec 18, 2016

quentinmit commented Dec 18, 2016

cstockton commented Dec 18, 2016

dsnet commented Dec 19, 2016

cstockton commented Dec 19, 2016

neild commented Dec 19, 2016

cstockton commented Dec 19, 2016

bcmills commented Dec 19, 2016

cstockton commented Dec 20, 2016

bcmills commented Dec 20, 2016

cstockton commented Dec 20, 2016

robpike commented Aug 22, 2016 •

edited

bcmills commented Dec 2, 2016 •

edited