testing: run Cleanup if SIGINT occurs #41891

katiehockman · 2020-10-09T19:12:37Z

Currently, testing.T.Cleanup and testing.B.Cleanup are only run if the test and all of its subtests have finished running. This means it won't run if the test is stopped with a SIGINT (e.g. Ctrl+C). There are several things that may be in such a Cleanup function that would be helpful to run regardless of whether or not the test fully completed.

This will be especially unfortunate in the case of fuzzing (http://golang.org/s/draft-fuzzing-design). Take for example:

func FuzzFoo(f *testing.F) {
	f.Cleanup(func() {
		// Check for things like stray go routines, etc
		// Delete tmp resources that may be left behind
	})
	f.Fuzz(func(t *testing.T, b []byte) {
		// some code being fuzzed
	})
}

It may be a very common use case that someone could run go test -fuzz FuzzFoo and give a SIGINT after a few hours. If no crash occurs, there would be no way to run any cleanup code or do any checks after fuzzing completed since the fuzzing runs in an infinite loop.

I would like to expand the behavior of Cleanup to not only run if tests finished, but also run in the case of a SIGINT. This could of course be special cased to fuzzing, but I can see good reasons why it should be a part of unit and benchmark testing as well (e.g. if the test hangs for some reason).

/cc @jayconrod @bcmills @FiloSottile

The text was updated successfully, but these errors were encountered:

katiehockman · 2020-10-09T19:14:10Z

A note on possible complexity from @jayconrod: "One problem is that we can't handle SIGINT on Windows. It looks like there a couple similar things (CTRL_CLOSE_EVENT). Not sure if we can hook into that without cgo."

jayconrod · 2020-10-09T19:36:06Z

Not sure whether we should handle SIGINT in all tests and benchmarks. User code may install signal handlers (for example, in an initializer in a cgo library), and we might accidentally replace those.

It makes more sense for fuzzing, specifically when running the fuzzing engine in worker processes, since 1) we don't need to be compatible with existing fuzz targets because there are none, and 2) it will be common for users to stop fuzzing with ^C.

bcmills · 2020-10-09T20:09:43Z

Can the test function itself register the SIGINT handler, rather than having the testing package do that?

For example, something like: https://play.golang.org/p/ZsRDGgDMgGz

jayconrod · 2020-10-09T20:26:59Z

We were looking for something more general purpose with fuzzing. The fuzzing coordinator process (started by go test) should handle SIGINT and should flush pending updates to the corpus. It should also stop its worker processes.

bcmills · 2020-10-09T20:36:13Z

I don't think Cleanup is appropriate for that without coordination from the test itself, though.

The Cleanup function can normally assume that it does not race with any subtests, and can also assume that it is only called after the subtests are complete (compare #40908). So it seems to me that if you want to run Cleanup functions on SIGINT, you fundamentally have to thread that signal down to the test functions and have them cooperatively complete.

ianlancetaylor · 2020-10-10T04:17:26Z

Note that the last time we tried to handle SIGINT in the testing package, for #19397, we had to roll back the change because it broke tests that were testing code that itself handled SIGINT. See #21000.

dprotaso · 2020-10-26T22:35:23Z

Related #39382

dprotaso · 2020-10-27T02:33:08Z

Knative have written some test-infra code that deals with handling interrupts but it's not ideal to maintain.

Generally

calls to our custom 'Cleanup' function registers Cleanup with 'testing.T` and our own interrupt handler
After handling the interrupt you can't cleanly stop test execution - we end up calling os.Exit(1) after the cleanup finishes
you actually need to wrap the clean up funcs in a sync.Once since there could be race between a test cleaning up and a signal being handled. You could end up with double invocations

I also noticed timeouts don't cancel tests gracefully either - #42217

odeke-em · 2021-02-04T03:52:00Z

Thanks everyone for the input here, @ianlancetaylor raised the cautionary tale and precedent of us previously having tried to handle SIGINT in testing before, and then rollbacked, and @bcmills showed how the caller could invoke it and handle directly. Perhaps the fuzzing package could handle it somehow or if the testing package could detect that fuzzing is being done that it could add hooks. Either way, we haven’t resolved this issue for Go1.16, and I shall kindly implore us to try again for Go1.17. Moving to Go1.17. Thank you.

cbednarski · 2021-02-16T06:35:06Z

My use case is to terminate a docker container that is started in a test helper. Since the container process is not parented to the test process it keeps running after the test is aborted.

bcmills · 2021-02-16T14:03:02Z

@cbednarski, #24050 (comment) might be more relevant for that use-case?

ianlancetaylor · 2021-05-04T03:38:12Z

I don't see how we can make this change. I'm at least inclined to change to milestone, which is currently Go1.17.

ianlancetaylor · 2021-11-16T23:57:44Z

@katiehockman This is in the Go 1.18 milestone. Is it likely to happen for 1.18? Or should it move to Backlog? Thanks.

katiehockman · 2021-11-17T16:20:13Z

I went ahead and moved it to Backlog, thanks.

krashish8 · 2022-01-02T09:07:34Z

Does anyone have any workaround for this, which one can implement temporarily?
Something like catching and handling interrupts would work, but I don't want to write the same code in "every" Test() function, so, any workaround for this?

benmoss · 2022-12-01T15:38:00Z

@krashish8 https://github.com/knative/serving/blob/4e17128158b206220eccbfa8b60c749811dab9a1/test/cleanup.go is a decent place to look for a pattern here. Basically replace t.Cleanup with a different function that handles the signal and calls t.Cleanup

krashish8 · 2022-12-02T13:05:11Z

Thanks a lot for the link @benmoss

pPanda-beta · 2023-08-03T14:36:34Z

For folks who landed here from #42217 , I have made a workaround to run all cleanup functions (yes go testing's cleanup, and I promise you no extra framework/slice is needed).

(Sorry for spamming this issue, the timeout related issue is closed and we can not comment there)

func RunAndFailTestBeforeTimeout(t *testing.T, tf func(*testing.T)) {
	deadline, hasTimeout := t.Deadline()
	if !hasTimeout {
		tf(t)
		return
	}

	const gracePercentage = 0.05

	tillDeadline := deadline.Sub(time.Now())
	gracePeriod := time.Duration(float64(tillDeadline) * gracePercentage)
	waitPeriod := tillDeadline - gracePeriod

	finished := make(chan any, 1)
	go func() {
		defer func() { finished <- recover() }()
		tf(t)
	}()

	select {
	case <-time.After(waitPeriod):
		t.Fatalf("test failed early, %v before actual deadline %v", gracePeriod, deadline)
	case somePanic := <-finished: // This may look like over-engineering, but it allows you to run the cleanups.
		if somePanic != nil {
			panic(somePanic)
		}
	}
}

Usage:

// Before 
func TestXyz(t *testing.T) {
// ...
}



// After
func TestXyz(t *testing.T) {
	RunAndFailTestBeforeTimeout(t, XTestXyz)
}

func XTestXyz(t *testing.T) {
// ...
}

Check the playground: https://go.dev/play/p/KMOyjuu5D-x

As a bonus it takes care of cleanup functions even if the code-under-test panics (recoverable).

I believe we can build a similar solution for signals as well, and apply it to existing tests without moving away from `t.Cleanup`.

katiehockman added the NeedsInvestigation label Oct 9, 2020

katiehockman added this to the Go1.16 milestone Oct 9, 2020

dprotaso mentioned this issue Oct 27, 2020

testing: T.Cleanup is not invoked when -timeout=N is exceeded #42217

Closed

benmoss mentioned this issue Oct 28, 2020

After is not called if test is aborted by SIGINT sclevine/spec#17

Closed

odeke-em modified the milestones: Go1.16, Go1.17 Feb 4, 2021

bcmills mentioned this issue Mar 15, 2021

proposal: Go2: goroutines execute defer statements when main exits #44973

Closed

This was referenced Apr 1, 2021

Idea: Use the OS Feature trap to catch SIGINTs when running tests to force cleanup gruntwork-io/terratest#328

Closed

Interrupting test suite leaves orphan etcd processes and data in /tmp kubernetes-sigs/cluster-api#4278

Closed

katiehockman modified the milestones: Go1.17, Go1.18 May 4, 2021

katiehockman modified the milestones: Go1.18, Backlog Nov 17, 2021

AlekSi mentioned this issue Jan 4, 2022

Support graceful tests termination FerretDB/dance#20

Open

snormore mentioned this issue Oct 11, 2022

Deflake some tests xmtp/xmtp-node-go#156

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

testing: run Cleanup if SIGINT occurs #41891

testing: run Cleanup if SIGINT occurs #41891

katiehockman commented Oct 9, 2020

katiehockman commented Oct 9, 2020

jayconrod commented Oct 9, 2020

bcmills commented Oct 9, 2020

jayconrod commented Oct 9, 2020

bcmills commented Oct 9, 2020

ianlancetaylor commented Oct 10, 2020

dprotaso commented Oct 26, 2020 •

edited

Loading

dprotaso commented Oct 27, 2020

odeke-em commented Feb 4, 2021 •

edited

Loading

cbednarski commented Feb 16, 2021

bcmills commented Feb 16, 2021

ianlancetaylor commented May 4, 2021

ianlancetaylor commented Nov 16, 2021

katiehockman commented Nov 17, 2021

krashish8 commented Jan 2, 2022

benmoss commented Dec 1, 2022

krashish8 commented Dec 2, 2022

pPanda-beta commented Aug 3, 2023

testing: run Cleanup if SIGINT occurs #41891

testing: run Cleanup if SIGINT occurs #41891

Comments

katiehockman commented Oct 9, 2020

katiehockman commented Oct 9, 2020

jayconrod commented Oct 9, 2020

bcmills commented Oct 9, 2020

jayconrod commented Oct 9, 2020

bcmills commented Oct 9, 2020

ianlancetaylor commented Oct 10, 2020

dprotaso commented Oct 26, 2020 • edited Loading

dprotaso commented Oct 27, 2020

odeke-em commented Feb 4, 2021 • edited Loading

cbednarski commented Feb 16, 2021

bcmills commented Feb 16, 2021

ianlancetaylor commented May 4, 2021

ianlancetaylor commented Nov 16, 2021

katiehockman commented Nov 17, 2021

krashish8 commented Jan 2, 2022

benmoss commented Dec 1, 2022

krashish8 commented Dec 2, 2022

pPanda-beta commented Aug 3, 2023

Usage:

I believe we can build a similar solution for signals as well, and apply it to existing tests without moving away from t.Cleanup.

dprotaso commented Oct 26, 2020 •

edited

Loading

odeke-em commented Feb 4, 2021 •

edited

Loading

I believe we can build a similar solution for signals as well, and apply it to existing tests without moving away from `t.Cleanup`.