Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing: Data race on t.ran #20339

Closed
jrick opened this issue May 12, 2017 · 9 comments
Closed

testing: Data race on t.ran #20339

jrick opened this issue May 12, 2017 · 9 comments
Labels
Documentation FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@jrick
Copy link
Contributor

jrick commented May 12, 2017

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

1.8.1

What operating system and processor architecture are you using (go env)?

Linux amd64 (Travis CI runner)

What did you do?

One of our tests runs (*testing.T).Run concurrently and occasionally a data race is caused by unsafe access to the t.common.ran member.

WARNING: DATA RACE
Write at 0x00c420069081 by goroutine 21:
  testing.(*common).setRan()
      /home/travis/.gimme/versions/go1.8.1.linux.amd64/src/testing/testing.go:433 +0x97
  testing.(*common).setRan()
      /home/travis/.gimme/versions/go1.8.1.linux.amd64/src/testing/testing.go:429 +0xea
  testing.(*common).setRan()
      /home/travis/.gimme/versions/go1.8.1.linux.amd64/src/testing/testing.go:429 +0xea
  testing.tRunner.func1()
      /home/travis/.gimme/versions/go1.8.1.linux.amd64/src/testing/testing.go:650 +0x48c
  testing.tRunner()
      /home/travis/.gimme/versions/go1.8.1.linux.amd64/src/testing/testing.go:659 +0x123

Previous read at 0x00c420069081 by main goroutine:
  testing.runTests()
      /home/travis/.gimme/versions/go1.8.1.linux.amd64/src/testing/testing.go:890 +0x538
  testing.(*M).Run()
      /home/travis/.gimme/versions/go1.8.1.linux.amd64/src/testing/testing.go:822 +0x1c3
  main.main()
      github.com/decred/dcrwallet/wallet/udb/_test/_testmain.go:50 +0x20f

Goroutine 21 (running) created at:
  testing.(*T).Run()
      /home/travis/.gimme/versions/go1.8.1.linux.amd64/src/testing/testing.go:697 +0x543
  github.com/decred/dcrwallet/wallet/udb.TestUpgrades.func1()
      /home/travis/gopath/src/github.com/decred/dcrwallet/wallet/udb/upgrades_test.go:81 +0x17d

On testing.go:433 t.common.ran is protected by the t.common.mu mutex, but no mutex is grabbed during the read on testing.go:890.

@jrick jrick changed the title testing: Data race on t.run testing: Data race on t.ran May 12, 2017
@ianlancetaylor ianlancetaylor added this to the Go1.9 milestone May 12, 2017
@ianlancetaylor
Copy link
Contributor

CC @mpvl

@davecheney
Copy link
Contributor

davecheney commented May 12, 2017 via email

@jrick
Copy link
Contributor Author

jrick commented May 13, 2017

@davecheney Thanks for that info, I didn't realize that. I'll refactor the tests to fix that issue.

However I don't believe that is the problem in this case. These tests are not failing so none of those methods would have been called. The function passed to Run also does not appear in the stack trace indicating that this race was caused by concurrent calls to Run.

jrick added a commit to jrick/btcwallet that referenced this issue May 13, 2017
This refactors the udb TestUpgrades function to avoid a data race
(potentially a Go issue, see golang/go#20339) by replacing the
concurrent t.Run and sync.WaitGroup usage with a subtest composed of
other parallel subtests.  This allows cleanup to be run outside of the
outer subtest.
@jrick
Copy link
Contributor Author

jrick commented May 13, 2017

I created an isolated test case that did nothing except start a goroutine to call t.Run. Following the same waitgroup usage as in my linked test, I was able to reproduce the same race. However, by moving the wg.Done() call to outside of the test function run by t.Run I was unable to hit the race. I'm not sure if this indicates a bug in the testing package or not. The documentation states that calls to Run must occur before the outer test function returns, but it is unclear to me if this means the call must be started before the outer test returns, or the entire execution of t.Run must complete.

func TestIssue20399(t *testing.T) {
	var wg sync.WaitGroup
	wg.Add(10)
	for i := 0; i < 10; i++ {
		go func() {
			t.Run("", func(t *testing.T) {
				wg.Done() // No data race if this is moved after the call to t.Run
			})
		}()
	}
	wg.Wait()
}

@cespare
Copy link
Contributor

cespare commented May 13, 2017

The documentation states that calls to Run must occur before the outer test function returns, but it is unclear to me if this means the call must be started before the outer test returns, or the entire execution of t.Run must complete.

Yeah, when the docs say

Run may be called simultaneously from multiple goroutines, but all such calls must happen before the outer test function for t returns.

it means that the calls to t.Run must return before the outer test function returns.

Perhaps there's a documentation improvement needed, but there isn't a bug here AFAICT.

@ALTree
Copy link
Member

ALTree commented Jun 1, 2017

it means that the calls to t.Run must return before the outer test function returns.

@cespare can you expand on this? Because the doc at the top says:

A parent test will only complete once all of its subtests complete.

which I read as "the outer test will never return before the subtests are complete"; if this interpretation is correct then it's not true that the subtests need to return before the main, since the implementation should already guarantee that this always happen by preventing the outer test to return before all the subtests are done.

It looks like using Run in goroutines should be fine as long as the goroutine calls Run before the outer test exit.

@ALTree ALTree added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 1, 2017
@cespare
Copy link
Contributor

cespare commented Jun 1, 2017

@ALTree The doc for t.Run seems pretty clear. I'm not sure I can restate it better. The example after that sentence you quoted

A parent test will only complete once all of its subtests complete.

is

func TestGroupedParallel(t *testing.T) {
    for _, tc := range tests {
        tc := tc // capture range variable
        t.Run(tc.Name, func(t *testing.T) {
            t.Parallel()
            ...
        })
    }
}

I think the purpose of stating that is to clarify that Run doesn't fire-and-forget the subtest functions.

If you do go t.Run(...) there's no magic the testing package can do to prevent the outer test from returning before the subtests are done.

It looks like using Run in goroutines should be fine as long as the goroutine calls Run before the outer test exit.

Since this has apparently confused multiple people, perhaps we should update the t.Run docs from

all such calls must happen before the outer test function for t returns.

to

all such calls must return before the outer test function for t returns.

(But I think that's already what it means.)

@ALTree
Copy link
Member

ALTree commented Jun 1, 2017

If you are right then I agree we should definitely update the doc, because "all such calls must happen before" really does not make clear that the calls must actually return before the outer test returns.

@ALTree ALTree added Documentation and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jun 1, 2017
@bradfitz bradfitz added the NeedsFix The path to resolution is known, but the work has not been done. label Jun 29, 2017
@gopherbot
Copy link

CL https://golang.org/cl/47150 mentions this issue.

@golang golang locked and limited conversation to collaborators Jun 29, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Documentation FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

7 participants