testing: race detected in goroutine can report an error in an unnamed test #60083

ianlancetaylor · 2023-05-09T19:58:39Z

Test case:

package foo_test

import (
	"fmt"
	"testing"
)

func makeRace() {
	var x int
	c := make(chan bool)
	go func() {
		x = 1
		c <- true
	}()
	fmt.Println(x)
	<-c
}

func TestFoo(t *testing.T) {
	makeRace()
}

Running this test with go test -race -test.v prints

=== RUN   TestFoo
0
==================
WARNING: DATA RACE
Write at 0x00c000014288 by goroutine 8:
  command-line-arguments_test.makeRace.func1()
      /home/iant/foo_test.go:12 +0x33

Previous read at 0x00c000014288 by goroutine 7:
  command-line-arguments_test.makeRace()
      /home/iant/foo_test.go:15 +0x10c
  command-line-arguments_test.TestFoo()
      /home/iant/foo_test.go:20 +0x1c
  testing.tRunner()
      /home/iant/go/src/testing/testing.go:1595 +0x238
  ??()
      -:0 +0x5243c4

Goroutine 8 (running) created at:
  command-line-arguments_test.makeRace()
      /home/iant/foo_test.go:11 +0xfc
  command-line-arguments_test.TestFoo()
      /home/iant/foo_test.go:20 +0x1c
  testing.tRunner()
      /home/iant/go/src/testing/testing.go:1595 +0x238
  ??()
      -:0 +0x5243c4

Goroutine 7 (running) created at:
  testing.(*T).Run()
      /home/iant/go/src/testing/testing.go:1648 +0x835
  testing.runTests.func1()
      /home/iant/go/src/testing/testing.go:2054 +0x84
  testing.tRunner()
      /home/iant/go/src/testing/testing.go:1595 +0x238
  testing.runTests()
      /home/iant/go/src/testing/testing.go:2052 +0x896
  testing.(*M).Run()
      /home/iant/go/src/testing/testing.go:1925 +0xb57
  main.main()
      _testmain.go:47 +0x2bd
==================
    testing.go:1465: race detected during execution of test
--- FAIL: TestFoo (0.00s)
=== NAME  
    testing.go:1465: race detected during execution of test
FAIL
FAIL	command-line-arguments	0.007s
FAIL

Note the === NAME near the end. A === NAME line should be followed by the name of the test. Omitting the name is confusing and can break code that is parsing the test output.

The text was updated successfully, but these errors were encountered:

cherrymui · 2023-05-09T20:30:23Z

cc @bcmills

bcmills · 2023-05-09T20:43:50Z

I think this is plausibly correct? The empty NAME line could mean “this line is independent of the individual tests”, which is true for race reports in general (albeit perhaps not this specific case).

ianlancetaylor · 2023-05-09T21:12:34Z

I tend to think that an empty NAME line is confusing and unhelpful. If we want to say that the report is independent of individual tests, let's say that.

bcmills · 2023-05-09T21:38:15Z

I believe it is currently necessary in order for cmd/test2json to correctly interpret the output, but perhaps it could be omitted (or changed) when -test.v is not set to test2json.

ianlancetaylor · 2023-05-09T21:41:23Z

Can you expand on that? test2json currently doesn't seem to count race detector reports at all.

cip999 · 2023-05-09T21:43:04Z

Hi! While we're at it, let me point out a related "weird" pattern in race error messages for tests. Unlike other errors, these get propagated to all the ancestors of the subtest that generated the race, and the same hardcoder error message (testing.go:1465: race detected during execution of test) is printed for all of them. [Since the ancestors include the "root" test that encapsulates the whole suite, and which has an empty name, this causes the === NAME line to be printed.]

Just wanted to mention this before I forgot. It's a different problem (if we want to call it that) from the subject of this issue, in the sense that fixing any of them won't solve the other. Maybe it's worth filing another bug for this?

ianlancetaylor · 2023-05-09T21:43:29Z

@cip999 Thanks, I was just looking at that.

cip999 · 2023-05-09T21:47:52Z

The reason for that is how race errors are counted:

Total number of race errors (race.Errors()) is logged right before running the (sub)test.
(Sub)test is run.
Total number of race errors is logged again, and if greater than the previous one the race detection error is thrown.

(By the way, for some reason this is done by storing the negative of the first number, adding the second number, and checking whether the result is greater than 0. I'm confused.)

It shouldn't be difficult to fix it.

ianlancetaylor · 2023-05-09T21:49:05Z

Using the negative count and then adding to it is because the number of race detector errors is monotonically increasing. We want to do the right thing when we run multiple tests, and some have race errors.

cip999 · 2023-05-09T21:57:25Z

Yes, my confusion was about why not storing the first error count (not its negative) and comparing it to the second one. But I think I figured it out, it's to handle the counting for parallel tests. Anyway, I'll stop diverging from the main subject. :)

gopherbot · 2023-05-09T22:12:11Z

Change https://go.dev/cl/494057 mentions this issue: testing: only report subtest races once

cip999 · 2023-05-10T21:03:08Z

Following up on this comment thread, I investigated the way race errors are handled and reported in parallel tests.

There are a couple notable points.

Parallel tests are run after the parent test has finished. Most importantly, the order of the events is the following: parent test function returns -> race errors for parent test are counted (and possibly reported) -> parallel subtests are released.
This means that direct parents of parallel tests aren't affected by this issue: if a race happens inside a parallel test, by that time its parent will have already logged the number of races. However, the race will be reported for the parent's parent (unless the parent is also run in parallel) and all previous ancestors.
As a concrete example, if we call t.Parallel() inside TestFoo in the proof of concept above, we no longer get the === NAME line and the race is reported only once.

[Incidentally, the semantics of T.Parallel is a bit counterintuitive; but let's not go there.]

Now, what happens with the proposed change? What I said in the comments isn't exactly true. Parallel tests will still be "deferred" to when race errors for the parent have been already reported, so the parent won't be affected at all. However, the parent's parent (and all previous ancestors) will be: if a race happens during parallel execution of two or more sub-sub-tests, the race count will be decremented by 2 or more (instead of 1). Then, if another race happens outside the subtests, it won't be detected. (Disclaimer: I haven't verified this, I'm too lazy to patch the CL...)

Admittedly, sub-sub-tests are seldom a thing (if ever). Still, something that works "by chance" is not super solid.

Edit: Just to clarify, a minimal working example of the scenario I describe should be the following:

func makeRace() {
  var x int
  c := make(chan bool)
  go func() {
    x = 1
    c <- true
  }()
  fmt.Println(x)
  <-c
}

func TestFoo(t *testing.T) {
  t.Run("subtest", func(t *testing.T) {
    c := make(chan bool)

    t.Run("first subsubtest", func(t *testing.T) {
      t.Parallel()
      <-c
      makeRace() // All the signaling is to make sure that this happens inside both subsubtests.
      c <- true
    })

    t.Run("second subsubtest", func(t *testing.T) {
      t.Parallel()
      c <- true
      <-c
    })
  })
}

ianlancetaylor · 2023-05-24T17:19:22Z

CL is being reverted.

gopherbot · 2023-06-08T17:35:57Z

Change https://go.dev/cl/501895 mentions this issue: testing: correctly handle race error detection in subtests

gopherbot · 2023-06-26T22:27:16Z

Change https://go.dev/cl/506300 mentions this issue: testing: use monotonic counts to attribute races in subtests

gopherbot · 2023-06-28T14:03:23Z

Change https://go.dev/cl/506755 mentions this issue: testing: simplify concurrency and cleanup

While investigating #60083, I found a couple of bugs (notably #61034) that had slipped through code review in part because the concurrency patterns used in the testing package were too complex for me to fully reason about. This change adjusts those patterns to be more in line with current idioms, and to reduce the number of special cases that depend on details that should be orthogonal. (For example: the details of how we invoke the Cleanup functions should not depend on whether the test happened to run any parallel subtests.) In the process, this change fixes a handful of bugs: - Concurrent calls to Run (explicitly allowed by TestParallelSub) could previously drive the testcontext.running count negative, causing the number of running parallel tests to exceed the -parallel flag. - The -failfast flag now takes effect immediately on failure. It no longer delays until the test finishes, and no longer misses failures during cleanup (fixing #61034). - If a Cleanup function calls runtime.Goexit (typically via t.FailNow) during a panic, Cleanup functions from its parent tests are no longer skipped and buffered logs from its parent tests are now flushed. - The time reported for a test with subtests now includes the time spent running those subtests, regardless of whether they are parallel. (Previously, non-parallel subtests were included but parallel subtests were not.) - Calls to (*B).Run in iterations after the first are now diagnosed with a panic. (This diagnoses badly-behaved benchmarks: if Run is called during the first iteration, no subsequent iterations are supposed to occur.) Fixes #61034. Change-Id: I3797f6ef5210a3d2d5d6c2710d3f35c0219b02ea Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-linux-amd64-longtest-race,gotip-windows-amd64-longtest Reviewed-on: https://go-review.googlesource.com/c/go/+/506755 Auto-Submit: Bryan Mills <bcmills@google.com> Reviewed-by: Alan Donovan <adonovan@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

cherrymui added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 9, 2023

cherrymui added this to the Backlog milestone May 9, 2023

gopherbot closed this as completed in 5b93af7 May 24, 2023

ianlancetaylor reopened this May 24, 2023

bcmills modified the milestones: Backlog, Go1.22 Jun 28, 2023

bcmills self-assigned this Jun 28, 2023

gopherbot closed this as completed in 55b8e16 Oct 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

testing: race detected in goroutine can report an error in an unnamed test #60083

testing: race detected in goroutine can report an error in an unnamed test #60083

ianlancetaylor commented May 9, 2023

cherrymui commented May 9, 2023

bcmills commented May 9, 2023

ianlancetaylor commented May 9, 2023

bcmills commented May 9, 2023

ianlancetaylor commented May 9, 2023

cip999 commented May 9, 2023

ianlancetaylor commented May 9, 2023

cip999 commented May 9, 2023

ianlancetaylor commented May 9, 2023

cip999 commented May 9, 2023

gopherbot commented May 9, 2023

cip999 commented May 10, 2023 •

edited

ianlancetaylor commented May 24, 2023

gopherbot commented Jun 8, 2023

gopherbot commented Jun 26, 2023

gopherbot commented Jun 28, 2023

testing: race detected in goroutine can report an error in an unnamed test #60083

testing: race detected in goroutine can report an error in an unnamed test #60083

Comments

ianlancetaylor commented May 9, 2023

cherrymui commented May 9, 2023

bcmills commented May 9, 2023

ianlancetaylor commented May 9, 2023

bcmills commented May 9, 2023

ianlancetaylor commented May 9, 2023

cip999 commented May 9, 2023

ianlancetaylor commented May 9, 2023

cip999 commented May 9, 2023

ianlancetaylor commented May 9, 2023

cip999 commented May 9, 2023

gopherbot commented May 9, 2023

cip999 commented May 10, 2023 • edited

ianlancetaylor commented May 24, 2023

gopherbot commented Jun 8, 2023

gopherbot commented Jun 26, 2023

gopherbot commented Jun 28, 2023

cip999 commented May 10, 2023 •

edited