Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: race detector: more aggressive scheduler perturbations #22569

Open
bcmills opened this issue Nov 3, 2017 · 4 comments
Open

runtime: race detector: more aggressive scheduler perturbations #22569

bcmills opened this issue Nov 3, 2017 · 4 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. RaceDetector
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Nov 3, 2017

A program was posted to golang-nuts, with the question “is this code thread [safe]?”

The program has a race: it executes a select with a default case many times in parallel, and the default case calls close on a channel. (You can see the race by inserting a runtime.Gosched(): https://play.golang.org/p/_PWTTCwPgi.)

To my surprise, the program runs without error even with the race detector enabled.

The Go Memory Model defines an order between sends and receives but not between closes and sends or closes and other closes, so I think this is technically a data race (and not just a logic race). The race detector should report it.

src/racy/main.go:

package main

import (
	"fmt"
	"sync/atomic"
	"time"
)

func main() {
	die := make(chan struct{})
	i := int32(0)
	closed := func() {
		select {
		case <-die:
			atomic.AddInt32(&i, 1)
		default:
			close(die)
			fmt.Println("closed")
		}
	}
	N := 100000
	for i := 0; i < N; i++ {
		go closed()
	}
	time.Sleep(10 * time.Second)
	fmt.Println(atomic.LoadInt32(&i))
}
$ go version
go version devel +1852573521 Wed Nov 1 16:58:36 2017 +0000 linux/amd64
$ go run -race src/racy/main.go
closed
99999
@ianlancetaylor ianlancetaylor changed the title race detector: detect close-after-close and send-after-close on channels runtime: race detector: detect close-after-close and send-after-close on channels Nov 3, 2017
@ianlancetaylor ianlancetaylor added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. RaceDetector labels Nov 3, 2017
@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Nov 3, 2017
@ianlancetaylor
Copy link
Contributor

CC @dvyukov

@dvyukov
Copy link
Member

dvyukov commented Nov 6, 2017

Calling recv and close concurrently is perfectly fine and in fact a common idiom. Behavior is defined and is useful.
It's calling close and send concurrently is problematic.

@dvyukov dvyukov closed this as completed Nov 6, 2017
@bcmills
Copy link
Contributor Author

bcmills commented Nov 8, 2017

Calling recv and close concurrently is perfectly fine and in fact a common idiom. Behavior is defined and is useful.

Yes, that's true. However, a select does not induce a happens-before relationship if its default case is taken (i.e., if the receive operation does not execute).

In this particular example, that makes the existence of the race itself nondeterministic: if goroutine scheduling is favorable, the first default case executes before any other goroutines are scheduled. Then subsequent goroutines take the receive case, and no race occurs.

However, if goroutine scheduling is unfavorable, the default case is taken by multiple goroutines before the close executes, and there is a close-after-close race.


Perhaps one way to make the race detector more useful for select-with-default races would be to make the scheduler more aggressive (i.e., less deterministic) when the race detector is active.

A more deterministic option might be to check whether the first (or last) send or close event on the channel happens before the select statement itself: if not, evaluate the default case up to the first observable write (or next synchronization point) in addition to whatever branch was actually taken.

@dvyukov dvyukov reopened this Nov 8, 2017
@dvyukov
Copy link
Member

dvyukov commented Nov 9, 2017

The problem is more general and is not specific to selects. Consider:

if atomic.LoadUint32(&x) == 0 {
  atomic.StoreUint32(&x, 1)
  y++
}

If 2 goroutines sneak into the if before one of them sets x=1, there will be a race on y. But most of the time there won't be a race.
In another race detector (Relacy) I implemented systematic checking of all possible thread interleavings, but that requires executing the same code gazillions of times and is not generally feasible. Our best bet is randomizing execution order more. In the old tsan I implemented a feature called "scheduler shake" for this, but I never got around to implementing it in the new tsan.

@dvyukov dvyukov changed the title runtime: race detector: detect close-after-close and send-after-close on channels runtime: race detector: more aggressive scheduler perturbations Feb 26, 2018
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. RaceDetector
Projects
None yet
Development

No branches or pull requests

4 participants