Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

math/rand: tests are too weak due to misuse of statsResults.maxError #21211

Open
josharian opened this issue Jul 29, 2017 · 3 comments
Open

math/rand: tests are too weak due to misuse of statsResults.maxError #21211

josharian opened this issue Jul 29, 2017 · 3 comments
Labels
help wanted NeedsFix The path to resolution is known, but the work has not been done. Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@josharian
Copy link
Contributor

The math/rand tests use the function nearEqual to compare floats:

func nearEqual(a, b, closeEnough, maxError float64) bool {
	absDiff := math.Abs(a - b)
	if absDiff < closeEnough { // Necessary when one value is zero and one value is close to zero.
		return true
	}
	return absDiff/max(math.Abs(a), math.Abs(b)) < maxError
}

A bit of thought suggests that maxError should always be pretty small, if this function is ever to return false. However, maxError is usually scaled everywhere it is used (e.g. stddev * 0.08), and in practice it frequently ends up >> 2.

As a result, many of the math/rand tests have no teeth. An extreme example of this is TestReadUniformity, which requires (among other things) that the mean of just 2 randomly selected bytes must be very near 255.0/2, a test which should almost always fail, but which currently succeeds, due to a large value of maxError. (Related: CL 51310.)

Replacing maxError with a constant 0.08 causes most math/rand tests to pass, except TestReadUniformity when called with small n (in which case it should fail).

However, I don't have enough domain expertise to know whether this is reasonable or whether an alternative bound would be better. Advice requested.

golang.org/s/owners lists no one for math/rand, so I'll guess and cc: @rsc @aclements @robpike

@josharian josharian added the Testing An issue that has been verified to require only test changes, not just a test failure. label Jul 29, 2017
@josharian josharian added this to the Go1.10 milestone Jul 29, 2017
@gopherbot
Copy link

Change https://golang.org/cl/51891 mentions this issue: math/rand: add Shuffle

josharian added a commit to josharian/go that referenced this issue Aug 12, 2017
Shuffle uses the Fisher-Yates algorithm.

Since this is new API, it affords us the opportunity
to use a much faster Int31n implementation that mostly avoids division.
As a result, BenchmarkPerm30ViaShuffle is
about 30% faster than BenchmarkPerm30,
despite requiring a separate initialization loop
and using function calls to swap elements.

Fixes golang#20480
Updates golang#16213
Updates golang#21211

Change-Id: Ib8956c4bebed9d84f193eb98282ec16ee7c2b2d5
@josharian
Copy link
Contributor Author

This is even worse than I thought. The existing code reads:

func (this *statsResults) checkSimilarDistribution(expected *statsResults) error {
	if !nearEqual(this.mean, expected.mean, expected.closeEnough, expected.maxError) {
		s := fmt.Sprintf("mean %v != %v (allowed error %v, %v)", this.mean, expected.mean, expected.closeEnough, expected.maxError)
		fmt.Println(s)
		return errors.New(s)
	}
	if !nearEqual(this.stddev, expected.stddev, 0, expected.maxError) {
		s := fmt.Sprintf("stddev %v != %v (allowed error %v, %v)", this.stddev, expected.stddev, expected.closeEnough, expected.maxError)
		fmt.Println(s)
		return errors.New(s)
	}
	return nil
}

Note that when checking stddev, closeEnough is set to 0 in the call to nearEqual. This means that all the work is being done by expected.maxError, which is generally too large.

Also, there should probably be separate values for closeEnough and maxError for mean and stddev, given that they are generally of very different scales.

@gopherbot
Copy link

Change https://golang.org/cl/55972 mentions this issue: math/rand: make Perm match Shuffle

gopherbot pushed a commit that referenced this issue Sep 8, 2017
Shuffle uses the Fisher-Yates algorithm.

Since this is new API, it affords us the opportunity
to use a much faster Int31n implementation that mostly avoids division.
As a result, BenchmarkPerm30ViaShuffle is
about 30% faster than BenchmarkPerm30,
despite requiring a separate initialization loop
and using function calls to swap elements.

Fixes #20480
Updates #16213
Updates #21211

Change-Id: Ib8956c4bebed9d84f193eb98282ec16ee7c2b2d5
Reviewed-on: https://go-review.googlesource.com/51891
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
gopherbot pushed a commit that referenced this issue Sep 8, 2017
Perm and Shuffle are fundamentally doing the same work.
This change makes Perm's algorithm match Shuffle's.
In addition to allowing developers to switch more
easily between the two methods, it affords a nice speed-up:

name      old time/op  new time/op  delta
Perm3-8   75.7ns ± 1%  51.8ns ± 1%  -31.59%  (p=0.000 n=9+8)
Perm30-8   610ns ± 1%   405ns ± 1%  -33.67%  (p=0.000 n=9+9)

This change alters the output from Perm,
given the same Source and seed.
This is a change from Go 1.0 behavior.
This necessitates updating the regression test.

This also changes the number of calls made to the Source
during Perm, which changes the output of the math/rand examples.

This also slightly perturbs the output of Perm,
nudging it out of the range currently accepted by TestUniformFactorial.
However, it is complete unclear that the helpers relied on
by TestUniformFactorial are correct. That is #21211.
This change updates checkSimilarDistribution to respect
closeEnough for standard deviations, which makes the test pass.
The whole situation is muddy; see #21211 for details.

There is an alternative implementation of Perm
that avoids initializing m, which is more similar
to the existing implementation, plus some optimizations:

func (r *Rand) Perm(n int) []int {
	m := make([]int, n)
	max31 := n
	if n > 1<<31-1-1 {
		max31 = 1<<31 - 1 - 1
	}
	i := 1
	for ; i < max31; i++ {
		j := r.int31n(int32(i + 1))
		m[i] = m[j]
		m[j] = i
	}
	for ; i < n; i++ {
		j := r.Int63n(int64(i + 1))
		m[i] = m[j]
		m[j] = i
	}
	return m
}

This is a tiny bit faster than the implementation
actually used in this change:

name      old time/op  new time/op  delta
Perm3-8   51.8ns ± 1%  50.3ns ± 1%  -2.83%  (p=0.000 n=8+9)
Perm30-8   405ns ± 1%   394ns ± 1%  -2.66%  (p=0.000 n=9+8)

However, 3% in performance doesn't seem worth
having the two algorithms diverge,
nor the reduced readability of this alternative.

Updates #16213.

Change-Id: I11a7441ff8837ee9c241b4c88f7aa905348be781
Reviewed-on: https://go-review.googlesource.com/55972
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
josharian added a commit to josharian/go that referenced this issue Nov 13, 2017
Shuffle uses the Fisher-Yates algorithm.

Since this is new API, it affords us the opportunity
to use a much faster Int31n implementation that mostly avoids division.
As a result, BenchmarkPerm30ViaShuffle is
about 30% faster than BenchmarkPerm30,
despite requiring a separate initialization loop
and using function calls to swap elements.

Fixes golang#20480
Updates golang#16213
Updates golang#21211

Change-Id: Ib8956c4bebed9d84f193eb98282ec16ee7c2b2d5
josharian added a commit to josharian/go that referenced this issue Nov 13, 2017
Perm and Shuffle are fundamentally doing the same work.
This change makes Perm's algorithm match Shuffle's.
In addition to allowing developers to switch more
easily between the two methods, it affords a nice speed-up:

name      old time/op  new time/op  delta
Perm3-8   75.7ns ± 1%  51.8ns ± 1%  -31.59%  (p=0.000 n=9+8)
Perm30-8   610ns ± 1%   405ns ± 1%  -33.67%  (p=0.000 n=9+9)

This change alters the output from Perm,
given the same Source and seed.
This is a change from Go 1.0 behavior.
This necessitates updating regress test.

This also changes the number of calls made to the Source
during Perm, which changes the output of the math/rand examples.

This also slightly perturbs the output of Perm,
nudging it out of the range currently accepted by TestUniformFactorial.
However, it is complete unclear that the helpers relied on
by TestUniformFactorial are correct. That is golang#21211.
This change updates checkSimilarDistribution to respect
closeEnough for standard deviations, which makes the test pass.
The whole situation is muddy; see golang#21211 for details.


There is an alternative implementation of Perm
that avoids initializing m, which is more similar
to the existing implementation, plus some optimizations:

func (r *Rand) Perm(n int) []int {
	m := make([]int, n)
	max31 := n
	if n > 1<<31-1-1 {
		max31 = 1<<31 - 1 - 1
	}
	i := 1
	for ; i < max31; i++ {
		j := r.int31n(int32(i + 1))
		m[i] = m[j]
		m[j] = i
	}
	for ; i < n; i++ {
		j := r.Int63n(int64(i + 1))
		m[i] = m[j]
		m[j] = i
	}
	return m
}

This is a tiny bit faster than the implementation
actually used in this change:

name      old time/op  new time/op  delta
Perm3-8   51.8ns ± 1%  50.3ns ± 1%  -2.83%  (p=0.000 n=8+9)
Perm30-8   405ns ± 1%   394ns ± 1%  -2.66%  (p=0.000 n=9+8)

However, 3% in performance doesn't seem worth
having the two algorithms diverge,
nor the reduced readability of this alternative.


Updates golang#16213.

Change-Id: I11a7441ff8837ee9c241b4c88f7aa905348be781
@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017
@bradfitz bradfitz modified the milestones: Go1.11, Go1.12 Jun 20, 2018
@bradfitz bradfitz added help wanted NeedsFix The path to resolution is known, but the work has not been done. labels Jun 20, 2018
@ianlancetaylor ianlancetaylor modified the milestones: Go1.12, Go1.13 Dec 13, 2018
@andybons andybons modified the milestones: Go1.13, Go1.14 Jul 8, 2019
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted NeedsFix The path to resolution is known, but the work has not been done. Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

6 participants