testing: don't truncate allocs/op #24631

teh-cmc · 2018-04-01T21:16:17Z

$ go version
go version go1.10.1 linux/amd64

testing.Benchmark gives me flaky allocation reports in the following case:

package main

import (
	"fmt"
	"testing"
)

func main() {
	var Eface interface{}
	res := testing.Benchmark(func(b *testing.B) {
		// i := 1, just in case, to avoid convT2E32's optimization wrt zero-values.
		for i := 1; i < b.N; i++ {
			Eface = uint32(i)
		}
	})
	fmt.Println(res.MemString())
}

The generated code for Eface = uint32(i) is what you'd expect:

;; Eface = uint32(i)
0x0050 00080 (main.go:12)	MOVL	CX, ""..autotmp_3+36(SP)
0x0054 00084 (main.go:12)	LEAQ	type.uint32(SB), AX
0x005b 00091 (main.go:12)	MOVQ	AX, (SP)
0x005f 00095 (main.go:12)	LEAQ	""..autotmp_3+36(SP), DX
0x0064 00100 (main.go:12)	MOVQ	DX, 8(SP)
0x0069 00105 (main.go:12)	PCDATA	$0, $1
0x0069 00105 (main.go:12)	CALL	runtime.convT2E32(SB)
0x006e 00110 (main.go:12)	MOVQ	24(SP), AX
0x0073 00115 (main.go:12)	MOVQ	16(SP), CX
0x0078 00120 (main.go:12)	MOVQ	"".&Eface+48(SP), DX
0x007d 00125 (main.go:12)	MOVQ	CX, (DX)
0x0080 00128 (main.go:12)	MOVL	runtime.writeBarrier(SB), CX
0x0086 00134 (main.go:12)	LEAQ	8(DX), DI
0x008a 00138 (main.go:12)	TESTL	CX, CX
0x008c 00140 (main.go:12)	JNE	148
0x008e 00142 (main.go:12)	MOVQ	AX, 8(DX)
0x0092 00146 (main.go:12)	JMP	46
0x0094 00148 (main.go:12)	CALL	runtime.gcWriteBarrier(SB)
0x0099 00153 (main.go:12)	JMP	46

Results:

$ go run main.go
       4 B/op	       1 allocs/op
$ go run main.go
       4 B/op	       1 allocs/op
$ go run main.go
       4 B/op	       1 allocs/op
$ go run main.go
       4 B/op	       0 allocs/op
$ go run main.go
       4 B/op	       0 allocs/op
$ go run main.go
       4 B/op	       1 allocs/op

I.e. while the number of allocated bytes per op is always correct, the number of allocations per op is sometimes wrong (expecting 1 allocs/op).

I guess I'm hitting some kind of rounding error here, but then again why would this behavior not be deterministic? Maybe due to the heuristics around the value of b.N?
I'm not sure what I'm missing here?

Thanks.

The text was updated successfully, but these errors were encountered:

teh-cmc · 2018-04-01T21:26:35Z

Just in case, here's the complete result string, which shows that the program is doing as much work in both cases:

$ go run main.go 
100000000	        12.7 ns/op        4 B/op	       1 allocs/op
$ go run main.go 
100000000	        12.4 ns/op        4 B/op	       0 allocs/op
$ go run main.go 
100000000	        12.3 ns/op        4 B/op	       1 allocs/op
$ go run main.go 
100000000	        12.2 ns/op        4 B/op	       0 allocs/op

odeke-em · 2018-04-02T07:38:34Z

/cc @randall77 @aclements @ianlancetaylor

ALTree · 2018-04-02T10:43:01Z

The number of reported allocs varies between 999999 and 1000003; when it' the former 0 allocs are reported, when it's the latter 1 alloc is reported.

Either we fix the flakiness in the memstat calls (I don't know if that can be done reliably) or we stop using integer truncated division to compute the result (where 999999 / 1000000 = 0) and instead we round to the nearest integer. This will give us more accurate results in the cases that are more likely to be flaky (very fast benchmarks that only allocate once or twice).

gopherbot · 2018-04-02T10:43:22Z

Change https://golang.org/cl/104055 mentions this issue: testing: round (instead of truncating) in AllocsPerOp

aclements · 2018-04-02T16:28:52Z

@teh-cmc, your benchmark loop is slightly non-linear, which I believe is what's causing this. The loop needs to repeat the benchmark b.N times, but your loop repeats it b.N-1 times. Either:

for i := 1; i < b.N+1; i++ {
	Eface = uint32(i)
}

Or, better, since it a keeps the standard benchmark loop form:

for i := 0; i < b.N; i++ {
	Eface = uint32(i+1)
}

Regardless, perhaps we shouldn't be truncating the reported allocs/op to an integer at all. The benchmark format is allowed to contain floating point numbers, so we could print out a few decimal places, like we already do for ns/op and MB/s.

teh-cmc · 2018-04-02T16:56:48Z

Silly me. That makes sense @aclements, thanks.

I do agree with you and @ALTree that this shouldn't get truncated down to zero in any case though, that seems quite deceptive.

bcmills · 2018-04-02T18:43:42Z

In #19128 (comment), @rsc suggested:

I'd also like to have a check somewhere (in package testing or benchstat, probably the former) that b.N really is scaling linearly.

I wonder whether that check would be sensitive enough to catch the nonlinearity here.

At any rate, it sounds like just fixing the rounding would help. Shall we retitle the bug for that?

aclements · 2018-04-02T18:50:59Z

OTOH, unlike ns/op and MB/s, allocs/op really is a whole number for a given iteration. I worry slightly that the random background allocations that aren't coupled to iterations (e.g., when a GC cycle runs) would almost always cause this number to be slightly above a whole number and that that could confuse people.

bcmills · 2018-04-02T18:59:28Z

Is allocs/op really a whole number? I would think that any API that involves a cache could legitimately end up with a fractional mean.

aclements · 2018-04-02T19:09:58Z

For a given iteration, yes, it's a whole number. But caches are another good example of iteration-to-iteration variance and yet another example where summarizing a benchmark result distribution as just a mean can lose a great deal of information.

rsc · 2018-04-09T20:57:00Z

The code under test was buggy which caused the flake. When we added allocs/op we explicitly decided to truncate it, to report the number of allocs that happen every single time. It's too late to change that definition.

odeke-em added the NeedsInvestigation label Apr 2, 2018

aclements added NeedsDecision and removed NeedsInvestigation labels Apr 2, 2018

aclements changed the title ~~testing: Benchmark reports flaky number of allocations~~ testing: don't truncate allocs/op Apr 2, 2018

bcmills added this to the Go1.11 milestone Apr 2, 2018

josharian mentioned this issue Apr 2, 2018

strings: TestBuilderGrow test flake #24647

Closed

rsc closed this as completed Apr 9, 2018

golang locked and limited conversation to collaborators Apr 9, 2019

gopherbot added the FrozenDueToAge label Apr 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

testing: don't truncate allocs/op #24631

testing: don't truncate allocs/op #24631

teh-cmc commented Apr 1, 2018 •

edited

Loading

teh-cmc commented Apr 1, 2018

odeke-em commented Apr 2, 2018

ALTree commented Apr 2, 2018 •

edited

Loading

gopherbot commented Apr 2, 2018

aclements commented Apr 2, 2018

teh-cmc commented Apr 2, 2018

bcmills commented Apr 2, 2018

aclements commented Apr 2, 2018

bcmills commented Apr 2, 2018 •

edited

Loading

aclements commented Apr 2, 2018

rsc commented Apr 9, 2018

testing: don't truncate allocs/op #24631

testing: don't truncate allocs/op #24631

Comments

teh-cmc commented Apr 1, 2018 • edited Loading

teh-cmc commented Apr 1, 2018

odeke-em commented Apr 2, 2018

ALTree commented Apr 2, 2018 • edited Loading

gopherbot commented Apr 2, 2018

aclements commented Apr 2, 2018

teh-cmc commented Apr 2, 2018

bcmills commented Apr 2, 2018

aclements commented Apr 2, 2018

bcmills commented Apr 2, 2018 • edited Loading

aclements commented Apr 2, 2018

rsc commented Apr 9, 2018

teh-cmc commented Apr 1, 2018 •

edited

Loading

ALTree commented Apr 2, 2018 •

edited

Loading

bcmills commented Apr 2, 2018 •

edited

Loading