net: TCP benchmarks are wildly variable #9774

josharian · 2015-02-05T00:07:06Z

The net TCP benchmarks are all over the map. Here are two consecutive runs:

benchmark                            old ns/op     new ns/op     delta
BenchmarkTCP4OneShot                 475408        544021        +14.43%
BenchmarkTCP4OneShotTimeout          805033        1563184       +94.18%
BenchmarkTCP4Persistent              50894         22601         -55.59%
BenchmarkTCP4PersistentTimeout       23367         22396         -4.16%
BenchmarkTCP6OneShot                 648696        636950        -1.81%
BenchmarkTCP6OneShotTimeout          1084069       989891        -8.69%
BenchmarkTCP6Persistent              25445         24205         -4.87%
BenchmarkTCP6PersistentTimeout       23346         23225         -0.52%
BenchmarkTCP4ConcurrentReadWrite     16441         15232         -7.35%
BenchmarkTCP6ConcurrentReadWrite     17069         15819         -7.32%

This is in my experience a typical amount of variation. Is there a way to stabilize them? If not, what purpose do they serve?

The text was updated successfully, but these errors were encountered:

dvyukov · 2015-02-05T04:45:53Z

What machine and OS do you use?

josharian · 2015-02-05T04:55:11Z

OS X 10.10.2, Core i7.

dvyukov · 2015-02-05T08:33:10Z

To measure benchmark stability, you need to disable TuroBoost and SpeedStep in BIOS. Also bind the program to specific set of cores (for GOMAXPROCS=1 to a single core). Also make the system as idle as possible, in particular disable background processes like updaters, antiviruses, etc.
For networking benchmarks additional instability factor is kernel, and I suspect OS X is not particularly good in this respect.

On my linux machine the results are reasonably stable:

$ nice -20 taskset 16 go test -run=none -bench=BenchmarkTCP4Persistent$ -cpu=1,1,1,1,1,1,1,1 net
PASS
BenchmarkTCP4Persistent 20000 69208 ns/op
BenchmarkTCP4Persistent 20000 69545 ns/op
BenchmarkTCP4Persistent 20000 69302 ns/op
BenchmarkTCP4Persistent 20000 69450 ns/op
BenchmarkTCP4Persistent 20000 69359 ns/op
BenchmarkTCP4Persistent 20000 69095 ns/op
BenchmarkTCP4Persistent 20000 68679 ns/op
BenchmarkTCP4Persistent 20000 68962 ns/op

OneShot is more problematic, because it creates tons of TCP connections and kernel can't keep up.

RLH · 2015-02-05T16:35:40Z

In your table you say old ns/op and new ns/op. What is old and what is new.
Is old 1.3 and new 1.4 or is old 1.4 vs new is tip?

On Thu, Feb 5, 2015 at 3:33 AM, Dmitry Vyukov notifications@github.com
wrote:

To measure benchmark stability, you need to disable TuroBoost and
SpeedStep in BIOS. Also bind the program to specific set of cores (for
GOMAXPROCS=1 to a single core). Also make the system as idle as possible,
in particular disable background processes like updaters, antiviruses, etc.
For networking benchmarks additional instability factor is kernel, and I
suspect OS X is not particularly good in this respect.

On my linux machine the results are reasonably stable:

$ nice -20 taskset 16 go test -run=none -bench=BenchmarkTCP4Persistent$
-cpu=1,1,1,1,1,1,1,1 net
PASS
BenchmarkTCP4Persistent 20000 69208 ns/op
BenchmarkTCP4Persistent 20000 69545 ns/op
BenchmarkTCP4Persistent 20000 69302 ns/op
BenchmarkTCP4Persistent 20000 69450 ns/op
BenchmarkTCP4Persistent 20000 69359 ns/op
BenchmarkTCP4Persistent 20000 69095 ns/op
BenchmarkTCP4Persistent 20000 68679 ns/op
BenchmarkTCP4Persistent 20000 68962 ns/op

OneShot is more problematic, because it creates tons of TCP connections
and kernel can't keep up.

—
Reply to this email directly or view it on GitHub
#9774 (comment).

cespare · 2015-02-05T16:38:24Z

@RLH I think it's the same version. Josh just used benchcmp as a way of showing the variance between two runs (that one would hope would be nearly the same).

josharian · 2015-02-05T17:34:36Z

@dvyukov thanks for the tips. Most benchmarks are quite stable on my system; the net ones are the only ones that show significant variability. It seems plausible that this is OS X-specific and kernel-related.

Any objection to me adding b.Skip to the relevant benchmarks on darwin, since they're so noisy?

dvyukov · 2015-02-07T09:55:29Z

I object to adding t.Skip. It may be appropriate if the benchmark is failing episodically.
We run all benchmarks on race builders to get at least some concurrent coverage, and these look like useful benchmarks for that.
Also I run all benchmarks to measure difference in allocs/run. The benchmarks also look useful for that.

You can filter these benchmarks with -bench flag or just grep -v afterwards.

josharian · 2015-02-07T16:12:09Z

Fair enough.

josharian closed this as completed Feb 7, 2015

golang locked and limited conversation to collaborators Jun 25, 2016

gopherbot added the FrozenDueToAge label Jun 25, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

net: TCP benchmarks are wildly variable #9774

net: TCP benchmarks are wildly variable #9774

josharian commented Feb 5, 2015

dvyukov commented Feb 5, 2015

josharian commented Feb 5, 2015

dvyukov commented Feb 5, 2015

RLH commented Feb 5, 2015

cespare commented Feb 5, 2015

josharian commented Feb 5, 2015

dvyukov commented Feb 7, 2015

josharian commented Feb 7, 2015

net: TCP benchmarks are wildly variable #9774

net: TCP benchmarks are wildly variable #9774

Comments

josharian commented Feb 5, 2015

dvyukov commented Feb 5, 2015

josharian commented Feb 5, 2015

dvyukov commented Feb 5, 2015

RLH commented Feb 5, 2015

cespare commented Feb 5, 2015

josharian commented Feb 5, 2015

dvyukov commented Feb 7, 2015

josharian commented Feb 7, 2015