Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: TCP benchmarks are wildly variable #9774

Closed
josharian opened this issue Feb 5, 2015 · 8 comments
Closed

net: TCP benchmarks are wildly variable #9774

josharian opened this issue Feb 5, 2015 · 8 comments

Comments

@josharian
Copy link
Contributor

The net TCP benchmarks are all over the map. Here are two consecutive runs:

benchmark                            old ns/op     new ns/op     delta
BenchmarkTCP4OneShot                 475408        544021        +14.43%
BenchmarkTCP4OneShotTimeout          805033        1563184       +94.18%
BenchmarkTCP4Persistent              50894         22601         -55.59%
BenchmarkTCP4PersistentTimeout       23367         22396         -4.16%
BenchmarkTCP6OneShot                 648696        636950        -1.81%
BenchmarkTCP6OneShotTimeout          1084069       989891        -8.69%
BenchmarkTCP6Persistent              25445         24205         -4.87%
BenchmarkTCP6PersistentTimeout       23346         23225         -0.52%
BenchmarkTCP4ConcurrentReadWrite     16441         15232         -7.35%
BenchmarkTCP6ConcurrentReadWrite     17069         15819         -7.32%

This is in my experience a typical amount of variation. Is there a way to stabilize them? If not, what purpose do they serve?

@dvyukov
Copy link
Member

dvyukov commented Feb 5, 2015

What machine and OS do you use?

@josharian
Copy link
Contributor Author

OS X 10.10.2, Core i7.

@dvyukov
Copy link
Member

dvyukov commented Feb 5, 2015

To measure benchmark stability, you need to disable TuroBoost and SpeedStep in BIOS. Also bind the program to specific set of cores (for GOMAXPROCS=1 to a single core). Also make the system as idle as possible, in particular disable background processes like updaters, antiviruses, etc.
For networking benchmarks additional instability factor is kernel, and I suspect OS X is not particularly good in this respect.

On my linux machine the results are reasonably stable:

$ nice -20 taskset 16 go test -run=none -bench=BenchmarkTCP4Persistent$ -cpu=1,1,1,1,1,1,1,1 net
PASS
BenchmarkTCP4Persistent 20000 69208 ns/op
BenchmarkTCP4Persistent 20000 69545 ns/op
BenchmarkTCP4Persistent 20000 69302 ns/op
BenchmarkTCP4Persistent 20000 69450 ns/op
BenchmarkTCP4Persistent 20000 69359 ns/op
BenchmarkTCP4Persistent 20000 69095 ns/op
BenchmarkTCP4Persistent 20000 68679 ns/op
BenchmarkTCP4Persistent 20000 68962 ns/op

OneShot is more problematic, because it creates tons of TCP connections and kernel can't keep up.

@RLH
Copy link
Contributor

RLH commented Feb 5, 2015

In your table you say old ns/op and new ns/op. What is old and what is new.
Is old 1.3 and new 1.4 or is old 1.4 vs new is tip?

On Thu, Feb 5, 2015 at 3:33 AM, Dmitry Vyukov notifications@github.com
wrote:

To measure benchmark stability, you need to disable TuroBoost and
SpeedStep in BIOS. Also bind the program to specific set of cores (for
GOMAXPROCS=1 to a single core). Also make the system as idle as possible,
in particular disable background processes like updaters, antiviruses, etc.
For networking benchmarks additional instability factor is kernel, and I
suspect OS X is not particularly good in this respect.

On my linux machine the results are reasonably stable:

$ nice -20 taskset 16 go test -run=none -bench=BenchmarkTCP4Persistent$
-cpu=1,1,1,1,1,1,1,1 net
PASS
BenchmarkTCP4Persistent 20000 69208 ns/op
BenchmarkTCP4Persistent 20000 69545 ns/op
BenchmarkTCP4Persistent 20000 69302 ns/op
BenchmarkTCP4Persistent 20000 69450 ns/op
BenchmarkTCP4Persistent 20000 69359 ns/op
BenchmarkTCP4Persistent 20000 69095 ns/op
BenchmarkTCP4Persistent 20000 68679 ns/op
BenchmarkTCP4Persistent 20000 68962 ns/op

OneShot is more problematic, because it creates tons of TCP connections
and kernel can't keep up.


Reply to this email directly or view it on GitHub
#9774 (comment).

@cespare
Copy link
Contributor

cespare commented Feb 5, 2015

@RLH I think it's the same version. Josh just used benchcmp as a way of showing the variance between two runs (that one would hope would be nearly the same).

@josharian
Copy link
Contributor Author

@dvyukov thanks for the tips. Most benchmarks are quite stable on my system; the net ones are the only ones that show significant variability. It seems plausible that this is OS X-specific and kernel-related.

Any objection to me adding b.Skip to the relevant benchmarks on darwin, since they're so noisy?

@dvyukov
Copy link
Member

dvyukov commented Feb 7, 2015

I object to adding t.Skip. It may be appropriate if the benchmark is failing episodically.
We run all benchmarks on race builders to get at least some concurrent coverage, and these look like useful benchmarks for that.
Also I run all benchmarks to measure difference in allocs/run. The benchmarks also look useful for that.

You can filter these benchmarks with -bench flag or just grep -v afterwards.

@josharian
Copy link
Contributor Author

Fair enough.

@golang golang locked and limited conversation to collaborators Jun 25, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants