You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In general, benchmarks in isolation aren't very useful. To compare benchmarks we need an external tool (like benchstat). It'd be nice if we could do comparative benchmarking from within the go test tool itself.
When refactoring code, it is often useful to maintain an old version of a function or method to test for regressions or change in functionality.
Likewise, when refactoring for performance we should test for functional regressions AND performance regressions. That is: We should ensure that a change represents a performance improvement and make sure that it remains the case over time.
For example, lets say we have a function fib() that we think we've sped up. I believe a reasonable thing to do would be to create a test compare our new version of fib() with our old version. That might entail something like:
// in fib_test.gofuncoldFib(nint) int {
// insert fib() implementation here...
}
// imagine our "new" fib() function has replaced the old implementation of fib() -- presumably in fib.go// TestFibRegression ensures that our new optimized fib is bug-for-bug compatible with the previous version. funcTestFibRegression(t*testing.T) {
fori:=0; i<50; i++ {
ifgot, expected:=fib(i), oldFib(i); got!=expected {
t.Fatalf("regression: fib(%d) returned %d; oldFib(%[1]d) returned %d", i, got, expected)
}
}
}
So maybe that's a reasonable way to ensure that our new fib() is a valid replacement for the old one. But how do we ensure that our new fib() is always faster than the old?
How about this:
// BenchmarkFib benchmarks and compares our optimized and unoptimized fib implementation.// If, somehow, the optimized version becomes slower, this benchmark should fail.funcBenchmarkFib(b*testing.B) {
funcs:=map[string]func(int) int {
"oldFib": oldFib,
"fib": fib,
}
r:=map[string]testing.BenchmarkResults{}
forname, f:=rangefuncs {
r[name] =b.Execute(func(b*testing.B) {
fori:=0; i<b.N; i++ {
f(100) // whatever -- i know, inlining etc.
}
})
}
ifnewns, oldns:=r["fib"].NsPerOp(), r["oldFib"].NsPerOp; ifoldns>0&&newns>oldns {
b.Fatalf("regression: new fib is %d ns/op; should be less than old fib at %d ns/op", newns, oldns)
}
}
The functions can be compared on the same machine and as near the same time as possible. oldFib can be excluded from the benchmark if necessary by excluding sub-tests that starts with old and we can have an easy way to be alerted if the optimization regresses some how (maybe a new compiler optimization or standard library change, etc.).
The goal is to allow comparative benchmarks without external tools.
I propose this instead of using testing.Benchmark() unless testing.Benchmark() can be called from... an actual benchmark in a meaningful way.
Otherwise, I don't think we should have to run benchmarks from a Test function for the same reason we generally don't want to run benchmarks when we test.
Maybe additional methods could be used to compute and output differences in a way similar to benchstat.
The text was updated successfully, but these errors were encountered:
In general, benchmarks in isolation aren't very useful. To compare benchmarks we need an external tool (like
benchstat
). It'd be nice if we could do comparative benchmarking from within the go test tool itself.When refactoring code, it is often useful to maintain an old version of a function or method to test for regressions or change in functionality.
Likewise, when refactoring for performance we should test for functional regressions AND performance regressions. That is: We should ensure that a change represents a performance improvement and make sure that it remains the case over time.
For example, lets say we have a function
fib()
that we think we've sped up. I believe a reasonable thing to do would be to create a test compare our new version offib()
with our old version. That might entail something like:So maybe that's a reasonable way to ensure that our new
fib()
is a valid replacement for the old one. But how do we ensure that our new fib() is always faster than the old?How about this:
The functions can be compared on the same machine and as near the same time as possible.
oldFib
can be excluded from the benchmark if necessary by excluding sub-tests that starts withold
and we can have an easy way to be alerted if the optimization regresses some how (maybe a new compiler optimization or standard library change, etc.).The goal is to allow comparative benchmarks without external tools.
I propose this instead of using testing.Benchmark() unless testing.Benchmark() can be called from... an actual benchmark in a meaningful way.
Otherwise, I don't think we should have to run benchmarks from a
Test
function for the same reason we generally don't want to run benchmarks when we test.Maybe additional methods could be used to compute and output differences in a way similar to
benchstat
.The text was updated successfully, but these errors were encountered: