Skip to content

cmd/go: go test running bench twice with -benchtime=1x #32051

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vitesse-ftian opened this issue May 15, 2019 · 18 comments
Closed

cmd/go: go test running bench twice with -benchtime=1x #32051

vitesse-ftian opened this issue May 15, 2019 · 18 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@vitesse-ftian
Copy link

What version of Go are you using (go version)?

1.12.5

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/ftian/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/ftian/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build598447146=/tmp/go-build -gno-record-gcc-switches"

What did you do?

ftian@ftian-xps:~/tmp/x$ cat xxx_test.go
package x

import (
"fmt"
"testing"
)

func BenchmarkOnce(b *testing.B) {
fmt.Printf("Calling me once.\n")
}

ftian@ftian-xps:~/tmp/x$ go test -run=NONONO -bench=Once -benchtime=1x
Calling me once.
goos: linux
goarch: amd64
BenchmarkOnce-12 Calling me once.
1 17290 ns/op
PASS
ok _/home/ftian/tmp/x 0.002s

What did you expect to see?

The BencharkOnce should be run only once. The line should be printed once.

What did you see instead?

Calling me once. printed twice.

@robpike
Copy link
Contributor

robpike commented May 15, 2019

Working as intended. The benchmark support code calls each benchmark function as many times as necessary to establish a reliable performance figure.

@robpike
Copy link
Contributor

robpike commented May 15, 2019

By the way, you're using the benchtime flag incorrectly. I'm surprised you didn't see an error. Its argument is a duration.

@cespare
Copy link
Contributor

cespare commented May 15, 2019

By the way, you're using the benchtime flag incorrectly. I'm surprised you didn't see an error. Its argument is a duration.

As of Go 1.12 (https://golang.org/doc/go1.12#testing):

The -benchtime flag now supports setting an explicit iteration count instead of a time when the value ends with an "x". For example, -benchtime=100x runs the benchmark 100 times.

@vitesse-ftian
Copy link
Author

benchtime=1x is a new feature in 1.12. According to doc, 100x means 100 times.

BTW, before benchtime=1x, I used to use -benchtime=0s to trick go test to only run my test case once. 1.12 will error, stating 0s is not a valid duration.

-benchtime=1ns still does the trick.

@cespare
Copy link
Contributor

cespare commented May 15, 2019

Looking at the code, it looks like we always call the benchmark once to initialize things (for example, to discover sub-benchmarks). Then we run it n times after that.

Perhaps we just need to subtract 1 from n.

In any case, it does seem confusing that 1x runs it twice. (And I suspect that 1 is the most common number to use with -benchtime=Nx. We use it in CI to verify that our benchmarks aren't broken, for example.)

@andybons andybons changed the title go test running bench twice with -benchtime=1x cmd/go: go test running bench twice with -benchtime=1x May 15, 2019
@andybons andybons added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label May 15, 2019
@andybons andybons added this to the Unplanned milestone May 15, 2019
@andybons
Copy link
Member

@rsc for a decision on behavior.

@robpike
Copy link
Contributor

robpike commented May 15, 2019

Well, benchtime needs some documentation then. The testing package doesn't mention it in the comment with the other flags.

Not a fan of this hack, but I obviously missed it when it went in.

@cespare
Copy link
Contributor

cespare commented May 15, 2019

#24735 has the context for the -benchtime=Nx syntax.

Well, benchtime needs some documentation then. The testing package doesn't mention it in the comment with the other flags.

As far as I can see, the testing package documentation only mentions three of the testing flags: -run, -bench, and -cpu. There are more than 20 other flags not mentioned. If we're going to describe -benchtime there, we should consider including some other important ones such as -v, -timeout, -count, and -short. (The testing package docs do already link to https://golang.org/cmd/go/#hdr-Testing_flags, though.)

@robpike
Copy link
Contributor

robpike commented May 16, 2019

Maybe that's enough, but I must say that 'go doc testing' is a lot easier to handle than the output of the go command's own documentation.

copybara-service bot pushed a commit to google/gvisor that referenced this issue Dec 12, 2020
- Skip the bazel clean command on the last run of the benchmark.
- Use --test.benchtime=1ns to force running the benchmark once
(golang/go#32051)

PiperOrigin-RevId: 347124606
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/317916 mentions this issue: testing: Improve ambiguous error message about wrong value for -benchtime flag.

@tv42
Copy link

tv42 commented Jun 24, 2021

Could the discovery round be run with testing.B.N == 0? Calling B.Run inside your benchmark loop would be wrong anyway, so it doesn't have to be supported.

@rsc
Copy link
Contributor

rsc commented Jun 24, 2021

No, we tried b.N==0 when we introduced subtests and various things broke. Code assumes b.N > 0.

@cespare
Copy link
Contributor

cespare commented Jun 25, 2021

I investigated this a bit.

It's unfortunate that -benchtime=Nx runs the benchmark loop N+1 times. If the user asked for N it would be better to give them exactly N. The only two ways I can see to fix this are (1) using a discovery round of b.N == 0 and (2) not using a discovery round for -benchtime=Nx benchmarks.

I guess we've ruled out (1) for Hyrum's Law reasons. (2) seems plausibly doable from my read of the code, but would require some nontrivial restructuring.

However, I think we can special-case -benchtime=1x. If we go to run the final benchmark loop and we see that we're running in -benchtime=1x mode, at that point, we can skip the "main" benchmark loop entirely and let the discovery round stand in as our benchmark result. That's what happens today if you're running in "duration mode" and it turns out that even the b.N == 1 discovery round takes longer than the requested -benchtime. That's why -benchtime=1ns only runs the benchmark loop once in total. (So, arguably, treating the 1x in the special way makes it more consistent with a duration -benchtime.)

I think that fixing this only for -benchtime=1x will give most of the value of fixing it for -benchtime=Nx generally. In my experience, I've used the -benchtime=Nx feature for three different things:

  • Checking that benchmark code isn't broken (this is why we use -benchtime=1x in CI)
  • Checking for myself how long a single benchmark iteration takes using -benchtime=1x (perhaps I'm debugging some buggy benchmark code I'm writing)
  • Comparing two different benchmarks that I want to run for the same number of iterations (usually with -benchtime=10000x or some other large number)

I'm sure there are cases where people are using -benchtime=2x or 3x or 10x, but I suspect that most of the time that people use -benchtime=Nx they are using 1x and would really like it to run exactly once or they are using some large number of iterations where running N+1 total iterations isn't a big deal.

The special-casing of 1x is a trivial one-line change and all the tests pass. I can send a CL for when the tree opens in August.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/331089 mentions this issue: testing: with -benchtime=1x, run the benchmark loop exactly once

@tv42
Copy link

tv42 commented Jun 26, 2021

Note: "discovery round" in benchmarks is serving two purposes: establishing timing, and enumerating subtests. Subtest discovery still needs to happen. The idea of making 1x use the "discovery" as the real benchmark sounds good to me; that should fulfill both roles.

@mvdan
Copy link
Member

mvdan commented Jan 22, 2022

I find myself coming back to this thread; I've got a benchmark that is quite slow by design, as it's benchmarking an operation akin to go build which takes in the order of seconds. For that reason, I almost exclusively use -benchtime=Nx with a low N such as 4 or 6, as this helps me keep numbers stable and avoid the overhead of estimating a duration.

Except that I'm not really avoiding the overhead of estimating the duration; if I use -benchtime=4x and each iteration takes 5s, I actually end up with a total run-time of ~25s rather than ~20s, as the discovery run with b.N == 1 still happens.

I fully appreciate the thoughtful investigation that happened here, especially in #32051 (comment). Still, my situation feels unfortunate because I don't have any sub-benchmarks, but the testing package doesn't know that.

I've come up with a horrible temporary work-around, in the form of:

benchtime := flag.Lookup("test.benchtime").Value.String()
if b.N == 1 && strings.HasSuffix(benchtime, "x") && benchtime != "1x" {
    println("skipping the sub-benchmark discovery run for -benchtime=Nx")
    return
}

And it works; for -benchtime=1x we already skip the discovery run, and the workaround doesn't trigger. But for -benchtime=4x it triggers, saving me those extra 5s.

Still, this is a hack, and I'm hoping we can do better by default. From my point of view as a user, go test should be able to see that there aren't any possible calls to B.Run whatsoever, so it should be able to skip the discovery run when -benchtime=Nx is used.

An alterntaive that comes to mind is some way for testing.B to signal to the benchmark function that a discovery run is taking place. Then, with the assumption that the discovery run is only used to discover sub-benchmarks, and knowing I have none, I could do something like:

for i := 0; i < b.N; i++ {
    if b.DiscoveryRun {
        break // we have no sub-benchmarks; avoid doign one iteration
    }

@mvdan
Copy link
Member

mvdan commented Jan 22, 2022

A related issue: it seems like the discovery run is re-ran many times if -count=N is used. Presumably, if one is using -count=10 to collect many results for benchstat, the discovery run would only need to run once, not ten times.

@bboreham
Copy link
Contributor

bboreham commented Feb 7, 2022

An alterntaive that comes to mind ...

Nit: you'd only want that break in the benchtime=Nx case. So maybe benchtime should be surfaced in testing.B.

@golang golang locked and limited conversation to collaborators Feb 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

9 participants