Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: x/perf/cmd/benchrun: add -run and -count option #63233

Open
Jorropo opened this issue Sep 26, 2023 · 6 comments
Open

proposal: x/perf/cmd/benchrun: add -run and -count option #63233

Jorropo opened this issue Sep 26, 2023 · 6 comments
Labels
Milestone

Comments

@Jorropo
Copy link
Member

Jorropo commented Sep 26, 2023

Collecting high quality benchmarks using benchstat is a bit annoying. First you have to run them many times, ideally you want to interleave old and new binaries to make noise less relevant.


Add -run bool option, if set then the old, new, ... file paths are not outputs of go test but go test -c, benchstat will invoke the files using a Thue-Morse sequence and with options -test.run=$^ and -test.bench= with benchstat's -filter option value.
-count specify how many invocation of each should happen, each invocation will run with -test.count=1 (or unset).


Expected usage:

> go test -c -o /tmp/new
> git checkout HEAD~1
> go test -c -o /tmp/old
> benchstat -run -count=10 /tmp/{old,new}
@gopherbot gopherbot added this to the Proposal milestone Sep 26, 2023
@Jorropo Jorropo changed the title proposal: x/perf/cmd/benchstat: add -run option proposal: x/perf/cmd/benchstat: add -run and -count option Sep 26, 2023
@AlexanderYastrebov
Copy link
Contributor

AlexanderYastrebov commented Sep 27, 2023

And to get the binaries one would need to checkout specific revisions and compile. I crafted the following benchrev.sh script:

#!/bin/bash
set -euo pipefail

if [ "$#" -lt 6 ]; then
    echo "Usage: $0 <revision A> <revision B> [<go test -bench=...>]"
    echo "Runs benchmark for each revision and prints benchstat results."
    echo "Example:"
    echo "  $0 master HEAD ../bin/go test -run=NONE -bench=BenchmarkLarge -count=10 ./internal/zstd"
    exit 2
fi

if [ $(git status --porcelain | wc -l) -ne "0" ]; then
    echo "Please commit your changes or stash them before you switch branches."
    exit 3
fi

readonly REVA="$1"; shift
readonly REVB="$1"; shift

readonly HEAD=$(git rev-parse --abbrev-ref HEAD)
readonly COMMITA=$(git rev-parse "$REVA")
readonly COMMITB=$(git rev-parse "$REVB")
readonly OUT=$(mktemp -d benchrev.XXXXXXXXXX)

function cleanup {
    rm -rf "$OUT"
    git checkout -q "$HEAD"
}
trap cleanup EXIT

git checkout -q "$COMMITA"
"$@" | tee "$OUT/$COMMITA"

git checkout -q "$COMMITB"
"$@" | tee "$OUT/$COMMITB"

benchstat "$REVA=$OUT/$COMMITA" "$REVB=$OUT/$COMMITB"

@ianlancetaylor
Copy link
Contributor

CC @aclements

@aclements
Copy link
Member

Interesting. I haven't encountered the Thue-Morse sequence before. I have a hunch as to why it could be valuable for interleaving, but could you please explain?

I'm not sure benchstat is the right place for this because this is a separable problem from running the analysis itself. But I agree that some tool for running high quality benchmark sequences would be valuable. I've written a few messy prototypes in the past.

@Jorropo
Copy link
Member Author

Jorropo commented Sep 27, 2023

The Thue-Morse sequence isn't that important here, you could interleave all binaries in order and it would have a very similar effect.
Using Thue-Morse would help break up noise patterns if by malchance some periodic load noise on the machine has a period similar to the benchmarks being run (because they wouldn't be interleaved in order the noise would affect boths benchmark in similar fashions).

Arguably I could add a command in my shell which does:

for i in (seq 10); /tmp/old -test.bench=. | tee /dev/stderr >> /tmp/old.results; /tmp/new -test.bench=. | tee /dev/stderr >> /tmp/new.results; end; benchstat /tmp/{old,new}

but I feel benchstat -run -count=10 /tmp/{old,new} is a quick win that will also help more peoples.

@Jorropo
Copy link
Member Author

Jorropo commented Sep 27, 2023

The main comparison to do is with:

> go test -bench=. -count=10 | tee /tmp/new
> git checkout HEAD~1
> go test -bench=. -count=10 | tee /tmp/old
> benchstat /tmp/{old,new}

which has tangible realistical drawbacks.

@aclements
Copy link
Member

I agree it would be valuable to have a tool to help with running benchmarks for comparison. It shouldn't be added to benchstat, though, because this is pretty different from the functionality of benchstat. It should be a new tool.

There are many dimensions on which it may make sense to do an A/B comparison, too. Not just between two binaries. Two binaries built from different commits is probably the most common use case, but even that can have unintended results if the benchmark depends on any other checked-in files. Seven years ago (!) I wrote such a tool that was specifically for comparing between git commits, but even that got quite fiddly as I tried to reduce the time to repeatedly switch git checkouts. It may be that with things like build caching, more regular benchmarks, and the enhanced benchmark results format, that we're in a better place to do this now.

@aclements aclements changed the title proposal: x/perf/cmd/benchstat: add -run and -count option proposal: x/perf/cmd/benchrun: add -run and -count option Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

5 participants