Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing: add -benchsplit to get more data points #19128

Open
bcmills opened this issue Feb 16, 2017 · 21 comments
Open

testing: add -benchsplit to get more data points #19128

bcmills opened this issue Feb 16, 2017 · 21 comments
Labels
FeatureRequest help wanted NeedsFix The path to resolution is known, but the work has not been done. Proposal Proposal-Accepted
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Feb 16, 2017

For #18177, we need to make tradeoffs between worst-case latency and average throughput for the functions in question.

It's relatively easy to measure the average throughput today using benchmarks with the testing package.

It would be nice if I could use those same benchmarks to measure the distribution of timings across iterations. That isn't really feasible using (*B).N with caller-side iteration, but it might at least be possible to tap into (*PB).Next to get some idea of the latency between calls for parallel benchmarks.

@gopherbot
Copy link

CL https://golang.org/cl/37153 mentions this issue.

@gopherbot
Copy link

CL https://golang.org/cl/37342 mentions this issue.

@rsc
Copy link
Contributor

rsc commented Mar 6, 2017

/cc @aclements

@rsc rsc added this to the Proposal milestone Mar 6, 2017
@josharian
Copy link
Contributor

Related: #7465

@aclements
Copy link
Member

That isn't really feasible using (*B).N with caller-side iteration

One possibility is to crank up N only far enough to eliminate clock measurement overhead and then use caller-driven iteration to drive up the number of samples/total iterations to the desired measurement interval. However, this may not work well with benchmarks that perform non-trivial setup and use B.ResetTimer.

@josharian
Copy link
Contributor

benchmarks that perform non-trivial setup and use B.ResetTimer

Also related: #10930. I bring it up because I learned there that you can in fact detect this case--both the case in which b.ResetTimer is called and the case in which b.ResetTimer is (erroneously) not called. Just look at how the total wall time to run the benchmark compares to the time as recorded by the timer.

@rsc
Copy link
Contributor

rsc commented Mar 7, 2017

We can also tell whether b.ResetTimer is in use because we control the implementation of b.ResetTimer.

One of the reasons we've moved from benchcmp to benchstat is that the latter allows multiple data points for a given benchmark. Today those data points are all for roughly the same large size N, but the eventual plan was to emit more lines about smaller N, so that benchstat can do more work with distributions. If anyone wants to experiment with that and let us know how it goes, please do.

I'd also like to have a check somewhere (in package testing or benchstat, probably the former) that b.N really is scaling linearly. If the time for b.N is 4X that of b.N/2 and 16X that of b.N/4, the benchmark function is using b.N incorrectly and shouldn't be reported at all. Doing this in package testing, without printing the smaller runs at all, would let benchstat continue to assume that all the reported runs are about the same size.

@josharian
Copy link
Contributor

Adding a correlation coefficient test to benchmarks seems like a good idea. We could also use the estimated slope of a (weighted?) linear regression as the ns/op instead of straight division, which in theory might yield better results.

I tried hacking in a quick-and-dirty linear regression and got plausible slope numbers. (I didn't mail it because I used a naive rather than a numerically stable calculation.)

I don't plan to work on this further at the moment.

@rsc
Copy link
Contributor

rsc commented Mar 8, 2017

A few years ago I spent a while looking at what Haskell's benchmark tool does. I convinced myself that trying to be clever about slope vs N=0 cost was probably not worthwhile - better to brute force enough iterations that the slope is dominant.

The initial version of Haskell's tool (I forget the name) apparently did fancy things like boosting too, but that seems like overkill too, super subtle and easy to quietly get wrong. Later versions seemed to have backed off some of the complexity.

@josharian
Copy link
Contributor

I convinced myself that trying to be clever about slope vs N=0 cost was probably not worthwhile - better to brute force enough iterations that the slope is dominant.

Fair enough. Looking at the N=0 cost may still be worth doing, if only to catch bugs like dbf533a. (Of course, one could brute force enough iterations for that, but it seems inadvisable.)

gopherbot pushed a commit to golang/sync that referenced this issue Mar 15, 2017
In the previous version, Loads of keys with unchanged values would
block on the Mutex if the map was dirty, leading to potentially long
stalls while the read-only map is being copied to the dirty map.

This change adds a double pointer indirection to each entry and an
extra atomic load on the fast path. In exchange, it removes the need
to lock the map for nearly all operations on existing keys.

Each entry is represented by an atomically-updated pointer to an
interface{} value. The same entries are stored in both the read-only
and dirty maps. Keys deleted before the dirty map was last copied are
marked as "expunged" and omitted from the dirty map until the next
Store to that key.

Newly-stored values exist only in the dirty map. The existence of new
keys in the dirty map is indicated by an "amended" bit attached to the
read-only map, allowing Load or Delete calls for nonexistent keys to
sometimes return immediately (without acquiring the Mutex).

This trades off a bit of steady-state throughput in the "mostly hits"
case in exchange for less contention (and lower Load tail latencies)
for mixed Load/Store/Delete usage.

Unfortunately, the latency impact is currently difficult to measure
(#19128).

updates golang/go#19128
updates golang/go#18177

https://perf.golang.org/search?q=upload:20170315.5

name                                               old time/op    new time/op    delta
LoadMostlyHits/*syncmap_test.DeepCopyMap             70.3ns ± 6%    70.2ns ± 4%      ~     (p=0.886 n=4+4)
LoadMostlyHits/*syncmap_test.DeepCopyMap-48          10.9ns ± 4%    13.6ns ±24%      ~     (p=0.314 n=4+4)
LoadMostlyHits/*syncmap_test.RWMutexMap              86.0ns ± 9%    86.0ns ± 4%      ~     (p=1.000 n=4+4)
LoadMostlyHits/*syncmap_test.RWMutexMap-48            140ns ± 3%     139ns ± 2%      ~     (p=0.686 n=4+4)
LoadMostlyHits/*syncmap.Map                          70.6ns ± 8%    70.3ns ± 1%      ~     (p=0.571 n=4+4)
LoadMostlyHits/*syncmap.Map-48                       10.3ns ± 7%    11.2ns ± 5%      ~     (p=0.114 n=4+4)
LoadMostlyMisses/*syncmap_test.DeepCopyMap           59.4ns ± 4%    59.4ns ± 4%      ~     (p=1.000 n=4+4)
LoadMostlyMisses/*syncmap_test.DeepCopyMap-48        10.4ns ±30%    12.6ns ±29%      ~     (p=0.486 n=4+4)
LoadMostlyMisses/*syncmap_test.RWMutexMap            64.8ns ± 6%    63.8ns ± 4%      ~     (p=0.686 n=4+4)
LoadMostlyMisses/*syncmap_test.RWMutexMap-48          138ns ± 3%     139ns ± 2%      ~     (p=0.800 n=4+4)
LoadMostlyMisses/*syncmap.Map                        51.5ns ± 7%    50.4ns ± 2%      ~     (p=0.371 n=4+4)
LoadMostlyMisses/*syncmap.Map-48                     9.37ns ± 3%    9.40ns ± 3%      ~     (p=0.886 n=4+4)
LoadOrStoreBalanced/*syncmap_test.RWMutexMap          820ns ±17%     834ns ±14%      ~     (p=1.000 n=4+4)
LoadOrStoreBalanced/*syncmap_test.RWMutexMap-48      1.38µs ± 4%    1.38µs ± 3%      ~     (p=0.886 n=4+4)
LoadOrStoreBalanced/*syncmap.Map                      709ns ±13%    1085ns ± 9%   +53.09%  (p=0.029 n=4+4)
LoadOrStoreBalanced/*syncmap.Map-48                  1.30µs ±13%    1.08µs ± 8%   -17.40%  (p=0.029 n=4+4)
LoadOrStoreUnique/*syncmap_test.RWMutexMap           1.26µs ±14%    1.16µs ±29%      ~     (p=0.343 n=4+4)
LoadOrStoreUnique/*syncmap_test.RWMutexMap-48        1.82µs ±15%    1.98µs ± 6%      ~     (p=0.143 n=4+4)
LoadOrStoreUnique/*syncmap.Map                       1.06µs ± 7%    1.86µs ±11%   +76.09%  (p=0.029 n=4+4)
LoadOrStoreUnique/*syncmap.Map-48                    2.00µs ± 4%    1.62µs ± 4%   -19.20%  (p=0.029 n=4+4)
LoadOrStoreCollision/*syncmap_test.DeepCopyMap       32.9ns ± 8%    32.6ns ± 9%      ~     (p=0.686 n=4+4)
LoadOrStoreCollision/*syncmap_test.DeepCopyMap-48   2.26ns ±136%    3.41ns ±62%      ~     (p=0.114 n=4+4)
LoadOrStoreCollision/*syncmap_test.RWMutexMap        58.0ns ±20%    54.6ns ±17%      ~     (p=0.343 n=4+4)
LoadOrStoreCollision/*syncmap_test.RWMutexMap-48      458ns ± 2%     445ns ± 6%      ~     (p=0.229 n=4+4)
LoadOrStoreCollision/*syncmap.Map                    35.8ns ± 5%    39.6ns ± 9%      ~     (p=0.114 n=4+4)
LoadOrStoreCollision/*syncmap.Map-48                 1.24ns ± 2%    1.58ns ± 4%   +27.16%  (p=0.029 n=4+4)
Range/*syncmap_test.DeepCopyMap                      19.7µs ± 4%    19.8µs ± 3%      ~     (p=0.686 n=4+4)
Range/*syncmap_test.DeepCopyMap-48                    763ns ± 1%     864ns ± 3%   +13.24%  (p=0.029 n=4+4)
Range/*syncmap_test.RWMutexMap                       20.9µs ± 3%    20.4µs ± 4%      ~     (p=0.343 n=4+4)
Range/*syncmap_test.RWMutexMap-48                     764ns ± 1%     870ns ± 1%   +13.77%  (p=0.029 n=4+4)
Range/*syncmap.Map                                   20.4µs ± 5%    22.7µs ± 5%   +10.98%  (p=0.029 n=4+4)
Range/*syncmap.Map-48                                 776ns ± 3%     954ns ± 4%   +22.95%  (p=0.029 n=4+4)
AdversarialAlloc/*syncmap_test.DeepCopyMap            206ns ± 5%     199ns ± 2%      ~     (p=0.200 n=4+4)
AdversarialAlloc/*syncmap_test.DeepCopyMap-48        8.94µs ± 5%    9.21µs ± 4%      ~     (p=0.200 n=4+4)
AdversarialAlloc/*syncmap_test.RWMutexMap            63.4ns ± 4%    63.8ns ± 3%      ~     (p=0.686 n=4+4)
AdversarialAlloc/*syncmap_test.RWMutexMap-48          184ns ±10%     198ns ±11%      ~     (p=0.343 n=4+4)
AdversarialAlloc/*syncmap.Map                         213ns ± 3%     264ns ± 3%   +23.97%  (p=0.029 n=4+4)
AdversarialAlloc/*syncmap.Map-48                      556ns ± 5%    1389ns ± 8%  +149.93%  (p=0.029 n=4+4)
AdversarialDelete/*syncmap_test.DeepCopyMap           300ns ± 6%     304ns ± 7%      ~     (p=0.886 n=4+4)
AdversarialDelete/*syncmap_test.DeepCopyMap-48        647ns ± 3%     646ns ± 3%      ~     (p=0.943 n=4+4)
AdversarialDelete/*syncmap_test.RWMutexMap           69.1ns ± 1%    69.2ns ± 6%      ~     (p=0.686 n=4+4)
AdversarialDelete/*syncmap_test.RWMutexMap-48         289ns ±15%     300ns ±17%      ~     (p=0.829 n=4+4)
AdversarialDelete/*syncmap.Map                        198ns ± 5%     264ns ± 2%   +33.17%  (p=0.029 n=4+4)
AdversarialDelete/*syncmap.Map-48                     291ns ± 9%     173ns ± 8%   -40.50%  (p=0.029 n=4+4)

name                                               old alloc/op   new alloc/op   delta
LoadMostlyHits/*syncmap_test.DeepCopyMap              7.00B ± 0%     7.00B ± 0%      ~     (all equal)
LoadMostlyHits/*syncmap_test.DeepCopyMap-48           7.00B ± 0%     7.00B ± 0%      ~     (all equal)
LoadMostlyHits/*syncmap_test.RWMutexMap               7.00B ± 0%     7.00B ± 0%      ~     (all equal)
LoadMostlyHits/*syncmap_test.RWMutexMap-48            7.00B ± 0%     7.00B ± 0%      ~     (all equal)
LoadMostlyHits/*syncmap.Map                           7.00B ± 0%     7.00B ± 0%      ~     (all equal)
LoadMostlyHits/*syncmap.Map-48                        7.00B ± 0%     7.00B ± 0%      ~     (all equal)
LoadMostlyMisses/*syncmap_test.DeepCopyMap            7.00B ± 0%     7.00B ± 0%      ~     (all equal)
LoadMostlyMisses/*syncmap_test.DeepCopyMap-48         7.00B ± 0%     7.00B ± 0%      ~     (all equal)
LoadMostlyMisses/*syncmap_test.RWMutexMap             7.00B ± 0%     7.00B ± 0%      ~     (all equal)
LoadMostlyMisses/*syncmap_test.RWMutexMap-48          7.00B ± 0%     7.00B ± 0%      ~     (all equal)
LoadMostlyMisses/*syncmap.Map                         7.00B ± 0%     7.00B ± 0%      ~     (all equal)
LoadMostlyMisses/*syncmap.Map-48                      7.00B ± 0%     7.00B ± 0%      ~     (all equal)
LoadOrStoreBalanced/*syncmap_test.RWMutexMap          95.0B ± 0%     95.0B ± 0%      ~     (all equal)
LoadOrStoreBalanced/*syncmap_test.RWMutexMap-48       95.0B ± 0%     95.0B ± 0%      ~     (all equal)
LoadOrStoreBalanced/*syncmap.Map                      95.0B ± 0%     88.0B ± 0%    -7.37%  (p=0.029 n=4+4)
LoadOrStoreBalanced/*syncmap.Map-48                   95.0B ± 0%     88.0B ± 0%    -7.37%  (p=0.029 n=4+4)
LoadOrStoreUnique/*syncmap_test.RWMutexMap             175B ± 0%      175B ± 0%      ~     (all equal)
LoadOrStoreUnique/*syncmap_test.RWMutexMap-48          175B ± 0%      175B ± 0%      ~     (all equal)
LoadOrStoreUnique/*syncmap.Map                         175B ± 0%      161B ± 0%    -8.00%  (p=0.029 n=4+4)
LoadOrStoreUnique/*syncmap.Map-48                      175B ± 0%      161B ± 0%    -8.00%  (p=0.029 n=4+4)
LoadOrStoreCollision/*syncmap_test.DeepCopyMap        0.00B          0.00B           ~     (all equal)
LoadOrStoreCollision/*syncmap_test.DeepCopyMap-48     0.00B          0.00B           ~     (all equal)
LoadOrStoreCollision/*syncmap_test.RWMutexMap         0.00B          0.00B           ~     (all equal)
LoadOrStoreCollision/*syncmap_test.RWMutexMap-48      0.00B          0.00B           ~     (all equal)
LoadOrStoreCollision/*syncmap.Map                     0.00B          0.00B           ~     (all equal)
LoadOrStoreCollision/*syncmap.Map-48                  0.00B          0.00B           ~     (all equal)
Range/*syncmap_test.DeepCopyMap                       0.00B          0.00B           ~     (all equal)
Range/*syncmap_test.DeepCopyMap-48                    0.00B          0.00B           ~     (all equal)
Range/*syncmap_test.RWMutexMap                        0.00B          0.00B           ~     (all equal)
Range/*syncmap_test.RWMutexMap-48                     0.00B          0.00B           ~     (all equal)
Range/*syncmap.Map                                    0.00B          0.00B           ~     (all equal)
Range/*syncmap.Map-48                                 0.00B          0.00B           ~     (all equal)
AdversarialAlloc/*syncmap_test.DeepCopyMap            74.0B ± 0%     74.0B ± 0%      ~     (all equal)
AdversarialAlloc/*syncmap_test.DeepCopyMap-48        3.15kB ± 0%    3.15kB ± 0%      ~     (p=1.000 n=4+4)
AdversarialAlloc/*syncmap_test.RWMutexMap             8.00B ± 0%     8.00B ± 0%      ~     (all equal)
AdversarialAlloc/*syncmap_test.RWMutexMap-48          8.00B ± 0%     8.00B ± 0%      ~     (all equal)
AdversarialAlloc/*syncmap.Map                         74.0B ± 0%     55.0B ± 0%   -25.68%  (p=0.029 n=4+4)
AdversarialAlloc/*syncmap.Map-48                      8.00B ± 0%    56.25B ± 1%  +603.12%  (p=0.029 n=4+4)
AdversarialDelete/*syncmap_test.DeepCopyMap            155B ± 0%      155B ± 0%      ~     (all equal)
AdversarialDelete/*syncmap_test.DeepCopyMap-48         156B ± 0%      156B ± 0%      ~     (p=1.000 n=4+4)
AdversarialDelete/*syncmap_test.RWMutexMap            8.00B ± 0%     8.00B ± 0%      ~     (all equal)
AdversarialDelete/*syncmap_test.RWMutexMap-48         8.00B ± 0%     8.00B ± 0%      ~     (all equal)
AdversarialDelete/*syncmap.Map                        81.0B ± 0%     65.0B ± 0%   -19.75%  (p=0.029 n=4+4)
AdversarialDelete/*syncmap.Map-48                     23.0B ± 9%     15.2B ± 5%   -33.70%  (p=0.029 n=4+4)

name                                               old allocs/op  new allocs/op  delta
LoadMostlyHits/*syncmap_test.DeepCopyMap               0.00           0.00           ~     (all equal)
LoadMostlyHits/*syncmap_test.DeepCopyMap-48            0.00           0.00           ~     (all equal)
LoadMostlyHits/*syncmap_test.RWMutexMap                0.00           0.00           ~     (all equal)
LoadMostlyHits/*syncmap_test.RWMutexMap-48             0.00           0.00           ~     (all equal)
LoadMostlyHits/*syncmap.Map                            0.00           0.00           ~     (all equal)
LoadMostlyHits/*syncmap.Map-48                         0.00           0.00           ~     (all equal)
LoadMostlyMisses/*syncmap_test.DeepCopyMap             0.00           0.00           ~     (all equal)
LoadMostlyMisses/*syncmap_test.DeepCopyMap-48          0.00           0.00           ~     (all equal)
LoadMostlyMisses/*syncmap_test.RWMutexMap              0.00           0.00           ~     (all equal)
LoadMostlyMisses/*syncmap_test.RWMutexMap-48           0.00           0.00           ~     (all equal)
LoadMostlyMisses/*syncmap.Map                          0.00           0.00           ~     (all equal)
LoadMostlyMisses/*syncmap.Map-48                       0.00           0.00           ~     (all equal)
LoadOrStoreBalanced/*syncmap_test.RWMutexMap           2.00 ± 0%      2.00 ± 0%      ~     (all equal)
LoadOrStoreBalanced/*syncmap_test.RWMutexMap-48        2.00 ± 0%      2.00 ± 0%      ~     (all equal)
LoadOrStoreBalanced/*syncmap.Map                       2.00 ± 0%      3.00 ± 0%   +50.00%  (p=0.029 n=4+4)
LoadOrStoreBalanced/*syncmap.Map-48                    2.00 ± 0%      3.00 ± 0%   +50.00%  (p=0.029 n=4+4)
LoadOrStoreUnique/*syncmap_test.RWMutexMap             2.00 ± 0%      2.00 ± 0%      ~     (all equal)
LoadOrStoreUnique/*syncmap_test.RWMutexMap-48          2.00 ± 0%      2.00 ± 0%      ~     (all equal)
LoadOrStoreUnique/*syncmap.Map                         2.00 ± 0%      4.00 ± 0%  +100.00%  (p=0.029 n=4+4)
LoadOrStoreUnique/*syncmap.Map-48                      2.00 ± 0%      4.00 ± 0%  +100.00%  (p=0.029 n=4+4)
LoadOrStoreCollision/*syncmap_test.DeepCopyMap         0.00           0.00           ~     (all equal)
LoadOrStoreCollision/*syncmap_test.DeepCopyMap-48      0.00           0.00           ~     (all equal)
LoadOrStoreCollision/*syncmap_test.RWMutexMap          0.00           0.00           ~     (all equal)
LoadOrStoreCollision/*syncmap_test.RWMutexMap-48       0.00           0.00           ~     (all equal)
LoadOrStoreCollision/*syncmap.Map                      0.00           0.00           ~     (all equal)
LoadOrStoreCollision/*syncmap.Map-48                   0.00           0.00           ~     (all equal)
Range/*syncmap_test.DeepCopyMap                        0.00           0.00           ~     (all equal)
Range/*syncmap_test.DeepCopyMap-48                     0.00           0.00           ~     (all equal)
Range/*syncmap_test.RWMutexMap                         0.00           0.00           ~     (all equal)
Range/*syncmap_test.RWMutexMap-48                      0.00           0.00           ~     (all equal)
Range/*syncmap.Map                                     0.00           0.00           ~     (all equal)
Range/*syncmap.Map-48                                  0.00           0.00           ~     (all equal)
AdversarialAlloc/*syncmap_test.DeepCopyMap             1.00 ± 0%      1.00 ± 0%      ~     (all equal)
AdversarialAlloc/*syncmap_test.DeepCopyMap-48          1.00 ± 0%      1.00 ± 0%      ~     (all equal)
AdversarialAlloc/*syncmap_test.RWMutexMap              1.00 ± 0%      1.00 ± 0%      ~     (all equal)
AdversarialAlloc/*syncmap_test.RWMutexMap-48           1.00 ± 0%      1.00 ± 0%      ~     (all equal)
AdversarialAlloc/*syncmap.Map                          1.00 ± 0%      1.00 ± 0%      ~     (all equal)
AdversarialAlloc/*syncmap.Map-48                       1.00 ± 0%      1.00 ± 0%      ~     (all equal)
AdversarialDelete/*syncmap_test.DeepCopyMap            1.00 ± 0%      1.00 ± 0%      ~     (all equal)
AdversarialDelete/*syncmap_test.DeepCopyMap-48         1.00 ± 0%      1.00 ± 0%      ~     (all equal)
AdversarialDelete/*syncmap_test.RWMutexMap             1.00 ± 0%      1.00 ± 0%      ~     (all equal)
AdversarialDelete/*syncmap_test.RWMutexMap-48          1.00 ± 0%      1.00 ± 0%      ~     (all equal)
AdversarialDelete/*syncmap.Map                         1.00 ± 0%      1.00 ± 0%      ~     (all equal)
AdversarialDelete/*syncmap.Map-48                      1.00 ± 0%      1.00 ± 0%      ~     (all equal)

Change-Id: I93c9458505d84238cc8e9016a7dfd285868af236
Reviewed-on: https://go-review.googlesource.com/37342
Reviewed-by: Russ Cox <rsc@golang.org>
@rsc
Copy link
Contributor

rsc commented Apr 10, 2017

I'd like to move this along, assuming that we're not going to introduce a new API for writing benchmarks (so no equivalent of pb.Next) and that we can't emit extra output lines by default (preserving the current one benchmark = one output line). And when we do emit extra output lines, all the lines need to be about the same thing (no warmups to filter out from averaging calculations in benchstat or other tools).

That implies that we need to add some flag that causes a benchmark to be run multiple times for smaller counts and then with one line of output per each of those runs.

Assuming the default -benchtime=1s, if instead of printing output for one run of 1s we want to print output for 10 runs of 0.1s, this can be specified in two possible ways: by setting the target run duration with the count implicit (-benchtime2=0.1s), or by setting the target number of runs with the duration implicit (-benchsplit=10). We only want to add one of these to package testing.

I don't have a great name for the target run duration (obviously), but I think benchsplit could work if we decide to specify the number of runs instead.

The split factor is subject to change to respect the overall time limit. If, say, one iteration of a benchmark takes 0.6 seconds, then with -benchtime=1s -benchsplit=10, we'll still only run ~2 of them, not 10. This makes it sensible to run go test -benchsplit=10 in the go1 directory, as compared to go test -benchtime=0.1 -count=10. The latter runs even very expensive benchmarks 10 times, while the former only runs them once.

So my proposal for this proposal is to add -benchsplit=N as described above.

I don't know if this really helps @bcmills's original request. If not I apologize for hijacking. We've been talking about getting this data out for years, and it does provide more fine-grained detail about the distribution of runs, just not individual runs. If there are rare high-latency individual runs then hopefully you'd be able to split down to a small enough interval that some are lucky and some are not, and then that would come out in the data as a bimodal distribution.

@bcmills
Copy link
Contributor Author

bcmills commented Apr 10, 2017

I don't think that really addresses my original request, unfortunately: for the kinds of latency distributions I'm interested in, one output line per run would be far more data than I want to sift through.

But it does at least seem like a step in the right direction.

@josharian
Copy link
Contributor

Agreed: Seems like a step in the right direction, and I would definitely use this. The interaction between slow benchmarks and -count has always been painful.

To be clear on the (non-)interaction between -benchsplit and -count, if you did -benchtime=1s -benchsplit=10 -count=5, you'd get anywhere between 5 and 50 lines of output?

@rsc
Copy link
Contributor

rsc commented Apr 10, 2017

@bcmills, sifting through the output is for computers.

@josharian, yes.

for 0..count {
    total = 0
    for total < benchtime {
        t = runAndReport(benchtime/benchsplit)
        total += t
    }
}

@rsc
Copy link
Contributor

rsc commented Apr 17, 2017

Objections to adding -benchsplit?

/cc @aclements @mpvl

@rsc
Copy link
Contributor

rsc commented Jun 12, 2017

Sounds like there are no objections to -benchsplit. Anyone want to implement it?

@rsc rsc changed the title proposal: testing: measure benchmark latency distributions testing: add -benchsplit to get more data points Jun 12, 2017
@rsc rsc modified the milestones: Go1.10, Proposal Jun 12, 2017
@meirf
Copy link
Contributor

meirf commented Jun 29, 2017

(I've started working on this and plan on having a CL by early next week. Hopefully my recent testing contributions grant me such a runway.)

@gopherbot
Copy link

CL https://golang.org/cl/47411 mentions this issue.

@gopherbot
Copy link

Change https://golang.org/cl/80841 mentions this issue: testing: remove claim that b.Run is safe for concurrent use

@cespare
Copy link
Contributor

cespare commented May 14, 2018

Has progress on this stalled?

@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Unplanned Jun 29, 2018
@ianlancetaylor ianlancetaylor added help wanted NeedsFix The path to resolution is known, but the work has not been done. FeatureRequest labels Jun 29, 2018
@cespare
Copy link
Contributor

cespare commented Apr 25, 2019

Answering my own question from a year ago: yes, it seems like CL 47411 has stalled out, but the proposal is approved and the details seem to have been mostly ironed out in the extensive discussion on that CL. I think anyone could pick up these changes and take them across the finish line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FeatureRequest help wanted NeedsFix The path to resolution is known, but the work has not been done. Proposal Proposal-Accepted
Projects
None yet
Development

No branches or pull requests

8 participants