testing: ease writing parallel benchmarks #7090

bradfitz · 2014-01-09T21:33:57Z

Writing contention benchmarks involves some boilerplate:

https://golang.org/cl/46010043/diff/60001/src/pkg/sync/pool_test.go
https://golang.org/cl/49910043/

etc

The general form is:

        const CallsPerSched = 1000
    procs := runtime.GOMAXPROCS(-1)
        N := int32(b.N / CallsPerSched)
        c := make(chan bool, procs)
    for p := 0; p < procs; p++ {
                go func() {
                    var buf bytes.Buffer
                        for atomic.AddInt32(&N, -1) >= 0 {
                                for g := 0; g < CallsPerSched; g++ {
                    f(&buf)
                                }
                    }
                        c <- true
            }()
        }
        for p := 0; p < procs; p++ {
                <-c
    }


But sometimes:

    n0 := uintptr(b.N)
    atomic.AddUintptr(&n, ^uintptr(0)) < n0 {

The testing package seems to cap b.N at 2*1e9, but that's not publicly documented as a
guarantee.

Can/should we say that b.N will always fit in an int32, even if it's of type int?

I once even defensively wrote,

func BenchmarkPool(b *testing.B) {
        procs := runtime.GOMAXPROCS(-1)
    var dec func() bool
    if unsafe.Sizeof(b.N) == 8 {
        n := int64(b.N)
        dec = func() bool {
            return atomic.AddInt64(&n, -1) >= 0
        }
    } else {
                n := int32(b.N)
        dec = func() bool {
            return atomic.AddInt32(&n, -1) >= 0
        }
    }
        var p Pool
    var wg WaitGroup
    for i := 0; i < procs; i++ {
        wg.Add(1)
        go func() {
                        defer wg.Done()
            for dec() {
                                p.Put(1)
                p.Get()
            }
        }()
    }
        wg.Wait()
}

... but felt gross about it.

We should either document this, or provide a means in the testing package to ease
writing benchmarks for contention.

dvyukov · 2014-01-10T12:32:39Z

Comment 1:

Yes, it would be handy. Lots of benchmarks do this. And even more do not, but should.
In the dashboard benchmarks I use the following helper function:
// Parallel is a public helper function that runs f N times in P*GOMAXPROCS goroutines.
func Parallel(N uint64, P int, f func()) {
        numProcs := P * runtime.GOMAXPROCS(0)
        var wg sync.WaitGroup
        wg.Add(numProcs)
        for p := 0; p < numProcs; p++ {
                go func() {
                        defer wg.Done()
                        for int64(atomic.AddUint64(&N, ^uint64(0))) >= 0 {
                                f()
                        }
                }()
        }
        wg.Wait()
}
One aspect to consider is that generally it also needs to know "grain size", because
synchronizing on each iteration can outweigh the thing-under-test. If it's incorporated
into testing package, then probably we can remember ns/op from previous runs and thus
easily calculate grain size.

Labels changed: added repo-main, release-go1.3maybe.

dvyukov · 2014-01-26T12:09:38Z

Comment 2:

Out of 27 parallel benchmarks in std lib, 16 fit well into simple:
b.RunParallel(func() {
  ...
})
but 11 use local per-goroutine state, so they do not fit as is into this simple pattern.
I see 2 options for per-goroutine state:
1.
b.RunParallel(func(x *interface{}) {
  ...
})
then the function can cache anything it wants in x. The overhead is merely interface
cast.
2. benchmarks can use sync.Pool to cache local state.
Pool.Get/Put overhead is 20-50 ns depending on processor.
and this variant most likely will create more resources than there are goroutines.
---
Separate question is whether we want to support goroutine excess, i.e. create
K*GOMAXPROCS goroutines.
The interface can be:
b.RunParallel(4, func() {
  ...
})
this will create 4*GOMAXPROCS goroutines.
This may be useful to benchmark something that includes IO operations, or has contention
(so that some goroutines are temporary non-runnable).
But I am concerned that users may mis-interpret this parameter.
Brad?

bradfitz · 2014-01-26T17:59:25Z

Comment 4:

Or even:
    b.RunParallel(f func() (loopFn func()))
f is called once per goroutine and returns a func to be called in a loop.
Then the per-goroutine state is simply createdby f and closed over in loopFn.
That might be too complicated for the majority of cases, though.  We could provide a
simple method and a more complex method that gives you the K parameter too.
I don't have strong opinions here, other than wanting to make this easy to write and
cleaning up the boilerplate in these 27+ and growing number of places.

dvyukov · 2014-01-27T13:39:25Z

Comment 5:

Here is what I have now:
https://golang.org/cl/57270043
Your "func() func()" idea works nicely, and it seems to be enough to express all common
cases.
Although, this "b.RunParallel2(1, func() func() {" looks somewhat clumsy for std lib.
And, yes, we need a better name for RunParallel2.
Any suggestions?

Owner changed to @dvyukov.

Status changed to Started.

bradfitz · 2014-01-30T09:30:14Z

Comment 6:

Replied on codereview.

dvyukov · 2014-02-17T02:30:12Z

Comment 7:

This issue was closed by revision c3922f0.

Status changed to Fixed.

bradfitz added fixed labels Feb 17, 2014

rsc added this to the Go1.3 milestone Apr 14, 2015

rsc removed the release-go1.3maybe label Apr 14, 2015

golang locked and limited conversation to collaborators Jun 25, 2016

gopherbot added the FrozenDueToAge label Jun 25, 2016

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

testing: ease writing parallel benchmarks #7090

testing: ease writing parallel benchmarks #7090

bradfitz commented Jan 9, 2014

dvyukov commented Jan 10, 2014

dvyukov commented Jan 26, 2014

bradfitz commented Jan 26, 2014

dvyukov commented Jan 27, 2014

bradfitz commented Jan 30, 2014

dvyukov commented Feb 17, 2014

testing: ease writing parallel benchmarks #7090

testing: ease writing parallel benchmarks #7090

Comments

bradfitz commented Jan 9, 2014

dvyukov commented Jan 10, 2014

dvyukov commented Jan 26, 2014

bradfitz commented Jan 26, 2014

dvyukov commented Jan 27, 2014

bradfitz commented Jan 30, 2014

dvyukov commented Feb 17, 2014