Issue 57270043: code review 57270043: testing: ease writing parallel benchmarks

bradfitz

https://codereview.appspot.com/57270043/diff/40001/src/pkg/testing/benchmark.go File src/pkg/testing/benchmark.go (right): https://codereview.appspot.com/57270043/diff/40001/src/pkg/testing/benchmark.go#newcode355 src/pkg/testing/benchmark.go:355: func (b *B) RunParallel2(P int, ctor func() func()) { ...

10 years, 2 months ago (2014-01-30 09:29:58 UTC) #1

dvyukov

Hello bradfitz@golang.org (cc: golang-codereviews@googlegroups.com, r@golang.org), I'd like you to review this change to https://dvyukov%40google.com@code.google.com/p/go/

10 years, 2 months ago (2014-01-30 10:33:16 UTC) #2

dvyukov

On 2014/01/30 09:29:58, bradfitz wrote: > https://codereview.appspot.com/57270043/diff/40001/src/pkg/testing/benchmark.go > File src/pkg/testing/benchmark.go (right): > > https://codereview.appspot.com/57270043/diff/40001/src/pkg/testing/benchmark.go#newcode355 > ...

10 years, 2 months ago (2014-01-30 10:33:38 UTC) #3

bradfitz

LGTM but wait for an API review from r or iant https://codereview.appspot.com/57270043/diff/100001/src/pkg/testing/benchmark.go File src/pkg/testing/benchmark.go (right): ...

10 years, 2 months ago (2014-01-30 10:45:40 UTC) #4

dvyukov

https://codereview.appspot.com/57270043/diff/100001/src/pkg/testing/benchmark.go File src/pkg/testing/benchmark.go (right): https://codereview.appspot.com/57270043/diff/100001/src/pkg/testing/benchmark.go#newcode354 src/pkg/testing/benchmark.go:354: // A typical value for perProc for CPU bound ...

10 years, 2 months ago (2014-01-30 11:20:11 UTC) #5

dvyukov

And Go team must arrange for a persistent guest reviewer in Europe :)

10 years, 2 months ago (2014-01-30 11:21:20 UTC) #6

dvyukov

I think we need to somehow mention that it's intended to be used with "go ...

10 years, 2 months ago (2014-01-30 11:22:52 UTC) #7

bradfitz

Is AddUint32 faster than AddUint64? On Jan 30, 2014 12:20 PM, <dvyukov@google.com> wrote: > > ...

10 years, 2 months ago (2014-01-30 12:12:41 UTC) #8

dvyukov

It's probably a bit faster on 386: XADD vs CMPXCHG8B loop. It would matter more ...

10 years, 2 months ago (2014-01-30 12:20:23 UTC) #9

iant

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark.go File src/pkg/testing/benchmark.go (right): https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark.go#newcode352 src/pkg/testing/benchmark.go:352: // It creates perProc*GOMAXPROCS worker goroutines I guess the ...

10 years, 2 months ago (2014-01-31 00:40:50 UTC) #10

dvyukov

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark.go File src/pkg/testing/benchmark.go (right): https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark.go#newcode352 src/pkg/testing/benchmark.go:352: // It creates perProc*GOMAXPROCS worker goroutines On 2014/01/31 00:40:50, ...

10 years, 2 months ago (2014-01-31 09:26:21 UTC) #11

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark.go
File src/pkg/testing/benchmark.go (right):

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:352: // It creates perProc*GOMAXPROCS worker
goroutines
On 2014/01/31 00:40:50, iant wrote:
> I guess the general idea is to run some number of instances of the function in
> parallel, to see how long they take.  I'm not sure why the API should express
> that in terms of GOMAXPROCS.  Why not have the caller pass in the number of
> goroutines that should be started simultaneously, and let them worry about
> correlating that with GOMAXPROCS?  In particular I can imagine getting rid of
> GOMAXPROCS some day, at which point it would be strange to have be part of the
> API here.

We have a nice, handy and official way to control GOMAXPROCS in benchmarks and
tests, which is:
$ go test -cpu=1,2,4,8
In all instances of parallel benchmarks we start k*GOMAXPROCS goroutines.
This very naturally leads to this flexible API (as compared to hardcoding
GOMAXPROCS in benchmarks).

In particular one thing that I want to prevent is:
func BenchmarkFoo(b *testing.B) {
  runtime.GOMAXPROCS(4)
  b.RunParallel(4, ...)
}

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:356: // and must return the loop function which
will be executed b.N times.
On 2014/01/31 00:40:50, iant wrote:
> This isn't how non-parallel benchmarks behave.  Those benchmarks handle b.N
> themselves.  Why shouldn't parallel benchmarks behave the same?  Also, the
> function should take a *testing.B anyhow, so that it can call Fail, Log, etc.

I do not understand comment about b.N. The automatic b.N tuning logic has
already happened before the BenhcmarkFoo is called. RunParallel takes full
advantage of it, user does not need to do anything special with b.N. In fact,
with RunParallel you care even less about b.N, you just provide loop function.

b is easily captured by both setup and loop functions.

Take a look at the example usages of RunParallel in chan_test.go.

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:361: var wg sync.WaitGroup
On 2014/01/31 00:40:50, iant wrote:
> May as well move wg down to just before the loop.

Done.

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark...
File src/pkg/testing/benchmark_test.go (right):

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark_test.go:84: // Parallel benchmark for fmt.Fprintf into
a bytes.Buffer.
On 2014/01/31 00:40:50, iant wrote:
> This seems like a strange example, because I can't see why anybody would care
> about how fmt.Fprintf behaves when run in parallel.

We do care:
https://codereview.appspot.com/50140045/

I believe that all of our benchmarks must be parallel:
1. If you run without -cpu=1,2,4, then you get the same as for non-parallel
benchmark
2. It can highlight any implicit or explicit synchronization in the component
3. If there is no, it still can highlight bottlenecks in runtime (e.g. malloc
and GC always have some synchronization)
4. If it just scales linearly, it's still a good result

Having said that, I am open to suggestions about a better example. I think we
don't want to write a new component just for the example, so it must be
something from std lib.

Sign in to reply to this message.

iant

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark.go File src/pkg/testing/benchmark.go (right): https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark.go#newcode352 src/pkg/testing/benchmark.go:352: // It creates perProc*GOMAXPROCS worker goroutines On 2014/01/31 09:26:21, ...

10 years, 2 months ago (2014-01-31 14:45:00 UTC) #12

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark.go
File src/pkg/testing/benchmark.go (right):

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:352: // It creates perProc*GOMAXPROCS worker
goroutines
On 2014/01/31 09:26:21, dvyukov wrote:
> On 2014/01/31 00:40:50, iant wrote:
> > I guess the general idea is to run some number of instances of the function
in
> > parallel, to see how long they take.  I'm not sure why the API should
express
> > that in terms of GOMAXPROCS.  Why not have the caller pass in the number of
> > goroutines that should be started simultaneously, and let them worry about
> > correlating that with GOMAXPROCS?  In particular I can imagine getting rid
of
> > GOMAXPROCS some day, at which point it would be strange to have be part of
the
> > API here.
> 
> 
> We have a nice, handy and official way to control GOMAXPROCS in benchmarks and
> tests, which is:
> $ go test -cpu=1,2,4,8
> In all instances of parallel benchmarks we start k*GOMAXPROCS goroutines.
> This very naturally leads to this flexible API (as compared to hardcoding
> GOMAXPROCS in benchmarks).
> 
> In particular one thing that I want to prevent is:
> func BenchmarkFoo(b *testing.B) {
>   runtime.GOMAXPROCS(4)
>   b.RunParallel(4, ...)
> }

OK, makes sense.

RunParallel helps write parallel benchmarks.
It creates perProc worker goroutines on each operating system thread.  The
number of threads to use comes from GOMAXPROCS; it's most useful to set
GOMAXPROCS with the go test -cpu option.  For a CPU-bound benchmark, perProc
would typically be set to 1.

The setup function is called once in each worker goroutine.  It must return the
loop function, which will be executed b.N times.  The loop function should not
use b.N itself.  The setup function may be used to create goroutine-local
variables that the loop function may refer to.

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark...
File src/pkg/testing/benchmark_test.go (right):

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark_test.go:84: // Parallel benchmark for fmt.Fprintf into
a bytes.Buffer.
On 2014/01/31 09:26:21, dvyukov wrote:
> On 2014/01/31 00:40:50, iant wrote:
> > This seems like a strange example, because I can't see why anybody would
care
> > about how fmt.Fprintf behaves when run in parallel.
> 
> We do care:
> https://codereview.appspot.com/50140045/

That is not a great example, as it is really a test of sync.Pool.  I think this
example here would make more sense to readers if you test sync.Pool here.

> I believe that all of our benchmarks must be parallel:

If you really believe that then I think we should consider a different approach
here.  This API is simple but looks somewhat cumbersome to use.

For example, perhaps we could say that go test will look for any function

ParallelBenchmark(*testing.PB, int)

and execute -test.cpu instances in parallel, passing the number of loop
iterations as the int parameter.  If you like this suggestion I recommend taking
it to golang-dev to see what others think.

Sign in to reply to this message.

dvyukov

On Fri, Jan 31, 2014 at 6:45 PM, <iant@golang.org> wrote: > > https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark.go > File ...

10 years, 2 months ago (2014-01-31 15:00:30 UTC) #13

On Fri, Jan 31, 2014 at 6:45 PM,  <iant@golang.org> wrote:
>
>
https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark.go
> File src/pkg/testing/benchmark.go (right):
>
>
https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark...
> src/pkg/testing/benchmark.go:352: // It creates perProc*GOMAXPROCS
> worker goroutines
> On 2014/01/31 09:26:21, dvyukov wrote:
>>
>> On 2014/01/31 00:40:50, iant wrote:
>> > I guess the general idea is to run some number of instances of the
>
> function in
>>
>> > parallel, to see how long they take.  I'm not sure why the API
>
> should express
>>
>> > that in terms of GOMAXPROCS.  Why not have the caller pass in the
>
> number of
>>
>> > goroutines that should be started simultaneously, and let them worry
>
> about
>>
>> > correlating that with GOMAXPROCS?  In particular I can imagine
>
> getting rid of
>>
>> > GOMAXPROCS some day, at which point it would be strange to have be
>
> part of the
>>
>> > API here.
>
>
>
>> We have a nice, handy and official way to control GOMAXPROCS in
>
> benchmarks and
>>
>> tests, which is:
>> $ go test -cpu=1,2,4,8
>> In all instances of parallel benchmarks we start k*GOMAXPROCS
>
> goroutines.
>>
>> This very naturally leads to this flexible API (as compared to
>
> hardcoding
>>
>> GOMAXPROCS in benchmarks).
>
>
>> In particular one thing that I want to prevent is:
>> func BenchmarkFoo(b *testing.B) {
>>    runtime.GOMAXPROCS(4)
>>    b.RunParallel(4, ...)
>> }
>
>
> OK, makes sense.
>
> RunParallel helps write parallel benchmarks.
> It creates perProc worker goroutines on each operating system thread.
> The number of threads to use comes from GOMAXPROCS; it's most useful to
> set GOMAXPROCS with the go test -cpu option.  For a CPU-bound benchmark,
> perProc would typically be set to 1.
>
> The setup function is called once in each worker goroutine.  It must
> return the loop function, which will be executed b.N times.  The loop
> function should not use b.N itself.  The setup function may be used to
> create goroutine-local variables that the loop function may refer to.
>
>
>
https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark...
> File src/pkg/testing/benchmark_test.go (right):
>
>
https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark...
> src/pkg/testing/benchmark_test.go:84: // Parallel benchmark for
> fmt.Fprintf into a bytes.Buffer.
> On 2014/01/31 09:26:21, dvyukov wrote:
>>
>> On 2014/01/31 00:40:50, iant wrote:
>> > This seems like a strange example, because I can't see why anybody
>
> would care
>>
>> > about how fmt.Fprintf behaves when run in parallel.
>
>
>> We do care:
>> https://codereview.appspot.com/50140045/
>
>
> That is not a great example, as it is really a test of sync.Pool.  I
> think this example here would make more sense to readers if you test
> sync.Pool here.


Good idea!



>> I believe that all of our benchmarks must be parallel:
>
>
> If you really believe that then I think we should consider a different
> approach here.  This API is simple but looks somewhat cumbersome to use.
>
> For example, perhaps we could say that go test will look for any
> function
>
> ParallelBenchmark(*testing.PB, int)
>
> and execute -test.cpu instances in parallel, passing the number of loop
> iterations as the int parameter.  If you like this suggestion I
> recommend taking it to golang-dev to see what others think.

The problem is that half of the parallel benchmarks in std lib need
goroutine-local state, so they do not fit into your simple pattern.

The general structure of parallel benchmarks is:

func BenchmarkFoo(...) {
  // create some global state
      // create some goroutine-local state
           // do one iteration using global state and goroutine-local state
}

However, I like the idea of providing more fundamental (and easier to
use) support for parallel benchmark. I will try to figure out how it
can look like.

Sign in to reply to this message.

dvyukov

PTAL https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark.go File src/pkg/testing/benchmark.go (right): https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark.go#newcode352 src/pkg/testing/benchmark.go:352: // It creates perProc*GOMAXPROCS worker goroutines On 2014/01/31 ...

10 years, 2 months ago (2014-02-02 17:31:39 UTC) #14

PTAL

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark.go
File src/pkg/testing/benchmark.go (right):

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:352: // It creates perProc*GOMAXPROCS worker
goroutines
On 2014/01/31 14:45:01, iant wrote:
> On 2014/01/31 09:26:21, dvyukov wrote:
> > On 2014/01/31 00:40:50, iant wrote:
> > > I guess the general idea is to run some number of instances of the
function
> in
> > > parallel, to see how long they take.  I'm not sure why the API should
> express
> > > that in terms of GOMAXPROCS.  Why not have the caller pass in the number
of
> > > goroutines that should be started simultaneously, and let them worry about
> > > correlating that with GOMAXPROCS?  In particular I can imagine getting rid
> of
> > > GOMAXPROCS some day, at which point it would be strange to have be part of
> the
> > > API here.
> > 
> > 
> > We have a nice, handy and official way to control GOMAXPROCS in benchmarks
and
> > tests, which is:
> > $ go test -cpu=1,2,4,8
> > In all instances of parallel benchmarks we start k*GOMAXPROCS goroutines.
> > This very naturally leads to this flexible API (as compared to hardcoding
> > GOMAXPROCS in benchmarks).
> > 
> > In particular one thing that I want to prevent is:
> > func BenchmarkFoo(b *testing.B) {
> >   runtime.GOMAXPROCS(4)
> >   b.RunParallel(4, ...)
> > }
> 
> OK, makes sense.
> 
> RunParallel helps write parallel benchmarks.
> It creates perProc worker goroutines on each operating system thread.  The
> number of threads to use comes from GOMAXPROCS; it's most useful to set
> GOMAXPROCS with the go test -cpu option.  For a CPU-bound benchmark, perProc
> would typically be set to 1.
> 
> The setup function is called once in each worker goroutine.  It must return
the
> loop function, which will be executed b.N times.  The loop function should not
> use b.N itself.  The setup function may be used to create goroutine-local
> variables that the loop function may refer to.

Done.

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark...
File src/pkg/testing/benchmark_test.go (right):

https://codereview.appspot.com/57270043/diff/140001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark_test.go:84: // Parallel benchmark for fmt.Fprintf into
a bytes.Buffer.
On 2014/01/31 14:45:01, iant wrote:
> On 2014/01/31 09:26:21, dvyukov wrote:
> > On 2014/01/31 00:40:50, iant wrote:
> > > This seems like a strange example, because I can't see why anybody would
> care
> > > about how fmt.Fprintf behaves when run in parallel.
> > 
> > We do care:
> > https://codereview.appspot.com/50140045/
> 
> That is not a great example, as it is really a test of sync.Pool.  I think
this
> example here would make more sense to readers if you test sync.Pool here.


I really want to show goroutine-local state, but I don't know how to reasonably
squeeze it into sync.Pool benchmark.
So I've changed it to parallel benchmark of text/template.Template.Execute on a
single object. It makes the global/shared state clearly visible.

Sign in to reply to this message.

dvyukov

On 2014/01/31 15:00:30, dvyukov wrote: > >> I believe that all of our benchmarks must ...

10 years, 2 months ago (2014-02-02 17:34:50 UTC) #15

josharian

On 2014/02/02 17:34:50, dvyukov wrote: > On 2014/01/31 15:00:30, dvyukov wrote: > > >> I ...

10 years, 2 months ago (2014-02-04 22:17:02 UTC) #16

On 2014/02/02 17:34:50, dvyukov wrote:
> On 2014/01/31 15:00:30, dvyukov wrote:
> > >> I believe that all of our benchmarks must be parallel:
> > >
> > >
> > > If you really believe that then I think we should consider a different
> > > approach here.  This API is simple but looks somewhat cumbersome to use.
> > >
> > > For example, perhaps we could say that go test will look for any
> > > function
> > >
> > > ParallelBenchmark(*testing.PB, int)
> > >
> > > and execute -test.cpu instances in parallel, passing the number of loop
> > > iterations as the int parameter.  If you like this suggestion I
> > > recommend taking it to golang-dev to see what others think.
> > 
> > The problem is that half of the parallel benchmarks in std lib need
> > goroutine-local state, so they do not fit into your simple pattern.
> > 
> > The general structure of parallel benchmarks is:
> > 
> > func BenchmarkFoo(...) {
> >   // create some global state
> >       // create some goroutine-local state
> >            // do one iteration using global state and goroutine-local state
> > }
> > 
> > However, I like the idea of providing more fundamental (and easier to
> > use) support for parallel benchmark. I will try to figure out how it
> > can look like.
> 
> 
> I've failed to figure out how ParallelBenchmark can support global/local
state.
> Except for:
> func ParallelBenchmark(b *testing.B) func() func() {
> 	templ := template.Must(template.New("test").Parse("Hello, {{.}}!"))
> 	return func() func() {
> 			var buf bytes.Buffer
> 			return func() {
> 				buf.Reset()
> 				templ.Execute(&buf, "World")
> 			}
> 		}
> }
> but it looks even more awkward and does not avoid this "func() func()".

At the cost of verbosity, we could avoid the func() func() funkiness by
introducing an interface, something like:

// BWorker is a worker in a parallel benchmark.
type BWorker interface {
	// NewWorker creates a new BWorker from the receiver.
	// NewWorker can be used to propagate global state to new
	// workers and to set up local state per worker.
	NewWorker() BWorker
	// RunOnce is the unit of work that is being benchmarked.
	// RunOnce will be called multiple times as part of a benchmark loop.
	RunOnce()
}

Using this in a parallel benchmark would look something like:

type parallelExecute struct {
	tpl *template.Template
	buf *bytes.Buffer
}

func (e *parallelExecute) NewWorker() BWorker {
	// Pass the template on to the new worker.
	// There is no local state to set up in this example.
	return &parallelExecute{tpl: e.tpl}
}

func (e *parallelExecute) RunOnce() {
	e.buf.Reset()
	e.tpl.Execute(e.buf, "World")
}

func BenchmarkExecute(b *testing.B) {
	tpl := template.Must(template.New("test").Parse("Hello, {{.}}!"))
	e := &parallelExecute{tpl: tpl}
	b.RunParallel(1, e)
}

Having written all that out, I'm not sure that it is actually any better than
func() func(). Still, I figure it is good to have an alternative on the table.


Unrelatedly, RunParallel seems useful, non-trivial, and general purpose. (I plan
to steal it for my own non-benchmarking purposes.) If you come up with a really
clean way to expose it, maybe it should live somewhere other than testing.

-josh

Sign in to reply to this message.

btracey

Similarly, it would be great to have an auto-load-balanced parallel for loop provided (similar to ...

10 years, 2 months ago (2014-02-04 22:49:27 UTC) #17

dvyukov

On 2014/02/04 22:17:02, josharian wrote: > On 2014/02/02 17:34:50, dvyukov wrote: > > On 2014/01/31 ...

10 years, 2 months ago (2014-02-05 06:01:54 UTC) #18

On 2014/02/04 22:17:02, josharian wrote:
> On 2014/02/02 17:34:50, dvyukov wrote:
> > On 2014/01/31 15:00:30, dvyukov wrote:
> > > >> I believe that all of our benchmarks must be parallel:
> > > >
> > > >
> > > > If you really believe that then I think we should consider a different
> > > > approach here.  This API is simple but looks somewhat cumbersome to use.
> > > >
> > > > For example, perhaps we could say that go test will look for any
> > > > function
> > > >
> > > > ParallelBenchmark(*testing.PB, int)
> > > >
> > > > and execute -test.cpu instances in parallel, passing the number of loop
> > > > iterations as the int parameter.  If you like this suggestion I
> > > > recommend taking it to golang-dev to see what others think.
> > > 
> > > The problem is that half of the parallel benchmarks in std lib need
> > > goroutine-local state, so they do not fit into your simple pattern.
> > > 
> > > The general structure of parallel benchmarks is:
> > > 
> > > func BenchmarkFoo(...) {
> > >   // create some global state
> > >       // create some goroutine-local state
> > >            // do one iteration using global state and goroutine-local
state
> > > }
> > > 
> > > However, I like the idea of providing more fundamental (and easier to
> > > use) support for parallel benchmark. I will try to figure out how it
> > > can look like.
> > 
> > 
> > I've failed to figure out how ParallelBenchmark can support global/local
> state.
> > Except for:
> > func ParallelBenchmark(b *testing.B) func() func() {
> > 	templ := template.Must(template.New("test").Parse("Hello, {{.}}!"))
> > 	return func() func() {
> > 			var buf bytes.Buffer
> > 			return func() {
> > 				buf.Reset()
> > 				templ.Execute(&buf, "World")
> > 			}
> > 		}
> > }
> > but it looks even more awkward and does not avoid this "func() func()".
> 
> At the cost of verbosity, we could avoid the func() func() funkiness by
> introducing an interface, something like:
> 
> // BWorker is a worker in a parallel benchmark.
> type BWorker interface {
> 	// NewWorker creates a new BWorker from the receiver.
> 	// NewWorker can be used to propagate global state to new
> 	// workers and to set up local state per worker.
> 	NewWorker() BWorker
> 	// RunOnce is the unit of work that is being benchmarked.
> 	// RunOnce will be called multiple times as part of a benchmark loop.
> 	RunOnce()
> }
> 
> Using this in a parallel benchmark would look something like:
> 
> type parallelExecute struct {
> 	tpl *template.Template
> 	buf *bytes.Buffer
> }
> 
> func (e *parallelExecute) NewWorker() BWorker {
> 	// Pass the template on to the new worker.
> 	// There is no local state to set up in this example.
> 	return &parallelExecute{tpl: e.tpl}
> }
> 
> func (e *parallelExecute) RunOnce() {
> 	e.buf.Reset()
> 	e.tpl.Execute(e.buf, "World")
> }
> 
> func BenchmarkExecute(b *testing.B) {
> 	tpl := template.Must(template.New("test").Parse("Hello, {{.}}!"))
> 	e := &parallelExecute{tpl: tpl}
> 	b.RunParallel(1, e)
> }
> 
> Having written all that out, I'm not sure that it is actually any better than
> func() func(). Still, I figure it is good to have an alternative on the table.

I agree that it does not look better.
And I agree that we need to try to find something better.

How about this?

func BenchmarkFoo(b *testing.B) {
	b.RunParallel(1, func() {
		var buf bytes.Buffer
		b.Iter = func() {
			buf.Reset()
			fmt.Fprintf(&buf, "foo")
		}
	})
}

Still, just returning the function looks like the most natural way to... return
the function.


> Unrelatedly, RunParallel seems useful, non-trivial, and general purpose. (I
plan
> to steal it for my own non-benchmarking purposes.) If you come up with a
really
> clean way to expose it, maybe it should live somewhere other than testing.

Currently std lib does not provide any parallel algorithms, so just putting this
single function somewhere does not sound like a good plan. A more coherent plan
is to design parallel package that provides parallel.For and parallel.Sort.
However, as always, it can live on github.

Sign in to reply to this message.

josharian

On 2014/02/05 06:01:54, dvyukov wrote: > On 2014/02/04 22:17:02, josharian wrote: > > On 2014/02/02 ...

10 years, 2 months ago (2014-02-05 17:35:48 UTC) #19

On 2014/02/05 06:01:54, dvyukov wrote:
> On 2014/02/04 22:17:02, josharian wrote:
> > On 2014/02/02 17:34:50, dvyukov wrote:
> > > On 2014/01/31 15:00:30, dvyukov wrote:
> > > > >> I believe that all of our benchmarks must be parallel:
> > > > >
> > > > >
> > > > > If you really believe that then I think we should consider a different
> > > > > approach here.  This API is simple but looks somewhat cumbersome to
use.
> > > > >
> > > > > For example, perhaps we could say that go test will look for any
> > > > > function
> > > > >
> > > > > ParallelBenchmark(*testing.PB, int)
> > > > >
> > > > > and execute -test.cpu instances in parallel, passing the number of
loop
> > > > > iterations as the int parameter.  If you like this suggestion I
> > > > > recommend taking it to golang-dev to see what others think.
> > > > 
> > > > The problem is that half of the parallel benchmarks in std lib need
> > > > goroutine-local state, so they do not fit into your simple pattern.
> > > > 
> > > > The general structure of parallel benchmarks is:
> > > > 
> > > > func BenchmarkFoo(...) {
> > > >   // create some global state
> > > >       // create some goroutine-local state
> > > >            // do one iteration using global state and goroutine-local
> state
> > > > }
> > > > 
> > > > However, I like the idea of providing more fundamental (and easier to
> > > > use) support for parallel benchmark. I will try to figure out how it
> > > > can look like.
> > > 
> > > 
> > > I've failed to figure out how ParallelBenchmark can support global/local
> > state.
> > > Except for:
> > > func ParallelBenchmark(b *testing.B) func() func() {
> > > 	templ := template.Must(template.New("test").Parse("Hello, {{.}}!"))
> > > 	return func() func() {
> > > 			var buf bytes.Buffer
> > > 			return func() {
> > > 				buf.Reset()
> > > 				templ.Execute(&buf, "World")
> > > 			}
> > > 		}
> > > }
> > > but it looks even more awkward and does not avoid this "func() func()".
> > 
> > At the cost of verbosity, we could avoid the func() func() funkiness by
> > introducing an interface, something like:
> > 
> > // BWorker is a worker in a parallel benchmark.
> > type BWorker interface {
> > 	// NewWorker creates a new BWorker from the receiver.
> > 	// NewWorker can be used to propagate global state to new
> > 	// workers and to set up local state per worker.
> > 	NewWorker() BWorker
> > 	// RunOnce is the unit of work that is being benchmarked.
> > 	// RunOnce will be called multiple times as part of a benchmark loop.
> > 	RunOnce()
> > }
> > 
> > Using this in a parallel benchmark would look something like:
> > 
> > type parallelExecute struct {
> > 	tpl *template.Template
> > 	buf *bytes.Buffer
> > }
> > 
> > func (e *parallelExecute) NewWorker() BWorker {
> > 	// Pass the template on to the new worker.
> > 	// There is no local state to set up in this example.
> > 	return &parallelExecute{tpl: e.tpl}
> > }
> > 
> > func (e *parallelExecute) RunOnce() {
> > 	e.buf.Reset()
> > 	e.tpl.Execute(e.buf, "World")
> > }
> > 
> > func BenchmarkExecute(b *testing.B) {
> > 	tpl := template.Must(template.New("test").Parse("Hello, {{.}}!"))
> > 	e := &parallelExecute{tpl: tpl}
> > 	b.RunParallel(1, e)
> > }
> > 
> > Having written all that out, I'm not sure that it is actually any better
than
> > func() func(). Still, I figure it is good to have an alternative on the
table.
> 
> I agree that it does not look better.
> And I agree that we need to try to find something better.
> 
> How about this?
> 
> func BenchmarkFoo(b *testing.B) {
> 	b.RunParallel(1, func() {
> 		var buf bytes.Buffer
> 		b.Iter = func() {
> 			buf.Reset()
> 			fmt.Fprintf(&buf, "foo")
> 		}
> 	})
> }
> 
> Still, just returning the function looks like the most natural way to...
return
> the function.

Yep. And to any new reader of such code, at first glance it will look like there
might be a race on b.Iter.
I like the func func better.

Here's yet another try. Instead of trying to make this a one-liner, we could add
a helper Parallel field to testing.B.
Then, just as you write an explicit for loop up to b.N in regular benchmarks,
you would follow something like this
template to write a parallel benchmark:

func BenchmarkParallelTemplate(b *testing.B) {
	b.Parallel.Init()
	// do global state initialization
	for p := 0; p < b.Parallel.NProcs(1); p++ {
		go func() {
			defer b.Parallel.Done()
			// do local state initialization
			for {
				n := b.Parallel.N()
				if n == 0 {
					break
				}
				for i := 0; i < n; i++ {
					// do work
				}
			}
		}()
	}
	p.Parallel.Wait()
}

Yes, it's a lot of boilerplate code, but the concurrency structure is exposed
nicely, and it avoids
returning functions that return functions.


> > Unrelatedly, RunParallel seems useful, non-trivial, and general purpose. (I
> plan
> > to steal it for my own non-benchmarking purposes.) If you come up with a
> really
> > clean way to expose it, maybe it should live somewhere other than testing.
> 
> Currently std lib does not provide any parallel algorithms, so just putting
this
> single function somewhere does not sound like a good plan. A more coherent
plan
> is to design parallel package that provides parallel.For and parallel.Sort.
> However, as always, it can live on github.

All true. Nudge nudge wink wink. :)

Sign in to reply to this message.

dvyukov

On 2014/02/05 17:35:48, josharian wrote: > On 2014/02/05 06:01:54, dvyukov wrote: > > On 2014/02/04 ...

10 years, 2 months ago (2014-02-05 18:26:31 UTC) #20

On 2014/02/05 17:35:48, josharian wrote:
> On 2014/02/05 06:01:54, dvyukov wrote:
> > On 2014/02/04 22:17:02, josharian wrote:
> > > On 2014/02/02 17:34:50, dvyukov wrote:
> > > > On 2014/01/31 15:00:30, dvyukov wrote:
> > > > > >> I believe that all of our benchmarks must be parallel:
> > > > > >
> > > > > >
> > > > > > If you really believe that then I think we should consider a
different
> > > > > > approach here.  This API is simple but looks somewhat cumbersome to
> use.
> > > > > >
> > > > > > For example, perhaps we could say that go test will look for any
> > > > > > function
> > > > > >
> > > > > > ParallelBenchmark(*testing.PB, int)
> > > > > >
> > > > > > and execute -test.cpu instances in parallel, passing the number of
> loop
> > > > > > iterations as the int parameter.  If you like this suggestion I
> > > > > > recommend taking it to golang-dev to see what others think.
> > > > > 
> > > > > The problem is that half of the parallel benchmarks in std lib need
> > > > > goroutine-local state, so they do not fit into your simple pattern.
> > > > > 
> > > > > The general structure of parallel benchmarks is:
> > > > > 
> > > > > func BenchmarkFoo(...) {
> > > > >   // create some global state
> > > > >       // create some goroutine-local state
> > > > >            // do one iteration using global state and goroutine-local
> > state
> > > > > }
> > > > > 
> > > > > However, I like the idea of providing more fundamental (and easier to
> > > > > use) support for parallel benchmark. I will try to figure out how it
> > > > > can look like.
> > > > 
> > > > 
> > > > I've failed to figure out how ParallelBenchmark can support global/local
> > > state.
> > > > Except for:
> > > > func ParallelBenchmark(b *testing.B) func() func() {
> > > > 	templ := template.Must(template.New("test").Parse("Hello, {{.}}!"))
> > > > 	return func() func() {
> > > > 			var buf bytes.Buffer
> > > > 			return func() {
> > > > 				buf.Reset()
> > > > 				templ.Execute(&buf, "World")
> > > > 			}
> > > > 		}
> > > > }
> > > > but it looks even more awkward and does not avoid this "func() func()".
> > > 
> > > At the cost of verbosity, we could avoid the func() func() funkiness by
> > > introducing an interface, something like:
> > > 
> > > // BWorker is a worker in a parallel benchmark.
> > > type BWorker interface {
> > > 	// NewWorker creates a new BWorker from the receiver.
> > > 	// NewWorker can be used to propagate global state to new
> > > 	// workers and to set up local state per worker.
> > > 	NewWorker() BWorker
> > > 	// RunOnce is the unit of work that is being benchmarked.
> > > 	// RunOnce will be called multiple times as part of a benchmark loop.
> > > 	RunOnce()
> > > }
> > > 
> > > Using this in a parallel benchmark would look something like:
> > > 
> > > type parallelExecute struct {
> > > 	tpl *template.Template
> > > 	buf *bytes.Buffer
> > > }
> > > 
> > > func (e *parallelExecute) NewWorker() BWorker {
> > > 	// Pass the template on to the new worker.
> > > 	// There is no local state to set up in this example.
> > > 	return &parallelExecute{tpl: e.tpl}
> > > }
> > > 
> > > func (e *parallelExecute) RunOnce() {
> > > 	e.buf.Reset()
> > > 	e.tpl.Execute(e.buf, "World")
> > > }
> > > 
> > > func BenchmarkExecute(b *testing.B) {
> > > 	tpl := template.Must(template.New("test").Parse("Hello, {{.}}!"))
> > > 	e := &parallelExecute{tpl: tpl}
> > > 	b.RunParallel(1, e)
> > > }
> > > 
> > > Having written all that out, I'm not sure that it is actually any better
> than
> > > func() func(). Still, I figure it is good to have an alternative on the
> table.
> > 
> > I agree that it does not look better.
> > And I agree that we need to try to find something better.
> > 
> > How about this?
> > 
> > func BenchmarkFoo(b *testing.B) {
> > 	b.RunParallel(1, func() {
> > 		var buf bytes.Buffer
> > 		b.Iter = func() {
> > 			buf.Reset()
> > 			fmt.Fprintf(&buf, "foo")
> > 		}
> > 	})
> > }
> > 
> > Still, just returning the function looks like the most natural way to...
> return
> > the function.
> 
> Yep. And to any new reader of such code, at first glance it will look like
there
> might be a race on b.Iter.
> I like the func func better.
> 
> Here's yet another try. Instead of trying to make this a one-liner, we could
add
> a helper Parallel field to testing.B.
> Then, just as you write an explicit for loop up to b.N in regular benchmarks,
> you would follow something like this
> template to write a parallel benchmark:
> 
> func BenchmarkParallelTemplate(b *testing.B) {
> 	b.Parallel.Init()
> 	// do global state initialization
> 	for p := 0; p < b.Parallel.NProcs(1); p++ {
> 		go func() {
> 			defer b.Parallel.Done()
> 			// do local state initialization
> 			for {
> 				n := b.Parallel.N()
> 				if n == 0 {
> 					break
> 				}
> 				for i := 0; i < n; i++ {
> 					// do work
> 				}
> 			}
> 		}()
> 	}
> 	p.Parallel.Wait()
> }
> 
> Yes, it's a lot of boilerplate code, but the concurrency structure is exposed
> nicely, and it avoids
> returning functions that return functions.


You will need to always lookup what functions you need to call in what order,
where and what to do with return values.

Sign in to reply to this message.

josharian

On 2014/02/05 18:26:31, dvyukov wrote: > On 2014/02/05 17:35:48, josharian wrote: > > On 2014/02/05 ...

10 years, 2 months ago (2014-02-05 18:42:26 UTC) #21

On 2014/02/05 18:26:31, dvyukov wrote:
> On 2014/02/05 17:35:48, josharian wrote:
> > On 2014/02/05 06:01:54, dvyukov wrote:
> > > On 2014/02/04 22:17:02, josharian wrote:
> > > > On 2014/02/02 17:34:50, dvyukov wrote:
> > > > > On 2014/01/31 15:00:30, dvyukov wrote:
> > > > > > >> I believe that all of our benchmarks must be parallel:
> > > > > > >
> > > > > > >
> > > > > > > If you really believe that then I think we should consider a
> different
> > > > > > > approach here.  This API is simple but looks somewhat cumbersome
to
> > use.
> > > > > > >
> > > > > > > For example, perhaps we could say that go test will look for any
> > > > > > > function
> > > > > > >
> > > > > > > ParallelBenchmark(*testing.PB, int)
> > > > > > >
> > > > > > > and execute -test.cpu instances in parallel, passing the number of
> > loop
> > > > > > > iterations as the int parameter.  If you like this suggestion I
> > > > > > > recommend taking it to golang-dev to see what others think.
> > > > > > 
> > > > > > The problem is that half of the parallel benchmarks in std lib need
> > > > > > goroutine-local state, so they do not fit into your simple pattern.
> > > > > > 
> > > > > > The general structure of parallel benchmarks is:
> > > > > > 
> > > > > > func BenchmarkFoo(...) {
> > > > > >   // create some global state
> > > > > >       // create some goroutine-local state
> > > > > >            // do one iteration using global state and
goroutine-local
> > > state
> > > > > > }
> > > > > > 
> > > > > > However, I like the idea of providing more fundamental (and easier
to
> > > > > > use) support for parallel benchmark. I will try to figure out how it
> > > > > > can look like.
> > > > > 
> > > > > 
> > > > > I've failed to figure out how ParallelBenchmark can support
global/local
> > > > state.
> > > > > Except for:
> > > > > func ParallelBenchmark(b *testing.B) func() func() {
> > > > > 	templ := template.Must(template.New("test").Parse("Hello, {{.}}!"))
> > > > > 	return func() func() {
> > > > > 			var buf bytes.Buffer
> > > > > 			return func() {
> > > > > 				buf.Reset()
> > > > > 				templ.Execute(&buf, "World")
> > > > > 			}
> > > > > 		}
> > > > > }
> > > > > but it looks even more awkward and does not avoid this "func()
func()".
> > > > 
> > > > At the cost of verbosity, we could avoid the func() func() funkiness by
> > > > introducing an interface, something like:
> > > > 
> > > > // BWorker is a worker in a parallel benchmark.
> > > > type BWorker interface {
> > > > 	// NewWorker creates a new BWorker from the receiver.
> > > > 	// NewWorker can be used to propagate global state to new
> > > > 	// workers and to set up local state per worker.
> > > > 	NewWorker() BWorker
> > > > 	// RunOnce is the unit of work that is being benchmarked.
> > > > 	// RunOnce will be called multiple times as part of a benchmark loop.
> > > > 	RunOnce()
> > > > }
> > > > 
> > > > Using this in a parallel benchmark would look something like:
> > > > 
> > > > type parallelExecute struct {
> > > > 	tpl *template.Template
> > > > 	buf *bytes.Buffer
> > > > }
> > > > 
> > > > func (e *parallelExecute) NewWorker() BWorker {
> > > > 	// Pass the template on to the new worker.
> > > > 	// There is no local state to set up in this example.
> > > > 	return &parallelExecute{tpl: e.tpl}
> > > > }
> > > > 
> > > > func (e *parallelExecute) RunOnce() {
> > > > 	e.buf.Reset()
> > > > 	e.tpl.Execute(e.buf, "World")
> > > > }
> > > > 
> > > > func BenchmarkExecute(b *testing.B) {
> > > > 	tpl := template.Must(template.New("test").Parse("Hello, {{.}}!"))
> > > > 	e := &parallelExecute{tpl: tpl}
> > > > 	b.RunParallel(1, e)
> > > > }
> > > > 
> > > > Having written all that out, I'm not sure that it is actually any better
> > than
> > > > func() func(). Still, I figure it is good to have an alternative on the
> > table.
> > > 
> > > I agree that it does not look better.
> > > And I agree that we need to try to find something better.
> > > 
> > > How about this?
> > > 
> > > func BenchmarkFoo(b *testing.B) {
> > > 	b.RunParallel(1, func() {
> > > 		var buf bytes.Buffer
> > > 		b.Iter = func() {
> > > 			buf.Reset()
> > > 			fmt.Fprintf(&buf, "foo")
> > > 		}
> > > 	})
> > > }
> > > 
> > > Still, just returning the function looks like the most natural way to...
> > return
> > > the function.
> > 
> > Yep. And to any new reader of such code, at first glance it will look like
> there
> > might be a race on b.Iter.
> > I like the func func better.
> > 
> > Here's yet another try. Instead of trying to make this a one-liner, we could
> add
> > a helper Parallel field to testing.B.
> > Then, just as you write an explicit for loop up to b.N in regular
benchmarks,
> > you would follow something like this
> > template to write a parallel benchmark:
> > 
> > func BenchmarkParallelTemplate(b *testing.B) {
> > 	b.Parallel.Init()
> > 	// do global state initialization
> > 	for p := 0; p < b.Parallel.NProcs(1); p++ {
> > 		go func() {
> > 			defer b.Parallel.Done()
> > 			// do local state initialization
> > 			for {
> > 				n := b.Parallel.N()
> > 				if n == 0 {
> > 					break
> > 				}
> > 				for i := 0; i < n; i++ {
> > 					// do work
> > 				}
> > 			}
> > 		}()
> > 	}
> > 	p.Parallel.Wait()
> > }
> > 
> > Yes, it's a lot of boilerplate code, but the concurrency structure is
exposed
> > nicely, and it avoids
> > returning functions that return functions.
> 
> 
> You will need to always lookup what functions you need to call in what order,
> where and what to do with return values.

Yeah. Well, I'm out of ideas for the moment...

Sign in to reply to this message.

iant

On Fri, Jan 31, 2014 at 7:00 AM, Dmitry Vyukov <dvyukov@google.com> wrote: > > On ...

10 years, 2 months ago (2014-02-05 21:30:47 UTC) #22

dvyukov

On 2014/02/05 21:30:47, iant wrote: > I guess the part I haven't managed to grasp ...

10 years, 2 months ago (2014-02-06 06:14:54 UTC) #23

dvyukov

Just for completeness few more alternatives: b.RunParallel(func(x *interface{}) { buf := (*x).(*bytes.Buffer) if buf == ...

10 years, 2 months ago (2014-02-06 06:26:55 UTC) #24

josharian

Here's yet another alternative, tweaking Ian's suggestion. (Let me know if you tire of this, ...

10 years, 2 months ago (2014-02-06 17:29:00 UTC) #25

dvyukov

On 2014/02/06 17:29:00, josharian wrote: > Here's yet another alternative, tweaking Ian's suggestion. (Let me ...

10 years, 2 months ago (2014-02-06 17:49:44 UTC) #26

bradfitz

I like it. On Feb 6, 2014 9:49 AM, <dvyukov@google.com> wrote: > On 2014/02/06 17:29:00, ...

10 years, 2 months ago (2014-02-06 17:50:34 UTC) #27

dvyukov

Full example for completeness: // Parallel benchmark for text/template.Template.Execute on a single object. func BenchmarkTempl(b ...

10 years, 2 months ago (2014-02-06 18:01:24 UTC) #29

bradfitz

If PB had a Run method, we'd have a way to set options in the ...

10 years, 2 months ago (2014-02-06 18:38:22 UTC) #30

If PB had a Run method, we'd have a way to set options in the future.
Setters on PB
On Feb 6, 2014 10:01 AM, <dvyukov@google.com> wrote:

> Full example for completeness:
>
> // Parallel benchmark for text/template.Template.Execute on a single
> object.
> func BenchmarkTempl(b *testing.B) {
>         templ := template.Must(template.New("test").Parse("Hello,
> {{.}}!"))
>         // RunParallel will create GOMAXPROCS goroutines
>         // and distribute work among them.
>         b.RunParallel(1, func(pb *testing.PB) {
>                 // Each goroutine has own private bytes.Buffer.
>                 var buf bytes.Buffer
>                 for pb.Next() {
>                         // This function is executed b.N times.
>                         buf.Reset()
>                         templ.Execute(&buf, "World")
>                 }
>         })
> }
>
> Exactly the same number of lines as for current "func func" version,
> just
> s/func() func()/func(pb *testing.PB)/
> s/return func() {/for pb.Next() {/
>
> -----
>
> And returning to Ian's idea of having "func ParallelBenchmark", we can
> do:
>
> var templ = template.Must(template.New("test").Parse("Hello, {{.}}!"))
>
> func ParallelBenchmarkTempl(pb *testing.PB) {
>         var buf bytes.Buffer
>         for pb.Next() {
>                 buf.Reset()
>                 templ.Execute(&buf, "World")
>         }
> }
>
> The only downside, is that all global state must be, well, global.
> But -1 line of code :)
>
> -----
>
> And extending it further, we don't actually need testing.PB. We can
> extend testing.B with Next function, which will also work for
> non-parallel benchmarks, so that you can write:
>
> func BenchmarkTempl(b *testing.B) {
>         templ := template.Must(template.New("test").Parse("Hello,
> {{.}}!"))
>         var buf bytes.Buffer
>         for b.Next() {
>                 buf.Reset()
>                 templ.Execute(&buf, "World")
>         }
> }
>
> or:
>
> var templ = template.Must(template.New("test").Parse("Hello, {{.}}!"))
> func ParallelBenchmarkTempl(b *testing.B) {
>         var buf bytes.Buffer
>         for b.Next() {
>                 buf.Reset()
>                 templ.Execute(&buf, "World")
>         }
> }
>
> https://codereview.appspot.com/57270043/
>

Sign in to reply to this message.

dvyukov

What do you mean by Run and Setters? On Thu, Feb 6, 2014 at 10:38 ...

10 years, 2 months ago (2014-02-06 19:43:14 UTC) #31

What do you mean by Run and Setters?

On Thu, Feb 6, 2014 at 10:38 PM, Brad Fitzpatrick <bradfitz@golang.org> wrote:
> If PB had a Run method, we'd have a way to set options in the future.
> Setters on PB
>
> On Feb 6, 2014 10:01 AM, <dvyukov@google.com> wrote:
>>
>> Full example for completeness:
>>
>> // Parallel benchmark for text/template.Template.Execute on a single
>> object.
>> func BenchmarkTempl(b *testing.B) {
>>         templ := template.Must(template.New("test").Parse("Hello,
>> {{.}}!"))
>>         // RunParallel will create GOMAXPROCS goroutines
>>         // and distribute work among them.
>>         b.RunParallel(1, func(pb *testing.PB) {
>>                 // Each goroutine has own private bytes.Buffer.
>>                 var buf bytes.Buffer
>>                 for pb.Next() {
>>                         // This function is executed b.N times.
>>                         buf.Reset()
>>                         templ.Execute(&buf, "World")
>>                 }
>>         })
>> }
>>
>> Exactly the same number of lines as for current "func func" version,
>> just
>> s/func() func()/func(pb *testing.PB)/
>> s/return func() {/for pb.Next() {/
>>
>> -----
>>
>> And returning to Ian's idea of having "func ParallelBenchmark", we can
>> do:
>>
>> var templ = template.Must(template.New("test").Parse("Hello, {{.}}!"))
>>
>> func ParallelBenchmarkTempl(pb *testing.PB) {
>>         var buf bytes.Buffer
>>         for pb.Next() {
>>                 buf.Reset()
>>                 templ.Execute(&buf, "World")
>>         }
>> }
>>
>> The only downside, is that all global state must be, well, global.
>> But -1 line of code :)
>>
>> -----
>>
>> And extending it further, we don't actually need testing.PB. We can
>> extend testing.B with Next function, which will also work for
>> non-parallel benchmarks, so that you can write:
>>
>> func BenchmarkTempl(b *testing.B) {
>>         templ := template.Must(template.New("test").Parse("Hello,
>> {{.}}!"))
>>         var buf bytes.Buffer
>>         for b.Next() {
>>                 buf.Reset()
>>                 templ.Execute(&buf, "World")
>>         }
>> }
>>
>> or:
>>
>> var templ = template.Must(template.New("test").Parse("Hello, {{.}}!"))
>> func ParallelBenchmarkTempl(b *testing.B) {
>>         var buf bytes.Buffer
>>         for b.Next() {
>>                 buf.Reset()
>>                 templ.Execute(&buf, "World")
>>         }
>> }
>>
>> https://codereview.appspot.com/57270043/

Sign in to reply to this message.

bradfitz

Sorry, was typing on a chairlift. b.NewParallel().Run(1, func(pb *testing.PB) { .... }) Then NewParallel can ...

10 years, 2 months ago (2014-02-06 19:47:58 UTC) #32

Sorry, was typing on a chairlift.

b.NewParallel().Run(1, func(pb *testing.PB) {
....
})

Then NewParallel can return a *PB and later we can add options on PB to
change parameters before we call the Run method.

But I don't like having two pb in scope (the one from NewParallel and the
same one from the func). Will propose something more coherent later.
On Feb 6, 2014 11:43 AM, "Dmitry Vyukov" <dvyukov@google.com> wrote:

> What do you mean by Run and Setters?
>
> On Thu, Feb 6, 2014 at 10:38 PM, Brad Fitzpatrick <bradfitz@golang.org>
> wrote:
> > If PB had a Run method, we'd have a way to set options in the future.
> > Setters on PB
> >
> > On Feb 6, 2014 10:01 AM, <dvyukov@google.com> wrote:
> >>
> >> Full example for completeness:
> >>
> >> // Parallel benchmark for text/template.Template.Execute on a single
> >> object.
> >> func BenchmarkTempl(b *testing.B) {
> >>         templ := template.Must(template.New("test").Parse("Hello,
> >> {{.}}!"))
> >>         // RunParallel will create GOMAXPROCS goroutines
> >>         // and distribute work among them.
> >>         b.RunParallel(1, func(pb *testing.PB) {
> >>                 // Each goroutine has own private bytes.Buffer.
> >>                 var buf bytes.Buffer
> >>                 for pb.Next() {
> >>                         // This function is executed b.N times.
> >>                         buf.Reset()
> >>                         templ.Execute(&buf, "World")
> >>                 }
> >>         })
> >> }
> >>
> >> Exactly the same number of lines as for current "func func" version,
> >> just
> >> s/func() func()/func(pb *testing.PB)/
> >> s/return func() {/for pb.Next() {/
> >>
> >> -----
> >>
> >> And returning to Ian's idea of having "func ParallelBenchmark", we can
> >> do:
> >>
> >> var templ = template.Must(template.New("test").Parse("Hello, {{.}}!"))
> >>
> >> func ParallelBenchmarkTempl(pb *testing.PB) {
> >>         var buf bytes.Buffer
> >>         for pb.Next() {
> >>                 buf.Reset()
> >>                 templ.Execute(&buf, "World")
> >>         }
> >> }
> >>
> >> The only downside, is that all global state must be, well, global.
> >> But -1 line of code :)
> >>
> >> -----
> >>
> >> And extending it further, we don't actually need testing.PB. We can
> >> extend testing.B with Next function, which will also work for
> >> non-parallel benchmarks, so that you can write:
> >>
> >> func BenchmarkTempl(b *testing.B) {
> >>         templ := template.Must(template.New("test").Parse("Hello,
> >> {{.}}!"))
> >>         var buf bytes.Buffer
> >>         for b.Next() {
> >>                 buf.Reset()
> >>                 templ.Execute(&buf, "World")
> >>         }
> >> }
> >>
> >> or:
> >>
> >> var templ = template.Must(template.New("test").Parse("Hello, {{.}}!"))
> >> func ParallelBenchmarkTempl(b *testing.B) {
> >>         var buf bytes.Buffer
> >>         for b.Next() {
> >>                 buf.Reset()
> >>                 templ.Execute(&buf, "World")
> >>         }
> >> }
> >>
> >> https://codereview.appspot.com/57270043/
>

Sign in to reply to this message.

dvyukov

On Thu, Feb 6, 2014 at 11:47 PM, Brad Fitzpatrick <bradfitz@golang.org> wrote: > Sorry, was ...

10 years, 2 months ago (2014-02-06 19:57:45 UTC) #33

josharian

> > b.NewParallel().Run(1, func(pb *testing.PB) { > > .... > > }) > > > ...

10 years, 2 months ago (2014-02-08 02:03:27 UTC) #34

dvyukov

On Sat, Feb 8, 2014 at 6:03 AM, <josharian@gmail.com> wrote: >> > b.NewParallel().Run(1, func(pb *testing.PB) ...

10 years, 2 months ago (2014-02-08 16:13:15 UTC) #35

bradfitz

On Sat, Feb 8, 2014 at 8:12 AM, Dmitry Vyukov <dvyukov@google.com> wrote: > On Sat, ...

10 years, 2 months ago (2014-02-08 16:18:01 UTC) #36

dvyukov

On Sat, Feb 8, 2014 at 6:03 AM, <josharian@gmail.com> wrote: >> > b.NewParallel().Run(1, func(pb *testing.PB) ...

10 years, 2 months ago (2014-02-08 16:38:58 UTC) #37

dvyukov

Hi! I've updated the code to use the new interface. PTAL. Comments are more of ...

10 years, 2 months ago (2014-02-10 17:34:35 UTC) #38

bradfitz

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark.go File src/pkg/testing/benchmark.go (right): https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark.go#newcode353 src/pkg/testing/benchmark.go:353: // PB is a helper type for RunParallel. Maybe: ...

10 years, 2 months ago (2014-02-10 19:37:05 UTC) #39

josharian

The suggested comment changes below are all just that -- suggested. Take of them what ...

10 years, 2 months ago (2014-02-10 20:19:31 UTC) #40

dvyukov

PTAL https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark.go File src/pkg/testing/benchmark.go (right): https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark.go#newcode40 src/pkg/testing/benchmark.go:40: lastDuration time.Duration On 2014/02/10 20:19:32, josharian wrote: > ...

10 years, 2 months ago (2014-02-11 11:31:13 UTC) #41

PTAL

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark.go
File src/pkg/testing/benchmark.go (right):

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:40: lastDuration    time.Duration
On 2014/02/10 20:19:32, josharian wrote:
> Is it worth saving both of these, as opposed to just the last ns/op?

I don't know.
I do not see any problem with saving an additional var here.
For the current run we save N and duration, so it looks reasonable to save the
same for the last run.
Also, ns/op will have to be float64, and I fear what I don't understand :)



> As written, lastDuration is a bit ambiguous -- was it the duration of the
entire
> run or the duration of single operation?
> 
> Comments for these would be good.

comments are added

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:353: // PB is a helper type for RunParallel.
On 2014/02/10 19:37:05, bradfitz wrote:
> Maybe:
> 
>   // A PB is used by RunParallel for running parallel benchmarks.
> 
> Or s/used/provided/

Done.

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:353: // PB is a helper type for RunParallel.
On 2014/02/10 20:19:32, josharian wrote:
> // PB coordinates the work done by single goroutine in RunParallel.
> 
> Is PB the right name now? In its current incarnation, a PB isn't a
> ParallelBenchmark, it's more of a worker.

Any better suggestion?

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:361: // Next returns whether there is more
iterations to execute.
On 2014/02/10 19:37:05, bradfitz wrote:
> s/returns/reports/ and s/is/are/

Done.

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:361: // Next returns whether there is more
iterations to execute.
On 2014/02/10 20:19:32, josharian wrote:
> s/returns/reports/
> s/is/are/

Done.

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:379: // Number of worker goroutines can be
overriden with SetParallelism function.
On 2014/02/10 19:37:05, bradfitz wrote:
> Articles (the) before both Number and SetParallelism

Done.

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:380: // It's most useful to use RunParallel with go
test -cpu option.
On 2014/02/10 19:37:05, bradfitz wrote:
> with the
> 
> or "using go test -cpu=..."

Done.

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:382: // The body function should not use b.N
itself; instead it must execute benchmarking
On 2014/02/10 19:37:05, bradfitz wrote:
> s/must/should/
> 
> Must is much stronger than should and often feels too strong.

Done.

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:385: // The body function should not use StopTimer,
StartTimer and ResetTimer functions,
On 2014/02/10 19:37:05, bradfitz wrote:
> another "the".  And s/and/or/:
> 
> should not use the StopTime, StartTime or ResetTimer functions

Done.

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:386: // because they have global effect.
On 2014/02/10 20:19:32, josharian wrote:
> // RunParallel runs a benchmark in parallel.
> // It creates multiple goroutines and distributes N iterations among them.
> // The number of goroutines defaults to GOMAXPROCS. To increase parallelism
for
> non-CPU-bound benchmarks, use SetParallelism.
> // RunParallel is usually used with the go test -cpu flag.
> //
> // The body function will be run in each goroutine. It should set up any
> goroutine-local state and then iterate
> // until pb.Next returns false. It should not use StartTimer, StopTimer, or
> ResetTimer.


Now, that I've fixed all of Brad's comments, I get to this...

OK, let it be this version.
I permit myself to do following changes (mostly per Brad's suggestions):
s/N/b.N/
s/SetParallelism/the SetParallelism function/
s/StartTimer, StopTimer, or ResetTimer/the StartTimer, StopTimer, or ResetTimer
functions/

and I've restored "because they have global effect", because this limitation
instantly raises why? question.

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:404: N := uint64(0)
On 2014/02/10 20:19:32, josharian wrote:
> s/N/n/?

Done.

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:418: // SetParallelism sets number of worker
goroutines for RunParallel to p*GOMAXPROCS.
On 2014/02/10 19:37:05, bradfitz wrote:
> the number of
> 
> Also, mention that the default is 1?  Even though it's mentioned indirectly
> above.

Since you will need to fix grammar and wording in my version anyway, can you
just give me the complete sense? :)

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark.go:419: // There is usually no need to call this
function for CPU-bound benchmarks.
On 2014/02/10 20:19:32, josharian wrote:
> // SetParallelism sets the number of goroutines used by RunParallel to p *
> GOMAXPROCS.
> // There is usually no need to call SetParallelism for CPU-bound benchmarks.
> // If p is less than 1, this call will have no effect.

Done.

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
File src/pkg/testing/benchmark_test.go (right):

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark_test.go:76: t.Errorf("expected %v procs, got %v",
want, procs)
On 2014/02/10 20:19:32, josharian wrote:
> As I understand it, the preferred style is now "got %v, want %v", here and
> below.

Done.

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark_test.go:103: // Each goroutine has own private
bytes.Buffer.
On 2014/02/10 20:19:32, josharian wrote:
> s/has own private/has its own/

Done.

https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark...
src/pkg/testing/benchmark_test.go:106: // The loop body is executed b.N times
total.
On 2014/02/10 20:19:32, josharian wrote:
> s/total./total across all goroutines./

Done.

Sign in to reply to this message.

josharian

LGTM Thanks, Dmitry, this was really fun. https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark.go File src/pkg/testing/benchmark.go (right): https://codereview.appspot.com/57270043/diff/270001/src/pkg/testing/benchmark.go#newcode353 src/pkg/testing/benchmark.go:353: // PB ...

10 years, 2 months ago (2014-02-11 15:02:18 UTC) #42

dvyukov

there is an example On Tue, Feb 11, 2014 at 7:53 PM, <bradfitz@golang.org> wrote: > ...

10 years, 2 months ago (2014-02-11 15:54:26 UTC) #44

bradfitz

LGTM Whoops, I was only reading benchmark.go this latest round. On Tue, Feb 11, 2014 ...

10 years, 2 months ago (2014-02-11 15:57:05 UTC) #45

r

I'll take a look at the API later today. On Tue, Feb 11, 2014 at ...

10 years, 2 months ago (2014-02-11 16:34:13 UTC) #46

dvyukov

Thanks! Waiting for you. On Tue, Feb 11, 2014 at 8:33 PM, Rob Pike <r@golang.org> ...

10 years, 2 months ago (2014-02-11 16:37:01 UTC) #47

r

most of my comments are about documentation. the API looks really nice. https://codereview.appspot.com/57270043/diff/280001/src/pkg/testing/benchmark.go File src/pkg/testing/benchmark.go ...

10 years, 2 months ago (2014-02-11 22:57:17 UTC) #48

r

most of my comments are about documentation. the API looks really nice.

10 years, 2 months ago (2014-02-11 22:57:23 UTC) #49

r

most of my comments are about documentation. the API looks really nice.

10 years, 2 months ago (2014-02-11 22:57:28 UTC) #50

r

most of my comments are about documentation. the API looks really nice.

10 years, 2 months ago (2014-02-11 22:57:33 UTC) #51

r

most of my comments are about documentation. the API looks really nice.

10 years, 2 months ago (2014-02-11 22:57:33 UTC) #52

r

Sorry about the burst. The submit button wasn't showing evidence it had succeeded. On Tue, ...

10 years, 2 months ago (2014-02-11 22:58:34 UTC) #53

dvyukov

PTAL https://codereview.appspot.com/57270043/diff/280001/src/pkg/testing/benchmark.go File src/pkg/testing/benchmark.go (right): https://codereview.appspot.com/57270043/diff/280001/src/pkg/testing/benchmark.go#newcode39 src/pkg/testing/benchmark.go:39: lastN int // number of iterations in the ...

10 years, 2 months ago (2014-02-12 11:09:09 UTC) #54

r

LGTM iant is on vacation. i think rsc should take a look too. https://codereview.appspot.com/57270043/diff/320001/src/pkg/testing/benchmark.go File ...

10 years, 2 months ago (2014-02-16 16:22:09 UTC) #56

dvyukov

waiting for Russ https://codereview.appspot.com/57270043/diff/320001/src/pkg/testing/benchmark.go File src/pkg/testing/benchmark.go (right): https://codereview.appspot.com/57270043/diff/320001/src/pkg/testing/benchmark.go#newcode422 src/pkg/testing/benchmark.go:422: // SetParallelism sets the number of ...

10 years, 2 months ago (2014-02-16 16:28:26 UTC) #58

r

I agree: nice. Thanks to everyone who pitched in. This is how to do API ...

10 years, 2 months ago (2014-02-16 20:45:19 UTC) #60

dvyukov

10 years, 2 months ago (2014-02-17 02:30:05 UTC) #61

*** Submitted as https://code.google.com/p/go/source/detail?r=355c9ac57116 ***

testing: ease writing parallel benchmarks
Add b.RunParallel function that captures parallel benchmark boilerplate:
creates worker goroutines, joins worker goroutines, distributes work
among them in an efficient way, auto-tunes grain size.
Fixes issue 7090.

R=bradfitz, iant, josharian, tracey.brendan, r, rsc, gobot
CC=golang-codereviews
https://codereview.appspot.com/57270043

Sign in to reply to this message.

Issue 57270043: code review 57270043: testing: ease writing parallel benchmarks (Closed)

Description

Patch Set 1 #

Patch Set 2 : diff -r a41f8780d8b0 https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 3 : diff -r a41f8780d8b0 https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 4 : diff -r daaac74e31ed https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 5 : diff -r daaac74e31ed https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 6 : diff -r daaac74e31ed https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 7 : diff -r daaac74e31ed https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 8 : diff -r daaac74e31ed https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 9 : diff -r efb71a1d099d https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 10 : diff -r 4106a965e536 https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 11 : diff -r ae14bde9ce3c https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 12 : diff -r 0ff92624872e https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 13 : diff -r 0ff92624872e https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 14 : diff -r 0ff92624872e https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 15 : diff -r 0ff92624872e https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 16 : diff -r c3ee11c5f19f https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 17 : diff -r e2dd08f26402 https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 18 : diff -r e2dd08f26402 https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 19 : diff -r b29b14df7e45 https://dvyukov%40google.com@code.google.com/p/go/ #

Patch Set 20 : diff -r f51841430d08 https://dvyukov%40google.com@code.google.com/p/go/ #

Messages