Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: per-phase compiler benchmarking #16169

Closed
mdempsky opened this issue Jun 23, 2016 · 16 comments
Closed

cmd/compile: per-phase compiler benchmarking #16169

mdempsky opened this issue Jun 23, 2016 · 16 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@mdempsky
Copy link
Member

Right now we can use compilebench to measure the compiler's overall run time, and we can use -cpuprofile to measure fine-grained per-function/instruction profiles. But with the interest in continuing to evolve/rewrite individual phases of the compiler, I think it would be nice to track the compiler's performance on a per-phase granularity.

Rough proposal sketch:

  1. Add a flag to cmd/compile like -phasebench=prefix:outfile.
  2. Add instrumentation to cmd/compile/internal/gc/main.go so that when phasebench is enabled, it will append benchmark data formatted entries to outfile of the form "prefix-parse", "prefix-typecheck", etc.
  3. compilebench can then optionally make use of this new flag, and we can start tracking the more detailed information.
@mdempsky mdempsky added this to the Go1.8Early milestone Jun 23, 2016
@josharian
Copy link
Contributor

Yes please! I assume you can pass multiple phase:outfile pairs?

How about just -phasebench=prefix and benchmark all phases, with a phase-specific suffix? I worry that while optimizing one phase decisions might get made that impact other phases without getting noticed. In particular, frontend/backend interactions can be like this. As a concrete example, I have a CL waiting for 1.8 that removes some empty goto-label pairs that were introduced as a frontend convenience, the removal of which cuts overall compiler allocation by 1%.

@mdempsky
Copy link
Member Author

mdempsky commented Jun 23, 2016

To be clear, my intention was -phasebench=prefix:outfile would measure all phases, writing the benchmarks to outfile, each prefixed by the string prefix. The idea being so compilebench could use -phasebench=BenchmarkGoTypes:/tmp/benchmark.txt, -phasebench=BenchmarkUnicode:/tmp/benchmark.txt, etc. and the outputs would be lines labeled BenchmarkGoTypes-parse, BenchmarkGoTypes-typecheck, BenchmarkUnicode-parse, BenchmarkUnicode-typecheck, etc.

@griesemer
Copy link
Contributor

Yes. Funny coincidence, I spent this morning experimenting with adding structured statistics code to the front-end. This is an item on my TODOs.

My thinking at the moment:

  • There's the notion of a Measurement (an interface, abstracts over a time/duration, count, or other measure).
  • There are Groups, which collect multiple Measurements. Depending on what it is, it may accumulate/average the data.
  • There are individual Measurements, such as timings, counts. etc.

The hierarchy permits collecting a lot of data in a structured (and labeled) way. The code augmentation is typically a single extra line per start/stop event.

A single "print" at the end of the compile dumps the output in a structured form.

I'm thinking that there should be a JSON variant for dumping the data to a file so that I can be processed later (i.e., so we can collect the data across many runs/times/versions and then produce graphs).

@griesemer
Copy link
Contributor

PS: Standard-benchmark data format should also be easily possible, but I think that format might be generated after the individual measurements (e.g., a single compile produces just one data set). It's also not clear to me how one would organize more structured data in that format.

@mdempsky
Copy link
Member Author

Certainly not opposed to a structured API. My immediate idea was to start simple with instrumentation like m := benchmarker() and then calls to m.mark("parse"), m.mark("typecheck"), etc. appropriately scattered around. If you have anything more thought out I'm all ears. :)

As for format, one of my takeaways from @aclements' presentation about github.com/aclements/go-gg was that the benchmark format may be sufficient as long as we adequately structure the benchmark names. E.g., see the examples with names like BenchmarkDecode/text:digits/level:speed/size:1e4/gomaxprocs:8.

/cc @aclements for thoughts on data format

@josharian
Copy link
Contributor

One thing to plan for: some phases may be less clear or overlap in a more heavily concurrent compiler. Also, ssa phases already happen in a cycle, once per function.

@griesemer
Copy link
Contributor

With "structured API" I didn't mean large framework. At the moment it's about 75 lines of code, with minimal code (one-liners) at measuring points in the front-end. The grouping mostly removes the need to think much about label names: Basically, introducing a new subgroup is a local operation; no need to rename other labels. For instance:

var stats = NewGroup("compilation of ...") // initialize compiler stats, a group of measurements
...
parse := stats.AddTiming("parse") // create a start a new time measurement
// parser invocations
parse.Stop()
...
phase1 := stats.AddTiming("phase1")
...
phase1.Stop()

etc. A bit more code than you suggest.

@griesemer
Copy link
Contributor

griesemer commented Jun 23, 2016

That is, each time measurement gets its own Timing object which may track more than just the time. It may carry the number of bytes processed, or a count (iterations), or other things. Slightly more than just a benchmark map that tracks labels and times. But not a lot more.

@mdempsky
Copy link
Member Author

@josharian Agreed. That's why I was thinking of using names like "typecheck" instead of "phase1" or "phase2", so that the name could remain meaningful long-term even if we end up reorganizing how the specific phase boundaries.

@griesemer Okay, yeah, that's a little more code (explicitly marking begin/end of each phase rather than making it implicit by just marking transitions), but sounds roughly isomorphic to what I was thinking.

@griesemer
Copy link
Contributor

@mdempsky Maybe it's overkill to mark each start/end and it's good enough to just track boundaries. But it seems to me that it might be more difficult to track a specific time (say escape analysis) if some of the other phases move around. Also, it would seem more difficult to attach bytes (to compute things like source bytes parsed/time unit, etc.).

On the other hand, simpler is often better. Maybe it is good enough to just track transition times. Still exploring.

@dr2chase
Copy link
Contributor

The ssa phases can all be timed. For example

go build -gcflags -d=ssa/all/time step_hide.go
# command-line-arguments
./step_hide.go:13:  early phielim   TIME(ns)    9458    foo
./step_hide.go:13:  early copyelim  TIME(ns)    3365    foo
./step_hide.go:13:  early deadcode  TIME(ns)    288821  foo
./step_hide.go:13:  short circuit   TIME(ns)    293607  foo
./step_hide.go:13:  decompose user  TIME(ns)    303039  foo
./step_hide.go:13:  opt TIME(ns)    2593795 foo

I think the format needs work; that seems to not conform to what I recall Austin telling me was the desired form for the benchmarking commands we already have. Right now the function name comes last because sometimes those have exciting syntax in them.

@griesemer
Copy link
Contributor

@mdempsky, @josharian Please have a look at https://go-review.googlesource.com/24462 . After considering the feedback above I decided to abandon the structured data idea (seemed overkill) and go with a pretty simple approach, essentially along the lines of @mdempsky's proposal. I expect this to morp h a bit over time but it's a starting point.

@gopherbot
Copy link

CL https://golang.org/cl/24462 mentions this issue.

@aclements
Copy link
Member

As for format, one of my takeaways from @aclements' presentation about github.com/aclements/go-gg was that the benchmark format may be sufficient as long as we adequately structure the benchmark names. E.g., see the examples with names like BenchmarkDecode/text:digits/level:speed/size:1e4/gomaxprocs:8.

/cc @aclements for thoughts on data format

I don't know exactly what data you want to represent, but the "extended" benchmark format is pretty flexible. It's not great at hierarchical things, but it's not clear you really want that. And, of course, you'll be able to use all of the tools that know how to speak it (which isn't much yet, but the format is young). You could have a benchmark line for each (file, phase) where the phase is a property, and results for time, allocations, etc. You might just want to name each benchmark after each phase. Something like

BenchmarkPhielem/file:step_hide.go/gomaxprocs:4 1   9458 ns/op   100 bytes-allocated    2 allocations

Maybe you want the function in there. That may be too detailed, but it would be easy to write a tool that aggregated it in various ways after the fact, e.g., using https://godoc.org/github.com/aclements/go-misc/bench.

@griesemer
Copy link
Contributor

The pending CL now enables writing to a file and is using the benchmark data format.

gopherbot pushed a commit that referenced this issue Aug 17, 2016
Timings is a simple data structure that collects times of labeled
Start/Stop events describing timed phases, which later can be written
to a file.

Adjacent phases with common label prefix are automatically collected
in a group together with the accumulated phase time.

Timing data can be appended to a file in benchmark data format
using the new -bench flag:

$ go build -gcflags="-bench=/dev/stdout" -o /dev/null go/types
commit: devel +8847c6b Mon Aug 15 17:51:53 2016 -0700
goos: darwin
goarch: amd64
BenchmarkCompile:go/types:fe:init              1       663292 ns/op      0.07 %
BenchmarkCompile:go/types:fe:loadsys           1      1337371 ns/op      0.14 %
BenchmarkCompile:go/types:fe:parse             1     47008869 ns/op      4.91 %    10824 lines    230254 lines/s
BenchmarkCompile:go/types:fe:typecheck:top1    1      2843343 ns/op      0.30 %
BenchmarkCompile:go/types:fe:typecheck:top2    1       447457 ns/op      0.05 %
BenchmarkCompile:go/types:fe:typecheck:func    1     15119595 ns/op      1.58 %      427 funcs     28241 funcs/s
BenchmarkCompile:go/types:fe:capturevars       1        56314 ns/op      0.01 %
BenchmarkCompile:go/types:fe:inlining          1      9805767 ns/op      1.02 %
BenchmarkCompile:go/types:fe:escapes           1     53598646 ns/op      5.60 %
BenchmarkCompile:go/types:fe:xclosures         1       199302 ns/op      0.02 %
BenchmarkCompile:go/types:fe:subtotal          1    131079956 ns/op     13.70 %
BenchmarkCompile:go/types:be:compilefuncs      1    692009428 ns/op     72.33 %      427 funcs       617 funcs/s
BenchmarkCompile:go/types:be:externaldcls      1        54591 ns/op      0.01 %
BenchmarkCompile:go/types:be:dumpobj           1    133478173 ns/op     13.95 %
BenchmarkCompile:go/types:be:subtotal          1    825542192 ns/op     86.29 %
BenchmarkCompile:go/types:unaccounted          1       106101 ns/op      0.01 %
BenchmarkCompile:go/types:total                1    956728249 ns/op    100.00 %

For #16169.

Change-Id: I93265fe0cb08e47cd413608d0824c5dd35ba7899
Reviewed-on: https://go-review.googlesource.com/24462
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
@quentinmit quentinmit added the NeedsFix The path to resolution is known, but the work has not been done. label Oct 6, 2016
@rsc
Copy link
Contributor

rsc commented Oct 17, 2016

This seems to be done.

@rsc rsc closed this as completed Oct 17, 2016
@golang golang locked and limited conversation to collaborators Oct 17, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

8 participants