Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: support for LBR profiles in PGO #59616

Open
prattmic opened this issue Apr 13, 2023 · 0 comments
Open

cmd/compile: support for LBR profiles in PGO #59616

prattmic opened this issue Apr 13, 2023 · 0 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@prattmic
Copy link
Member

prattmic commented Apr 13, 2023

Branch profiles, aka Last Branch Record (LBR) profiles provide a sample-based profile of branches taken. LBR hardware provides a snapshot of the last N taken branches at each sample point. Subsequent records can be combined to provide basic block execution counts as well as branch counts. e.g., one sample could be: basic block executed from startpc to endpc and then jumped to dstpc.

Today we use CPU profiles for PGO because they are easy to collect (no special hardware support required), while LBR profiles require special hardware support (notably, not exposed by most major cloud VM providers). However, LBR profiles can often be better than CPU profiles for PGO. For instance:

  • Iterative stability: a PGO optimization that reduces the cost of a call will make that call use less CPU and thus get fewer samples in the next CPU profile. This could lead to the next compile not identifying that code as hot anymore and no longer performing the optimization. On the other hand, an LBR profile will report the same number of calls despite the optimizations.

  • While CPU cycles and number of calls/executions are generally correlated, there can be significant skew particularly on basic block level optimizations. Basic blocks tend not to have many instructions, so more expensive instructions in a basic block (e.g., MUL vs ADD) can skew results making the MUL block look hotter even if the ADD block is executed more often.

One blocker to native LBR support in the compiler is that the pprof format has no canonical way of encoding LBR samples. We either need to pick a custom interpretation that we recognize, pprof should add an official form, or we use a different file format entirely (such as LLVM's PGO format).

cc @cherrymui @aclements @rajbarik @jinlin-bayarea @hoeppi-google

@prattmic prattmic added Performance NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Apr 13, 2023
@prattmic prattmic added this to the Backlog milestone Apr 13, 2023
@prattmic prattmic self-assigned this Apr 13, 2023
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Apr 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
Development

No branches or pull requests

2 participants