Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: PGO does not devirtualize methods in another package (log: "no function body") #60561

Closed
thepudds opened this issue Jun 1, 2023 · 12 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@thepudds
Copy link
Contributor

thepudds commented Jun 1, 2023

What version of Go are you using (go version)?

$ gotip version
go version devel go1.21-c7b2f649 Tue May 30 17:30:28 2023

Does this issue reproduce with the latest release?

Feature only present in tip.

What did you do?

When experimenting with the new PGO-driven indirect call specialization from #59959, I did not see devirtualization happening on methods in another package, even when those methods are directly referenced in the current package.

CL 492436 notes that there are limitations in the first version of this, including:

  • Callees not directly referenced in the current package can be missed (even if they are in the transitive dependences).

However, I do not see devirtualization even after attempting to take that into account, though I might be holding it wrong.

I created a simplified example (playground link), which includes this main.go:

import (
	// ...
	"example/speak"
)

func main() {
	// configure profiling
	// ...

	s1 := &speak.Speaker1{}
	s2 := &speak.Speaker2{}
	println(s1.Speak())
	println(s2.Speak())

	for i := 0; i < 10_000; i++ {
		_ = f()
	}
}

func f() string {
	var s speak.Speaker // interface
	s = &speak.Speaker1{}
	if rand.Int()%10 == 0 {
		s = &speak.Speaker2{}
	}
	return s.Speak()
}

I then create a cpu profile:

$ gotip run . -cpuprofile=cpu.out

And use that cpu profile with pgo:

$ gotip build -pgo=cpu.out -gcflags='-m=99 -d=pgodebug=99' &> debug-build.out

The log shows should not PGO devirtualize (*Speaker1).Speak: no function body:

./main.go:42:16: PGO devirtualize considering call s.Speak()
./main.go:42:16: edge main.f:6 -> example/speak.(*Speaker1).Speak (weight 186): hottest so far
./main.go:42:16: edge main.f:6 -> example/speak.(*Speaker2).Speak (weight 34): too cold (hottest 186)
./main.go:42:16 call main.f:6: hottest callee example/speak.(*Speaker1).Speak (weight 186)
./speak/speak.go:9:6: should not PGO devirtualize (*Speaker1).Speak: no function body

If I comment out the only explicit uses of Speaker1.Speak and Speaker2.Speak in package main:

	// s1 := &speak.Speaker1{}
	// s2 := &speak.Speaker2{}
	// println(s1.Speak())
	// println(s2.Speak())

I then get a different log message saying missing IR for (*Speaker1).Speak:

$ gotip build -pgo=cpu.out -gcflags='-m=99 -d=pgodebug=99' &> debug-build-2.out

./main.go:42:16: PGO devirtualize considering call s.Speak()
./main.go:42:16: edge main.f:6 -> example/speak.(*Speaker1).Speak (weight 186) (missing IR): hottest so far
./main.go:42:16: edge main.f:6 -> example/speak.(*Speaker2).Speak (weight 34): too cold (hottest 186)
./main.go:42:16 call main.f:6: hottest callee example/speak.(*Speaker1).Speak (weight 186)

I suspect this second message of missing IR might be expected given the limitations listed in the CL because commenting out those lines means there are no explicit references to the Speak callees in package main, but I'm not sure that the first message of no function body is expected when those lines are not commented out.

The no function body reason seems to be coming from inline.InlineImpossible(fn):

	// If fn has no body (is defined outside of Go), cannot inline it.
	if len(fn.Body) == 0 {
		reason = "no function body"
		return reason
	}

Finally, regular inlining does appear to work on those methods (here using the original main.go, without having commented anything out):

$ gotip build -gcflags=-m
...
./main.go:28:18: inlining call to speak.(*Speaker1).Speak
./main.go:29:18: inlining call to speak.(*Speaker2).Speak

What did you expect to see?

PGO-based devirtualization on methods in another package when those methods are directly referenced in the current package.

What did you see instead?

No devirtualization, with the log messages above.

Sorry in advance if this is just not supported yet, or if I've made a mistake.

CC @prattmic

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jun 1, 2023
@thepudds thepudds changed the title cmd/compile: PGO does not devirtualize methods in another package (log: "should not PGO devirtualize ...: no function body") cmd/compile: PGO does not devirtualize methods in another package (log: "...: no function body") Jun 1, 2023
@thepudds thepudds changed the title cmd/compile: PGO does not devirtualize methods in another package (log: "...: no function body") cmd/compile: PGO does not devirtualize methods in another package (log: "... no function body") Jun 1, 2023
@mknyszek mknyszek added this to the Go1.21 milestone Jun 1, 2023
@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 1, 2023
@prattmic
Copy link
Member

prattmic commented Jun 1, 2023

https://go.dev/cl/497175 is my WIP CL to address the limitation you quoted by explicitly looking for missing functions in export data. Could you try cherry-picking that CL and see if it makes a difference?

Note that you need to pass -d=pgodevirtualize=2 to enable the change in that CL.

@prattmic
Copy link
Member

prattmic commented Jun 1, 2023

At first glance, I agree with your assessment. https://go.dev/cl/497175 should address the "missing IR" case, but the "no function body" case is surprising to me.

Thanks for the detailed bug report.

@thepudds
Copy link
Contributor Author

thepudds commented Jun 1, 2023

Brief comment for now, but will hopefully circle back tonight to try that CL.

I suspect the "no function body" is causing it to fail unnecessarily, including it if encounters that error, it has already successfully gotten past the e.Dst.AST == nil check in findHotConcreteCallee:

	for _, e := range callerNode.OutEdges {
        // ...
		if e.Dst.AST == nil {

I had a very simple test fix that seemed to work (in exactly one test 😅):

./main.go:42:16: PGO devirtualizing call to speak.(*Speaker1).Speak

I can send that as a proof-of-concept CL, but maybe not needed if CL 497175 is the real fix.

@gopherbot
Copy link

Change https://go.dev/cl/500155 mentions this issue: cmd/compile/internal/devirtualize: devirtualize methods in other packages if current package has a concrete reference

@thepudds
Copy link
Contributor Author

thepudds commented Jun 2, 2023

Hi @prattmic, if I try your CL 497175 (ps 1, fa555c12) on my original uncommented example:

$ gotip version
go version devel go1.21-fa555c12 Mon May 22 16:58:07 2023 -0400

$ gotip build -pgo=cpu.out -gcflags='-m=99 -d=pgodebug=99 -d=pgodevirtualize=2'

I still get the no function body message in the log and no devirtualization:

./main.go:42:16: PGO devirtualize considering call s.Speak()
./main.go:42:16: edge main.f:6 -> example/speak.(*Speaker1).Speak (weight 186): hottest so far
./main.go:42:16: edge main.f:6 -> example/speak.(*Speaker2).Speak (weight 34): too cold (hottest 186)
./main.go:42:16 call main.f:6: hottest callee example/speak.(*Speaker1).Speak (weight 186)
./speak/speak.go:9:6: should not PGO devirtualize (*Speaker1).Speak: no function body

I also sent CL 500155, which was a quick exploratory hack. That does seem to successfully allow devirtualization:

./main.go:42:16: PGO devirtualize considering call s.Speak()
./main.go:42:16: edge main.f:6 -> example/speak.(*Speaker1).Speak (weight 186): hottest so far
./main.go:42:16: edge main.f:6 -> example/speak.(*Speaker2).Speak (weight 34): too cold (hottest 186)
./main.go:42:16 call main.f:6: hottest callee example/speak.(*Speaker1).Speak (weight 186)
./main.go:42:16: PGO devirtualizing call to speak.(*Speaker1).Speak

Is your CL 497175 something that might be merged for 1.21, or is that more likely 1.22 at this point?

If a more targeted fix is in order for 1.21, I'd be interested in trying to mature my CL 500155 into a real fix and add tests. (It is likely possible to do better than a string compare 😅). That said, I'd also of course understand if you'd prefer to address this yourself via CL 497175 or another CL.

Thanks, and the PGO devirtualization work seems very promising!

@thepudds thepudds changed the title cmd/compile: PGO does not devirtualize methods in another package (log: "... no function body") cmd/compile: PGO does not devirtualize methods in another package (log: "no function body") Jun 2, 2023
@cherrymui
Copy link
Member

If I understand correctly, for imported inlineable functions, its fn.Body is nil but fn.Inl is not, and fn.Inl.Body is filled during inlining. During inlining, InlineImpossible is only called for locally defined functions. For imported function the inliner just checks for fn.Inl. Maybe we should do that in PGO. Maybe we could use typecheck.HaveInlineBody? But for locally defined functions fn.Inl is not populated until inlining, so we probably need to check both...

@gopherbot
Copy link

Change https://go.dev/cl/500395 mentions this issue: cmd/compile: allow PGO-devirtualize call to imported function

@thepudds
Copy link
Contributor Author

thepudds commented Jun 2, 2023

Hi @cherrymui, thanks for the feedback!

I updated my CL 500155 to use typecheck.HaveInlineBody, and it seemed to work in limited testing so far, including it seems to handle my example from this issue.

@prattmic
Copy link
Member

prattmic commented Jun 2, 2023

@cherrymui Great find, thanks. I'm not sure how I missed this before.

Is your CL 497175 something that might be merged for 1.21, or is that more likely 1.22 at this point?

No, it will go in 1.22, it is too big for the freeze.

@thepudds
Copy link
Contributor Author

thepudds commented Jun 2, 2023

Hi @cherrymui, FWIW, I'm also working on a update to tests for CL 500155 and I think I'm close to having that working.

@prattmic
Copy link
Member

prattmic commented Jun 2, 2023

For reference, when building cmd/compile with PGO I see:

At tip: 35 interface calls devirtualized (12.02% of interface call weight)
With this fix: 63 interface calls devirtualized (12.92% of interface call weight)
With this fix and https://go.dev/cl/497175: 98 interface calls devirtualized (16.1% of interface call weight)

So this is a nice boost in ability to devirtualize, but probably doesn't impact performance much (at least for cmd/compile). I'll run a full sweet run once we merge one of these fix CLs.

@prattmic
Copy link
Member

prattmic commented Jun 5, 2023

Results from my benchmarking run:

                     │ goroot.results │    goroot.pgo.baseline.results     │   goroot.pgo.devirtbase.results    │    goroot.pgo.devirtfix.results    │  goroot.pgo.devirtexport.results   │
                     │     sec/op     │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │
BiogoIgor                  13.98 ± 0%    13.74 ± 1%  -1.69% (p=0.000 n=30)    13.78 ± 0%  -1.42% (p=0.000 n=30)    13.72 ± 1%  -1.86% (p=0.000 n=30)    13.76 ± 1%  -1.54% (p=0.000 n=30)
BiogoKrishna               12.51 ± 1%    12.57 ± 1%       ~ (p=0.070 n=30)    12.53 ± 0%       ~ (p=0.582 n=30)    12.58 ± 1%       ~ (p=0.134 n=30)    12.62 ± 0%  +0.90% (p=0.006 n=30)
BleveIndexBatch100         4.770 ± 1%    4.565 ± 1%  -4.30% (p=0.000 n=30)    4.578 ± 1%  -4.04% (p=0.000 n=30)    4.584 ± 1%  -3.90% (p=0.000 n=30)    4.561 ± 1%  -4.39% (p=0.000 n=30)
EtcdPut                   32.78m ± 1%   32.41m ± 1%  -1.12% (p=0.001 n=30)   32.40m ± 0%  -1.17% (p=0.000 n=30)   31.62m ± 1%  -3.52% (p=0.000 n=30)   31.75m ± 1%  -3.14% (p=0.000 n=30)
EtcdSTM                   147.6m ± 0%   140.7m ± 0%  -4.62% (p=0.000 n=30)   140.4m ± 0%  -4.84% (p=0.000 n=30)   134.6m ± 0%  -8.77% (p=0.000 n=30)   134.3m ± 0%  -8.96% (p=0.000 n=30)
GoBuildKubelet             51.80 ± 0%    50.91 ± 0%  -1.72% (p=0.000 n=30)    50.90 ± 0%  -1.74% (p=0.000 n=30)    50.96 ± 0%  -1.62% (p=0.000 n=30)    50.95 ± 0%  -1.64% (p=0.000 n=30)
GoBuildKubeletLink         7.312 ± 1%    7.116 ± 1%  -2.68% (p=0.000 n=30)    7.131 ± 0%  -2.48% (p=0.000 n=30)    7.146 ± 1%  -2.27% (p=0.000 n=30)    7.133 ± 1%  -2.46% (p=0.000 n=30)
GoBuildIstioctl            44.44 ± 0%    43.81 ± 0%  -1.43% (p=0.000 n=30)    43.85 ± 0%  -1.33% (p=0.000 n=30)    43.83 ± 0%  -1.38% (p=0.000 n=30)    43.78 ± 0%  -1.50% (p=0.000 n=30)
GoBuildIstioctlLink        7.763 ± 1%    7.641 ± 0%  -1.56% (p=0.000 n=30)    7.631 ± 1%  -1.70% (p=0.000 n=30)    7.636 ± 1%  -1.63% (p=0.000 n=30)    7.645 ± 1%  -1.52% (p=0.000 n=30)
GoBuildFrontend            16.07 ± 0%    15.97 ± 0%  -0.63% (p=0.000 n=30)    15.98 ± 0%  -0.60% (p=0.000 n=30)    16.00 ± 0%  -0.47% (p=0.002 n=30)    15.93 ± 0%  -0.87% (p=0.000 n=30)
GoBuildFrontendLink        1.276 ± 1%    1.229 ± 1%  -3.73% (p=0.000 n=30)    1.226 ± 1%  -3.91% (p=0.000 n=30)    1.228 ± 1%  -3.81% (p=0.000 n=30)    1.219 ± 1%  -4.46% (p=0.000 n=30)
GopherLuaKNucleotide       21.77 ± 1%    20.52 ± 1%  -5.71% (p=0.000 n=30)    20.37 ± 1%  -6.40% (p=0.000 n=30)    20.26 ± 1%  -6.93% (p=0.000 n=30)    20.25 ± 1%  -6.99% (p=0.000 n=30)
MarkdownRenderXHTML       259.4m ± 2%   247.5m ± 1%  -4.60% (p=0.000 n=30)   249.4m ± 3%  -3.85% (p=0.004 n=30)   249.6m ± 2%  -3.79% (p=0.004 n=30)   244.4m ± 3%  -5.78% (p=0.000 n=30)
Tile38QueryLoad           512.9µ ± 0%   506.6µ ± 0%  -1.24% (p=0.000 n=30)   509.0µ ± 0%  -0.77% (p=0.000 n=30)   506.0µ ± 0%  -1.35% (p=0.000 n=30)   499.1µ ± 0%  -2.70% (p=0.000 n=30)
geomean                    2.095         2.043       -2.48%                   2.043       -2.45%                   2.033       -2.94%                   2.027       -3.25%

goroot.results: No PGO
goroot.pgo.baseline.results: PGO, no devirtualization
goroot.pgo.devirtbase.results: PGO, with devirtualization, prior to https://go.dev/cl/500155
goroot.pgo.devirtfix.results: PGO, with devirtualization, including https://go.dev/cl/500155
goroot.pgo.devirtexport.results: PGO, with devirtualization, including https://go.dev/cl/500155 and https://go.dev/cl/497175

It is hard to notice in the full list, but this fix did have quite a large impact on a few of these benchmarks:

                     │ goroot.results │    goroot.pgo.baseline.results     │   goroot.pgo.devirtbase.results    │    goroot.pgo.devirtfix.results    │  goroot.pgo.devirtexport.results   │
                     │     sec/op     │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │   sec/op     vs base               │
EtcdPut                   32.78m ± 1%   32.41m ± 1%  -1.12% (p=0.001 n=30)   32.40m ± 0%  -1.17% (p=0.000 n=30)   31.62m ± 1%  -3.52% (p=0.000 n=30)   31.75m ± 1%  -3.14% (p=0.000 n=30)
EtcdSTM                   147.6m ± 0%   140.7m ± 0%  -4.62% (p=0.000 n=30)   140.4m ± 0%  -4.84% (p=0.000 n=30)   134.6m ± 0%  -8.77% (p=0.000 n=30)   134.3m ± 0%  -8.96% (p=0.000 n=30)
GopherLuaKNucleotide       21.77 ± 1%    20.52 ± 1%  -5.71% (p=0.000 n=30)    20.37 ± 1%  -6.40% (p=0.000 n=30)    20.26 ± 1%  -6.93% (p=0.000 n=30)    20.25 ± 1%  -6.99% (p=0.000 n=30)
Tile38QueryLoad           512.9µ ± 0%   506.6µ ± 0%  -1.24% (p=0.000 n=30)   509.0µ ± 0%  -0.77% (p=0.000 n=30)   506.0µ ± 0%  -1.35% (p=0.000 n=30)   499.1µ ± 0%  -2.70% (p=0.000 n=30)
geomean                   85.72m        82.99m       -3.19%                  82.88m       -3.32%                  81.28m       -5.19%                  81.02m       -5.48%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
5 participants