Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile/internal/devirtualize: PGO devirtualization lost when inlining function containing devirtualized call #62245

Open
korniltsev opened this issue Aug 23, 2023 · 8 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@korniltsev
Copy link
Contributor

korniltsev commented Aug 23, 2023

What version of Go are you using (go version)?

$ go version
go version devel go1.22-081d27a2e5 Wed Aug 23 15:49:39 2023 +0000 linux/amd64
go version go1.21.0 linux/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/korniltsev/.cache/go-build'
GOENV='/home/korniltsev/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/korniltsev/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/korniltsev/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/korniltsev/github/go-linux-amd64-bootstrap'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/korniltsev/github/go-linux-amd64-bootstrap/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='devel go1.22-081d27a2e5 Wed Aug 23 15:49:39 2023 +0000'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/korniltsev/pyro/pgo/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build943153879=/tmp/go-build -gno-record-gcc-switches'

What did you do?

package main

import (
	"flag"
	"fmt"
	"os"
	"runtime/pprof"
	"time"
)

type checker interface {
	check([]byte) int
}

type data struct {
	data    []byte
	checker checker
}

func (d *data) checkStatic() int {
	return len(d.data)
}

func (d *data) checkVirtual() int {
	return d.checker.check(d.data)
}

func (d *data) checkManualVirtual() int {
	c1, ok := d.checker.(*c1)
	if ok {
		return c1.check(d.data)
	}
	return d.checker.check(d.data)
}

type c1 struct {
}

func (c *c1) check(bytes []byte) int {
	return len(bytes)
}

type c2 struct {
}

func (c *c2) check(bytes []byte) int {
	return len(bytes)
}

var sz = flag.Int("sz", 0x100, "")
var n = flag.Int("n", 1000000000, "")
var typ = flag.Bool("typ", true, "")
var profile = flag.String("profile", "", "")

var blackHole = 0

//go:noinline
func loopStatic(d *data, n int) {
	for i := 0; i < n; i++ {
		blackHole += d.checkStatic()
	}
}

//go:noinline
func loopVirtual(d *data, n int) {
	for i := 0; i < n; i++ {
		blackHole += d.checkVirtual()
	}
}

//go:noinline
func loopManualVirtual(d *data, n int) {
	for i := 0; i < n; i++ {
		blackHole += d.checkManualVirtual()
	}
}

func main() {
	flag.Parse()
	if *profile != "" {
		fd, _ := os.Create(*profile)
		pprof.StartCPUProfile(fd)
		defer pprof.StopCPUProfile()
	}
	var d = new(data)
	d.data = make([]byte, *sz)
	if *typ {
		d.checker = new(c1)
	} else {
		d.checker = new(c2)
	}
	blackHole += new(c1).check(d.data)
	blackHole += new(c2).check(d.data)
	n := *n
	t1 := time.Now()
	loopVirtual(d, n)
	t2 := time.Now()
	loopManualVirtual(d, n)
	t3 := time.Now()
	loopStatic(d, n)
	t4 := time.Now()
	fmt.Printf("virtual        %+v\n", t2.Sub(t1))
	fmt.Printf("manualVirtual  %+v\n", t3.Sub(t2))
	fmt.Printf("static         %+v\n", t4.Sub(t3))
	fmt.Println(blackHole)
}
# collect profile
/home/korniltsev/github/go-linux-amd64-bootstrap/bin/go build -o nopgo main.go
./nopgo -profile=profile.pgo
# recompile with pgo
/home/korniltsev/github/go-linux-amd64-bootstrap/bin/go build -pgo=profile.pgo -gcflags '-m=2 -d=pgodebug=3' -o pgo main.go
./pgo
debug compiler output


# command-line-arguments
./main.go:25:24: PGO devirtualize considering call d.checker.check(d.data)
./main.go:25:24: edge main.(*data).checkVirtual:1 -> main.(*c1).check (weight 20): hottest so far
./main.go:25:24: edge main.(*data).checkVirtual:1 -> runtime.asyncPreempt (weight 1): too cold (hottest 20)
./main.go:25:24 call main.(*data).checkVirtual:1: hottest callee main.(*c1).check (weight 20)
./main.go:25:24: PGO devirtualizing d.checker.check to (*c1).check
PGO devirtualizing call to PTR-*c1. After: 
.   INLCALL int tc(1) # main.go:25:24
.   INLCALL-Body
.   .   IF-init
.   .   .   AS2 tc(1) # main.go:25:24
.   .   .   AS2-Lhs
.   .   .   .   NAME-main..autotmp_3 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used main.checker tc(1) # main.go:50:5
.   .   .   .   NAME-main..autotmp_4 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used SLICE-[]byte tc(1) # main.go:50:5
.   .   .   AS2-Rhs
.   .   .   .   DOTPTR main.checker main.checker tc(1) # main.go:25:10
.   .   .   .   .   NAME-main.d Class:PPARAM Offset:0 OnStack Used PTR-*data tc(1) # main.go:24:7
.   .   .   .   DOTPTR main.data SLICE-[]byte tc(1) # main.go:25:26
.   .   .   .   .   NAME-main.d Class:PPARAM Offset:0 OnStack Used PTR-*data tc(1) # main.go:24:7
.   .   .   AS2DOTTYPE tc(1) # main.go:25:24
.   .   .   AS2DOTTYPE-Lhs
.   .   .   .   NAME-main..autotmp_5 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used PTR-*c1 tc(1) # main.go:50:5
.   .   .   .   NAME-main..autotmp_6 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used bool tc(1) # main.go:50:5
.   .   .   AS2DOTTYPE-Rhs
.   .   .   .   DOTTYPE2 PTR-*c1 tc(1) # main.go:25:24
.   .   .   .   .   NAME-main..autotmp_3 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used main.checker tc(1) # main.go:50:5
.   .   IF Likely tc(1) # main.go:25:24
.   .   IF-Cond
.   .   .   NAME-main..autotmp_6 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used bool tc(1) # main.go:50:5
.   .   IF-Body
.   .   .   AS2 tc(1) # main.go:25:24
.   .   .   AS2-Lhs
.   .   .   .   NAME-main..autotmp_2 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used int tc(1) # main.go:50:5
.   .   .   AS2-Rhs
.   .   .   .   CALLFUNC int tc(1) # main.go:25:24
.   .   .   .   .   METHEXPR main.check FUNC-func(*c1, []byte) int tc(1) # main.go:25:24
.   .   .   .   .   .   TYPE <S> type PTR-*c1 tc(1)
.   .   .   .   CALLFUNC-Args
.   .   .   .   .   NAME-main..autotmp_5 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used PTR-*c1 tc(1) # main.go:50:5
.   .   .   .   .   NAME-main..autotmp_4 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used SLICE-[]byte tc(1) # main.go:50:5
.   .   IF-Else
.   .   .   AS2 tc(1) # main.go:25:24
.   .   .   AS2-Lhs
.   .   .   .   NAME-main..autotmp_2 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used int tc(1) # main.go:50:5
.   .   .   AS2-Rhs
.   .   .   .   CALLINTER int tc(1) # main.go:25:24
.   .   .   .   .   DOTINTER main.check FUNC-method(*struct {}) func([]byte) int tc(1) # main.go:25:18
.   .   .   .   .   .   DOTPTR main.checker main.checker tc(1) # main.go:25:10
.   .   .   .   .   .   .   NAME-main.d Class:PPARAM Offset:0 OnStack Used PTR-*data tc(1) # main.go:24:7
.   .   .   .   CALLINTER-Args
.   .   .   .   .   NAME-main..autotmp_4 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used SLICE-[]byte tc(1) # main.go:50:5
.   INLCALL-ReturnVars
.   .   NAME-main..autotmp_2 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used int tc(1) # main.go:50:5
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:25:24","Caller":"main.(*data).checkVirtual","Direct":false,"Interface":true,"Weight":21,"Hottest":"main.(*c1).check","HottestWeight":20,"Devirtualized":"main.(*c1).check","DevirtualizedWeight":20}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:31:18","Caller":"main.(*data).checkManualVirtual","Direct":true,"Interface":false,"Weight":0,"Hottest":"main.(*c1).check","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
./main.go:33:24: PGO devirtualize considering call d.checker.check(d.data)
./main.go:33:24: call main.(*data).checkManualVirtual:5: no hot callee
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:33:24","Caller":"main.(*data).checkManualVirtual","Direct":false,"Interface":true,"Weight":0,"Hottest":"","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:60:29","Caller":"main.loopStatic","Direct":true,"Interface":false,"Weight":19,"Hottest":"main.(*data).checkStatic","HottestWeight":19,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:67:30","Caller":"main.loopVirtual","Direct":true,"Interface":false,"Weight":90,"Hottest":"main.(*data).checkVirtual","HottestWeight":90,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:74:36","Caller":"main.loopManualVirtual","Direct":true,"Interface":false,"Weight":83,"Hottest":"main.(*data).checkManualVirtual","HottestWeight":83,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:79:12","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"flag.Parse","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:81:21","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"os.Create","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:82:24","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"runtime/pprof.StartCPUProfile","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:83:29","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"runtime/pprof.StopCPUProfile","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:92:28","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"main.(*c1).check","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:93:28","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"main.(*c2).check","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:95:16","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"time.Now","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:96:13","Caller":"main.main","Direct":true,"Interface":false,"Weight":111,"Hottest":"main.loopVirtual","HottestWeight":111,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:97:16","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"time.Now","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:98:19","Caller":"main.main","Direct":true,"Interface":false,"Weight":30,"Hottest":"main.loopManualVirtual","HottestWeight":30,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:99:16","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"time.Now","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:100:12","Caller":"main.main","Direct":true,"Interface":false,"Weight":28,"Hottest":"main.loopStatic","HottestWeight":28,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:101:16","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"time.Now","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:102:43","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"time.Time.Sub","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:102:12","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"fmt.Printf","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:103:43","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"time.Time.Sub","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:103:12","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"fmt.Printf","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:104:43","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"time.Time.Sub","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:104:12","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"fmt.Printf","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:105:13","Caller":"main.main","Direct":true,"Interface":false,"Weight":0,"Hottest":"fmt.Println","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:50:18","Caller":"main.init","Direct":true,"Interface":false,"Weight":0,"Hottest":"flag.Int","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:51:17","Caller":"main.init","Direct":true,"Interface":false,"Weight":0,"Hottest":"flag.Int","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:52:20","Caller":"main.init","Direct":true,"Interface":false,"Weight":0,"Hottest":"flag.Bool","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
{"Pkg":"main","Pos":"/home/korniltsev/pyro/pgo/main.go:53:26","Caller":"main.init","Direct":true,"Interface":false,"Weight":0,"Hottest":"flag.String","HottestWeight":0,"Devirtualized":"","DevirtualizedWeight":0}
hot-callsite-thres-from-CDF=4.973821989528796
hot-cg before inline in dot format:
digraph G {
forcelabels=true;
"main.(*c1).check" [color=black, style=solid, label="main.(*c1).check"];
"main.loopVirtual" [color=black, style=solid, label="main.loopVirtual"];
"flag.Parse" [color=black, style=solid, label="flag.Parse,inl_cost=62"];
"flag.Bool" [color=black, style=solid, label="flag.Bool,inl_cost=63"];
"main.(*data).checkVirtual" [color=black, style=solid, label="main.(*data).checkVirtual"];
"main.(*data).checkStatic" [color=black, style=solid, label="main.(*data).checkStatic"];
"main.(*c2).check" [color=black, style=solid, label="main.(*c2).check"];
"main.main" [color=black, style=solid, label="main.main"];
"time.Time.Sub" [color=black, style=solid, label="time.Time.Sub"];
"fmt.Printf" [color=black, style=solid, label="fmt.Printf,inl_cost=73"];
"os.Create" [color=black, style=solid, label="os.Create,inl_cost=72"];
"flag.String" [color=black, style=solid, label="flag.String,inl_cost=63"];
"runtime.asyncPreempt" [color=black, style=dashed, label="runtime.asyncPreempt"];
"main.loopManualVirtual" [color=black, style=solid, label="main.loopManualVirtual"];
"main.(*data).checkManualVirtual" [color=black, style=solid, label="main.(*data).checkManualVirtual"];
"runtime/pprof.StartCPUProfile" [color=black, style=solid, label="runtime/pprof.StartCPUProfile"];
"flag.Int" [color=black, style=solid, label="flag.Int,inl_cost=63"];
"main.loopStatic" [color=black, style=solid, label="main.loopStatic"];
"time.Now" [color=black, style=solid, label="time.Now"];
"fmt.Println" [color=black, style=solid, label="fmt.Println,inl_cost=72"];
"runtime/pprof.StopCPUProfile" [color=black, style=solid, label="runtime/pprof.StopCPUProfile"];
"main.init" [color=black, style=solid, label="main.init"];
edge [color=black, style=dashed];
"main.(*data).checkVirtual" -> "runtime.asyncPreempt" [label="0.26"];
edge [color=red, style=solid];
"main.(*data).checkVirtual" -> "main.(*c1).check" [label="5.24"];
edge [color=black, style=solid];
"main.(*data).checkManualVirtual" -> "main.(*c1).check" [label="0.00"];
edge [color=black, style=solid];
"main.loopStatic" -> "main.(*data).checkStatic" [label="4.97"];
edge [color=red, style=solid];
"main.loopVirtual" -> "main.(*data).checkVirtual" [label="23.56"];
edge [color=red, style=solid];
"main.loopManualVirtual" -> "main.(*data).checkManualVirtual" [label="21.73"];
edge [color=black, style=solid];
"main.main" -> "main.(*c2).check" [label="0.00"];
edge [color=black, style=solid];
"main.main" -> "time.Now" [label="0.00"];
edge [color=black, style=solid];
"main.main" -> "fmt.Printf" [label="0.00"];
edge [color=black, style=solid];
"main.main" -> "time.Now" [label="0.00"];
edge [color=black, style=solid];
"main.main" -> "fmt.Printf" [label="0.00"];
edge [color=black, style=solid];
"main.main" -> "fmt.Printf" [label="0.00"];
edge [color=black, style=solid];
"main.main" -> "flag.Parse" [label="0.00"];
edge [color=black, style=solid];
"main.main" -> "os.Create" [label="0.00"];
edge [color=black, style=solid];
"main.main" -> "main.(*c1).check" [label="0.00"];
edge [color=red, style=solid];
"main.main" -> "main.loopManualVirtual" [label="7.85"];
edge [color=black, style=solid];
"main.main" -> "time.Now" [label="0.00"];
edge [color=black, style=solid];
"main.main" -> "time.Time.Sub" [label="0.00"];
edge [color=black, style=solid];
"main.main" -> "time.Time.Sub" [label="0.00"];
edge [color=black, style=solid];
"main.main" -> "fmt.Println" [label="0.00"];
edge [color=black, style=solid];
"main.main" -> "runtime/pprof.StartCPUProfile" [label="0.00"];
edge [color=black, style=solid];
"main.main" -> "time.Now" [label="0.00"];
edge [color=red, style=solid];
"main.main" -> "main.loopVirtual" [label="29.06"];
edge [color=black, style=solid];
"main.main" -> "runtime/pprof.StopCPUProfile" [label="0.00"];
edge [color=red, style=solid];
"main.main" -> "main.loopStatic" [label="7.33"];
edge [color=black, style=solid];
"main.main" -> "time.Time.Sub" [label="0.00"];
edge [color=black, style=solid];
"main.init" -> "flag.Int" [label="0.00"];
edge [color=black, style=solid];
"main.init" -> "flag.Int" [label="0.00"];
edge [color=black, style=solid];
"main.init" -> "flag.Bool" [label="0.00"];
edge [color=black, style=solid];
"main.init" -> "flag.String" [label="0.00"];
}
hot-node enabled increased budget=2000 for func=main.(*data).checkStatic
./main.go:20:6: can inline (*data).checkStatic with cost 4 as: method(*data) func() int { return len(d.data) }
hot-node enabled increased budget=2000 for func=main.(*c1).check
./main.go:39:6: can inline (*c1).check with cost 3 as: method(*c1) func([]byte) int { return len(bytes) }
hot-node enabled increased budget=2000 for func=main.(*data).checkVirtual
./main.go:24:6: can inline (*data).checkVirtual with cost 89 as: method(*data) func() int { return .autotmp_2 }
./main.go:25:24: inlining call to (*c1).check
hot-node enabled increased budget=2000 for func=main.(*data).checkManualVirtual
./main.go:28:6: can inline (*data).checkManualVirtual with cost 88 as: method(*data) func() int { c1, ok := (*c1)(.autotmp_4), .autotmp_5; if ok { return (*c1).check(c1, d.data) }; return d.checker.check(d.data) }
./main.go:31:18: inlining call to (*c1).check
./main.go:46:6: can inline (*c2).check with cost 3 as: method(*c2) func([]byte) int { return len(bytes) }
./main.go:58:6: cannot inline loopStatic: marked go:noinline
./main.go:60:29: inlining call to (*data).checkStatic
./main.go:65:6: cannot inline loopVirtual: marked go:noinline
hot-budget check allows inlining for call main.(*data).checkVirtual (cost 89) at ./main.go:67:30 in function main.loopVirtual
./main.go:67:30: inlining call to (*data).checkVirtual
./main.go:72:6: cannot inline loopManualVirtual: marked go:noinline
hot-budget check allows inlining for call main.(*data).checkManualVirtual (cost 88) at ./main.go:74:36 in function main.loopManualVirtual
./main.go:74:36: inlining call to (*data).checkManualVirtual
./main.go:74:36: inlining call to (*c1).check
./main.go:78:6: cannot inline main: unhandled op DEFER
./main.go:79:12: inlining call to flag.Parse
./main.go:81:21: inlining call to os.Create
./main.go:92:28: inlining call to (*c1).check
./main.go:93:28: inlining call to (*c2).check
./main.go:102:12: inlining call to fmt.Printf
./main.go:103:12: inlining call to fmt.Printf
./main.go:104:12: inlining call to fmt.Printf
./main.go:105:13: inlining call to fmt.Println
./main.go:50:18: inlining call to flag.Int
./main.go:51:17: inlining call to flag.Int
./main.go:52:20: inlining call to flag.Bool
./main.go:53:26: inlining call to flag.String
./main.go:20:7: d does not escape
./main.go:24:7: parameter d leaks to {heap} with derefs=1:
./main.go:24:7:   flow: {heap} = *d:
./main.go:24:7:     from d.checker (dot of pointer) at ./main.go:25:10
./main.go:24:7:     from d.checker.check(.autotmp_4) (call parameter) at ./main.go:25:24
./main.go:24:7: leaking param content: d
./main.go:28:7: parameter d leaks to {heap} with derefs=1:
./main.go:28:7:   flow: {heap} = *d:
./main.go:28:7:     from d.checker (dot of pointer) at ./main.go:33:10
./main.go:28:7:     from d.checker.check(d.data) (call parameter) at ./main.go:33:24
./main.go:28:7: leaking param content: d
./main.go:39:7: c does not escape
./main.go:39:20: bytes does not escape
./main.go:46:7: c does not escape
./main.go:46:20: bytes does not escape
./main.go:58:17: d does not escape
./main.go:65:18: parameter d leaks to {heap} with derefs=1:
./main.go:65:18:   flow: d = d:
./main.go:65:18:     from d := d (assign-pair) at ./main.go:67:30
./main.go:65:18:   flow: {heap} = *d:
./main.go:65:18:     from d.checker (dot of pointer) at ./main.go:67:30
./main.go:65:18:     from d.checker.check(d.data) (call parameter) at ./main.go:67:30
./main.go:65:18: leaking param content: d
./main.go:72:24: parameter d leaks to {heap} with derefs=1:
./main.go:72:24:   flow: d = d:
./main.go:72:24:     from d := d (assign-pair) at ./main.go:74:36
./main.go:72:24:   flow: {heap} = *d:
./main.go:72:24:     from d.checker (dot of pointer) at ./main.go:74:36
./main.go:72:24:     from d.checker.check(d.data) (call parameter) at ./main.go:74:36
./main.go:72:24: leaking param content: d
./main.go:86:15: make([]byte, *sz) escapes to heap:
./main.go:86:15:   flow: {heap} = &{storage for make([]byte, *sz)}:
./main.go:86:15:     from make([]byte, *sz) (spill) at ./main.go:86:15
./main.go:86:15:     from d.data = make([]byte, *sz) (assign) at ./main.go:86:9
./main.go:88:18: new(c1) escapes to heap:
./main.go:88:18:   flow: {heap} = &{storage for new(c1)}:
./main.go:88:18:     from new(c1) (spill) at ./main.go:88:18
./main.go:88:18:     from new(c1) (interface-converted) at ./main.go:88:18
./main.go:88:18:     from d.checker = new(c1) (assign) at ./main.go:88:13
./main.go:90:18: new(c2) escapes to heap:
./main.go:90:18:   flow: {heap} = &{storage for new(c2)}:
./main.go:90:18:     from new(c2) (spill) at ./main.go:90:18
./main.go:90:18:     from new(c2) (interface-converted) at ./main.go:90:18
./main.go:90:18:     from d.checker = new(c2) (assign) at ./main.go:90:13
./main.go:86:15: make([]byte, *sz) escapes to heap:
./main.go:86:15:   flow: {heap} = &{storage for make([]byte, *sz)}:
./main.go:86:15:     from make([]byte, *sz) (non-constant size) at ./main.go:86:15
./main.go:105:14: blackHole escapes to heap:
./main.go:105:14:   flow: {storage for ... argument} = &{storage for blackHole}:
./main.go:105:14:     from blackHole (spill) at ./main.go:105:14
./main.go:105:14:     from ... argument (slice-literal-element) at ./main.go:105:13
./main.go:105:14:   flow: fmt.a = &{storage for ... argument}:
./main.go:105:14:     from ... argument (spill) at ./main.go:105:13
./main.go:105:14:     from fmt.a := ... argument (assign-pair) at ./main.go:105:13
./main.go:105:14:   flow: {heap} = *fmt.a:
./main.go:105:14:     from fmt.Fprintln(os.Stdout, fmt.a...) (call parameter) at ./main.go:105:13
./main.go:104:43: time.Time.Sub(t4, t3) escapes to heap:
./main.go:104:43:   flow: {storage for ... argument} = &{storage for time.Time.Sub(t4, t3)}:
./main.go:104:43:     from time.Time.Sub(t4, t3) (spill) at ./main.go:104:43
./main.go:104:43:     from ... argument (slice-literal-element) at ./main.go:104:12
./main.go:104:43:   flow: fmt.a = &{storage for ... argument}:
./main.go:104:43:     from ... argument (spill) at ./main.go:104:12
./main.go:104:43:     from fmt.format, fmt.a := "static         %+v\n", ... argument (assign-pair) at ./main.go:104:12
./main.go:104:43:   flow: {heap} = *fmt.a:
./main.go:104:43:     from fmt.Fprintf(os.Stdout, fmt.format, fmt.a...) (call parameter) at ./main.go:104:12
./main.go:103:43: time.Time.Sub(t3, t2) escapes to heap:
./main.go:103:43:   flow: {storage for ... argument} = &{storage for time.Time.Sub(t3, t2)}:
./main.go:103:43:     from time.Time.Sub(t3, t2) (spill) at ./main.go:103:43
./main.go:103:43:     from ... argument (slice-literal-element) at ./main.go:103:12
./main.go:103:43:   flow: fmt.a = &{storage for ... argument}:
./main.go:103:43:     from ... argument (spill) at ./main.go:103:12
./main.go:103:43:     from fmt.format, fmt.a := "manualVirtual  %+v\n", ... argument (assign-pair) at ./main.go:103:12
./main.go:103:43:   flow: {heap} = *fmt.a:
./main.go:103:43:     from fmt.Fprintf(os.Stdout, fmt.format, fmt.a...) (call parameter) at ./main.go:103:12
./main.go:102:43: time.Time.Sub(t2, t1) escapes to heap:
./main.go:102:43:   flow: {storage for ... argument} = &{storage for time.Time.Sub(t2, t1)}:
./main.go:102:43:     from time.Time.Sub(t2, t1) (spill) at ./main.go:102:43
./main.go:102:43:     from ... argument (slice-literal-element) at ./main.go:102:12
./main.go:102:43:   flow: fmt.a = &{storage for ... argument}:
./main.go:102:43:     from ... argument (spill) at ./main.go:102:12
./main.go:102:43:     from fmt.format, fmt.a := "virtual        %+v\n", ... argument (assign-pair) at ./main.go:102:12
./main.go:102:43:   flow: {heap} = *fmt.a:
./main.go:102:43:     from fmt.Fprintf(os.Stdout, fmt.format, fmt.a...) (call parameter) at ./main.go:102:12
./main.go:85:13: new(data) does not escape
./main.go:86:15: make([]byte, *sz) escapes to heap
./main.go:88:18: new(c1) escapes to heap
./main.go:90:18: new(c2) escapes to heap
./main.go:92:18: new(c1) does not escape
./main.go:93:18: new(c2) does not escape
./main.go:102:12: ... argument does not escape
./main.go:102:43: time.Time.Sub(t2, t1) escapes to heap
./main.go:103:12: ... argument does not escape
./main.go:103:43: time.Time.Sub(t3, t2) escapes to heap
./main.go:104:12: ... argument does not escape
./main.go:104:43: time.Time.Sub(t4, t3) escapes to heap
./main.go:105:13: ... argument does not escape
./main.go:105:14: blackHole escapes to heap
./main.go:12:8: parameter .anon0 leaks to {heap} with derefs=0:
./main.go:12:8:   flow: {heap} = .anon0:
./main.go:12:8:     from .this.check(.anon0) (call parameter) at <autogenerated>:1
<autogenerated>:1: parameter .this leaks to {heap} with derefs=0:
<autogenerated>:1:   flow: {heap} = .this:
<autogenerated>:1:     from .this.check(.anon0) (call parameter) at <autogenerated>:1

go tool objdump -s '^main' pgo
go tool objdump -s '^main' pgo
TEXT main.(*c1).check(SB) /home/korniltsev/pyro/pgo/main.go
  main.go:39            0x4ab940                48895c2410              MOVQ BX, 0x10(SP)
  main.go:40            0x4ab945                4889c8                  MOVQ CX, AX
  main.go:40            0x4ab948                c3                      RET

TEXT main.(*c2).check(SB) /home/korniltsev/pyro/pgo/main.go
  main.go:46            0x4ab960                48895c2410              MOVQ BX, 0x10(SP)
  main.go:47            0x4ab965                4889c8                  MOVQ CX, AX
  main.go:47            0x4ab968                c3                      RET

TEXT main.loopStatic(SB) /home/korniltsev/pyro/pgo/main.go
  main.go:59            0x4ab980                eb15                    JMP 0x4ab997
  main.go:60            0x4ab982                488b0dffff0f00          MOVQ main.blackHole(SB), CX
  main.go:21            0x4ab989                48034808                ADDQ 0x8(AX), CX
  main.go:59            0x4ab98d                48ffcb                  DECQ BX
  main.go:60            0x4ab990                48890df1ff0f00          MOVQ CX, main.blackHole(SB)
  main.go:59            0x4ab997                4885db                  TESTQ BX, BX
  main.go:59            0x4ab99a                7fe6                    JG 0x4ab982
  main.go:62            0x4ab99c                c3                      RET

TEXT main.loopVirtual(SB) /home/korniltsev/pyro/pgo/main.go
  main.go:65            0x4ab9a0                493b6610                CMPQ SP, 0x10(R14)
  main.go:65            0x4ab9a4                7653                    JBE 0x4ab9f9
  main.go:65            0x4ab9a6                55                      PUSHQ BP
  main.go:65            0x4ab9a7                4889e5                  MOVQ SP, BP
  main.go:65            0x4ab9aa                4883ec28                SUBQ $0x28, SP
  main.go:65            0x4ab9ae                4889442438              MOVQ AX, 0x38(SP)
  main.go:66            0x4ab9b3                eb39                    JMP 0x4ab9ee
  main.go:25            0x4ab9b5                488b5018                MOVQ 0x18(AX), DX
  main.go:25            0x4ab9b9                488b7020                MOVQ 0x20(AX), SI
  main.go:25            0x4ab9bd                488b5218                MOVQ 0x18(DX), DX
  main.go:25            0x4ab9c1                4c8b00                  MOVQ 0(AX), R8
  main.go:25            0x4ab9c4                488b4808                MOVQ 0x8(AX), CX
  main.go:25            0x4ab9c8                488b7810                MOVQ 0x10(AX), DI
  main.go:66            0x4ab9cc                4c8d4bff                LEAQ -0x1(BX), R9
  main.go:66            0x4ab9d0                4c894c2420              MOVQ R9, 0x20(SP)
  main.go:25            0x4ab9d5                4c89c3                  MOVQ R8, BX
  main.go:25            0x4ab9d8                4889f0                  MOVQ SI, AX
  main.go:25            0x4ab9db                ffd2                    CALL DX
  main.go:67            0x4ab9dd                480105a4ff0f00          ADDQ AX, main.blackHole(SB)
  main.go:25            0x4ab9e4                488b442438              MOVQ 0x38(SP), AX
  main.go:66            0x4ab9e9                488b5c2420              MOVQ 0x20(SP), BX
  main.go:66            0x4ab9ee                4885db                  TESTQ BX, BX
  main.go:66            0x4ab9f1                7fc2                    JG 0x4ab9b5
  main.go:69            0x4ab9f3                4883c428                ADDQ $0x28, SP
  main.go:69            0x4ab9f7                5d                      POPQ BP
  main.go:69            0x4ab9f8                c3                      RET
  main.go:65            0x4ab9f9                4889442408              MOVQ AX, 0x8(SP)
  main.go:65            0x4ab9fe                48895c2410              MOVQ BX, 0x10(SP)
  main.go:65            0x4aba03                e87824fbff              CALL runtime.morestack_noctxt.abi0(SB)
  main.go:65            0x4aba08                488b442408              MOVQ 0x8(SP), AX
  main.go:65            0x4aba0d                488b5c2410              MOVQ 0x10(SP), BX
  main.go:65            0x4aba12                eb8c                    JMP main.loopVirtual(SB)

TEXT main.loopManualVirtual(SB) /home/korniltsev/pyro/pgo/main.go
  main.go:72            0x4aba20                493b6610                CMPQ SP, 0x10(R14)
  main.go:72            0x4aba24                766e                    JBE 0x4aba94
  main.go:72            0x4aba26                55                      PUSHQ BP
  main.go:72            0x4aba27                4889e5                  MOVQ SP, BP
  main.go:72            0x4aba2a                4883ec28                SUBQ $0x28, SP
  main.go:72            0x4aba2e                4889442438              MOVQ AX, 0x38(SP)
  main.go:73            0x4aba33                eb0b                    JMP 0x4aba40
  main.go:74            0x4aba35                4801154cff0f00          ADDQ DX, main.blackHole(SB)
  main.go:73            0x4aba3c                48ffcb                  DECQ BX
  main.go:73            0x4aba3f                90                      NOPL
  main.go:73            0x4aba40                4885db                  TESTQ BX, BX
  main.go:73            0x4aba43                7e49                    JLE 0x4aba8e
  main.go:29            0x4aba45                488b5018                MOVQ 0x18(AX), DX
  main.go:29            0x4aba49                488d35587d0400          LEAQ go:itab.*main.c1,main.checker(SB), SI
  main.go:29            0x4aba50                4839f2                  CMPQ DX, SI
  main.go:29            0x4aba53                7506                    JNE 0x4aba5b
  main.go:31            0x4aba55                488b5008                MOVQ 0x8(AX), DX
  main.go:74            0x4aba59                ebda                    JMP 0x4aba35
  main.go:73            0x4aba5b                48895c2420              MOVQ BX, 0x20(SP)
  main.go:29            0x4aba60                488b7020                MOVQ 0x20(AX), SI
  main.go:33            0x4aba64                488b5218                MOVQ 0x18(DX), DX
  main.go:33            0x4aba68                488b18                  MOVQ 0(AX), BX
  main.go:33            0x4aba6b                488b4808                MOVQ 0x8(AX), CX
  main.go:33            0x4aba6f                488b7810                MOVQ 0x10(AX), DI
  main.go:33            0x4aba73                4889f0                  MOVQ SI, AX
  main.go:33            0x4aba76                ffd2                    CALL DX
  main.go:73            0x4aba78                488b5c2420              MOVQ 0x20(SP), BX
  main.go:73            0x4aba7d                488d35247d0400          LEAQ go:itab.*main.c1,main.checker(SB), SI
  main.go:74            0x4aba84                4889c2                  MOVQ AX, DX
  main.go:29            0x4aba87                488b442438              MOVQ 0x38(SP), AX
  main.go:74            0x4aba8c                eba7                    JMP 0x4aba35
  main.go:76            0x4aba8e                4883c428                ADDQ $0x28, SP
  main.go:76            0x4aba92                5d                      POPQ BP
  main.go:76            0x4aba93                c3                      RET
  main.go:72            0x4aba94                4889442408              MOVQ AX, 0x8(SP)
  main.go:72            0x4aba99                48895c2410              MOVQ BX, 0x10(SP)
  main.go:72            0x4aba9e                6690                    NOPW
  main.go:72            0x4abaa0                e8db23fbff              CALL runtime.morestack_noctxt.abi0(SB)
  main.go:72            0x4abaa5                488b442408              MOVQ 0x8(SP), AX
  main.go:72            0x4abaaa                488b5c2410              MOVQ 0x10(SP), BX
  main.go:72            0x4abaaf                e96cffffff              JMP main.loopManualVirtual(SB)

What did you expect to see?

Debug compiler output says

./main.go:25:24: PGO devirtualizing d.checker.check to (*c1).check

In the objdump of main.loopVirtual I expected to see an itab check and conditional direct call (and maybe inlined function body instead of direct method call).

What did you see instead?

In the objdump of main.loopVirtual I see an unconditional indirect call via DX register

@prattmic prattmic self-assigned this Aug 25, 2023
@prattmic prattmic added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. compiler/runtime Issues related to the Go compiler and/or runtime. labels Aug 25, 2023
@mknyszek mknyszek added this to the Backlog milestone Aug 30, 2023
@prattmic
Copy link
Member

prattmic commented Sep 1, 2023

I can reproduce this as well, thanks.

main.(*data).checkVirtual is inlined into main.loopVirtual and at first glance it looks like the devirtualization is lost at that point; the original version of the function is inlined. I'm not sure why yet.

@prattmic
Copy link
Member

prattmic commented Sep 1, 2023

I suppose this is a fundamental mismatch between the way PGO devirtualization works and the way inlining works with unified IR.

PGO devirtualization edits the function IR to replace interface calls with a conditional direct or indirect call based on a type assertion.

This runs prior to inlining so that the direct call is eligible for inlining.

But during inlining, the inlining itself is done with unifiedInlineCall, which "implements inline.NewInline by re-reading the function body from its Unified IR export data." This means that any edits made to the function body prior to inlining are lost. Specifically, if a function was edited to PGO devirtualize a call, that devirtualization will be lost if that function is inlined (note that this is the opposite case of our goal above).

This theoretically impacts any edits to the IR between IR/export data creation and inlining, but as far as I can tell PGO devirtualization is the only such editing.

For PGO devirtualization, I'm not sure that there is a good quick fix for this. We could run PGO devirtualization again after inlining (which is also when static devirtualization occurs), but that is only a band-aid: devirtualization that occurs at that point won't get the inlining we desire.

I think the best long-term solution would be to combine inlining and devirtualization (static and PGO) into the same pass, so we can iteratively apply both: inlining can enable devirtualization, which can in turn enable more inlining, and so on. This idea is related to #52193.

cc @mdempsky

@prattmic prattmic changed the title cmd/compile/internal/devirtualize: PGO devirtualizing not found in objdump cmd/compile/internal/devirtualize: PGO devirtualize lost when inlining function containing devirtualized call Sep 1, 2023
@prattmic prattmic changed the title cmd/compile/internal/devirtualize: PGO devirtualize lost when inlining function containing devirtualized call cmd/compile/internal/devirtualize: PGO devirtualization lost when inlining function containing devirtualized call Sep 1, 2023
@prattmic
Copy link
Member

prattmic commented Sep 1, 2023

cc @cherrymui @aclements

@prattmic
Copy link
Member

prattmic commented Sep 1, 2023

cc @thanm

@prattmic
Copy link
Member

prattmic commented Sep 1, 2023

@mdempsky for the immediate term: do you foresee any correctness problems with the current behavior of the primary ir.Func.Body differing from what gets inlined via export data? It seems unlikely to cause correctness issues to me given that the inlined body is likely to be changed by later optimizations anyway. But perhaps there are some assumptions in unifiedInlineCall itself that the passed ir.Func agrees with what is in the export data?

@cherrymui
Copy link
Member

Is it possible to update the "Unified IR export data" after devirtualization? (Perhaps we only need to update inlineable functions, to reduce unnecessary work.)

@prattmic
Copy link
Member

prattmic commented Sep 1, 2023

That's a good question, I'm not sure how complex that is.

Just as a clarification, #62245 (comment) refers to inlining within the same package. I think if inlining occurs across packages it should be OK, since the function body containing PGO devirtualization will get written out to the disk version of the export data.

@mdempsky
Copy link
Member

mdempsky commented Sep 1, 2023

I think the best long-term solution would be to combine inlining and devirtualization (static and PGO) into the same pass, so we can iteratively apply both: inlining can enable devirtualization, which can in turn enable more inlining, and so on.

Ack, I'm looking into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Development

No branches or pull requests

5 participants