Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: optimize recursive closure calls #59708

Open
adonovan opened this issue Apr 19, 2023 · 5 comments
Open

cmd/compile: optimize recursive closure calls #59708

adonovan opened this issue Apr 19, 2023 · 5 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FeatureRequest NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Performance
Milestone

Comments

@adonovan
Copy link
Member

It is often convenient to define recursive functions local to some outer function, either because you don't want to pollute the package namespace with helper functions, or because the recursive function wants to close over locals of the outer function. Of course Go doesn't have recursive function literals so you have to fake it by using the var trick. The escape analysis is smart enough to avoid heap-allocating the closure in such cases, but unfortunately the actual recursive call is still dynamic.

type Cell struct {
	Next *Cell
}

func f(c *Cell) {
	var visit func(c *Cell)
	visit = func(c *Cell) {
		if c.Next != nil {
			visit(c.Next) // generates CALL (R1) on ARM
		}
	}
	visit(c)
}

Could the compiler be made smart enough to notice that the visit variable is assigned exactly once to a func literal, and thus it is safe to turn the whole thing into a letrec so that the call to visit becomes static?

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Apr 19, 2023
@thanm
Copy link
Contributor

thanm commented Apr 19, 2023

This seems doable to me, other folks should probably weigh in as well. We have to be pretty careful making changes to ir.StaticValue, but it seems to me that we could check for this specific case (variable declaration followed immediately by simple assignment).

@randall77
Copy link
Contributor

Yes, I think this is doable.

What's the end goal? To just replace the dynamic call with a static call? Or to know the callee for escape analysis? For the former, it should be pretty easy to do in SSA. But I'm skeptical just changing the call from indirect to direct will save much time. The latter might be much more helpful but that means we have to do it earlier in the compiler, which is more difficult.

@adonovan
Copy link
Member Author

What's the end goal?
I'm skeptical just changing the call from indirect to direct will save much time.

I keep seeing recursive calls as hotspots in profiles and I wanted to avoid the dynamic call overhead, but I just benchmarked it below (on an M1) and as you predicted, it didn't show a great deal of gain: only about 0.6ns per call. Is that a sign that the M1's branch and return prediction is really good?

Or to know the callee for escape analysis? For the former, it should be pretty easy to do in SSA. The latter might be much more helpful but that means we have to do it earlier in the compiler, which is more difficult.

Well, synergy is good. It shouldn't be that hard to detect the pattern during the traversal over typed syntax: you have a var, a single assignment from a literal, and a body that only references the var in function call position.

But the contribution to the profile is less than I expected.

$ cat virt/virt_test.go 
package virt_test

import "testing"

type Cell struct {
	Next *Cell
}

var theList *Cell

func init() {
	var c *Cell
	for i := 0; i < 16; i++ {
		c = &Cell{c}
	}
	theList = c
}

func BenchmarkDynamic(b *testing.B) {
	var visit func(c *Cell)
	visit = func(c *Cell) {
		if c.Next != nil {
			visit(c.Next) // dynamic call
		}
	}

	for i := 0; i < b.N; i++ {
		visit(theList)
	}
}

func BenchmarkStatic(b *testing.B) {
	for i := 0; i < b.N; i++ {
		visit(theList)
	}
}

func visit(c *Cell) {
	if c.Next != nil {
		visit(c.Next) // static call
	}
}
xtools$ go test -bench=. ./virt
BenchmarkDynamic-8   	53538361	        22.34 ns/op
BenchmarkStatic-8    	93342842	        12.85 ns/op

@prattmic prattmic added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Apr 21, 2023
@prattmic prattmic added this to the Backlog milestone Apr 21, 2023
@prattmic
Copy link
Member

cc @golang/compiler

@adonovan
Copy link
Member Author

adonovan commented Mar 5, 2024

Another benchmark comparing (a) a closure and dynamic call with (b) a non-closure with explicit parameter. The difference is about 1ns per call, or about 22% of the recursion cost. (Changing the Anon variant to put 'total' in the closure instead of a parameter adds +2%.)

package recursion_test

import "testing"

type Tree struct {
	Left, Right *Tree
}

const N = 10

var root = makeTree(N)

func makeTree(depth int) *Tree {
	if depth == 0 {
		return nil
	}
	return &Tree{
		Left:  makeTree(depth - 1),
		Right: makeTree(depth - 1),
	}
}

// -- recursion using anonymous function --

func BenchmarkAnon(b *testing.B) {
	for range b.N {
		sum := SumAnon()
		if sum != 1<<N-1 {
			b.Fatal("oops", sum)
		}
	}
}

func SumAnon() (total int) {
	// sum is a closure, but its only free variable is 'sum'
	var sum func(t *Tree, p *int)
	sum = func(t *Tree, p *int) {
		if t != nil {
			*p++
			sum(t.Left, p)
			sum(t.Right, p)
		}
	}
	sum(root, &total)
	return
}

// -- recursion using named function --

func BenchmarkNamed(b *testing.B) {
	for range b.N {
		sum := SumNamed(root)
		if sum != 1<<N-1 {
			b.Fatal("oops", sum)
		}
	}
}

func SumNamed(t *Tree) (total int) {
	sumNamed(t, &total)
	return
}

func sumNamed(t *Tree, p *int) {
	if t != nil {
		*p++
		sumNamed(t.Left, p)
		sumNamed(t.Right, p)
	}
}
xtools$ go test -bench=. ./bench_test.go 
goos: darwin
goarch: arm64
BenchmarkAnon-8    	  295784	      4061 ns/op
BenchmarkNamed-8   	  375530	      3187 ns/op

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FeatureRequest NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Performance
Projects
Development

No branches or pull requests

6 participants