Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/types: Implements is an allocation hotspot #61162

Open
adonovan opened this issue Jul 3, 2023 · 4 comments
Open

go/types: Implements is an allocation hotspot #61162

adonovan opened this issue Jul 3, 2023 · 4 comments
Assignees
Labels
NeedsFix The path to resolution is known, but the work has not been done. Performance
Milestone

Comments

@adonovan
Copy link
Member

adonovan commented Jul 3, 2023

While profiling the rapid type analysis (golang.org/x/tools/go/callgraph/rta) I noticed that it spends a majority of its time in the types.Implements function. It's not entirely surprising as it's a key part of the algorithm, but we could easily make the analysis several times faster by optimizing this step. Allocation of the next slice in types.lookupFieldOrMethod is a major cost: 43s out of 141s CPU.

Here's an overview of the profile:
pprof001

You can reproduce it by building cmd/deadcode in https://go.dev/cl/507738 and running it on k8s.io/cmd/kubelet.

@gopherbot
Copy link

Change https://go.dev/cl/507855 mentions this issue: go/callgraph/rta: optimise 'implements' operation

@adonovan
Copy link
Member Author

adonovan commented Jul 4, 2023

To be fair, the RTA algorithm calls this function too much, so I've changed it to use the Bloom filter technique to reject obvious candidates, which made it 10x faster. Implements still uses 10% of CPU though, which seems high.

pprof001

^ click this vast empty space to see the profile

@findleyr findleyr added this to the Go1.22 milestone Jul 6, 2023
gopherbot pushed a commit to golang/tools that referenced this issue Jul 6, 2023
Before, RTA spent most of its time in types.Implements.
This change uses the fingerprint technique (similar to
a Bloom filter) from CL 452060 to quickly reject most
candidates.

The running time of the deadcode command on cmd/kubelet
dropped from 92s to 10s.

Updates golang/go#61162

Change-Id: Iacfba027270613e943a617129ff056c629b11f91
Reviewed-on: https://go-review.googlesource.com/c/tools/+/507855
Run-TryBot: Alan Donovan <adonovan@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
@seankhliao seankhliao added Performance NeedsFix The path to resolution is known, but the work has not been done. labels Jul 17, 2023
@griesemer
Copy link
Contributor

Agreed that this function could use some TLC.

@findleyr
Copy link
Contributor

I looked into this briefly last week, and didn't quite reproduce what Alan saw. I think we could do better here, for example by reusing slices, but it doesn't seem urgent. Moving to 1.23.

@findleyr findleyr modified the milestones: Go1.22, Go1.23 Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsFix The path to resolution is known, but the work has not been done. Performance
Projects
None yet
Development

No branches or pull requests

5 participants