Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/go/types/objectpath: add Encoder type, to amortize Scope.Names #58668

Closed
adonovan opened this issue Feb 23, 2023 · 18 comments
Closed

x/tools/go/types/objectpath: add Encoder type, to amortize Scope.Names #58668

adonovan opened this issue Feb 23, 2023 · 18 comments
Labels
Proposal Proposal-Accepted Tools This label describes issues relating to any tools in the x/tools repository.
Milestone

Comments

@adonovan
Copy link
Member

adonovan commented Feb 23, 2023

The objectpath.For function calls Scope.Names, which allocates and sorts a slice of all package members. If For is called for every member of a package (or for every declaration within a package, which is much more), then it incurs an amount of allocation that is quadratic in the number of package members. Some packages (e.g. for CPU instructions sets) are very large, and objectpath loops become so slow they appear to be stuck.

I propose we add a new Encoder type with a For method so that new(Encoder).For gives the old behavior, but re-using the encoder across multiple calls for objects in the same package amortizes the expensive steps like Scope.Names.

See https://go.dev/cl/470679 for the expedient workaround in gopls (merged).
See https://go.dev/cl/470678 for the change to the public API.

@gopherbot gopherbot added the Tools This label describes issues relating to any tools in the x/tools repository. label Feb 23, 2023
@gopherbot gopherbot added this to the Unreleased milestone Feb 23, 2023
@gopherbot
Copy link

Change https://go.dev/cl/470679 mentions this issue: go/types/objectpath: add encoder type, to amortize Scope.Names

@gopherbot
Copy link

Change https://go.dev/cl/470678 mentions this issue: go/types/objectpath: add Encoder type, to amortize Scope.Names

gopherbot pushed a commit to golang/tools that referenced this issue Feb 25, 2023
This change adds a new encoder type with For method, that
is functionally equivalent to the old For function but avoids
the surprising cost of repeated calls to Scope.Names and
canonicalize (now called namedMethods), both of which
allocate and sorts a slice.

The former canonicalize function was previously applied to
interface types too, but this was costly and unnecessary
as they are already sorted.

The public API change will have to wait for proposal 58668
to be accepted and thus fix issue 51017; this change uses
the linkname hack to expose the function to x/tools to fix a
pressing bug in gopls.

See golang/go#58668
Updates golang/go#51017

Change-Id: Id3e8726517d31371b9376b0e47e320f8b6c5e085
Reviewed-on: https://go-review.googlesource.com/c/tools/+/470679
TryBot-Result: Gopher Robot <gobot@golang.org>
gopls-CI: kokoro <noreply+kokoro@google.com>
Run-TryBot: Alan Donovan <adonovan@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
@findleyr
Copy link
Contributor

To add additional context: we simply wouldn't be able to use the objectpath package in gopls without this optimization.

Furthermore, I think Encoder is a good name.

@rsc
Copy link
Contributor

rsc commented Mar 8, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Mar 15, 2023

The API is

package objectpath // import "golang.org/x/tools/go/types/objectpath"

func For(obj types.Object) (Path, error) {
	return new(Encoder).For(obj)
}

// An Encoder amortizes the cost of encoding the paths of multiple objects.
// The zero value of an Encoder is ready to use.
type Encoder struct {
	...
}

func (enc *Encoder) For(obj types.Object) (Path, error) 

Are there any concerns about this API?

@rsc
Copy link
Contributor

rsc commented Mar 29, 2023

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@timothy-king
Copy link
Contributor

Overall I am in favor of this. I have one minor concern I might as well voice.

func (enc *Encoder) For(obj types.Object) (Path, error) suggests each path can be independently serialized by the encoder. An encoding scheme that shares object path prefixes would be more compact for very large packages but does require a notion of encoding many objs at once to get a benefit. For example instead of serializing a list of paths {"T.T1O", "T.T1CM0"} as {"T.T1O", "T.T1CM0"}, we can serialize it as a shared prefix map plus a list {"T.T1"}, {"0O","0CM0"}. Fact serialization for unitchecker is a bulk operation like this.

It is possible that if objectpath has such a scheme like this in the future, it might be more ergonomic for func (enc *Encoder) For(obj types.Object) (Path, error) to generate a relative path instead. But I don't think this is a serious objection as the compression can be done at the file level later (say gzip) or Encoder can be extended with other functions in the future func (enc *Encoder) All(objs []types.Object) (Table, []Path, error).

@rsc
Copy link
Contributor

rsc commented Apr 6, 2023

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@adonovan
Copy link
Member Author

adonovan commented Apr 7, 2023

@timothy-king: func (enc *Encoder) For(obj types.Object) (Path, error) suggests each path can be independently serialized by the encoder.

Yes, and that's the existing contract of the objectpath package. Think of it schematically as:

types.Object₁ - types.Package₁ = objectpath.Path
objectpath.Path + types.Package₂ = types.Object₂

If you need some other data structure (Table in your example) to make sense of the Path then either the Path is incomplete, or the Table is just something that can be derived from the Package by an analogous Decoder interface.

You're right that there may be more efficient encodings of sets of Paths, but that's not a compatible evolution of this API.

gopherbot pushed a commit to golang/tools that referenced this issue Apr 8, 2023
This change adds a new Encoder type with For method, that
is functionally equivalent to the old For method but avoids
the surprising cost of repeated calls to Scope.Names, which
allocates and sorts a slice.

See golang/go#58668
Fixes golang/go#51017

Change-Id: I328e60c60f9bc312af253a0509aa029c41960d41
Reviewed-on: https://go-review.googlesource.com/c/tools/+/470678
gopls-CI: kokoro <noreply+kokoro@google.com>
Reviewed-by: Tim King <taking@google.com>
Run-TryBot: Alan Donovan <adonovan@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@adonovan
Copy link
Member Author

This was fixed by https://go.dev/cl/470678.

@gopherbot
Copy link

Change https://go.dev/cl/496875 mentions this issue: internal/typesinternal: remove NewObjectpathFunc

gopherbot pushed a commit to golang/tools that referenced this issue May 22, 2023
Updates golang/go#58668
Fixes golang/go#60330

Change-Id: I06bf739e9278028cbafb174b93699fbdfe98882f
Reviewed-on: https://go-review.googlesource.com/c/tools/+/496875
Run-TryBot: Alan Donovan <adonovan@google.com>
gopls-CI: kokoro <noreply+kokoro@google.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Findley <rfindley@google.com>
@dominikh
Copy link
Member

@adonovan Wouldn't objectpath.Object also benefit from memoization? In Staticcheck I am seeing
image

@adonovan
Copy link
Member Author

adonovan commented Jul 11, 2023

There has been a memoized version of namedMethods since CL 470679. Of course you have to use an objectpath.Encoder, but it's an easy change.

[Edit: ignore me, replying before coffee.]

@adonovan
Copy link
Member Author

I question whether it's necessary to sort named-type methods at all. So long as the export/import process preserves whatever order was there in the original types.Named produced from syntax, it doesn't actually matter what order that is. Perhaps we could just delete the expectation of Id-ordered method lists and use "whatever order go/types gives us". We should probably add some normative text to the Named.Method doc comment ("arbitrary but deterministic order"), and to the export/import package ("preserves order"), but then we could just remove the sort method. I see Rob just filed an issue.

@findleyr
Copy link
Contributor

@adonovan unfortunately the sorting was added in https://go.dev/cl/354433 to fix https://go.dev/issue/44195. At the time, we didn't understand the implications of the fix.

We're going to have to do something else to avoid the sort. It is prohibitively expensive when doing modular analysis of certain repositories.

@adonovan
Copy link
Member Author

Fortunately that change didn't promise anything in the API, so I think we can simply use whatever unspecified yet deterministic order the type checker produces (and the exporter and importer preserve) and delete the sort.

@findleyr
Copy link
Contributor

@adonovan that CL description cites parse independence as one of the main goals. Probably this requires more investigation.

@adonovan
Copy link
Member Author

The CL description talks about "parse order" as if it is something different from "what's in the source file". I don't know why we would need or want to be independent of that. The attached issue is about the fact that gcexportdata used to sort and thus lose the order, but it has since been fixed so that it now preserves order.

If go/types promises that method order is deterministic, and gcexportdata promises that it preserves method order, then there's nothing more to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Proposal Proposal-Accepted Tools This label describes issues relating to any tools in the x/tools repository.
Projects
Status: Accepted
Development

No branches or pull requests

6 participants