x/tools/go/packages: Load in NeedName|NeedFiles mode can be sped up significantly because providing answer doesn't require internet use #31893
Labels
FrozenDueToAge
modules
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Performance
Milestone
This issue is related to #29452 and #31087, but is more narrowly focused on the performance of the
packages.NeedName | packages.NeedFiles
mode.Background
The Go specification defines Go source file organization that is very conducive of extracting partial information with high performance and efficiency.
Specifically, each .go file always has 3 categories of content in well-specified order. The package clause is always first, followed by a possibly empty set of import declarations, and then finally the rest of its content.
This means that to find the package name in a .go file, a Go parser can stop after it parses the package clause. To find all imported packages in a .go file, the parser can stop after it reaches the first non-import declaration.
The Go parser provided by the
go/parser
package provides fine control over when to stop parsing a .go file via a Mode type:The Go specification also describes the concept of a Go package as follows:
The
go
command defines a 4th interesting category of information, build tags or constraints. They are also highly constrained, in that they can appear at the top of a .go source file only. More precisely, from go/build documentation:The
go/build
package makes good use of these properties when loading a Go package. The resolved package provides information about the package name, Go and other source files, and imports. To acquire that information, it invokes the Go parser in the following mode:It stops parsing after import declarations, and it needs to parse comments in order to find the build constraint comments. At no point is it necessary to use the internet, since all the necessary information is already available in the provided .go files.
API of go/build
The
go/build
API does not offer a lot of control over the process of loading a package. It has aImportMode
type which can either be0
to load Go packages normally (almost all information), orFindOnly
:API of go/packages
In contrast to that, the
go/packages
API provides a much greater degree of freedom about requesting information about the loaded package. See theLoadMode
type:As a result, it should be possible to achieve equal or better performance with
go/packages
API when the user requests as little information as justNeedName
orNeedFiles
(or both) withoutNeedImports
, since it wouldn't be necessary to parse the .go files beyond the package clause.Environment
All timing information was collected on a 2017 MacBook Pro with 256 GB flash storage.
Issue
The performance when using the
go/packages
package inpackages.NeedName | packages.NeedFiles
mode is significantly worse compared to thego/build
package.To reproduce the issue, create a target package to be imported in a temporary directory
/tmp/target
with the following files:Then, inside another directory, create the following test command:
Run it as follows:
Notice that during the first invocation,
go/build
took 1.7 milliseconds to produce correct results, whilego/packages
took 1.8 seconds.On the second invocation,
go/build
took 1.5 milliseconds (perhaps due to warmer disk cache), andgo/packages
took 322 milliseconds. Successive runs produce similar results.go/packages
became much faster on second run compared to first because thegithub.com/google/go-github
andgithub.com/google/go-querystring
modules were downloaded and extracted into the local module cache from the internet the first time, which doesn't need to happen the second time.However, 300~ milliseconds is still significantly slower than 2~ milliseconds that
go/build
takes to produce the same correct results for this query. Additionally, all the necessary information to answer a query inNeedName|NeedFiles
mode is available in the .go files on disk, so nothing needs to be downloaded from internet to return the correct result./cc @matloob per owners.
The text was updated successfully, but these errors were encountered: