Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/gopls: move analyzers to an extensible sidecar #59869

Open
adonovan opened this issue Apr 27, 2023 · 11 comments
Open

x/tools/gopls: move analyzers to an extensible sidecar #59869

adonovan opened this issue Apr 27, 2023 · 11 comments
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis) FeatureRequest gopls/analysis Issues related to running analysis in gopls gopls Issues related to the Go language server, gopls. Thinking Tools This label describes issues relating to any tools in the x/tools repository.
Milestone

Comments

@adonovan
Copy link
Member

adonovan commented Apr 27, 2023

We've talked a lot about how users can extend the set of analyzers that gopls runs so it can include org-specific checkers. Historically the challenge was that users would have to re-build gopls with the extended set of analyzers (which is not difficult but is inconvenient), or gopls would have to communicate with a separate process that contains the analyzers, which means it would have to do type checking yet again. (gopls already does--and must do--type checking of each package twice: once for its main index, and again for analysis.) But today @findleyr pointed out that, as of v0.12.0, there's no real efficiency reason not to move all of gopls' analysis into the sidecar process, so that the net number of invocations of the type checker is unchanged.

In essence, gopls would fork+exec a long-lived child process and communicate with it over a pipe. (The child process wouldn't need any other capabilities.) A request would send the metadata, source files, facts, and export data required to analyze a package, and the child would do the work and return serialized facts and diagnostics over the pipe. By default the child program would be a mode of gopls itself (e.g. 'gopls analyze'), but users could specify an alternative program that implements the same interface, analogous to the way 'go vet -vettool=...' runs an alternative unitchecker-based tool.

This approach still requires the user to build an executable containing both gopls code (the driver and core analyzers) and user code (their analyzers), but this could be done automatically by gopls: it would generate a main file from the user's configuration and then execute it, a bit like 'go test'. But perhaps this approach (and the potential for version skew of both the go toolchain and the gopls source) is more complex than the simple "re-build" approach. Something to think about.

@gopherbot gopherbot added Tools This label describes issues relating to any tools in the x/tools repository. gopls Issues related to the Go language server, gopls. labels Apr 27, 2023
@gopherbot gopherbot added this to the Unreleased milestone Apr 27, 2023
@adonovan adonovan added the Analysis Issues related to static analysis (vet, x/tools/go/analysis) label Apr 27, 2023
@findleyr
Copy link
Contributor

Per in-person discussion, we need to decide whether this has significant advantages over simply making it easier to recompile gopls itself.

One primary advantage is isolation: analyzers can't mutate other global state that may affect gopls. But on the other hand, does this limit gopls extensibility to analyers? If so, is that sufficient?

@findleyr findleyr modified the milestones: Unreleased, gopls/later Apr 28, 2023
@findleyr findleyr added gopls/analysis Issues related to running analysis in gopls Thinking labels Apr 28, 2023
@adonovan
Copy link
Member Author

Yes, @hyangah raised a similar question: the user still has to build something containing their analyzers; why should that not just be gopls itself? There are fewer moving parts that way.

The only real advantage I can think of is isolation: a sidecar could make gopls robust to panics and crashes in the analyzers, and with some bisection logic it could automatically identify and retire the culprits. Also, we could run completely untrusted analyzer code safely if it ran in a sandbox (e.g. something like seccomp) with no capabilites. But that does seem like a lot of work, and presumably users trust their own org's analyzers.

The security angle is more interesting in the context of an idea that @griesemer brought up yesterday, of allowing each module to define its own analyzers that are automatically run on packages that import it. This would allow a module with a complex API to provide a set of checkers to catch misuses of it. Checkers could run as part of 'go test' (which is already running untrusted code), or they could be run by a hypothetical 'go build -analyze' command if the checkers can be sandboxed so that they run in an unprivileged process that can't access the file system or network.

Also, @ianthehat mentioned that in an earlier experiment with extensible sets of analyzers, there were significant user interface challenges with too many error messages, and cross-talk between analyzers. I don't have details but perhaps he can elaborate.

@findleyr
Copy link
Contributor

findleyr commented Apr 28, 2023

Another big advantage of an analyzer sidecar is that it could be dynamic.

An idea that has been thrown around is having codebase-specific analyzers (e.g. I have a local analyzer that I use to catch when I forget to add a Go copyright header). If there were a configurable "workspaceAnalyzers" setting that pointed to a workspace-relative directory with a pre-determined layout similar to x/tools/go/analysis/passes, we could recompile an analyzer binary for each workspace, and allow users to run their own analyzers for their codebase, for each workspace.

@ianthehat
Copy link

Another potential advantage is the ability to run multiple sidecars and cleanly merge the results

@findleyr
Copy link
Contributor

I look forward to a future where gopls hotreloads analyzers in x/tools while I edit them.

@timothy-king
Copy link
Contributor

A goplschecker.Main analog to *checker.Main seems like it would be easy for others to adapt a list of analysis.Analyzers to talk the gopls analyze protocol. Then a minimum but flexible configuration for external checkers is just a list of external binaries and flags. These could be compiled ahead of time. This only really requires an external_analyzer setting.

A part of the protocol could be a handshake to check for version skew (+config skew). gopls can warn when a a tool is disabled due to version skew.

More flexible users could just give a list of packages and Analyzer variable and the goplschecker + go.mod could be generated relatively easily. gopls could compile and recompile this on demand. It seems like this would be done fairly infrequently. This could just be another external_analyzer. My gut instinct is that a provided external_analyzer binary would be more attractive to security sensitive organizations.

@leighmcculloch
Copy link
Contributor

Historically the challenge was that users would have to re-build gopls with the extended set of analyzers (which is not difficult but is inconvenient)...

The difficulty, or inconvenience, for me is not the action of recompiling gopls, but that it is an unintuitive workflow. Setting an analyzer occurs like a config, and it's pretty uncommon when configuring an IDE tool to need to recompile the tool to collect the config.

Is it possible for gopls to call out to a binary that is a singlechecker or multichecker instead of a custom gopls API? Then users can build analysis tools as they do today, and build their multichecker, and use it with gopls.

The checkers have a -json option. If it isn't sufficient, can we improve the API of the checkers so they're usable with gopls?

@adonovan
Copy link
Member Author

adonovan commented Jul 7, 2023

The difficulty, or inconvenience, for me is not the action of recompiling gopls, but that it is an unintuitive workflow. Setting an analyzer occurs like a config, and it's pretty uncommon when configuring an IDE tool to need to recompile the tool to collect the config.

I agree, it's confusing and inconvenient, but we should keep in mind that there may be solutions to this other than the sidecar-based approach, for example a gopls that can rebuild itself based on configuration changes. (Something like this is already being discussed in the context of #61185, not that this approach is by any means trivial.)

Is it possible for gopls to call out to a binary that is a singlechecker or multichecker instead of a custom gopls API?

No, it's not possible today, and I don't think it's even the right interface for a gopls analysis sidecar because {single,multi}checker uses go/packages to load a complete program from source, whereas gopls should be in charge of dependency analysis and invalidation (and indeed providing source files, which may contain unsaved edits). The 'unitchecker' model of separate analysis is much closer to what we want, though again I would not use that interface exactly.

But the idea of building a multichecker-like tool that consists of a special main function plus a bunch of analyzers---and nothing else---is exactly what this proposal is about. It's just that the main function would be designed around the channel to gopls.

@leighmcculloch
Copy link
Contributor

leighmcculloch commented Jul 7, 2023

But could we build the gopls interface into the {single,multi}checker applications as a flag, -gopls, that enables this different mode, so that checkers by default are compatible with the language server? It seems more useful if the community of checkers are naturally and efficiently compatible with other ecosystem tools.

(Or the flag could be called -pipe or -stream or something else that isn't gopls specific if that notion of coupling to gopls specifically isn't welcome.)

@adonovan
Copy link
Member Author

adonovan commented Jul 7, 2023

But could we build the gopls interface into the {single,multi}checker applications as a flag, -gopls, that enables this different mode, so that checkers by default are compatible with the language server? It seems more useful if the community of checkers are naturally and efficiently compatible with other ecosystem tools.

(Or the flag could be called -pipe or -stream or something else that isn't gopls specific if that notion of coupling to gopls specifically isn't welcome.)

That's a really interesting idea. Yes, I think it should be possible to define a system-level interface to a process that runs a set of analyzers and communicates entirely over a pipe or socket (and has no other capabilities), without reference to any code in the gopls module.

@leighmcculloch
Copy link
Contributor

leighmcculloch commented Jul 7, 2023

Some usability advantages of bundling the logic into a single build for checkers:

  • Developers of checkers don't need to be educated about gopls' (or other tools like gopls) unique requirements.
  • Developers of checkers only need to build and distribute one binary for folks to use in a variety of places.
  • Users of checkers don't need to choose between different builds of multicheckers for use in different places.
  • Users of checkers who use both the regular and gopls multichecker modes in different places do not have to deal with problems such as having different versions of the two binaries leading to inconsistent errors in CI vs IDE vs console. (For an example of where this type of inconsistency pain with IDE vs terminal usage can be so painful, see rust-analyzer and rustc.)
  • Developers of other tools similar to gopls can also support the pipe mechanism with any community checkers.

(The different "places" I'm thinking of being:

  • CI, which would likely use the regular multichecker mode.
  • IDE, which would likely use the gopls / pipe variety of multichecker mode.
  • Console/terminal, which would likely use the regular multichecker mode.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis) FeatureRequest gopls/analysis Issues related to running analysis in gopls gopls Issues related to the Go language server, gopls. Thinking Tools This label describes issues relating to any tools in the x/tools repository.
Projects
None yet
Development

No branches or pull requests

6 participants