Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/gopls: integrate test discovery #59445

Open
firelizzard18 opened this issue Apr 5, 2023 · 25 comments
Open

x/tools/gopls: integrate test discovery #59445

firelizzard18 opened this issue Apr 5, 2023 · 25 comments
Labels
FeatureRequest gopls Issues related to the Go language server, gopls. Tools This label describes issues relating to any tools in the x/tools repository.
Milestone

Comments

@firelizzard18
Copy link
Contributor

I would like to migrate the test discovery services of vscode-go into gopls so they can be implemented in Go and utilize the rich type information provided by the go/* packages instead of the current JavaScript implementation which relies on recursive directory walks and analyzing document symbols. Moving test discovery into gopls should improve performance and would bake it far easier to implement improvements such as detecting subtests in cases where that can be done deterministically.

CC @hyangah @suzmue

Related: golang/vscode-go#2445, golang/vscode-go#1602

@gopherbot gopherbot added Tools This label describes issues relating to any tools in the x/tools repository. gopls Issues related to the Go language server, gopls. labels Apr 5, 2023
@gopherbot gopherbot added this to the Unreleased milestone Apr 5, 2023
@firelizzard18
Copy link
Contributor Author

@gopherbot, please add label FeatureRequest

@findleyr
Copy link
Contributor

findleyr commented Apr 5, 2023

See also golang/vscode-go#2719.

We should do this. The actual analysis is not that hard: there are examples of recognizing test functions in the tests analyzer and subtests in the loopclosure analyzer.

We already have a run test code lens that could be intercepted in the vs code middleware. @hyangah @firelizzard18 would it be sufficient to extend that code lens to subtests?

@findleyr findleyr modified the milestones: Unreleased, gopls/v0.13.0 Apr 5, 2023
@firelizzard18
Copy link
Contributor Author

firelizzard18 commented Apr 5, 2023

@findleyr from an information perspective that would be sufficient if the code lens messages include some additional information (listed below). Otherwise we'd still need document symbols or some other mechanism for extracting that. However it seems like that might be problematic from a usability perspective. A user may wish to use the test explorer exclusively and disable the test code lens - if the user disables test code lenses does that tell gopls to stop sending those messages?

Off the top of my head I think these are the components we need:

  1. List packages within the workspace/module that contain tests/test files.
    • This seems like something best implemented as a separate call to gopls.
  2. List files within a package that contain tests/are test files.
    • It's not hard to do this in VSCode/JS via filesystem calls but presumably gopls already has that information loaded into memory so it would be faster to ask gopls.
    • Could this be merged with (1)?
  3. List tests within a file.
    • This requires the test name, the range in the file (of the TestXxx function or of the subtest callback), and whether it's a child of another test. Can we reasonably pack all of that into CodeLens.data?
    • This would involve interrogating files that are not open. How difficult is it to request code lenses for a file that has not been opened? What is the performance impact of doing so?

With respect to (1) and (2), testing for a _test.go suffix is sufficient (that's the current behavior) but it would be preferable to verify that the file actually contains tests. I sometimes have util_test.go files that contain shared test harnessing/utilities/etc and do not contain any tests. With the current behavior, those appear in the list and then disappear once they are expanded and discovered to contain no tests. If gopls can quickly and efficiently determine if a file contains tests, using gopls to implement (1) and (2) would improve the user experience.

A big motivation for moving (1) into gopls is to address performance issues on large projects (see golang/vscode-go#2504). The current implementation recursively walks the filesystem, which does not perform well on large projects (and likely performs badly if the project is on a remote filesystem). I realize that gopls will also have to walk the filesystem, but it must already be doing that (to supply type information) and I theorize that using the build metadata gopls collects will perform much better than the current JS implementation.

@firelizzard18
Copy link
Contributor Author

I am more than willing to work on this myself but I don't know when I'll have the time.

@findleyr
Copy link
Contributor

findleyr commented Apr 5, 2023

@firelizzard18 thanks for the details.

It sounds like it may be cleanest to have a custom command, as code lenses in general require type information and we may be able to implement this query without fully type-checking. This command may also have a life-cycle that is different from code lenses, typically.

If we do choose to implement this with type information, we can extract test information as part of our package post-processing, and serialize them in our file cache, so that they will be quickly available at startup. This aspect of the change would be tricker for an external contributor.

I have slated this for gopls@v0.13.0, as this keeps cropping up.

@firelizzard18
Copy link
Contributor Author

@findleyr I think we can separate the major areas of improvement into three groups:

  1. Package discovery
    • The current implementation (walking the filesystem) is ok but it's not very clean and it performs badly for large projects.
  2. Test discovery
    • The current implementation (using document symbols) is ok but it's not very clean.
    • I am pretty sure this can be done without type checking, all it really needs to do is locate func Test{name}(*testing.T) declarations.
  3. Static subtest discovery.
    • This is not currently supported except in an approximate way for stretchr test suites.
    • It seems to me this would be much easier to implement using complete type information. There are some cases where it's straight-forward to identify subtests by walking the AST, but type information would be very useful for more complex cases.

I think (1) and (2) are orthogonal and could be implemented separately, if that is more convenient. Whether (3) is dependent on (2) depends on the implementation. (3) could be implemented as an improvement to the normal test discovery; alternatively it could be implemented as a second pass.

Static subtest discovery (3) is the part I find most technically interesting. To be clear I know it is impossible to statically discover all sub tests (the halting problem) so IMO the target should be "statically discover subtests in cases where it is feasible to do so" as a companion to dynamic subtest discovery (that is, analyzing test output). I have functionally zero knowledge of how gopls works internally, so maybe I should wait until the team has implemented some form of static subtest discovery and then I can provide contributions building on that.

@mnoah1
Copy link

mnoah1 commented Apr 6, 2023

Package discovery

  • The current implementation (walking the filesystem) is ok but it's not very clean and it performs badly for large projects.

Can this improvement be done in away that it considers (or at least leaves the option open for) an option to infer relevant packages from the user's current open workspace state? As we were discussing on golang/vscode-go#2504, to support a large monorepo environment it becomes not just an issue of indexing speed, but also of relevance.

When many teams use one monorepo, "Show everything" (the current behavior) becomes too broad and shows thousands of irrelevant packages, and switching to nested mode becomes very deep. On the other hand, "Show only the package of the file I have open" may be too narrow, but is consistent, intuitive, and doesn't require the user to maintain any separate config with relevant paths. Perhaps the solution could be something like limiting the display to packages within "x" levels of the current directory.

@adonovan
Copy link
Member

adonovan commented Dec 5, 2023

Detecting tests from the syntax alone is possible, but there's a lot of potential for doing better by using typed syntax to infer subtests, at least in the simple cases. It would make most sense architecturally for test inference to be a step executed just after type checking, like references/methodsets/xrefs, to populate an index of tests. This would avoid the potential for requests such as "run all the tests in this module" to entail an expensive operation such as "first type-check all the packages in this module, sequentially". An index-based approach would make this operation parallel, fast, and cached, just like a references query.

@firelizzard18
Copy link
Contributor Author

@adonovan I would be happy to work on subtest discovery once the initial implementation is in place. That seems like a rather interesting problem. I’m also interested in contributing to the initial implementation but it sounds like that might be challenging due to indexing/caching/etc.

@adonovan
Copy link
Member

adonovan commented Dec 6, 2023

@adonovan I would be happy to work on subtest discovery once the initial implementation is in place. That seems like a rather interesting problem. I’m also interested in contributing to the initial implementation but it sounds like that might be challenging due to indexing/caching/etc.

Thanks. I don't think the gopls aspects are that tricky: just follow the model used by methodsets and xrefs, which are both analyses that run immediately after type checking and populate a persistent index. The tricky part is the meat of it--inferring names and locations of tests given typed syntax--which is as it should be.

@gopherbot
Copy link

Change https://go.dev/cl/548675 mentions this issue: gopls/internal/lsp: test discovery

@firelizzard18
Copy link
Contributor Author

firelizzard18 commented Dec 10, 2023

Change https://go.dev/cl/548675 mentions this issue: gopls/internal/lsp: test discovery

@adonovan @findleyr How's this for a proof of concept? Issues:

  • I'm not sure gopls/fileTests is the right method to use for the request
  • We also need "get all tests for package X" and maybe a couple others
  • I'm manually updating a generated file to add the custom LSP requests
  • The request to list all tests in a package requires the URI of a *_test.go file in the package, which is problematic

@adonovan
Copy link
Member

Thanks @firelizzard18 for the CL! I will take a look at it tomorrow.

@firelizzard18
Copy link
Contributor Author

Note to self

  • For parity with the existing vscode-go feature set, gopls would need to recognize stretchr tests, otherwise transitioning to gopls-based discovery would lead to a feature regression. That might be ok, as long as its temporary.
  • Tests returned by gopls need metadata. Special cases like stretchr tests need to be identified, since isolating a specific stretchr test requires special flags.
  • Return values from gopls test discovery methods should include enough information (i.e. module info) to construct the entire test tree.
  • vscode-go's test controller will need significant changes to switch to gopls-based test discovery. The controller should support discovery via expanding test explorer nodes, or opening files, or both. If the user does not expand any nodes or has that feature disabled and opens a file with tests, the controller should build the sub-section of the tree required to hold the tests in the file (and opening more files in the same or different packages/modules should work as expected).
  • Combining the last two points, it should be possible to configure vscode-go to only discover tests in open files, in packages with open files, or modules with open files.
  • Dynamically discovered sub-tests should be specifically marked, so that they can be cleared separately from statically discovered sub-tests.
  • Should dynamically discovered sub-tests should be purged when a test is run, instead of any time the file is updated (current behavior, which is can be annoy)?
  • Dynamic sub-test discovery needs work. Instead of splitting on certain characters (e.g. slashes):
    • When the test start event for a/b/c is received, search for tests matching /a(\/b?)?/
    • Attach a/b/c as a child to the test with the longest prefix
    • If no test is a prefix, add it to the package as an orphan?

@adonovan
Copy link
Member

adonovan commented Dec 14, 2023

@hyangah and I discussed the protocol and agreed that the API below should be sufficient for a starting point. The Packages method below would be a new command (see command.Interface in gopls), and its implementation would require type checking (similar to https://go.dev/cl/548675).

// The Packages command returns information about the set of Go packages
// associated with the specified files or directories, plus additional information such
// as the names of any tests within them.
func Packages(PackagesParams) (PackagesResult, error)

type PackagesParams struct {
	File []protocol.URI // file or directory (directories are recursively enumerated)
}

type PackagesResult struct {
        Packages []Package
}

// Package describes a directory that is a Go package (not an empty parent).
type Package struct {
	Path string // Go package path
	Dir protocol.URI // package directory
        Root protocol.URI // root of view folder (an enclosing directory)

	// TestFiles describes each file that defines a test, and the names of those tests.
        TestFiles []TestFile  // may be empty
}

type TestFile struct {
	File  protocol.URI // a *_test.go file
	Tests []Test
}

// Test represents a tree of test suites and test cases,
// preserving the structure of Go subtests, where possible.
type Test struct {
	// Name is the complete name of the test (Test, Benchmark, Example, or Fuzz)
        // as it appears in the output of go test.
	// The server may attempt to infer names of subtests by static
        // analysis; if so, it should aim to simulate the actual computed
        // name of the test, including any disambiguating suffix such as "#01".
        // Clients should quote this name using "^" + regexp.QuoteMeta(Name) + "$"
        // before passing it to go test -test.run=...
	Name string // e.g. "TestToplevel/Inner"

	// Range enclosing the entire test. This is used to place
        // the gutter marker and group tests based on location.
        // For subtests whose test names can be determined statically,
        // this can be either t.Run or the test data table
        // for table-driven setup.
	Range protocol.Range

	// Nested subtests.
	// Note that this is not an exhaustive list.
	// For example, if subtests are dynamically generated or
	// their names cannot be statically determined, those tests
	// may not be included in this list.
	Subtests []Test
}

@firelizzard18
Copy link
Contributor Author

@adonovan @hyangah I think recursion should be optional. There are scenarios where recursion is helpful, and others where it is not. Especially for large workspaces (#59445 (comment), golang/vscode-go#2504), the user may not want all tests to appear, even if gopls's performance is acceptable. The scenarios I see are:

  • Small workspace/user wants tests to be eagerly loaded:
    • Call Packages with the list of workspace roots, with recursion.
  • Medium workspace/user wants tests to be lazily loaded:
    • List workspace roots as test items, but do not populate.
    • When the user expands a workspace root test item, call Packages with that root, with recursion.
  • Large workspace/user only wants tests loaded for packages they're working on:
    • Do not populate the test view at all, until the user opens a file.
    • When the user opens a file, call Packages with that file or its directory ("show tests for open files" vs "show tests for packages with open files"), without recursion.

It might be better to use modules as the roots, instead of workspace folders. That would require some "find all Go modules given these workspace folders" logic, which would be easier with gopls than with JavaScript. But it might also make sense just to use workspace folders, since that will mirror the folder structure.

@hyangah
Copy link
Contributor

hyangah commented Dec 15, 2023

Thanks @firelizzard18

Here is a little bit of the background behind the API sketch @adonovan shared.

The upcoming zero-config gopls work (#57979) will introduce a significant change in how gopls models a workspace. Basically, gopls won't create a view until users open certain go-related files (go.* files, *.go files, go template files). If a go-related file opens, then gopls attempts to find the appropriate workspace boundary for the file. That can be a module root, or a workspace root, or a gopath directory or the current directory. It solves many scalability and performance issues. A similar model had been used in other editors and we think the experience will be better in a large workspace with multiple modules, or a workspace with go code in nested modules.

With this new model, we consider to adjust vscode-go test explorer's UX a bit. Gopls doesn't analyze modules or packages until the corresponding views about them are created. Test explorer needs to honor that new UX too, instead of attempting to eagerly load all modules and packages in the workspace. (Otherwise, either gopls won't return anything, or trigger creating views unnecessarily and cancelling out the benefit of zero config design).

We originally attempted to create package trees with multiple view roots. But we are afraid that leaks too much of the gopls's internal data structure. The API in #59445 (comment) instead attempts to simplify the problem by returning a flat package list. If the client wants hierarchical presentation, it can still reconstruct the hierarchy based on the package paths and the package directory paths & root (the view folder).

The three cases you listed were also discussed, and we think a client can implement the three cases with the API.

PackagesParams's File is meant to be all the open files.

  • Eagerly loading everything under the editor's workspace folder: this involves the editor plugin (vscode-go) to walk and find all go.mod files and opens them. That will let gopls load corresponding views. Then, query with the list of all open files.

  • Lazy package loading: that's the default behavior. If the client wants to show only the tests in the subfolders of the currently open files, it can still do so by just scanning the returned Package list and filtering out based on the package directory and root info when creating Test UI API TestItems.

  • Lazy test loading: the client can do filtering based on the package directory info and show only what users want.

For subtests, we try to preserve the hierarchies found from the source code. However, as you also pointed out, "/" in the test name is not escaped, so reconstructing the hierarchy for dynamically discovered subtests reliably is still difficult. I think for now, vscode-go test explorer can place all the dynamically discovered subtests under the test function.

I was originally afraid of the message sizes (and client-side overhead). But from a quick analysis of some large repos, the volume of returned results won't be too bad compared to other LSP messages. I hope we can try with this minimal API, and then if that becomes an issue we can consider to include some filters for optimization.

@firelizzard18
Copy link
Contributor Author

@hyangah The API sketch @adonovan shared looks good to me. I only have two very minor reservations:

  • Recursive subtests (Test.Subtests). I think it might be more natural to expose subtests as just another test and let the caller/vscode-go decide how to present them. Especially for weird cases like stretchr test suites, my gut feeling is that the recursive structure could get in the way. But I might be totally wrong, and we can always change it.
  • Modules. The proposed API (and currently my implementation) pretty much totally ignores modules. That might be fine, but it might not be the desired UX in some cases. For example, if a directory is loaded that contains a nested submodule, I believe the current implementation will ignore that and treat it as just another package. In that case, the user may wish the nested module to be displayed as a separate root in the test explorer. We can make that the client's problem, but it would be easier for the client to do if Package included the module path.

@firelizzard18
Copy link
Contributor Author

For visibility, here's a WIP CL for integrating this into vscode-go: https://go-review.googlesource.com/c/vscode-go/+/550555. It's the minimum viable change to switch to using gopls for test discovery. I plan to do more, but the full integration will require a lot of refactoring so I thought it would be best to do the basic integration in a small CL.

@hyangah
Copy link
Contributor

hyangah commented Dec 18, 2023

Thanks for the investigation and prototype @firelizzard18

  • Recursive subtests: I still think placing all subtests under a test under their enclosing test function (func Test*) seems natural. Whether to lay out subtests in a test function hierarchically is questionable, and I think it's ok to flatten them as a list. A nice thing about stretchr is that its framework is now built around go's standard subtest structures. So, instead of handling it to invoke -testify.m flags, I think we should construct the test items just using the go's standard subtest naming scheme. That way, we can avoid a dangling test item issue (see the picture).
    Screenshot 2023-12-18 at 8 29 42 AM

  • Modules: the idea is the Dir and the Root in the result Package provides the information about which module the package belongs too. Currently, we are still refining the meaning of the 'Root' though. So the protocol proposal we sketched may change.

Currently all of our team members are busy finishing up planned work for 2023. We are planning to work on testing UX in 2024 H1 (as part of the planning discussion we revived this thread).

@firelizzard18
Copy link
Contributor Author

@hyangah I agree that subtests should be displayed nested - I was thinking purely about the backend implementation. vscode-go needs a function for "find the parent of this subtest given the subtest name" for dynamic subtests. So vscode-go could easily handle receiving a flattened list of subtests from gopls; it would just look for a slash in the name and then run it through that function.

From a backend implementation POV, I am thinking flattening the gopls.packages command response might make it easier to implement, since it reduces the data structure complexity.

A nice thing about stretchr is that its framework is now built around go's standard subtest structures. So, instead of handling it to invoke -testify.m flags, I think we should construct the test items just using the go's standard subtest naming scheme.

I didn't realize that was an option, though I guess I just assumed I had to use -testify.m since that's what vscode-go was doing before.

That way, we can avoid a dangling test item issue (see the picture).

This has been a continual irritation to me and is definitely something I intend to eliminate. The existing code is supposed to realize that the dynamically discovered test matches the static one, but that obviously isn't working.

Modules: the idea is the Dir and the Root in the result Package provides the information about which module the package belongs too. Currently, we are still refining the meaning of the 'Root' though. So the protocol proposal we sketched may change.

I'll ignore it for now; we can always add a new field later if it makes sense.

@firelizzard18
Copy link
Contributor Author

We are planning to work on testing UX in 2024 H1 (as part of the planning discussion we revived this thread).

@hyangah @adonovan Any ETA on when you'll be ready to look at this issue again?

@firelizzard18
Copy link
Contributor Author

@hyangah @adonovan?

@hyangah
Copy link
Contributor

hyangah commented Apr 12, 2024

Sorry for the delay. As you noticed our team was busy closing gopls v0.15.x bugs and handling regressions related to the latest vscode release. I think we need more time to think about the problems 1 and 2 in your problem description #59445 (comment) (affected by gopls's zero config change, and it's very vscode-specific). However, the problem 3 is concrete enough. I will get back to you early next week.

@hyangah
Copy link
Contributor

hyangah commented Apr 17, 2024

@firelizzard18 @adonovan @findleyr

I have the updated API sketch in cl/579438

Testify case is a bit more complicated. I hope we can eliminate the need for use of -testify.m, but we will learn more. The use of RunCmd protocol.Command in @firelizzard18 's cl/579438 is interesting. If we are open to the idea of teaching gopls to understand testify's flag structure, we can potentially encode the info in RunCmd.

Let's discuss details in cl/579438.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FeatureRequest gopls Issues related to the Go language server, gopls. Tools This label describes issues relating to any tools in the x/tools repository.
Projects
Status: No status
Development

No branches or pull requests

6 participants