Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/gopls: poor performance and crash on google-cloud-go #60089

Closed
codyoss opened this issue May 9, 2023 · 34 comments
Closed

x/tools/gopls: poor performance and crash on google-cloud-go #60089

codyoss opened this issue May 9, 2023 · 34 comments
Assignees
Labels
gopls/incremental gopls/performance Issues related to gopls performance (CPU, memory, etc). gopls Issues related to the Go language server, gopls. Tools This label describes issues relating to any tools in the x/tools repository.
Milestone

Comments

@codyoss
Copy link
Member

codyoss commented May 9, 2023

gopls version: v0.12.0-pre.1 (go1.20.3)
gopls flags:
update flags: proxy
extension version: 0.38.0
go version: 1.20.3
environment: Visual Studio Code darwin
initialization error: undefined
issue timestamp: Tue, 09 May 2023 14:45:46 GMT
restart history:
Tue, 09 May 2023 14:43:43 GMT: activation (enabled: true)

ATTENTION: PLEASE PROVIDE THE DETAILS REQUESTED BELOW.

Describe what you observed.

gopls crashed while in google-cloud-go. I believe I was searching to open a file before this happened.

Please attach the stack trace from the crash.
A window with the error message should have popped up in the lower half of your screen.
Please copy the stack trace and error messages from that window and paste it in this issue.

https://gist.github.com/codyoss/f19310e0db7499861e7f3d92ad056625

Failed to auto-collect gopls trace: unrecognized crash pattern.

Also got the following message post crash: extension-output-golang.go-#3-gopls (server): tokenization, wrapping and folding have been turned off for this large file in order to reduce memory usage and avoid freezing or crashing. and Connection to server got closed. Server will not be restarted.

@codyoss
Copy link
Member Author

codyoss commented May 9, 2023

Editor and computer came pretty unusable post crash. Downgrading for now to make things better again. We recently added a go.work to this repository which I think makes gopls maybe try to do too much extra work. On the previous version(0.11.X) I noticed things were better since I did not need to open editor at sub-module roots -- but the perf is starting to make me think it is not worth the tradeoff of having a go.work file. I think this is a great repo to test out gopls/go.work in because it has so many modules and so much code :)

@findleyr
Copy link
Contributor

findleyr commented May 9, 2023

Thanks Cody.

It looks like the critical top portion of your crash is cut-off. Do you still have it?

@codyoss
Copy link
Member Author

codyoss commented May 9, 2023

Apologies, I assumed select all would have selected all 😿 I no longer do. I will see if I can reproduce it later today if I have a couple of minutes.

@codyoss
Copy link
Member Author

codyoss commented May 9, 2023

Tried for a couple to make it crash again but was unable to. The experience in the pre version does seem noticeably slower than 0.11.0 for this repo though.

  • I noticed jumping to a reference to take several seconds consistently after jumping around a bit.
  • I was unable to open files from the file explorer of vscode after some time, but could if I searched them with cmd + P.
  • In the file explorer I also noticed it need time to load the filesystem a few times -- opening a folder with one file in it took several seconds to load the contents of the folder.
  • Got the spinning ball of death a few times on my mac

If there is nothing actionable from the report feel free to close, I don't think I can keep the version installed to try to reproduce though as it affects my daily dev too much.

More info on the type of machine I am on:
Chip: m1 pro
OS: MacOS 13.3.1
Memory: 16GB

@findleyr
Copy link
Contributor

findleyr commented May 9, 2023

@codyoss if it's too slow for your daily dev, that's actionable.

I'll take a look.

@findleyr findleyr changed the title gopls: automated issue report (crash) x/tools/gopls: poor performance and crash on google-cloud-go May 9, 2023
@findleyr
Copy link
Contributor

findleyr commented May 9, 2023

Well, it's not that bad for me (I suspect due to the linux filesystem). Using coc.nvim, performs about the same as v0.11.0 and uses ~1/6th of the memory. I am sure that there is a problem here, it just doesn't yet reproduce in my environment.

However, I grabbed a profile while editing this repo interactively, and the results are astonishing: top -cum lists typeCheckImpl as constituting 2.8% of CPU. Meanwhile, snapshot.IgnoredFile constitutes 48% of CPU(!), due to the expense of span.URIFromPath. IgnoredFile is extremely inefficient, as it is O(files*modules), and this repo contains 144 modules. This should be easy to fix.

So, the good news is that there is some very low hanging fruit here. The bad new is that I think this fruit exists in v0.11.0 as well, and I don't yet understand the regression.

I think making gopls work well on this repository could be a good exercise before the second prerelease, as it has so many workspace modules.

Moving this issue to the Go issue tracker, since it is (probably) a gopls issue.

CC @adonovan

@findleyr findleyr transferred this issue from golang/vscode-go May 9, 2023
@gopherbot gopherbot added Tools This label describes issues relating to any tools in the x/tools repository. gopls Issues related to the Go language server, gopls. labels May 9, 2023
@gopherbot gopherbot added this to the Unreleased milestone May 9, 2023
@findleyr findleyr modified the milestones: Unreleased, gopls/v0.12.0 May 9, 2023
@findleyr findleyr added gopls/incremental gopls/performance Issues related to gopls performance (CPU, memory, etc). labels May 9, 2023
@gopherbot
Copy link

Change https://go.dev/cl/494096 mentions this issue: gopls: add google-cloud-go as a benchmark repo

@gopherbot
Copy link

Change https://go.dev/cl/494097 mentions this issue: gopls/internal/lsp: optimize checks for ignored files

@findleyr
Copy link
Contributor

The CLs above set up some benchmarks for this repo, and then fix the low hanging fruit mentioned above to optimize the time it takes to diagnose the workspace from ~1.5s->300ms.

I'll note that the initial workspace load on my machine went from 2 minutes@v0.11.0 (!!) to 17s@v0.12.0-pre.1, and the in-use memory went from 6-7GB to 1GB. Gopls should perform significantly better overall on v0.12.0-pre.1, and in fact this is what I observe on my linux laptop. I am curious to test more on Mac+VS Code.

I am very suspicious of system calls: I observed that even on linux, 20%+ of CPU was spent in system calls related to the file cache. I expect this could be much worse on darwin. As an experiment, I put a 100MB in-memory LRU cache in front of our file-based cache, in https://go.dev/cl/494100, which significantly reduces the system call overhead. I'll test this on darwin tomorrow.

@adonovan
Copy link
Member

I am very suspicious of system calls: [...] even on linux, 20%+ of CPU was spent in system calls related to the file cache

That does seem like an awful lot, for Linux. We shouldn't be making that many filecache system calls in total, and there should be very little contention in flock (and whatever contention there is should be minimally CPU-bound: all it does is walk a very short linked list before going to sleep). It would be interesting to run gopls under perf in system mode with kernel symbols and see what shows up. (We could also compare against a rename(2)-based implementation of the cache, since it's atomic on Linux, but it really shouldn't be necessary to maintain a second implementation.)

For macOS, I could totally believe it because of #13841, a kernel bug which causes a frustrating profile skew towards system calls that do minimal CPU work (like open(2)).

gopherbot pushed a commit to golang/tools that referenced this issue May 10, 2023
This repo is large, and exercises use-cases with many workspace modules.

For golang/go#60089

Change-Id: I93b9c99ce52bd7ae25f65fdb9bf0568c25375411
Reviewed-on: https://go-review.googlesource.com/c/tools/+/494096
Reviewed-by: Alan Donovan <adonovan@google.com>
gopls-CI: kokoro <noreply+kokoro@google.com>
Run-TryBot: Robert Findley <rfindley@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit to golang/tools that referenced this issue May 10, 2023
Optimize checking for ignored files to avoid unnecessary checks, and
only build prefixes once. Along the way, fix a bug where path segments
were not handled correctly in the ignore check. Encapsulate the check to
make this easy to test.

As a result, the DiagnoseChange/google-cloud-go benchmark improved ~5x
from ~1.5s to 300ms.

Also remove span.Dir, which tended to lead to unnecessary
filepath->span->filepath conversions. Inline it in the one place where
it was correct.

For golang/go#60089

Change-Id: Id24d05b504b43e6a6d9b77b5b578583e1351de31
Reviewed-on: https://go-review.googlesource.com/c/tools/+/494097
Reviewed-by: Alan Donovan <adonovan@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Robert Findley <rfindley@google.com>
gopls-CI: kokoro <noreply+kokoro@google.com>
@findleyr
Copy link
Contributor

@codyoss I have a few more thoughts about this, after reflecting on your description a bit more:

  1. You should not have to open submodule roots: opening the repo root should work fine, thanks to your go.work file. Is this not what you see?
  2. gopls performance should never cause vs code to freeze in the way you describe, as gopls runs as a separate process. However, I have an idea: sometimes when I am using gopls in VS Code I open the output window to the gopls logs, and I've noticed that after a short amount of time the VS Code UI becomes incredibly laggy, due to poor scaling of the output buffer, and the chattiness of the LSP. Is it possible that you opened this window when testing v0.12.0-pre.1, and left it open?

With the optimizations above, it would be great if you could try again using gopls@master, being sure not to leave the gopls logs open in the output window. It really should be faster than v0.11.0, and if it's not I'd love to jump on a VC to pair debug.

To install gopls@master, just check out go.googlesource.com/tools, cd tools/gopls && go install.

@codyoss
Copy link
Member Author

codyoss commented May 10, 2023

  1. I may have been conflating things. As mentioned above we recently switched to go.work and that is why I no longer needed to open at sub-module roots. Sorry if that was not clear in the comment above. I can open at the repo root just fine now.
  2. I don't think I did. But maybe it was another process on my computer that was causing the lockups. I have seen gopls in the past take 400% CPU usage in macs activity monitor.

Building from HEAD I saw little to no issue. I did have a minute or two where my editor, vscode, would not jump to refs or show things again. But after a period of time it started working again. gopls CPU usage does seem down quite a bit as well. In the past I was consistently noticed it as the top process, but no longer.

I will keep this version installed as it seems to be working well for my use cases. Thanks

@findleyr
Copy link
Contributor

Thanks for testing again! I'm going to close this issue as done for now, but please comment if you encounter problems again, and I'll reopen.

don't think I did. But maybe it was another process on my computer that was causing the lockups. I have seen gopls in the past take 400% CPU usage in macs activity monitor.

gopls will use a lot of CPU when building its indexes, particularly at startup but also after changes. However, such periods should be brief, and overall CPU usage should be down. It may be the case that gopls so successfully saturates CPUs that your computer freezes. In that case, please let us know. We may need to tweak gopls to use less parallelism.

@codyoss
Copy link
Member Author

codyoss commented May 15, 2023

@findleyr I did have another crash from my build of HEAD. The issue is the stacktrace is too long for vscode to capture in output to see that beginning important bit -- my terminal captured about 315k loc and that was not enough. Any suggestions about how to get more info? Does gopls write a log anywhere or anything that would be useful to share?

@codyoss
Copy link
Member Author

codyoss commented May 15, 2023

Actually on subsequent startup I got another that was small enough: https://gist.github.com/codyoss/d21403933fe6c1f5802d3bf831424bd3

@findleyr
Copy link
Contributor

Thanks Cody. Huh, this looks like a runtime issue.

Could someone from @golang/runtime take a look at Cody's crash?

Cody: could you confirm that you're still on Go 1.20.3?

@findleyr findleyr reopened this May 15, 2023
@findleyr
Copy link
Contributor

Perhaps I spoke too soon: as @adonovan points out there are ~4000 threads in our file cache blocked on opening a locked file, which is exhausting the file limit.

So this does look like a gopls bug. We have ideas for how to fix this.

@codyoss
Copy link
Member Author

codyoss commented May 15, 2023

Will do!

@codyoss
Copy link
Member Author

codyoss commented May 15, 2023

No crashes but I did have another really long pause on an operation. I was searching go.mod files in the repo and it took 3-5 minutes to get back the query -- during which time it seems gopls was very hungry and unresponsive:
Screenshot 2023-05-15 at 4 21 27 PM

@codyoss
Copy link
Member Author

codyoss commented May 15, 2023

Screenshot 2023-05-15 at 4 21 43 PM

@codyoss
Copy link
Member Author

codyoss commented May 15, 2023

@adonovan another crash: https://gist.github.com/codyoss/866fe1c54ca191e40d6b6438bff8bb58

@gopherbot
Copy link

Change https://go.dev/cl/495055 mentions this issue: gopls/internal/lsp/filecache: limit Get parallelism too

gopherbot pushed a commit to golang/tools that referenced this issue May 16, 2023
This change causes Get to be subject to the same
concurrency-limiting semaphore just added to Set, as a user
(predictably) encountered thread exhaustion
in this operation too.

Oddly, the Get benchmark doesn't seem to be limited
by the value of 128, or even smaller values.

Updates golang/go#60089

Change-Id: Ie4632cbc4cdd6536558b2067a3d115cf4fa17fae
Reviewed-on: https://go-review.googlesource.com/c/tools/+/495055
Run-TryBot: Alan Donovan <adonovan@google.com>
gopls-CI: kokoro <noreply+kokoro@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@gopherbot
Copy link

Change https://go.dev/cl/494100 mentions this issue: gopls/internal/lsp/filecache: front with a 100MB in-memory LRU cache

gopherbot pushed a commit to golang/tools that referenced this issue May 16, 2023
Put a 100MB in-memory LRU cache in front of the filecache, to reduce I/O
for repeated cache access, such as we observe with workspace diagnostics
or implements queries.

Updates golang/go#60089
For golang/go#57987

Change-Id: I01a1fcca7dcf26416d4cdb578a7a2674765c9f08
Reviewed-on: https://go-review.googlesource.com/c/tools/+/494100
Run-TryBot: Robert Findley <rfindley@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopls-CI: kokoro <noreply+kokoro@google.com>
@codyoss
Copy link
Member Author

codyoss commented May 17, 2023

I still do believe there are perf issue with HEAD gopls + google-cloud-go's large codebase + using go.work -- at least on my machine. I reverted today to v0.11.0 and gopls does idle at much higher mem usuage(~12GB) but after a period of time everything seems to stabilize perf wise in my editor. On the latest gopls I still have consistent freeze ups where basic operations like searching the whole repo take a considerable amount of time to perform while it feels like gopls indexing of symbols or something and is causing other operations not to complete.

I think for now the solution for us is going to have to be removing our workspace files as they seems to tax gopls too much. I would like to eventually restore our these files because I do see an benefit of having them for other operations like syncing deps between all of our various modules, but I don't think it is currently worth the productivity hit in vscode.

I would be curious if other large workspace repos have experienced the same issues, but I don't personally know of other projects as large as google-cloud-go that are trying to use this feature.

@findleyr
Copy link
Contributor

I think for now the solution for us is going to have to be removing our workspace files as they seems to tax gopls too much. I would like to eventually restore our these files because I do see an benefit of having them for other operations like syncing deps between all of our various modules, but I don't think it is currently worth the productivity hit in vscode.

That's really concerning to hear. You should absolutely not need to downgrade to v0.11.0, nor should you need to remove the go.work file. We'll keep digging into this.

On the latest gopls I still have consistent freeze ups where basic operations like searching the whole repo take a considerable amount of time to perform

Can you clarify what you mean by "searching the whole repo"? Do you mean symbol search, or something not related to gopls, like textual search?

@findleyr
Copy link
Contributor

Discussing with @codyoss, we believe this is related to our go mod tidy logic.

Historically, gopls has conservatively re-run go mod tidy when files are saved (because saved files may theoretically change the tidy result, before overlay support).

This runs for each go.mod file in the workspace, and eagerly parses Go files in the module to analyze the tidy result.

Mix in the VS Code setting that auto-saves files after a delay, gopls@v0.12.0's more aggressive eviction of parsed files, and the new google-cloud-go go.work file with 145 modules, and you have a recipe for disaster.

Fortunately, this is yet again easy to fix...

@findleyr findleyr reopened this May 17, 2023
@findleyr findleyr self-assigned this May 17, 2023
@gopherbot
Copy link

Change https://go.dev/cl/496186 mentions this issue: gopls/internal/lsp: split file patterns

@gopherbot
Copy link

Change https://go.dev/cl/496435 mentions this issue: gopls/internal/regtest/bench: add a benchmark for diagnosing saves

@gopherbot
Copy link

Change https://go.dev/cl/496439 mentions this issue: gopls/internal/lsp/mod: optimizations for mod tidy diagnostics

@gopherbot
Copy link

Change https://go.dev/cl/496442 mentions this issue: gopls/internal/lsp: keep track of overlays on the files map

gopherbot pushed a commit to golang/tools that referenced this issue May 19, 2023
As we discovered while investigating golang/go#60089, mod tidy
operations can significantly affect the amount of time it takes gopls to
diagnose a saved file.

Add a benchmark for this operation. For reference, this new benchmark
takes 8s+ on google-cloud-go, vs 300ms for DiagnoseChange (without the
save).

Updates golang/go#60089

Change-Id: Ie88bd63dd7d205b8629173e7f84aa1aa9858016b
Reviewed-on: https://go-review.googlesource.com/c/tools/+/496435
TryBot-Result: Gopher Robot <gobot@golang.org>
gopls-CI: kokoro <noreply+kokoro@google.com>
Run-TryBot: Robert Findley <rfindley@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
@gopherbot
Copy link

Change https://go.dev/cl/496595 mentions this issue: gopls/internal/lsp/cache: only delete the most relevant mod tidy handle

gopherbot pushed a commit to golang/tools that referenced this issue May 19, 2023
Run mod tidy diagnostics in parallel, and don't parse files to determine
missing imports if there are no missing requires.

BenchmarkDiagnoseSave: 8s->1.8s

For golang/go#60089

Change-Id: I5d41827914e4eb9264b16ed14af323c017eb327c
Reviewed-on: https://go-review.googlesource.com/c/tools/+/496439
TryBot-Result: Gopher Robot <gobot@golang.org>
gopls-CI: kokoro <noreply+kokoro@google.com>
Run-TryBot: Robert Findley <rfindley@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
gopherbot pushed a commit to golang/tools that referenced this issue May 19, 2023
The overlays method showed up as a hot spot in the google-cloud-go repo
(via reloadOrphanedFiles), because it walks all files to find the small
number of overlays.

We already have a filesMap abstraction; use it to keep track of overlays
in a separate map.

For golang/go#60089

Change-Id: I62c6c688d012beaa4b0f255225993da961cb9dad
Reviewed-on: https://go-review.googlesource.com/c/tools/+/496442
Reviewed-by: Alan Donovan <adonovan@google.com>
Run-TryBot: Robert Findley <rfindley@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopls-CI: kokoro <noreply+kokoro@google.com>
gopherbot pushed a commit to golang/tools that referenced this issue May 19, 2023
For workspaces with a lot of modules, deleting every mod tidy handle on
every save is too expensive. Approximate the correct behavior by
deleting only the most relevant mod file. See the comments in the code
for an explanation of why this is an approximation, and why is is
probably acceptable.

This decreases the DiagnoseSave benchmark for google-cloud-go to 550ms
(from 1.8s).

For golang/go#60089

Change-Id: I94bea0b00b13468f73f921db789292cfa2b8d3e9
Reviewed-on: https://go-review.googlesource.com/c/tools/+/496595
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Robert Findley <rfindley@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
@gopherbot
Copy link

Change https://go.dev/cl/496835 mentions this issue: gopls: add a new "subdirWatchPatterns" setting

gopherbot pushed a commit to golang/tools that referenced this issue May 22, 2023
As discovered in golang/go#60089, file watching patterns behave very
differently in different clients. We avoided a bad client-side bug in VS
Code by splitting our subdirectory watch pattern, but this appears to be
very expensive in other clients (notably coc.nvim, or any vim client
that uses watchman).

The subdirectory watch patterns were only known to be necessary for VS
Code, due to microsoft/vscode#109754. Other clients work as expected
when we watch e.g. **/*.go. For that reason, let's revert all other
clients to just use simple watch patterns, and only specialize to have
subdirectory watch patterns for VS Code.

It's truly unfortunate to have to specialize in this way. To paper over
this hole in the wall, add an internal setting that allows clients to
configure this behavior explicitly. The new "subdirWatchPatterns"
setting may accepts the following values:
 - "on": request watch patterns for each subdirectory (as before)
 - "off": do not request subdirectory watch patterns
 - "auto": same as "on" for VS Code, "off" for all others, based on the
   provided initializeParams.clientInfo.Name.

Includes some minor cleanup for the fake editor, and fixes some stale
comments.

Updates golang/go#golang/go#60089
Fixes golang/go#59635

Change-Id: I1eab5c08790bd86a5910657169edcb20511c0280
Reviewed-on: https://go-review.googlesource.com/c/tools/+/496835
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
gopls-CI: kokoro <noreply+kokoro@google.com>
Run-TryBot: Robert Findley <rfindley@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gopls/incremental gopls/performance Issues related to gopls performance (CPU, memory, etc). gopls Issues related to the Go language server, gopls. Tools This label describes issues relating to any tools in the x/tools repository.
Projects
None yet
Development

No branches or pull requests

4 participants