-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/tools/gopls: regtest flakes due to hanging go commands #54461
Comments
Two observations:
|
The only reason the kill system call can fail (at least in this situation) is when the child process has already exited, so failure of kill is unlikely to be the culprit. More likely kill terminated the go process itself, but not the tree of processes rooted at it. If one of them (a test?) retains an open file descriptor to the stdout pipe created by os/exec then the cmd.Run operation will hang indefinitely. To dig further, we could add logic to run during the failure (on linux) that does |
Interesting, I was debugging this in https://go.dev/cl/424075. On windows, our call to Process.Kill() is failing with "invalid argument": A bit of googling suggests that this is because we can't kill subprocesses on windows. @bcmills any advice for how to properly kill the go command on windows? |
After reading the source a bit more: this is EINVAL, which appears to mean that the Process.wait() has exited and the handle released, so this is a race, although it is surprising that we hit it so reliably. |
Change https://go.dev/cl/424075 mentions this issue: |
Can't be done without creating a whole extra process group, unfortunately. (Probably we should add a side-channel — perhaps an open file descriptor or a pidfile? — to request clean shutdown on Windows.) |
When a go command hangs during gopls regression tests, print out additional information about processes and file descriptors. For golang/go#54461 Change-Id: I92aa4665e9056d15a274c154fce2783bed79718e Reviewed-on: https://go-review.googlesource.com/c/tools/+/424075 gopls-CI: kokoro <noreply+kokoro@google.com> Reviewed-by: Alan Donovan <adonovan@google.com> Run-TryBot: Robert Findley <rfindley@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
Change https://go.dev/cl/431075 mentions this issue: |
Add a TODO and wait for a shorter period of time following Kill, per post-submit advice from bcmills on CL 424075. For golang/go#54461 Change-Id: Ia0e388c0119660844dad32629ebca4f122fded12 Reviewed-on: https://go-review.googlesource.com/c/tools/+/431075 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Robert Findley <rfindley@google.com> gopls-CI: kokoro <noreply+kokoro@google.com> Reviewed-by: Bryan Mills <bcmills@google.com>
Nice. Well, that test process seems very much alive, falsifying my hypothesis. |
Looks like the hanging go command is in the middle of a compile. Wish we had the full subprocesss command line -- I'll look into that. Not sure how to interpret the fstat output. |
That's a dead cmd/compile process: there's no command because argv has been destroyed along with the rest of the address space. Perhaps the go list parent simply hasn't called waitpid yet, so the process table entry has to be retained. I suspect the problem is in go list. |
Aha, thanks (excuse my ps noobness). Note that we instrumented this panic in two places: once before |
That one is |
2022-09-17T02:56:51-4d18923-cc1b20e/netbsd-amd64-9_0 Still only netbsd. Posting the |
Ooh, nice! https://go.dev/issue/55323#issuecomment-1254107802 has a |
Found new dashboard test flakes for:
2025-02-21 21:16 x_tools-gotip-openbsd-amd64 tools@1c52ccd3 go@f24b299d x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-02-21 21:16 x_tools-gotip-openbsd-amd64 tools@6e3d8bca go@f24b299d x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-02-21 21:47 x_tools-gotip-openbsd-amd64 tools@3d7c2e28 go@f24b299d x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-21 21:47 x_tools-go1.24-openbsd-amd64 tools@3d7c2e28 release-branch.go1.24@0f7b7600 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-02-21 22:37 x_tools-go1.24-openbsd-amd64 tools@5299dcb7 release-branch.go1.24@0f7b7600 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-21 22:37 x_tools-go1.23-openbsd-amd64 tools@5299dcb7 release-branch.go1.23@22fdd35c x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-21 23:41 x_tools-gotip-openbsd-amd64 tools@274b2375 go@f062d7b1 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-23 14:00 x_tools-gotip-openbsd-amd64 tools@739a5af4 go@fba83cdf x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-02-24 14:40 x_tools-gotip-openbsd-amd64 tools@2b2a44ed go@fba83cdf x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-02-24 17:19 x_tools-go1.24-openbsd-amd64 tools@d2fcd360 release-branch.go1.24@af236716 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-24 17:21 x_tools-gotip-openbsd-amd64 tools@3e76cae7 go@dceee2e9 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-24 19:57 x_tools-go1.24-openbsd-amd64 tools@bf9e2a81 release-branch.go1.24@af236716 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-02-25 14:43 x_tools-go1.24-openbsd-amd64 tools@6d4af1e1 release-branch.go1.24@af236716 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-25 21:13 x_tools-gotip-openbsd-amd64 tools@6f7906b2 go@b38b0c00 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-02-25 21:57 x_tools-go1.24-openbsd-amd64 tools@7fed2a4a release-branch.go1.24@af236716 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-26 03:50 x_tools-go1.23-openbsd-amd64 tools@5dc980c6 release-branch.go1.23@2aaa3889 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-02-26 03:50 x_tools-gotip-openbsd-amd64 tools@5dc980c6 go@1b1c6b83 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-26 14:50 x_tools-gotip-openbsd-amd64 tools@779331ac go@4c756718 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-02-26 17:49 x_tools-gotip-openbsd-amd64 tools@d740adf9 go@8b8bff7b x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-26 03:50 x_tools-go1.24-openbsd-amd64 tools@5dc980c6 release-branch.go1.24@f5c38831 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-02-26 17:49 x_tools-go1.24-openbsd-amd64 tools@d740adf9 release-branch.go1.24@7f375e2c x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-02-26 19:45 x_tools-go1.24-openbsd-amd64 tools@63229bc7 release-branch.go1.24@949eae84 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-26 19:45 x_tools-go1.23-openbsd-amd64 tools@63229bc7 release-branch.go1.23@e4772831 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-27 18:07 x_tools-go1.24-openbsd-amd64 tools@8f4b8cd6 release-branch.go1.24@5d692084 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-27 18:07 x_tools-go1.23-openbsd-amd64 tools@8f4b8cd6 release-branch.go1.23@e4772831 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-02-28 02:32 x_tools-go1.24-openbsd-amd64 tools@ff03c59f release-branch.go1.24@5d692084 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-28 02:50 x_tools-go1.23-openbsd-amd64 tools@66eb306a release-branch.go1.23@e4772831 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-28 20:03 x_tools-go1.23-openbsd-amd64 tools@608d370d release-branch.go1.23@e4772831 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-02-28 20:03 x_tools-go1.24-openbsd-amd64 tools@608d370d release-branch.go1.24@5d692084 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-02-28 20:28 x_tools-go1.24-openbsd-amd64 tools@5f02a3e8 release-branch.go1.24@5d692084 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-03-01 19:02 x_tools-go1.24-openbsd-amd64 tools@d1414997 release-branch.go1.24@5d692084 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-03-01 19:02 x_tools-go1.23-openbsd-amd64 tools@d1414997 release-branch.go1.23@e4772831 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-03-03 14:37 x_tools-go1.23-openbsd-amd64 tools@0efa5e51 release-branch.go1.23@e4772831 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-03-03 20:04 x_tools-go1.23-openbsd-amd64 tools@0ffdb82e release-branch.go1.23@e4772831 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-03-03 20:07 x_tools-go1.24-openbsd-amd64 tools@2b1f5503 release-branch.go1.24@5d692084 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-03-04 21:25 x_tools-gotip-openbsd-amd64 tools@8d38122b go@6f90ae36 x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-03-05 16:17 x_tools-go1.23-openbsd-amd64 tools@340f21a4 release-branch.go1.23@45aade7f x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-03-05 16:48 x_tools-go1.23-openbsd-amd64 tools@ece9e9ba release-branch.go1.23@45aade7f x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-03-05 18:38 x_tools-go1.23-openbsd-amd64 tools@db6008cb release-branch.go1.23@45aade7f x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-03-05 20:18 x_tools-go1.23-openbsd-amd64 tools@6a5b66be release-branch.go1.23@45aade7f x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-03-06 18:34 x_tools-go1.23-openbsd-amd64 tools@b08c7a26 release-branch.go1.23@45aade7f x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-03-06 18:34 x_tools-go1.24-openbsd-amd64 tools@b08c7a26 release-branch.go1.24@0ace2d8a x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-03-07 17:40 x_tools-go1.24-openbsd-amd64 tools@7435a814 release-branch.go1.24@0ace2d8a x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-03-07 21:08 x_tools-go1.24-openbsd-amd64 tools@29f81e9d release-branch.go1.24@0ace2d8a x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-03-08 17:41 x_tools-go1.24-openbsd-amd64 tools@8fa586e1 release-branch.go1.24@0ace2d8a x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-03-11 21:41 x_tools-go1.24-openbsd-amd64 tools@381d68d8 release-branch.go1.24@0ace2d8a x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-03-12 17:21 x_tools-go1.23-openbsd-amd64 tools@4ee50fe6 release-branch.go1.23@45aade7f x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-03-12 20:41 x_tools-go1.23-openbsd-amd64 tools@e59d6c5d release-branch.go1.23@45aade7f x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-03-12 20:41 x_tools-go1.24-openbsd-amd64 tools@e59d6c5d release-branch.go1.24@0ace2d8a x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-03-12 20:44 x_tools-go1.24-openbsd-amd64 tools@40f8cca0 release-branch.go1.24@c2a34bed x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Found new dashboard test flakes for:
2025-03-14 01:02 x_tools-go1.23-openbsd-amd64 tools@6c3e542d release-branch.go1.23@45aade7f x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
2025-03-14 01:02 x_tools-go1.24-openbsd-amd64 tools@6c3e542d release-branch.go1.24@c2a34bed x/tools/gopls/internal/test/integration/codelens.TestRegenerateCgo/default [ABORT] (log)
|
Change https://go.dev/cl/658015 mentions this issue: |
The new openbsd/amd64 7.6 builder is generally working well everywhere but this one place. Add a skip for now to buy time to investigate this issue. Note that the previous openbsd/amd64 7.2 builder was also running into problems with these tests, as tracked in go.dev/issue/54461, though it wasn't happening as consistently as it is now. For golang/go#72145. For golang/go#54461. Change-Id: I6dd34fcdcca99c90282f0b9119936efa6bebf458 Cq-Include-Trybots: luci.golang.try:x_tools-gotip-openbsd-amd64 Reviewed-on: https://go-review.googlesource.com/c/tools/+/658015 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Alan Donovan <adonovan@google.com> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2022-08-15T17:42:12-987de34-1f833e4/darwin-amd64-12_0
2022-08-14T00:06:23-35f806b-59865f1/netbsd-amd64-9_0
2022-08-12T20:40:05-bebd890-2f6783c/netbsd-386-9_0
2022-08-12T18:15:28-bebd890-b6f87b0/netbsd-amd64-9_0
2022-08-12T12:39:26-88d981e-f67c766/netbsd-amd64-9_0
2022-08-12T00:04:29-c4ec74a-a5cd894/netbsd-386-9_0
2022-08-11T19:05:54-c4ec74a-62654df/netbsd-amd64-9_0
2022-08-11T17:53:50-37a81b6-a526ec1/netbsd-amd64-9_0
2022-08-11T16:19:14-37a81b6-2340d37/netbsd-amd64-9_0
2022-08-11T12:53:59-b2156b5-3c200d6/netbsd-386-9_0
2022-08-10T22:22:48-b2156b5-6b80b62/netbsd-amd64-9_0
2022-08-10T17:41:25-0ad49fd-f19f6c7/netbsd-amd64-9_0
2022-08-10T15:08:24-3950865-c81dfdd/netbsd-amd64-9_0
2022-08-10T02:14:09-6fa767d-5531838/plan9-386-0intro
2022-08-09T14:33:24-92d58ea-0981d9f/openbsd-386-70
2022-08-09T14:12:01-92d58ea-662a729/netbsd-amd64-9_0
2022-08-09T13:39:27-92d58ea-9e8020b/netbsd-386-9_0
2022-08-09T11:28:56-92d58ea-0f8dffd/netbsd-amd64-9_0
2022-08-08T18:10:56-fff6d6d-4bcc138/netbsd-amd64-9_0
2022-08-08T15:33:45-06d96ee-0581d69/netbsd-amd64-9_0
2022-08-08T15:07:46-06d96ee-cd54ef1/netbsd-amd64-9_0
2022-08-08T14:12:21-06d96ee-e761556/netbsd-amd64-9_0
2022-08-08T06:16:59-06d96ee-0f6ee42/darwin-amd64-11_0
2022-08-08T06:16:59-06d96ee-0f6ee42/netbsd-386-9_0
2022-08-06T15:20:00-06d96ee-0c4db1e/plan9-386-0intro
2022-08-05T19:51:08-06d96ee-4fb7e22/plan9-386-0intro
2022-08-04T20:05:03-81c7dc4-39728f4/netbsd-386-9_0
2022-08-04T20:05:03-81c7dc4-39728f4/netbsd-amd64-9_0
2022-08-04T20:04:16-3519aa2-39728f4/netbsd-386-9_0
2022-08-04T19:57:25-763f65c-39728f4/netbsd-386-9_0
2022-08-04T18:51:46-99fd76f-39728f4/openbsd-386-70
2022-08-04T17:05:18-3e0a503-fb1bfd4/netbsd-amd64-9_0
2022-08-04T15:50:11-3e0a503-fcdd099/netbsd-386-9_0
2022-08-04T15:50:11-3e0a503-44ff9bf/netbsd-amd64-9_0
2022-08-04T15:31:49-87f47bb-44ff9bf/plan9-386-0intro
2022-08-04T14:58:59-87f47bb-4345620/netbsd-386-9_0
2022-08-04T10:32:51-3e0a503-a10afb1/linux-386-buster
2022-08-03T21:02:27-8b9a1fb-f28fa95/plan9-386-0intro
2022-08-03T21:02:27-8b9a1fb-4345620/netbsd-386-9_0
2022-08-03T18:07:40-d08f5dc-fcdd099/netbsd-386-9_0
2022-08-03T13:50:38-ddb90ec-c6a2dad/dragonfly-amd64-622
2022-08-03T13:50:38-ddb90ec-c6a2dad/plan9-386-0intro
2022-08-03T12:09:24-ddb90ec-29b9a32/plan9-386-0intro
2022-08-02T18:52:36-0d04f65-29b9a32/plan9-386-0intro
2022-08-02T18:19:01-d025cce-be59153/netbsd-amd64-9_0
2022-08-02T18:16:22-10cb435-d723df7/netbsd-amd64-9_0
2022-08-02T18:07:14-4d0b383-d723df7/netbsd-386-9_0
2022-08-02T18:07:14-4d0b383-d723df7/netbsd-amd64-9_0
2022-08-02T17:23:42-4d0b383-1b7e71e/darwin-amd64-nocgo
2022-08-02T16:05:48-4d0b383-f2a9f3e/netbsd-amd64-9_0
2022-07-29T20:19:23-9580c84-9240558/windows-arm64-11
2022-07-28T20:06:00-8ea5687-d9242f7/darwin-amd64-nocgo
2022-07-27T15:04:58-39a4e36-4248146/freebsd-386-13_0
2022-07-26T18:43:08-6c8a6c4-d9242f7/aix-ppc64
2022-07-25T20:44:49-2a6393f-24dc27a/darwin-amd64-10_14
2022-07-25T18:11:01-4375b29-795a88d/plan9-386-0intro
2022-07-25T14:16:17-178fdf9-64f2829/plan9-386-0intro
2022-07-22T20:12:19-7b605f4-c5da4fb/plan9-386-0intro
2022-07-21T20:11:06-ec1f924-c4a6d30/plan9-386-0intro
2022-07-15T15:11:26-22d1494-2aa473c/windows-386-2008-newcc
2022-07-15T14:27:36-1a4e02f-4651ebf/windows-arm64-10
2022-07-15T14:20:24-db8f89b-4651ebf/windows-arm64-10
2022-07-14T21:03:14-db8f89b-783ff7d/windows-arm64-11
2022-07-14T21:01:58-db8f89b-aa80228/darwin-arm64-11
2022-07-14T19:05:09-db8f89b-a906d3d/windows-arm64-10
2022-07-14T15:54:36-db8f89b-266c70c/windows-arm64-10
2022-07-14T01:47:39-db8f89b-558785a/windows-arm64-11
We recently started waiting for all go command invocations when shutting down gopls regtests. It appears that sometimes we kill the go command and still don't get a result from
cmd.Wait()
. For example, here:https://build.golang.org/log/00046e0b005c7660d676a3a415561950048f756a
In that failure, the test runner looks otherwise healthy (other tests ran fast), and yet the goroutine stack clearly shows a go command hanging for 9 minutes here:
https://cs.opensource.google/go/x/tools/+/master:internal/gocommand/invoke.go;l=260;drc=f38573358cbedf46d64c5759ef41b72afcf0c5c0
@bcmills do you happen to have any idea of what might cause this?
The text was updated successfully, but these errors were encountered: