New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/build: time out on LUCI Linux/PPC64x builder #65171
Comments
Found new dashboard test flakes for:
2023-12-06 21:50 gotip-linux-ppc64le go@e914671f cmd/cgo/internal/testshared.TestIssue62277 (log)
2023-12-12 20:05 gotip-linux-ppc64le go@962dade4 cmd/cgo/internal/testshared.TestIssue44031 (log)
2023-12-15 18:30 gotip-linux-ppc64le go@5e939b3a cmd/cgo/internal/testcarchive.TestPIE (log)
2023-12-15 18:30 gotip-linux-ppc64le go@5e939b3a cmd/cgo/internal/testshared.TestGlobal (log)
2023-12-15 20:51 gotip-linux-ppc64le go@f8170cc0 cmd/go.TestScript (log)
|
I wonder if LUCI needs to set |
Yeah, cc @mknyszek |
I'll send a CL for this today. Thanks. |
Change https://go.dev/cl/557857 mentions this issue: |
This change adds a baseline test time timeout scale that's derived from the host to better align with the old infrastructure. Such hosts are listed in SLOW_HOSTS along with their timeout scale. This CL also changes the way timeout scaling factors compose. Instead of taking the highest one, this change multiplies them. Although the only case where this matters seems to mostly work fine with taking the maximum (-longtest-race) it seems easier to reason about the composition of scaling factors this way. Fixes golang/go#65171. Change-Id: If95fc8a036c2a735396c854cbcf09af7cba0c9f3 Reviewed-on: https://go-review.googlesource.com/c/build/+/557857 Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> TryBot-Bypass: Michael Knyszek <mknyszek@google.com>
Found new dashboard test flakes for:
2023-12-15 20:30 gotip-linux-ppc64le go@3313bbb4 runtime.TestCgoPprofPIE (log)
|
Closing. Should be fixed by the CL above. (Apparently gopherbot doesn't do it because it is checked in to a branch.) |
Found new dashboard test flakes for:
2023-12-19 15:57 go1.22-linux-ppc64le release-branch.go1.22@fb23428a cmd/cgo/internal/testshared.TestIssue44031 [ABORT] (log)
|
Found new dashboard test flakes for:
2024-01-31 21:44 x_tools-go1.20-linux-ppc64le tools@5f906919 release-branch.go1.20@746a0727 x/tools/gopls/internal/test/integration/workspace.TestReloadOnlyOnce/default [ABORT] (log)
2024-01-31 21:44 x_tools-go1.21-linux-ppc64le tools@5f906919 release-branch.go1.21@916e6cdd x/tools/gopls/internal/test/integration/misc.TestHoverBrokenImport_Issue60592/default [ABORT] (log)
2024-02-02 15:22 gotip-linux-ppc64le go@149db960 cmd/cgo/internal/testcarchive.TestPreemption [ABORT] (log)
2024-02-02 15:22 gotip-linux-ppc64le go@149db960 cmd/cgo/internal/testshared.TestTestInstalledShared [ABORT] (log)
2024-02-02 15:22 gotip-linux-ppc64le go@149db960 cmd/go.TestScript (log)
|
Is there more context to be found about what test is suspected of hanging, and what the timeout is? I am rebuilding the container images to include lsof. |
Found new dashboard test flakes for:
2024-02-07 20:08 gotip-linux-ppc64le go@6abeffb1 cmd/go.TestScript (log)
2024-02-07 20:54 gotip-linux-ppc64le go@c0984005 cmd/cgo/internal/testshared.TestIssue44031 [ABORT] (log)
2024-02-07 21:04 gotip-linux-ppc64le go@1ffc1104 cmd/cgo/internal/testshared.TestGeneratedHash [ABORT] (log)
2024-02-07 21:04 gotip-linux-ppc64le go@1ffc1104 cmd/go.TestScript (log)
2024-02-07 21:50 gotip-linux-ppc64le go@adbf71eb cmd/cgo/internal/testcarchive.TestCachedInstall [ABORT] (log)
2024-02-07 21:50 gotip-linux-ppc64le go@adbf71eb cmd/cgo/internal/testplugin.TestIssue62430 [ABORT] (log)
2024-02-07 21:50 gotip-linux-ppc64le go@adbf71eb cmd/cgo/internal/testshared.TestImplicitInclusion [ABORT] (log)
2024-02-07 23:54 gotip-linux-ppc64le go@cde38c96 cmd/cgo/internal/testcarchive.TestSignalForwardingGo [ABORT] (log)
2024-02-07 23:54 gotip-linux-ppc64le go@cde38c96 cmd/cgo/internal/testerrors.TestMallocCrashesOnNil [ABORT] (log)
2024-02-07 23:54 gotip-linux-ppc64le go@cde38c96 cmd/cgo/internal/testplugin.TestIssue25756 [ABORT] (log)
2024-02-08 03:02 gotip-linux-ppc64le go@1400b268 cmd/cgo/internal/testcarchive.TestCompileWithoutShared [ABORT] (log)
2024-02-08 03:02 gotip-linux-ppc64le go@1400b268 cmd/cgo/internal/testshared.TestGlobal [ABORT] (log)
2024-02-08 03:02 gotip-linux-ppc64le go@1400b268 cmd/go.TestScript (log)
2024-02-08 17:31 gotip-linux-ppc64le go@210f051d cmd/cgo/internal/testshared.TestIssue44031 [ABORT] (log)
2024-02-08 17:31 gotip-linux-ppc64le go@210f051d runtime.TestCgoPprofPIE [ABORT] (log)
2024-02-08 18:16 gotip-linux-ppc64le go@4e91c569 cmd/cgo/internal/testshared.TestIssue47873 [ABORT] (log)
|
Hrm, I wonder if this is timing out more frequently because the jobs are not running on a tmpfs. The only tmpfs I have mounted to the containers is /workdir (as was used by the old CI), it doesn't look to be used. Is there an option to move the work/caches to a tmpfs? My initial though is to mount /home/swarming as a tmpfs. |
Found new dashboard test flakes for:
2024-01-30 14:41 gotip-linux-ppc64le go@b89ad464 go/printer.TestFiles/statements.input (log)
|
Change https://go.dev/cl/563396 mentions this issue: |
Factor the timeout_scale (int) → GO_TEST_TIMEOUT_SCALE env (string) conversion out of make_run_mod, so that it has effect on slow hosts whose builders don't have any run mods. Also mark the netbsd-arm64 host as slow. It's timing out during some tests, and the old dashboard had a timeout scale value of 10 set for it. To confirm that high of a value is needed, start with 2 and bump it further as needed. For golang/go#65171. Change-Id: Ib1206bf603502ba951cee025da0db24e4342214e Reviewed-on: https://go-review.googlesource.com/c/build/+/563396 Reviewed-by: Benny Siegert <bsiegert@gmail.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reports above seem to be because the |
I've also changed the docker configuration to mount /home/swarming as a tmpfs. |
Found new dashboard test flakes for:
2024-02-13 04:23 x_tools-go1.22-linux-ppc64le tools@afd84280 release-branch.go1.22@20107e05 x/tools/gopls/internal/test/integration/misc.TestRunGovulncheckStd/default [ABORT] (log)
2024-02-13 04:23 x_tools-go1.22-linux-ppc64le tools@afd84280 release-branch.go1.22@20107e05 x/tools/gopls/internal/util/safetoken.TestGoplsSourceDoesNotCallTokenFileMethods [ABORT] (log)
2024-02-14 20:25 gotip-linux-ppc64-power10 go@d90a57ff cmd/go.TestBadCommandLines [ABORT] (log)
2024-02-14 20:25 gotip-linux-ppc64-power10 go@d90a57ff cmd/go/internal/modfetch/codehost.TestLatest [ABORT] (log)
2024-02-14 21:42 x_tools-gotip-linux-ppc64-power10 tools@fef8b627 go@d90a57ff x/tools/gopls/internal/test/integration/misc.TestRunGovulncheckStd [ABORT] (log)
2024-02-14 21:42 x_tools-gotip-linux-ppc64-power10 tools@fef8b627 go@d90a57ff x/tools/gopls/internal/test/marker.Test [ABORT] (log)
2024-02-14 21:42 x_tools-gotip-linux-ppc64-power10 tools@fef8b627 go@d90a57ff x/tools/gopls/internal/util/safetoken.TestGoplsSourceDoesNotCallTokenFileMethods [ABORT] (log)
|
The latest batch of failures is the same issue as #65725. I only adjusted the RAM limits on the ppc64le LUCI containers. Looking through the logs, the ppc64-power10 LUCI containers also need more RAM. |
Found new dashboard test flakes for:
2024-02-15 17:22 x_tools-go1.21-linux-ppc64-power10 tools@7240af8b release-branch.go1.21@b214108e x/tools/gopls/internal/test/integration/misc.TestInconsistentVendoring [ABORT] (log)
2024-02-15 17:22 x_tools-go1.21-linux-ppc64-power10 tools@7240af8b release-branch.go1.21@b214108e x/tools/gopls/internal/test/marker.Test [ABORT] (log)
2024-02-15 17:22 x_tools-go1.21-linux-ppc64-power10 tools@7240af8b release-branch.go1.21@b214108e x/tools/gopls/internal/util/safetoken.TestGoplsSourceDoesNotCallTokenFileMethods [ABORT] (log)
|
Found new dashboard test flakes for:
2024-02-08 16:18 go1.22-linux-ppc64le release-branch.go1.22@20107e05 cmd/cgo/internal/testcarchive.TestDeepStack [ABORT] (log)
2024-02-08 16:18 go1.22-linux-ppc64le release-branch.go1.22@20107e05 cmd/cgo/internal/testshared.TestIssue62277 [ABORT] (log)
2024-02-16 15:53 go1.22-linux-ppc64-power10 release-branch.go1.22@fb86598c cmd/go.TestBadCommandLines [ABORT] (log)
2024-02-16 15:53 go1.22-linux-ppc64-power10 release-branch.go1.22@fb86598c cmd/go/internal/modfetch/codehost.TestReadFile [ABORT] (log)
2024-02-16 15:53 go1.22-linux-ppc64-power10 release-branch.go1.22@fb86598c cmd/go/internal/vcweb/vcstest.TestScripts [ABORT] (log)
2024-02-16 17:49 x_tools-go1.22-linux-ppc64le tools@c61f99f1 release-branch.go1.22@fb86598c x/tools/gopls/internal/test/marker.Test [ABORT] (log)
2024-02-19 07:24 gotip-linux-ppc64le go@35fa852d cmd/go.TestBadCommandLines [ABORT] (log)
2024-02-19 07:24 gotip-linux-ppc64le go@35fa852d cmd/link.TestCGOLTO [ABORT] (log)
|
Found new dashboard test flakes for:
2024-04-05 19:48 x_tools-gotip-linux-ppc64le_power8 tools@cb3eb43c go@d186dde8 x/tools/gopls/internal/test/integration/misc.TestRenameInTestVariant [ABORT] (log)
2024-04-05 19:48 x_tools-gotip-linux-ppc64le_power8 tools@cb3eb43c go@d186dde8 x/tools/gopls/internal/test/marker.Test [ABORT] (log)
2024-04-05 19:48 x_tools-gotip-linux-ppc64le_power8 tools@cb3eb43c go@d186dde8 x/tools/gopls/internal/util/safetoken.TestGoplsSourceDoesNotCallTokenFileMethods [ABORT] (log)
2024-04-08 13:17 x_tools-gotip-linux-ppc64le_power9 tools@8a0c6e2d go@20f052c8 x/tools/gopls/internal/test/integration/diagnostics.TestResolveImportCycle/default [ABORT] (log)
2024-04-08 13:17 x_tools-gotip-linux-ppc64le_power9 tools@8a0c6e2d go@20f052c8 x/tools/gopls/internal/test/integration/misc.TestTelemetryPrompt_Conditions/telemetryPrompt=false/initial_mode=on/default [ABORT] (log)
2024-04-08 16:03 x_tools-gotip-linux-ppc64le_power10 tools@de6db989 go@e8f5c04c x/tools/gopls/internal/test/integration/workspace.TestStandaloneFiles [ABORT] (log)
2024-04-09 14:12 x_tools-go1.22-linux-ppc64le_power9 tools@3520955d release-branch.go1.22@a65a2bbd x/tools/gopls/internal/test/integration/diagnostics.TestRenamePackage/default [ABORT] (log)
2024-04-09 14:12 x_tools-go1.22-linux-ppc64le_power9 tools@3520955d release-branch.go1.22@a65a2bbd x/tools/gopls/internal/test/integration/misc.TestTelemetryPrompt_Conditions/telemetryPrompt=true/initial_mode=off/default [ABORT] (log)
|
The text was updated successfully, but these errors were encountered: