Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/cmd/coordinator: apply GO_TEST_TIMEOUT_SCALE to scale go test timeout for x-repo tests #56968

Closed
bcmills opened this issue Nov 28, 2022 · 10 comments
Assignees
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Nov 28, 2022

Many of the slower Go builders (especially reverse builders) set GO_TEST_TIMEOUT_SCALE to some positive factor to compensate for a slow host.

For the main go repo, that scale factor is currently applied in cmd/dist:
https://cs.opensource.google/go/go/+/master:src/cmd/dist/test.go;l=207-212;drc=0a8055ef3f84673dc3a70ce5143bdf9817986dea

However, I don't see any place where we apply it for the x-repo tests:
https://cs.opensource.google/search?q=GO_TEST_TIMEOUT_SCALE&sq=&ss=go%2Fx%2Fbuild

And, empirically it does not seem to be applied for x repos at all. In https://build.golang.org/log/a2a8de1767171071d4551d937474538b0ad961c6, we see GO_TEST_TIMEOUT_SCALE=10 in the environment, but a test timed out after 10m0s — which is exactly equal to the default timeout for go test!

Perhaps we should update (*cmd/coordinator.buildStatus).runSubrepoTests to also apply the scale factor? It seems unduly optimistic to assume that x tests will run quickly even when std tests do not.

@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label Nov 28, 2022
@gopherbot gopherbot added this to the Unreleased milestone Nov 28, 2022
@bcmills
Copy link
Contributor Author

bcmills commented Nov 28, 2022

In combination with #47257 (which has the side-effect of causing tests that need export data for std to become much more expensive), this is causing frequent test timeouts on at least one of the reverse builders (#56967).

(CC @aclements @golang/release)

@aclements
Copy link
Member

I don't suppose we'd be willing to apply GO_TEST_TIMEOUT_SCALE in go test itself?

@bcmills
Copy link
Contributor Author

bcmills commented Nov 28, 2022

At the very least, I think that would require a proposal (and perhaps a freeze exception). 😅

@aclements
Copy link
Member

Fair enough.

I'll note that in addition to processing GO_TEST_TIMEOUT_SCALE, dist also has a default timeout scale for a few architectures. I'm not sure that's something we'd want to duplicate in go test, but that logic in dist could instead set GO_TEST_TIMEOUT_SCALE and let go test apply it. That would keep all.bash working for users on those architectures.

(The whole thing is weird, maybe we should be running a quick benchmark at the beginning of dist test and setting a timeout scale based on that!)

@Santanathegreat

This comment was marked as duplicate.

@bcmills
Copy link
Contributor Author

bcmills commented Nov 29, 2022

dist also has a default timeout scale for a few architectures.

Could we just get rid of that default and set GO_TEST_TIMEOUT_SCALE explicitly on the builders for those architectures? I don't think we should be in the business of assuming the performance characteristics of machines based on CPU family, especially given the performance overlap between modern ARM and x86 (and between ARM and ARM64).

(The whole thing is weird, maybe we should be running a quick benchmark at the beginning of dist test and setting a timeout scale based on that!)

Long-term (once we implement #48157), I wonder if the per-test-binary timeouts are even all that useful. But I guess even with per-test timeouts we'd still need a way to scale them.

I'm not sure that running a benchmark would be all that helpful: some builders are slow because of CPU performance, while others are slow because of disk performance (which disproportionately affects toolchain tests, since they tend to be more I/O intensive), and yet others are slow because of low memory (resulting in contention for file buffer cache and/or swap space). To get a reliable benchmark, we would probably need to measure all three — and we'd still need to figure out a way to incorporate that into go test in general.

@gopherbot
Copy link

Change https://go.dev/cl/453975 mentions this issue: cmd/coordinator: scale go test timeout by GO_TEST_TIMEOUT_SCALE

@dmitshur dmitshur added the NeedsFix The path to resolution is known, but the work has not been done. label Nov 29, 2022
@dmitshur dmitshur changed the title x/build: GO_TEST_TIMEOUT_SCALE not applied for x-repo tests x/build/cmd/coordinator: apply GO_TEST_TIMEOUT_SCALE to scale go test timeout for x-repo tests Nov 29, 2022
@bcmills
Copy link
Contributor Author

bcmills commented Dec 5, 2022

In combination with #47257 (which has the side-effect of causing tests that need export data for std to become much more expensive), this is causing frequent test timeouts on at least one of the reverse builders (#56967).

We've resolved #56967 in a different way (via CL 454497 and CL 454499), so I believe this is no longer contributing to ongoing test timeouts.

However, I do still think we should fix it at some point: I suspect that honoring GO_TEST_TIMEOUT_SCALE would help us reduce the technical debt embodied in golang.org/x/tools/internal/testenv.ExitifSmallMachine.

@gopherbot
Copy link

Change https://go.dev/cl/455521 mentions this issue: dashboard: apply cmd/dist's timeout scale for arm, mips* architectures

@gopherbot
Copy link

Change https://go.dev/cl/455520 mentions this issue: dashboard: explicitly set timeout scale in {freebsd,plan9}/arm builders

gopherbot pushed a commit to golang/build that referenced this issue Dec 6, 2022
These two builders have GO_TEST_TIMEOUT_SCALE values set in their local
environment. Since those variables affect the builder, and will soon be
needed for golang.org/x test timeout scale, set them explicitly in the
dashboard package. That makes more of the builder configuration visible
and removes the need to maintain local configuration on those builders.

For golang/go#56968.

Change-Id: I206cbddab91c3628f43716e8b05fd52d94fba7f2
Reviewed-on: https://go-review.googlesource.com/c/build/+/455520
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
gopherbot pushed a commit to golang/build that referenced this issue Dec 6, 2022
These values come from cmd/dist doing it by default based on GOARCH.
We want to factor out that logic from cmd/dist, so apply it explicitly
in x/build to appropriate builders. CL 455518 removes it from cmd/dist.

The appropriate builders were found by iterating over all builders in
this package and filtering on their GOARCH value.

For golang/go#56968.
For golang/go#57117.

Change-Id: I1bccb7144d9ae13ca17e5f12169924d0fb89e341
Reviewed-on: https://go-review.googlesource.com/c/build/+/455521
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
@golang golang locked and limited conversation to collaborators Dec 6, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
Archived in project
Development

No branches or pull requests

5 participants