-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/build: shard and scale the longtest
SlowBots
#37439
Comments
Change https://golang.org/cl/268037 mentions this issue: |
The longtest builders are currently primarily post-submit builders, where it's okay for them to be as slow as they need to be in order to provide additional test coverage. In this context, whether they take 40 minutes or 50 makes little difference. The longtest builders are also sometimes requested via SlowBots for changes that are riskier than usual, or otherwise desire additional coverage beyond the normal TryBots. They're also always enabled for CLs to release branches. In such contexts, speeding up SlowBot runs from 40 minutes to 20 or less would be appreciated and in turn help people use longtest SlowBots more frequently. Longtest builders are already configured to use sharded tests. Configure them to use additional helpers to speed up test execution. Try out 3, 5, and 9 helpers to see how much it helps before settling. For golang/go#37439. Change-Id: I425bc0257b7a54bb32c0eb1719fea7ba3f4fd461 Reviewed-on: https://go-review.googlesource.com/c/build/+/268037 Trust: Dmitri Shuralyov <dmitshur@golang.org> Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Alexander Rakoczy <alex@golang.org> Reviewed-by: Carlos Amedee <carlos@golang.org>
I sent https://golang.org/cl/268037 for this. I had a chance to try out all 3 values for additional TryBot helpers (3, 5, and 9) in at least one CL and the times so far were:
It seems even just 3 additional helpers goes a long way to speed up the longtest SlowBot. I'll collect some more timing data next up. (It's not a completely fair comparison since the builders are different, but it's enough to get some idea.) |
Change https://golang.org/cl/279513 mentions this issue: |
I missed a Windows test failure in CL 220645 because I forgot to run it against the
windows-amd64-longtest
SlowBot. I forgot to run it against that SlowBot because I'm not in the habit of doing so.I'm not in the habit of running that SlowBot because it is currently much too slow. To pick some relevant runs:
In contrast, a regular TryBot typically caps out around 10 minutes (#32632), and we consider runs that take longer than 20 minutes to be unacceptably slow (#36629, #36482).
Since there is nothing particularly special about the hardware needed to run the
longtest
builds (they're just large VMs), I think we should adjust the builder configuration to run the-longtest
SlowBots with 4 or more shards each. That way, the end-to-end latency impact of adding one of these bots to a CL will be minimal, and we will not only have less of a disincentive to using them, but also have much faster feedback in order to inform revert-or-fix decisions when one breaks.CC @golang/osp-team
The text was updated successfully, but these errors were encountered: