Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/cmd/coordinator: write proper scheduler #19178

Open
bradfitz opened this issue Feb 18, 2017 · 14 comments
Open

x/build/cmd/coordinator: write proper scheduler #19178

bradfitz opened this issue Feb 18, 2017 · 14 comments
Labels
Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@bradfitz
Copy link
Contributor

Currently if there are N builds waiting on a buildlet type (due to lack of machines or quota), the current implementation is a bunch of goroutines fighting over a mutex.

It's random who wins and gets the buildlet.

We should have them register interest and when a buildlet becomes available, pick the highest priority one.

Example priorities in order:

  • trybot run with a +2 already
  • trybot run
  • normal run of a build at tip of a release branch
  • normal run of a build at tip of master branch
  • normal run of a build at tip of dev branch
  • normal run of a build not at tip
  • idle flake detection (x/build: auto-detect flakes #19177)

etc.

/cc @danp @kevinburke

@bradfitz bradfitz added the Builders x/build issues (builders, bots, dashboards) label Feb 18, 2017
@bradfitz bradfitz self-assigned this Feb 18, 2017
@bradfitz
Copy link
Contributor Author

And gomote access is in there somewhere too.

@gopherbot
Copy link

CL https://golang.org/cl/38306 mentions this issue.

gopherbot pushed a commit to golang/build that referenced this issue Apr 12, 2017
Currently all builds start and think they're running, but most are
just fighting over a mutex to grab a builder. That will be fixed, but
in the meantime it's nice to see what's actually working vs what's
waiting on e.g. arm5 hardware which won't be available for hours.

This is a baby step towards more monitoring. Currently this is just HTML
output, but the same data could be exported via JSON or something else later
for graphing.

Updates golang/go#19178 (add a buildlet scheduler)
Updates golang/go#15760 (monitor everything)

Change-Id: I36e16ea0919afe8023fe7fedd981f2e857f0d6df
Reviewed-on: https://go-review.googlesource.com/40397
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@gopherbot
Copy link

CL https://golang.org/cl/40397 mentions this issue.

gopherbot pushed a commit to golang/build that referenced this issue Apr 21, 2017
Benchmarks are treated as unit tests and distributed to the test
helpers, which allows them to fit in our 5m trybot budget.

Currently we only run the go1 and x/benchmarks. Running package
benchmarks is a TODO.

This feature is disabled by default, and is enabled by the
"farmer-run-bench" project attribute.

Updates golang/go#19178
Updates golang/go#19871

Change-Id: I9c3a14da60c3662e7e2cb4e71953060915cc4364
Reviewed-on: https://go-review.googlesource.com/38306
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/70430 mentions this issue: cmd/coordinator: attempt to run newer work before older work on buildlets

gopherbot pushed a commit to golang/build that referenced this issue Oct 20, 2017
…lets

The problem is especially severe on the slow builders,
like darwin/arm64, which can sometimes not have run
any commit from the past 12 hours and still pick an old
commit for its next trial. It would be far better for it to
prefer new work.

It's possible that this new logic should be disabled for
some of the auto-scaling builders, but for now it seems
like we can try it for all of them and see if that's OK.

Update golang/go#19178.

Change-Id: I32cc67c0c2c84130b40b250675b40aadb4a0a681
Reviewed-on: https://go-review.googlesource.com/70430
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Sarah Adams <shadams@google.com>
@dmitshur dmitshur self-assigned this Aug 24, 2018
@gopherbot
Copy link

Change https://golang.org/cl/132076 mentions this issue: cmd/coordinator: start of a scheduler, not yet enabled

gopherbot pushed a commit to golang/build that referenced this issue Sep 26, 2018
Updates golang/go#19178

Change-Id: I24aa368df01a85259b53d6cfb08de7ab3a80e4fe
Reviewed-on: https://go-review.googlesource.com/132076
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/205078 mentions this issue: cmd/coordinator: finish the scheduler code, at least mostly

gopherbot pushed a commit to golang/build that referenced this issue Nov 13, 2019
Optimizations and tuning remain, but this should be tons better than
what we had before (random).

Updates golang/go#19178

Change-Id: Idb483a4c4209a012814322cc8b37b966ee4681de
Reviewed-on: https://go-review.googlesource.com/c/build/+/205078
Reviewed-by: Bryan C. Mills <bcmills@google.com>
codebien pushed a commit to codebien/build that referenced this issue Nov 13, 2019
Optimizations and tuning remain, but this should be tons better than
what we had before (random).

Updates golang/go#19178

Change-Id: Idb483a4c4209a012814322cc8b37b966ee4681de
Reviewed-on: https://go-review.googlesource.com/c/build/+/205078
Reviewed-by: Bryan C. Mills <bcmills@google.com>
@gopherbot
Copy link

Change https://golang.org/cl/207079 mentions this issue: cmd/coordinator: add tests for the scheduler, and resulting fixes

gopherbot pushed a commit to golang/build that referenced this issue Nov 13, 2019
It now passes thousands of iterations in race mode.

Updates golang/go#19178

Change-Id: I210277abd084bdfd3c7ada538189722ff9543e3f
Reviewed-on: https://go-review.googlesource.com/c/build/+/207079
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
@gopherbot
Copy link

Change https://golang.org/cl/207178 mentions this issue: cmd/coordinator: delete reverse buildlet pool's high priority mechanism

@gopherbot
Copy link

Change https://golang.org/cl/207179 mentions this issue: cmd/coordinator: clean up HTML status, add scheduler status

@gopherbot
Copy link

Change https://golang.org/cl/207180 mentions this issue: cmd/coordinator: enable the scheduler

gopherbot pushed a commit to golang/build that referenced this issue Nov 14, 2019
The HTML status for the coordinator was way too long. For pending
builds, only show a single line, and render their state as
"waiting_for_machine" rather than "running". And for active builds,
only show the last few lines of status on the home page. People can
click for details.

Then add a scheduler status section too.

I'm also stashing away a build's SchedItem for now (with a little
refactoring to break up a long method), so a future CL can tell people
where a build is in line to get a buildlet.

Updates golang/go#19178

Change-Id: I2f37982ea3c7ee4a6581464117ae533499eba6a4
Reviewed-on: https://go-review.googlesource.com/c/build/+/207179
Reviewed-by: Bryan C. Mills <bcmills@google.com>
gopherbot pushed a commit to golang/build that referenced this issue Nov 14, 2019
This is in its own CL for now so it's easy to revert.

Updates golang/go#19178

Change-Id: I2eb66e3c8a6e75a8039077401231577c8e7bfdc8
Reviewed-on: https://go-review.googlesource.com/c/build/+/207180
Reviewed-by: Bryan C. Mills <bcmills@google.com>
gopherbot pushed a commit to golang/build that referenced this issue Nov 14, 2019
…code, TODOs

Updates golang/go#19178

Change-Id: Id4d8b016c41f57bfaba5ee4ab046285607047493
Reviewed-on: https://go-review.googlesource.com/c/build/+/207178
Reviewed-by: Bryan C. Mills <bcmills@google.com>
@gopherbot
Copy link

Change https://golang.org/cl/207418 mentions this issue: cmd/coordinator: omit scheduler progress times when missing or irrelevant

gopherbot pushed a commit to golang/build that referenced this issue Nov 15, 2019
…vant

Updates golang/go#19178

Change-Id: Ib59eabafc589ce66948c91c7f15cbccae2d2733d
Reviewed-on: https://go-review.googlesource.com/c/build/+/207418
Reviewed-by: Bryan C. Mills <bcmills@google.com>
@gopherbot
Copy link

Change https://golang.org/cl/207464 mentions this issue: cmd/coordinator: favor trybot work over post-submit work on process start

gopherbot pushed a commit to golang/build that referenced this issue Nov 15, 2019
…tart

Updates golang/go#19178

Change-Id: I054c852db398a9474e53254c42eb5f77c44fe348
Reviewed-on: https://go-review.googlesource.com/c/build/+/207464
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/208277 mentions this issue: cmd/coordinator: plumb commit time and branch from findWork down into scheduler

gopherbot pushed a commit to golang/build that referenced this issue Nov 21, 2019
… scheduler

The branch is not yet used in this CL, but the scheduler has it now
and can use it easily in the future.

Updates golang/go#19178

Change-Id: I6abab826a8668cb091d0face8184f28d08421722
Reviewed-on: https://go-review.googlesource.com/c/build/+/208277
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 16, 2020
@dmitshur
Copy link
Contributor

CL 223381 is related. It's a request to start running post-submit builders on new commits on dev.* branches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

3 participants