Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/cmd/relui: add automatic retries #53886

Closed
heschi opened this issue Jul 14, 2022 · 3 comments
Closed

x/build/cmd/relui: add automatic retries #53886

heschi opened this issue Jul 14, 2022 · 3 comments
Assignees
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@heschi
Copy link
Contributor

heschi commented Jul 14, 2022

We frequently experience flakes in build steps, particularly ones that involve buildlets: test flakes, network connectivity problems, etc. cmd/releasebot had automatic retries which were extremely valuable; relui should too.

Not all tasks are safe to retry: if we fail mid-email or mid-Tweet we don't know if it was sent. (We do make an attempt to deduplicate emails but it is inherently racy.)

Since idempotence and retriability are properties of the task function, I think a reasonable API would be along the lines of (*testing.T).Parallel, where the function calls (*workflow.TaskContext).AllowRetries or such to indicate that it supports retry. (We can decide whether it should be opt-in or opt-out; I lean toward opt-out given the set of tasks we have today.)

cc @golang/release

@heschi heschi added the NeedsFix The path to resolution is known, but the work has not been done. label Jul 14, 2022
@heschi heschi added this to the Unreleased milestone Jul 14, 2022
@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label Jul 14, 2022
@heschi heschi self-assigned this Jul 14, 2022
@gopherbot
Copy link

Change https://go.dev/cl/417589 mentions this issue: internal/workflow: add automatic retry support

@gopherbot
Copy link

Change https://go.dev/cl/417590 mentions this issue: internal/relui: persist retry-related fields

@gopherbot
Copy link

Change https://go.dev/cl/417591 mentions this issue: internal/workflow: enable automatic retries

gopherbot pushed a commit to golang/build that referenced this issue Jul 19, 2022
Add automatic retries to the workflow package. Most of our workflow
actions are idempotent and at least potentially flaky, and there's no
point making the user hit retry over and over.

Since we do have some non-idempotent tasks, allow the task function to
disable retries by calling (*TaskContext).DisableRetries. This means we
don't know if retries are allowed until after the first one, which makes
for a slightly clumsy implementation but a nice easy API.

Retries are currently off by default while I finish implementing up the
stack, but enabled in all the tests.

For golang/go#53886.

Change-Id: I15c57cf7b23785560238bfd6a45bd7744ba29b32
Reviewed-on: https://go-review.googlesource.com/c/build/+/417589
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Jenny Rakoczy <jenny@golang.org>
Run-TryBot: Heschi Kreinick <heschi@google.com>
gopherbot pushed a commit to golang/build that referenced this issue Jul 19, 2022
Store the retry count in the database and test that it all works.

I removed the definition argument from start; we can read it just as
well from the holder, and it makes it harder to forget to register it.

For golang/go#53886.

Change-Id: I4c3d48a96bae87c31629f76067e75ccd524381b0
Reviewed-on: https://go-review.googlesource.com/c/build/+/417590
Reviewed-by: Jenny Rakoczy <jenny@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Heschi Kreinick <heschi@google.com>
gopherbot pushed a commit to golang/build that referenced this issue Jul 19, 2022
Mark tasks that should not be retried and enable automatic retries by
default.

For golang/go#53886.

Change-Id: I6c2b48cff83477c7c3ed7cf8b753b1646831e233
Reviewed-on: https://go-review.googlesource.com/c/build/+/417591
Auto-Submit: Heschi Kreinick <heschi@google.com>
Run-TryBot: Heschi Kreinick <heschi@google.com>
Reviewed-by: Jenny Rakoczy <jenny@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
@heschi heschi closed this as completed Jul 26, 2022
@heschi heschi added this to Done in Go Release Team Jul 26, 2022
@golang golang locked and limited conversation to collaborators Jul 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
Archived in project
Development

No branches or pull requests

2 participants