Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: need to be able to rerun TryBots on other people's CLs (and my own) #38620

Open
dr2chase opened this issue Apr 23, 2020 · 9 comments
Open
Labels
Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@dr2chase
Copy link
Contributor

And, also, the UI is misleading, see below.

What did you do?

In Gerrit, I first run TryBots on a CL (mine or someone else)'s with "Reply" and then I click the "+1" button next to "Run-TryBot". This runs TryBots.

The TryBots run, but with errors that look probably unrelated to the CL.
Since I ran TryBots by clicking "+1" in the reply pane, "obviously" the way to do this is to reset the value to zero (the "0" is not dimmed, therefore this is permitted, right?) and then clock "+1" again (worked once, why not again?).

What did you expect to see?

A second run of the TryBots.

What did you see instead?

NOT a second run of the TryBots.

I understand that what's actually missing is that (1) we've decided that people need special permission for this operation (2) I lack this permission (3) I should know this by noticing the absence of a button that I never knew existed.

I think we ought to fix the policy unless there's a really good reason not to let people do this, and if we can't fix the policy, we need to fix the UI to avoid people intuiting an unproductive way to spend their time not rerunning TryBots. What I actually want is "Rerun failed TryBots" and ideally someone is monitoring that to infer flakiness over time.

My workaround for this is to attempt to do the rerun by hand using a gomote, which is its own fiasco of stale documentation and read-all-the-docs-maybe-you'll-find-a-hint (oops, the hint has gone stale also).

/cc @golang/osp-team

@gopherbot gopherbot added this to the Unreleased milestone Apr 23, 2020
@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label Apr 23, 2020
@bcmills
Copy link
Contributor

bcmills commented Apr 23, 2020

Do you have the trashcan symbol on the TryBot-Result line? That's the thing you need to click to get a rerun for the same patch set.

deFi3smibtv

@dmitshur
Copy link
Contributor

Thank you for reporting this @dr2chase and providing detail. I agree this is a usability problem. If someone has permissions to start trybots, it seems reasonable that they should also have the permission to restart them.

We should investigate what is a good way to resolve this issue.

Issue #38235 is related, and should also be resolved to improve the usability of trybot restarts.

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Apr 23, 2020
@dr2chase
Copy link
Contributor Author

I do have that trash can icon, not sure if I didn't notice it or if it appeared recently.
I think I'd still argue usability, obviously I missed it.

@dmitshur
Copy link
Contributor

dmitshur commented Apr 23, 2020

Great! You should be able to start using using it to delete the TryBot-Result vote and cause the trybot to restart.

@dr2chase To confirm, do you have that delete button visible both on a CL you are the author of, and on another CL where you are not the author?

To make progress on this issue, I think the next steps here are:

  1. Confirm that everyone who is in the may-start-trybots group (documented at https://golang.org/wiki/GerritAccess#trybot-access-may-start-trybots) has access to delete the TryBot-Result vote on all CLs, and if not, make it so.

  2. Document the existing process for restarting a TryBot run (i.e., it requires pressing the trash can icon and deleting the TryBot-Result vote), if it's not already documented somewhere. That way, we can at least point people to it when they run into this issue.

  3. Consider what more, if anything, needs to be done to improve the usability.

@polinasok
Copy link

I just ran into the same issue trying to figure out how to rerun failed trybots. I poked around. I did see the trash icon in a hover, but it did not occur to me that it would help. I expected to see something like a looping arrow. After reading this issue I removed TryBot-Result-1. It doesn't seem to work. https://go-review.googlesource.com/c/vscode-go/+/394136

@ianlancetaylor
Copy link
Contributor

My understanding is that there is an ordering issue. You need to clear Trybot+1, then clear Trybot-Result-1, then set Trybot+1.

I tried setting Trybot+1 on CL 394136, and it looks like somebody uploaded a new patchset anyhow. Perhaps it will work now. We'll see.

@polinasok
Copy link

Setting Tryboot+1 did not help me. I uploaded a new patchset as a way to rerun things.

@ianlancetaylor
Copy link
Contributor

OK, sorry for the noise.

@dr2chase
Copy link
Contributor Author

dr2chase commented Apr 6, 2022

Here's the recipe I use, that has worked in the last month for me:

(1) you have to wait several minutes from the last TryBot-Result vote or else the coordinator gets confused
(2) you have to, in order, remove any previous Run-TryBot+1 votes and then TryBot-Result (click the trash can)
(3) you have to add the TRY= comment in the same reply as the new Run-TryBot+1 vote.

For a slowbot, include e.g.
TRY=linux-loong64,linux-amd64
to request a builder. (but linux-amd64 always runs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants