-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/build: trybots not running due to maintner resource temporarily unavailable errors #23705
Comments
Can't even bash in to debug:
Process limit? FD leak? |
Killed container and let it get restarted. Then ran bash in it and:
|
This may have happened again, as https://golang.org/cl/92155 has had no trybot initiation after ten minutes. |
The trybots did start running on CL 92155 after 32 minutes. |
Well, @andybons updated our GKE cluster from 1.5.6 to 1.8.whatever the latests GKE option is, and the errors persist, but they've changed. Now I see:
Not sure what |
Looks like something's leaking. Bunch of defunct processes:
|
I have no idea where these defunct processes are coming from. I audited all the os/exec callers in golang.org/x/build/maintner/... and don't see anything suspicious. @ianlancetaylor, any debugging tips? |
Sorry, no good ideas. Just the obvious one of double-checking that the parent process is what you think it is. |
Updates from research & chats with others. This is a GKE issue. Apparently GKE 1.7 automatically reaped zombie processes like an init would, using a "pause container", but that was reverted in GKE 1.8 and hasn't been fixed in GKE 1.8 yet. Some links: https://news.ycombinator.com/item?id=14524253 |
Change https://golang.org/cl/93077 mentions this issue: |
Updates golang/go#23705 Change-Id: If0e7826ab75aae46dc7b79306d36d6cd3f07a041 Reviewed-on: https://go-review.googlesource.com/93077 Reviewed-by: Andrew Bonventre <andybons@golang.org>
Deployed:
|
Change https://golang.org/cl/93082 mentions this issue: |
Updates golang/go#23705 Change-Id: Ia66e3c16bda3357daf4796f3eafb50b2eb019223 Reviewed-on: https://go-review.googlesource.com/93082 Reviewed-by: Andrew Bonventre <andybons@golang.org>
gitmirror is next to need love:
|
Change https://golang.org/cl/93755 mentions this issue: |
Updates golang/go#23705 Change-Id: I1fa42b2767d1a9780672458fd06d64ed998b9e04 Reviewed-on: https://go-review.googlesource.com/93755 Reviewed-by: Andrew Bonventre <andybons@golang.org>
Change https://golang.org/cl/94075 mentions this issue: |
Updates golang/go#23705 Change-Id: I9c483efa491e2d9f705850d81ea94feda9f9d5a4 Reviewed-on: https://go-review.googlesource.com/94075 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
I believe this particular issue is resolved, so closing. Please re-open if not, or if there's something actionable to do here. |
Trybots aren't running because maintner isn't noticing that there's new stuff on Gerrit, due to:
This smells just like #23686 in a different binary.
Both of these were just updated for the LetsEncrypt ACME changes.
So what changed in the meantime?
What resource isn't available?
/cc @andybons @ianlancetaylor
The text was updated successfully, but these errors were encountered: