-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/build: reverse pool locking problem in the coordinator #10750
Labels
Comments
In the reverse buildlet healthcheck, this channel receive is blocking for 33+ minutes while holding the mutex: // reverseHealthCheck requests the status page of each idle buildlet.
// If the buildlet fails to respond promptly, it is removed from the pool.
func (p *reverseBuildletPool) reverseHealthCheck() {
p.mu.Lock()
responses := make(map[*reverseBuildlet]chan error)
for _, b := range p.buildlets {
if b.inUseAs == "health" { // sanity check
panic("previous health check still running")
}
if b.inUseAs != "" {
continue // skip busy buildlets
}
b.inUseAs = "health"
res := make(chan error, 1)
responses[b] = res
client := b.client
go func() {
_, err := client.Status()
res <- err
}()
}
p.mu.Unlock()
time.Sleep(5 * time.Second) // give buildlets time to respond
p.mu.Lock()
var buildlets []*reverseBuildlet
for _, b := range p.buildlets {
res := responses[b]
if b.inUseAs != "health" || res == nil {
// buildlet skipped or registered after health check
buildlets = append(buildlets, b)
continue
}
b.inUseAs = ""
err, done := <-res // <------------ HERE (that final line) |
CL https://golang.org/cl/9851 mentions this issue. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
@crawshaw,
farmer.golang.org is hanging. Interesting stack goroutine:
The text was updated successfully, but these errors were encountered: