Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/cmd/coordinator: us-central1-f out of resources #35987

Closed
bradfitz opened this issue Dec 5, 2019 · 9 comments
Closed

x/build/cmd/coordinator: us-central1-f out of resources #35987

bradfitz opened this issue Dec 5, 2019 · 9 comments
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Soon This needs to be done soon. (regressions, serious bugs, outages)
Milestone

Comments

@bradfitz
Copy link
Contributor

bradfitz commented Dec 5, 2019

The coordinator's logs are full of:

2019/12/05 06:23:58 failed to create instance buildlet-freebsd-11-2-rn1118db6 in zone us-central1-f: ZONE_RESOURCE_POOL_EXHAUSTED 
2019/12/05 06:23:58 sched.getbuildlet: finish_create_gce_instance, after 2m34.1s; err=Error creating instance: &{Code:ZONE_RESOURCE_POOL_EXHAUSTED Location: Message:The zone 'projects/symbolic-datum-552/zones/us-central1-f' does not have enough resources available to fulfill the request.  Try a different zone, or try again later. ForceSendFields:[] NullFields:[]}; buildlet-freebsd-11-2-rn1118db6 
2019/12/05 06:23:58 Failed to create VM for host-freebsd-11_2: Error creating instance: &{Code:ZONE_RESOURCE_POOL_EXHAUSTED Location: Message:The zone 'projects/symbolic-datum-552/zones/us-central1-f' does not have enough resources available to fulfill the request.  Try a different zone, or try again later. ForceSendFields:[] NullFields:[]} 

And the scheduler status shows tons of builds waiting for long times:

https://farmer.golang.org/#sched

ZONE_RESOURCE_POOL_EXHAUSTED apparently means that the zone is out of resources, not our quota.

So, yay.

I guess we need to change zones or make it pick a random zone or something, or just ask for a VM in the region.

/cc @andybons @toothrot @dmitshur @cagedmantis @bcmills

@bradfitz bradfitz added NeedsFix The path to resolution is known, but the work has not been done. Soon This needs to be done soon. (regressions, serious bugs, outages) labels Dec 5, 2019
@gopherbot gopherbot added this to the Unreleased milestone Dec 5, 2019
@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label Dec 5, 2019
@gopherbot
Copy link

Change https://golang.org/cl/209968 mentions this issue: cmd/debugnewvm: add --zone flag

@bradfitz
Copy link
Contributor Author

bradfitz commented Dec 5, 2019

The good news is that legacy networks at least permit us to create VMs in any zone and they can all communicate.

So should be an easy fix.

@bradfitz
Copy link
Contributor Author

bradfitz commented Dec 5, 2019

So currently we create VMs only in one zone:

https://godoc.org/golang.org/x/build/buildenv#Environment.Zone

But I guess we used to create them in multiple zones since we have this ZonesToClean []string field:

https://godoc.org/golang.org/x/build/buildenv#Environment.ZonesToClean

Maybe just unify those into one field "Zones" and have new instances pick a random zone.

Some helpers like this would need to take a zone string argument, though:

// MachineTypeURI returns the URI for the environment's Machine Type.                                                                                                                                        
func (e Environment) MachineTypeURI() string {
        return e.ComputePrefix() + "/zones/" + e.Zone + "/machineTypes/" + e.MachineType
}

gopherbot pushed a commit to golang/build that referenced this issue Dec 5, 2019
Updates golang/go#35987

Change-Id: I8338da1a317ddfb47ceafd0b22d40a21fcfb2bdd
Reviewed-on: https://go-review.googlesource.com/c/build/+/209968
Reviewed-by: Carlos Amedee <carlos@golang.org>
@toothrot toothrot assigned toothrot and cagedmantis and unassigned toothrot Dec 5, 2019
@toothrot
Copy link
Contributor

toothrot commented Dec 5, 2019

I believe @cagedmantis is looking at this today.

@gopherbot
Copy link

Change https://golang.org/cl/210237 mentions this issue: cmd/coordinator: support deploying vms in multiple GCE zones

@gopherbot
Copy link

Change https://golang.org/cl/210498 mentions this issue: internal/buildgo: make basepin disks in all zones

gopherbot pushed a commit to golang/build that referenced this issue Dec 9, 2019
Follow-up to CL 210237.

Updates golang/go#35987

Change-Id: Ib00873123926863ba3d419fb5863adf7aaf4a41e
Reviewed-on: https://go-review.googlesource.com/c/build/+/210498
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alexander Rakoczy <alex@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/210541 mentions this issue: cmd/coordinator, all: fix more things related to multi-zone buildlets

gopherbot pushed a commit to golang/build that referenced this issue Dec 9, 2019
This fixes stuff in CL 210498 and CL 210237.

I renamed the Zone field to ControlZone both to make it more clear and
to force compilation errors wherever Zone was used previously, which
revealed some things that were missed.

Updates golang/go#35987

Change-Id: I2f890727ece86d093a90a3b47701caa58de6ccbc
Reviewed-on: https://go-review.googlesource.com/c/build/+/210541
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alexander Rakoczy <alex@golang.org>
@bradfitz
Copy link
Contributor Author

bradfitz commented Dec 9, 2019

Now we're out of SSD quota in us-central1:

Error: &{Code:QUOTA_EXCEEDED Location: Message:Quota 'SSD_TOTAL_GB' exceeded.  Limit: 8192.0 in region us-central1. ForceSendFields:[] NullFields:[]}

The easy fix is to bump our quota, but really we should delete a ton of our old VM images.

@bradfitz bradfitz reopened this Dec 9, 2019
@gopherbot
Copy link

Change https://golang.org/cl/210542 mentions this issue: internal/buildgo, cmd/updatedisks: clean up old, unused VM disks

@golang golang locked and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Soon This needs to be done soon. (regressions, serious bugs, outages)
Projects
None yet
Development

No branches or pull requests

4 participants