Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/internal/cloud: add an option to limit the requests made to the EC2 API #40950

Closed
cagedmantis opened this issue Aug 21, 2020 · 4 comments
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@cagedmantis
Copy link
Contributor

There should exist a way to limit the number of requests made to the AWS API. Not limiting the request rate can lead to errors when we have a burst of request. The API request throttling is documented at: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/throttling.html

/cc @andybons @dmitshur @toothrot

@cagedmantis cagedmantis added Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Aug 21, 2020
@cagedmantis cagedmantis added this to the Unplanned milestone Aug 21, 2020
@cagedmantis cagedmantis self-assigned this Aug 21, 2020
@dmitshur
Copy link
Contributor

limit the number of requests

I understand this is about limiting the rate of requests, not the total number.

when we have a burst of request

To help me understand this better, what is an example of a known situation when this happens?

@cagedmantis
Copy link
Contributor Author

I understand this is about limiting the rate of requests, not the total number.

This is about limiting the rate at which we make requests. We do not currently throttle requests made against the AWS API. In the link posted above, they have listed explicit request and resource rate limits. When a request fails because of us exceeding one of those limits, we fail the build.

Any throttling that we do on the coordinator side should be paired with some retry logic (with exponential backoffs) when a transient failure is encountered. That work will be included in a follow up issue.

To help me understand this better, what is an example of a known situation when this happens?

We issue a number of requests to a cloud provider in various situations:

  • A new builder is brought online and the coordinator will attempt to create builds for each missing cell for that builder.
  • Any CL submission that causes the all of the x repos to trigger a build.

Another scenario to consider: The reduction of capacity for a provider which would lead to a reduction in quota for an account. This would lower the number of simultaneous builds the coordinator can execute.

@gopherbot
Copy link

Change https://golang.org/cl/267901 mentions this issue: internal/cloud, cmd/coordinator: add a rate limiter for the AWS client

@gopherbot
Copy link

Change https://golang.org/cl/268697 mentions this issue: internal/cloud: do not use canceled context in rate limiter

gopherbot pushed a commit to golang/build that referenced this issue Nov 10, 2020
When errgroup is used in the rate limiter, a canceled context is
passed to the next method in the interceptor. A context is canceled
once the Wait() method returns. This change uses the same context
throughout the method instead of creating a new one just for the
errgroup.

For golang/go#40950

Change-Id: I64462394298a6b849187c288e5f76e92630572f9
Reviewed-on: https://go-review.googlesource.com/c/build/+/268697
Trust: Carlos Amedee <carlos@golang.org>
Run-TryBot: Carlos Amedee <carlos@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Alexander Rakoczy <alex@golang.org>
@golang golang locked and limited conversation to collaborators Nov 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

3 participants