Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/cmd/coordinator: revdial is consuming too much memory, causing GKE evictions #31639

Closed
bradfitz opened this issue Apr 23, 2019 · 3 comments
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge
Milestone

Comments

@bradfitz
Copy link
Contributor

The coordinator keeps getting evicted by GKE due to memory usage, mostly caused by the revdial package.

Quick bug to attach future CLs.

/cc @rsc @dmitshur

@gopherbot gopherbot added this to the Unreleased milestone Apr 23, 2019
@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label Apr 23, 2019
@gopherbot
Copy link

Change https://golang.org/cl/173517 mentions this issue: cmd/coordinator, buildenv: bound the number of reverse buildlets for now

gopherbot pushed a commit to golang/build that referenced this issue Apr 26, 2019
To mitigate bug in revdial that's retaining way too much memory and
causing coordinator evictions.

Also double memory limit for now.

Updates golang/go#31639

Change-Id: I19b6df92df9905b087c29884140096946037fa38
Reviewed-on: https://go-review.googlesource.com/c/build/+/173517
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/174082 mentions this issue: revdial/v2: add new simpler, non-multiplexing revdial implementation

gopherbot pushed a commit to golang/build that referenced this issue Apr 29, 2019
The old revdial has a simple multiplexing protocol that was like
HTTP/2 but without flow control, etc. But it was too simple (no flow
control) and too complex. Instead, just use one TCP connection per
reverse dialed connection. For now, the NAT'ed machine needs to go
re-connect for each incoming connection, but in practice that's just
once.

The old implementation is retained for now until all the buildlets are
updated.

Updates golang/go#31639

Change-Id: Id94c98d2949e695b677531b1221a827573543085
Reviewed-on: https://go-review.googlesource.com/c/build/+/174082
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/174325 mentions this issue: cmd/coordinator: only bound old revdial builds

@golang golang locked and limited conversation to collaborators Apr 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge
Projects
None yet
Development

No branches or pull requests

2 participants