-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/build: linux-arm builder can't finish an all.bash run when test sharding isn't used #40872
Comments
How do the trybots succeed? Is it because they shard out the tests to multiple machines? |
Can you please include links to CLs where you've seen this? That way we'll have more information (e.g., which commit was being tested exactly, etc.). How often is this happening? Did it start recently? The linux-arm builder is defined https://github.com/golang/build/blob/148ff27ab5b70970002d390c9e1da4b861f6da9f/dashboard/builders.go#L1736-L1756. They run on Scaleway (also see here), so adjusting resources will be limited to what's available there (we might already be maxed out; but need to look again to be more confident). I see that linux-arm trybots are currently disabled because of other issues: tryBot: nil, // Issue 22748, Issue 22749 Is this issue about that builder when requested via SlowBots or something else? /cc @cagedmantis @toothrot @andybons per builder owners. |
This happens when using gomote to run all.bash manually:
Sorry, I guess I'm using the term "trybot" to mean both the thing that tests CLs as well as manual gomotes. I mean the latter (except in the context of my second comment). |
@cagedmantis Do you expect #36841 will be able to help with this (by enabling a linux-arm builder with bigger limits)? |
@dmitshur Yes, I'm actively working on the linux-arm-aws builder with more resources. I will assign myself to this issue. |
Oh, I believe this is the same issue as #35628. /cc @cherrymui I'll close it in favor of that one, and move your assignment @cagedmantis if you don't mind. |
This is not exactly the same. #35628 is about trybot, this is about "when test sharding isn't used" e.g. manual gomote runs. The trybot one has weird STALE errors, whereas this one is OOMing or out of disk space. About disk space, if I remember correctly, last time I looked, the machine actually has reasonably sizable disk space, but we're running on a very small partition. |
We can re-open this if it'd be helpful to confirm this issue is fixed when #35628 is fixed, but as I understand, this builder is broken in all contexts other than as a post-submit builder (on build.golang.org). |
Change https://golang.org/cl/249420 mentions this issue: |
The current linux-arm builder is known to have trouble when used as a SlowBot. Start warning about it when the builder is requested via the TRY= SlowBot UI. I've considered also removing or disabling the "arm" SlowBot alias, but that would make it easier to miss that there's an issue, since SlowBots don't warn about unknown builders: If you specify an unknown TRY= token, it'll just ignore it and won't report an error. We can consider making further changes as this situation evolves. The goal here is to start notifying about a known problem sooner. For golang/go#35628. For golang/go#40872. Change-Id: Ibc1205720c44ec4823c632c04fc2f887368258c1 Reviewed-on: https://go-review.googlesource.com/c/build/+/249420 Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Alexander Rakoczy <alex@golang.org>
Change https://golang.org/cl/270517 mentions this issue: |
The linux-arm-aws builder was initially labeled with a known issue because it was experimental. The builder has been tested and is no longer considered experimental. Fixes golang/go#41867 Updates golang/go#40872 Updates golang/go#35628 Change-Id: I61f43f2c2651c26d3f5d4db01b779686ddb6a92b Reviewed-on: https://go-review.googlesource.com/c/build/+/270517 Trust: Carlos Amedee <carlos@golang.org> Run-TryBot: Carlos Amedee <carlos@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Also ran into a similar |
Instead of dedicating more time to the linux-arm builders which is hosted on Scaleway, I think it may be best to replace the current linux-arm builder with the one hosted on AWS. The new builders have additional resources which should resolve all of these issues. Please comment if you disagree with this plan. |
Change https://golang.org/cl/303230 mentions this issue: |
When I run a trybot on linux/arm, I get "out of memory" or "no space left on device" errors.
Is there anything we can do to fix this? Can we get more memory and/or disk space on these builders?
I could work on making cmd/compile/internal/ssa tests take less memory, perhaps. Not sure how much we could save.
The text was updated successfully, but these errors were encountered: