Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: deflake more tests using backoff #24580

Closed
josharian opened this issue Mar 28, 2018 · 9 comments
Closed

net: deflake more tests using backoff #24580

josharian opened this issue Mar 28, 2018 · 9 comments
Labels
FrozenDueToAge help wanted Suggested Issues that may be good for new contributors looking for work to do. Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@josharian
Copy link
Contributor

In https://golang.org/cl/102397, I added some retry with backoff loops to some flaky net tests. It appears to have helped the build dashboard. I mentioned in the commit message that we could use the same approach for other flaky tests that we see. Looking at build.golang.org, it looks like TestLookupCNAME could use the same treatment.

This would be a great starter fix for someone looking to contribute. Or maybe @odeke-em? :)

@josharian josharian added Suggested Issues that may be good for new contributors looking for work to do. Testing An issue that has been verified to require only test changes, not just a test failure. help wanted labels Mar 28, 2018
@josharian josharian added this to the Go1.11 milestone Mar 28, 2018
@josharian
Copy link
Contributor Author

Hmmm. Actually, looking at the dashboard, it appears that when the network gets really flaky, enough of these backoffs trigger that the package net tests time out instead. So it helps a little...but not enough. Sigh. We could increase the timeout for package net, perhaps? Better a slow pass than a fast flaky failure.

@bradfitz any opinions?

@mvdan
Copy link
Member

mvdan commented Mar 29, 2018

If these individual tests take very long due to a flaky network, I wonder if a higher -parallel number would help. That is, making it so more tests that need to retry multiple times can run at once.

@josharian
Copy link
Contributor Author

Good idea. And none of these tests are marked as parallel at all. So two things to do here: Add retries to TestLookupCNAME, and mark all these tests as parallel.

@gopherbot
Copy link

Change https://golang.org/cl/103655 mentions this issue: net: deflake TestLookupCNAME

@mikioh
Copy link
Contributor

mikioh commented Mar 30, 2018

The back off approach is fine. I'd also like to see some isolation between tests using the infrastructure and tests using the local system. FWIW, I have a plan to improve testing on DNS stub resolver after landing CL 102875, perhaps in Go 1.12 development cycle. It will use mock DNS servers for testing on connection setup and RR lookup APIs and will drop unnecessary dependence on the infrastructure.

gopherbot pushed a commit that referenced this issue Mar 30, 2018
Apply the same approach as in CL 102397.

Updates #24580

Change-Id: I65955f62a70807c87216519d03f3643a8f214dee
Reviewed-on: https://go-review.googlesource.com/103655
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/103975 mentions this issue: net: mark tests with retry as parallel

@bradfitz
Copy link
Contributor

bradfitz commented Apr 1, 2018

Reverted. Reopening.

@bradfitz bradfitz reopened this Apr 1, 2018
@josharian
Copy link
Contributor Author

@feliixx thanks for working on this. Are you up for diagnosing and fixing the data race?

@gopherbot
Copy link

Change https://golang.org/cl/104677 mentions this issue: net: mark tests with retry as parallel

@golang golang locked and limited conversation to collaborators Apr 12, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted Suggested Issues that may be good for new contributors looking for work to do. Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

5 participants