New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: TestListenCloseListen failures #65175
Comments
Found new dashboard test flakes for:
2024-01-08 18:26 gotip-solaris-amd64 go@75984918 net.TestListenCloseListen (log)
|
This test is inherently racy: Depending on the other tests running on the machine, it may have already reused the port for some other listener, perhaps even in some other test process. Perhaps we could keep the port from being reused by leaving an open connection on it, but I wonder if that would also prevent the (CC @ianlancetaylor @neild) |
Change https://go.dev/cl/557177 mentions this issue: |
Found new dashboard test flakes for:
2024-01-22 16:54 linux-arm64-race go@4ca1caf4 net.TestListenCloseListen (log)
|
Found new dashboard test flakes for:
2024-01-22 16:30 windows-amd64-race go@41c05ea4 net.TestListenCloseListen (log)
2024-01-22 16:50 windows-amd64-race go@846bb475 net.TestListenCloseListen (log)
2024-01-22 17:00 linux-amd64-race go@558919b4 net.TestListenCloseListen (log)
|
Change https://go.dev/cl/557536 mentions this issue: |
In CL 557177, I attempted to fix a logical race in this test (#65175). However, I introduced a data race in the process (#65209). The race was reported on the windows-amd64-race builder. When I tried to reproduce it on linux/amd64, I added a time.Sleep in the Accept loop. However, that Sleep causes the test to fail outright with EADDRINUSE, which suggests that my earlier guess about the open Conn preventing reuse of the port was, in fact, incorrect. On some platforms we could instead use SO_REUSEPORT and avoid closing the first Listener entirely, but that wouldn't be even remotely in the spirit of the original test. Since I don't see a way to preserve the test in a way that is not inherently flaky / racy, I suggest that we just delete it. It was originally added as a regression test for a bug in the nacl port, which no longer exists anyway. (Some of that code may live on in the wasm port, but it doesn't seem worth maintaining a flaky port-independent test to maintain a regression test for a bug specific to secondary platforms.) Fixes #65209. Updates #65175. Change-Id: I32f9da779d24f2e133571f0971ec460cebe7820a Cq-Include-Trybots: luci.golang.try:gotip-windows-amd64-race Reviewed-on: https://go-review.googlesource.com/c/go/+/557536 Run-TryBot: Bryan Mills <bcmills@google.com> Auto-Submit: Bryan Mills <bcmills@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Jonathan Amsterdam <jba@google.com>
Also make it flakier in longtest mode by burning through more ephemeral ports. (Burning through the ports raised the failure rate for me locally enough to reliably reproduce the failure in golang#65175 with -count=10.) Fixes golang#65175 (I hope). Change-Id: I5f5b68b6bf6a6aa92e66f0288078817041656a3e Reviewed-on: https://go-review.googlesource.com/c/go/+/557177 Reviewed-by: Damien Neil <dneil@google.com> Auto-Submit: Bryan Mills <bcmills@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
In CL 557177, I attempted to fix a logical race in this test (golang#65175). However, I introduced a data race in the process (golang#65209). The race was reported on the windows-amd64-race builder. When I tried to reproduce it on linux/amd64, I added a time.Sleep in the Accept loop. However, that Sleep causes the test to fail outright with EADDRINUSE, which suggests that my earlier guess about the open Conn preventing reuse of the port was, in fact, incorrect. On some platforms we could instead use SO_REUSEPORT and avoid closing the first Listener entirely, but that wouldn't be even remotely in the spirit of the original test. Since I don't see a way to preserve the test in a way that is not inherently flaky / racy, I suggest that we just delete it. It was originally added as a regression test for a bug in the nacl port, which no longer exists anyway. (Some of that code may live on in the wasm port, but it doesn't seem worth maintaining a flaky port-independent test to maintain a regression test for a bug specific to secondary platforms.) Fixes golang#65209. Updates golang#65175. Change-Id: I32f9da779d24f2e133571f0971ec460cebe7820a Cq-Include-Trybots: luci.golang.try:gotip-windows-amd64-race Reviewed-on: https://go-review.googlesource.com/c/go/+/557536 Run-TryBot: Bryan Mills <bcmills@google.com> Auto-Submit: Bryan Mills <bcmills@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Jonathan Amsterdam <jba@google.com>
Fixed (by deleting) in https://go.dev/cl/557536. |
Issue created automatically to collect these failures.
Example (log):
— watchflakes
The text was updated successfully, but these errors were encountered: