Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: TestDialParallel spuriously fails due to hard-coded timeout in dialClosedPort #46884

Closed
bcmills opened this issue Jun 23, 2021 · 3 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Jun 23, 2021

We know from past experience that hard-coded timeouts lead to flaky tests. The dialClosedPort helper function has one such hard-coded timeout:

go/src/net/dial_test.go

Lines 159 to 168 in 82517ac

// Estimate the expected time for this platform.
// On Windows, dialing a closed port takes roughly 1 second,
// but other platforms should be instantaneous.
if runtime.GOOS == "windows" {
expected = 1500 * time.Millisecond
} else if runtime.GOOS == "darwin" || runtime.GOOS == "ios" {
expected = 150 * time.Millisecond
} else {
expected = 95 * time.Millisecond
}

The test provides no rationale for why we expect 1500ms to be a firm upper bound on Windows. For that matter, it provides no rationale as to why we expect 95 or 150ms to be firm upper bounds on other platforms.

And, indeed, it isn't a firm upper bound on those other platforms either:

go/src/net/dial_test.go

Lines 177 to 179 in 82517ac

// On OpenBSD, interference from TestTCPSelfConnect is mysteriously
// causing the first attempt to hang for a few seconds, so we throw
// away the first result and keep the second.


So it comes as no surprise that the test consistently fails on my new Surface Book 3:

PS C:\Users\bryan\src\go\src> ..\bin\go version
go version devel go1.17-0ebd5a8de0 Tue Jun 22 16:59:10 2021 +0000 windows/amd64
PS C:\Users\bryan\src\go\src> ..\bin\go env
set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\bryan\AppData\Local\go-build
set GOENV=C:\Users\bryan\AppData\Roaming\go\env
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=C:\Users\bryan\go\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\bryan\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=C:\Users\bryan\src\go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=C:\Users\bryan\src\go\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=devel go1.17-0ebd5a8de0 Tue Jun 22 16:59:10 2021 +0000
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=C:\Users\bryan\src\go\src\go.mod
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\bryan\AppData\Local\Temp\go-build2503220696=/tmp/go-build -gno-record-gcc-switches
PS C:\Users\bryan\src\go\src> ..\bin\go test net
--- FAIL: TestDialParallel (12.92s)
    dial_test.go:188: dialClosedPort: measured delay 2.0038828s
    dial_test.go:203: got 2.0038828s; want <= 1.5s
FAIL
FAIL    net     45.755s
FAIL
PS C:\Users\bryan\src\go\src>

Probably the tests should be reworked not to rely on hard-coded timeouts at all.

@bcmills bcmills added Testing An issue that has been verified to require only test changes, not just a test failure. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jun 23, 2021
@bcmills bcmills added this to the Backlog milestone Jun 23, 2021
@bcmills
Copy link
Contributor Author

bcmills commented Jun 23, 2021

See previously #35616, #22062. (CC @ianlancetaylor @neild)

@bcmills
Copy link
Contributor Author

bcmills commented Jun 23, 2021

The hard-coded timeout was originally added sometime before patchset 8 in CL 8768 (CC @pmarks-net @mikioh @bradfitz). I don't see any comments in the code or that code review explaining why aggressive timeouts are necessary or helpful to the test.

@gopherbot
Copy link

Change https://golang.org/cl/330250 mentions this issue: net: remove hard-coded timeout in dialClosedPort test helper

@golang golang locked and limited conversation to collaborators Jun 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

2 participants