Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: TestServerCancelsReadHeaderTimeoutWhenIdle failing frequently since 2022-09-05 #54891

Closed
bcmills opened this issue Sep 6, 2022 · 10 comments
Assignees
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker Soon This needs to be done soon. (regressions, serious bugs, outages) Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@bcmills bcmills added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Soon This needs to be done soon. (regressions, serious bugs, outages) labels Sep 6, 2022
@bcmills bcmills added this to the Go1.20 milestone Sep 6, 2022
@bcmills
Copy link
Contributor Author

bcmills commented Sep 6, 2022

This is the regression test added for #54784 (CC @hawkw).

@aclements
Copy link
Member

Just to add some detail: this test was added 2022-09-05 in CL 426895. So this test has been flaky from the start.

@hawkw
Copy link
Contributor

hawkw commented Sep 7, 2022

Interesting...the test exercises behavior related to timeouts, is it possible it's flaky on CI for that reasons?

@hawkw
Copy link
Contributor

hawkw commented Sep 7, 2022

It looks like in all the failures linked in the issue, the test fails on the first request, before waiting for the idle timeout: https://github.com/hawkw/go/blob/09332743ad6d5a9eb1137adaade2810c583d38ca/src/net/http/serve_test.go#L5878-L5880

Sending this initial request is necessary to trigger the incorrect behavior prior to #54784, but isn't actually testing the change itself. So, this isn't a regression, but an issue with the test itself.

This test passes for me locally in the following environment:

$ uname -a
Linux noctis 5.18.19 #1-NixOS SMP PREEMPT_DYNAMIC Sun Aug 21 13:18:56 UTC 2022 x86_64 GNU/Linux

so I wonder if there's something specific to CI environment that's causing it to be flaky? As a first-time contributor, I'm not super familiar with the environment these tests are run in, so any details that effect tests like this would be very helpful!

@mknyszek
Copy link
Contributor

mknyszek commented Sep 8, 2022

I came upon this recently in a trybot (https://storage.googleapis.com/go-build-log/3a262583/openbsd-amd64-70_38a0ba2b.log) and it blocked an auto-submit.

I too tried to reproduce this locally with stress and was unable to. Any objections to skipping this test on the platforms we've seen the failures on?

@bcmills
Copy link
Contributor Author

bcmills commented Sep 8, 2022

The failures don't seem to be platform-specific.

I suggest that we either revert the CL that introduced the test (which can be resent and merged once the test is fixed), or skip the entire test always using testenv.SkipFlaky until it can be fixed.

@bcmills
Copy link
Contributor Author

bcmills commented Sep 8, 2022

@hawkw, generally any test that has specific timeout behavior will be flaky on CI machines, because they tend to be much more heavily loaded that developer workstations.

I took a look at the test and left some more specific comments on https://go.dev/cl/426895.

@toothrot
Copy link
Contributor

toothrot commented Sep 13, 2022

@gopherbot
Copy link

Change https://go.dev/cl/430955 mentions this issue: net/http: deflake TestServerCancelsReadHeaderTimeoutWhenIdle

@bcmills
Copy link
Contributor Author

bcmills commented Sep 14, 2022

I got tired of restarting TryBots for this, so I mailed https://go.dev/cl/430955 with a fix.

@bcmills bcmills self-assigned this Sep 14, 2022
@bcmills bcmills added Testing An issue that has been verified to require only test changes, not just a test failure. NeedsFix The path to resolution is known, but the work has not been done. labels Sep 14, 2022
@gopherbot gopherbot removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 14, 2022
@golang golang locked and limited conversation to collaborators Sep 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker Soon This needs to be done soon. (regressions, serious bugs, outages) Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

7 participants