Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http/pprof: occasional failures in TestDeltaProfile #38544

Closed
bcmills opened this issue Apr 20, 2020 · 13 comments
Closed

net/http/pprof: occasional failures in TestDeltaProfile #38544

bcmills opened this issue Apr 20, 2020 · 13 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Apr 20, 2020

TestDeltaProfile, added in CL 147598 for #23401, seems to be flaky on some of the arm builders.

(Perhaps it is relying on some property of scheduler latency that does not hold on the very slowest builders?)

2020-04-18T20:01:46-f5291cf/plan9-arm
2020-04-18T18:07:52-2a20f5c/android-arm-corellium

CC @hyangah @odeke-em @bradfitz

Marking as release-blocker for 1.15 since the test appears to be new for that release.

@bcmills bcmills added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker labels Apr 20, 2020
@bcmills bcmills added this to the Go1.15 milestone Apr 20, 2020
@bcmills
Copy link
Contributor Author

bcmills commented Apr 22, 2020

@bcmills bcmills changed the title net/http/pprof: TestDeltaProfile failures on arm builders net/http/pprof: occasional failures in TestDeltaProfile Apr 22, 2020
@hyangah
Copy link
Contributor

hyangah commented Apr 22, 2020

any objection to disabling this test in all platforms other than in linux?

@bcmills
Copy link
Contributor Author

bcmills commented Apr 22, 2020

Is there a reason to suspect that it is not flaky on Linux?

(In my experience, the unusual builders are often just more timing-sensitive than the mainline builders — tests that are flaky on more than one of those platforms also tend to be flaky on linux, just at a lower rate.)

@hyangah
Copy link
Contributor

hyangah commented Apr 22, 2020

Or, maybe we completely delete this test?
I don't know what to make out of testing this on the platforms where even time.Sleep(1sec) results in sleeping more than 2 sec.

@gopherbot
Copy link

Change https://golang.org/cl/229498 mentions this issue: net/http/pprof: make TestDeltaProfile less flaky by retrying

gopherbot pushed a commit that referenced this issue Apr 22, 2020
In some slow environment, the goroutine for mutexHog2 may not run
within 1secs. So, try with increasing seconds parameters,
and declare failure if it still fails with the longest duration
parameter (32sec).

Also, relax the test condition - previously we expected the
profile's duration is within 0.5~2sec. But obviously, in some
slow environment, that's not even guaranteed. Just check we get
non-zero duration in the result.

Update #38544

Change-Id: Ia9b0d51429a2093e6c9eb92cf463ff6952ef3e10
Reviewed-on: https://go-review.googlesource.com/c/go/+/229498
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@gopherbot

This comment has been minimized.

@josharian

This comment has been minimized.

@hyangah
Copy link
Contributor

hyangah commented Apr 28, 2020

I don't see any more test failures after the cl. Closing it. Thanks!

@hyangah hyangah closed this as completed Apr 28, 2020
@bcmills
Copy link
Contributor Author

bcmills commented Jul 27, 2020

@bcmills bcmills reopened this Jul 27, 2020
@andybons andybons modified the milestones: Go1.15, Go1.16 Aug 11, 2020
@odeke-em
Copy link
Member

odeke-em commented Feb 6, 2021

Hasn’t been flaky for the past 6 months, so punting to Go1.17.

@odeke-em odeke-em modified the milestones: Go1.16, Go1.17 Feb 6, 2021
@ianlancetaylor
Copy link
Contributor

No new failures, calling this fixed.

@bcmills
Copy link
Contributor Author

bcmills commented Jun 8, 2021

Apparently still not fixed, although the failure rate does still seem to be lower than it was.
2021-06-04T17:33:24-831f937/openbsd-arm-jsing
2020-09-03T21:50:25-612b119/openbsd-arm-jsing

@bcmills bcmills reopened this Jun 8, 2021
@mknyszek mknyszek modified the milestones: Go1.17, Backlog Aug 18, 2021
@bcmills
Copy link
Contributor Author

bcmills commented Dec 16, 2021

greplogs --dashboard -md -l -e 'FAIL: TestDeltaProfile ' --since=2021-06-08

2021-12-16T00:34:10-7f23145/openbsd-arm-jsing
2021-11-25T00:02:52-b2a5a37/openbsd-arm-jsing

Looks like this failure mode is specific to openbsd-arm-jsing. I'll file a separate issue.

@bcmills bcmills closed this as completed Dec 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

8 participants