Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/pprof: TestGoroutinesCounts is flaky #15156

Closed
bradfitz opened this issue Apr 6, 2016 · 6 comments
Closed

runtime/pprof: TestGoroutinesCounts is flaky #15156

bradfitz opened this issue Apr 6, 2016 · 6 comments
Labels
FrozenDueToAge help wanted Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@bradfitz
Copy link
Contributor

bradfitz commented Apr 6, 2016

https://storage.googleapis.com/go-build-log/4d227eb8/openbsd-amd64-gce58_b43ea4d4.log

--- FAIL: TestGoroutineCounts (0.04s)
    pprof_test.go:607: expected sorted goroutine counts:
        goroutine profile: total 102
        49 @ 0x426a61 0x426b34 0x403f2d 0x403c6b 0x4706e6 0x44f441
        #   0x4706e5    runtime/pprof_test.func3+0x35   /tmp/workdir/go/src/runtime/pprof/pprof_test.go:584
        #   0x0

        40 @ 0x426a61 0x426b34 0x403f2d 0x403c6b 0x470696 0x44f441
        #   0x470695    runtime/pprof_test.func2+0x35   /tmp/workdir/go/src/runtime/pprof/pprof_test.go:583
        #   0x0

        10 @ 0x426a61 0x426b34 0x403f2d 0x403c6b 0x470646 0x44f441
        #   0x470645    runtime/pprof_test.func1+0x35   /tmp/workdir/go/src/runtime/pprof/pprof_test.go:582
        #   0x0

        1 @ 0x403c70 0x403c6b 0x4706e6 0x44f441
        #   0x4706e5    runtime/pprof_test.func3+0x35   /tmp/workdir/go/src/runtime/pprof/pprof_test.go:584
        #   0x0

        1 @ 0x426a61 0x426b34 0x403f2d 0x403c6b 0x4628a0 0x465bbd 0x46263a 0x4636d6 0x462c18 0x401244 0x42660a 0x44f441
        #   0x46289f    testing.(*T).Run+0x21f      /tmp/workdir/go/src/testing/testing.go:579
        #   0x465bbc    testing.RunTests.func1+0x5c /tmp/workdir/go/src/testing/testing.go:724
        #   0x462639    testing.tRunner+0x79        /tmp/workdir/go/src/testing/testing.go:547
        #   0x4636d5    testing.RunTests+0x2b5      /tmp/workdir/go/src/testing/testing.go:730
        #   0x462c17    testing.(*M).Run+0x77       /tmp/workdir/go/src/testing/testing.go:675
        #   0x401243    main.main+0xb3          runtime/pprof/_test/_testmain.go:72
        #   0x426609    runtime.main+0x209      /tmp/workdir/go/src/runtime/proc.go:188
        #   0x0

        1 @ 0x46ac28 0x46aa41 0x4672c9 0x47085c 0x46263a 0x44f441
        #   0x46ac27    runtime/pprof.writeRuntimeProfile+0x97      /tmp/workdir/go/src/runtime/pprof/pprof.go:555
        #   0x46aa40    runtime/pprof.writeGoroutine+0x90       /tmp/workdir/go/src/runtime/pprof/pprof.go:517
        #   0x4672c8    runtime/pprof.(*Profile).WriteTo+0x318      /tmp/workdir/go/src/runtime/pprof/pprof.go:236
        #   0x47085b    runtime/pprof_test.TestGoroutineCounts+0x15b    /tmp/workdir/go/src/runtime/pprof/pprof_test.go:603
        #   0x462639    testing.tRunner+0x79                /tmp/workdir/go/src/testing/testing.go:547
        #   0x0

FAIL
FAIL    runtime/pprof   1.683s
ok      runtime/trace   18.945s
@bradfitz bradfitz added this to the Unplanned milestone Apr 6, 2016
gopherbot pushed a commit that referenced this issue Apr 6, 2016
Flaky tests are a distraction and cover up real problems.

File bugs instead and mark them as flaky.

This moves the net/http flaky test flagging mechanism to internal/testenv.

Updates #15156
Updates #15157
Updates #15158

Change-Id: I0e561cd2a09c0dec369cd4ed93bc5a2b40233dfe
Reviewed-on: https://go-review.googlesource.com/21614
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
@gopherbot
Copy link

CL https://golang.org/cl/42953 mentions this issue.

@chlunde
Copy link
Contributor

chlunde commented May 16, 2017

It looks like it's still flaky

I suspect it might be related to a busy OS scheduler, I can reproduce it within 500 runs when running the new test using tip when I have ~10 other busy processes on a Linux system. Sometimes the slower goroutine seems to be in or just return from runtime.morestack_noctxt.

The flakyness goes away (under the same load) with 40 000 runs if I add

defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(1))

@aclements would you like me to add a CL, or is there a better way to synchronize for this test?

PS! Looks like profile.String() here has an unrelated issue: pprof_test.go:643: expected count profile to contain goroutines with counts [50 40 10 1], got %!v(PANIC=runtime error: invalid memory address or nil pointer dereference)

@bradfitz bradfitz changed the title runtime/pprof: TestGoroutinesCounts is flaky on OpenBSD runtime/pprof: TestGoroutinesCounts is flaky May 16, 2017
@bradfitz bradfitz added help wanted Testing An issue that has been verified to require only test changes, not just a test failure. and removed OS-OpenBSD labels May 16, 2017
@bradfitz bradfitz modified the milestones: Go1.9, Unplanned May 16, 2017
@bradfitz bradfitz reopened this May 16, 2017
@aclements
Copy link
Member

@chlunde, sure, that seems like a fine change to that test. Please send a CL.

PS! Looks like profile.String() here has an unrelated issue

I tried to reproduce this and get a stack using testing/quick, but the only thing it found is that passing a nil *Profile crashes, but that would have crashed the real test earlier. (Unfortunately, testing/quick itself often crashes when it tries to set the value of an unexported field in Profile!)

@gopherbot
Copy link

CL https://golang.org/cl/43590 mentions this issue.

@chlunde
Copy link
Contributor

chlunde commented May 17, 2017

@aclements I see there's an uncommited change for the panic on https://go-review.googlesource.com/c/43152/

@gopherbot
Copy link

CL https://golang.org/cl/43630 mentions this issue.

gopherbot pushed a commit that referenced this issue May 17, 2017
profileBuilder.locForPC returns 0 to mean "no location" because 0 is
an invalid location index. However, the code to build count profiles
doesn't check the result of locForPC, so this 0 location index ends up
in the profile's location list. This, in turn, causes problems later
when we decode the profile because it puts a nil *Location in the
sample's location slice, which can later lead to a nil pointer panic.

Fix this by making printCountProfile correctly discard the result of
locForPC if it returns 0. This makes this call match the other two
calls of locForPC.

Updates #15156.

Change-Id: I4492b3652b513448bc56f4cfece4e37da5e42f94
Reviewed-on: https://go-review.googlesource.com/43630
Reviewed-by: Michael Matloob <matloob@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@golang golang locked and limited conversation to collaborators May 17, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

5 participants