Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/pprof: TestProfilerStackDepth/heap failures #70112

Closed
gopherbot opened this issue Oct 29, 2024 · 5 comments
Closed

runtime/pprof: TestProfilerStackDepth/heap failures #70112

gopherbot opened this issue Oct 29, 2024 · 5 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@gopherbot
Copy link
Contributor

#!watchflakes
default <- pkg == "runtime/pprof" && test == "TestProfilerStackDepth/heap"

Issue created automatically to collect these failures.

Example (log):

=== RUN   TestProfilerStackDepth/heap
    pprof_test.go:2545: Profile = PeriodType: space bytes
        Period: 1
        Time: 2024-10-29 09:29:44.147981 -0700 PDT
        Samples:
        alloc_objects/count alloc_space/bytes inuse_objects/count inuse_space/bytes
                  0          0          0          0: 1 2 3 4 5 
                  0          0          0          0: 6 5 
                  0          0          0          0: 7 8 5 
                  0          0          0          0: 9 10 8 5 
...
           763: 0x100af4087 M=1 runtime.mstart1 /Volumes/Work/s/w/ir/x/w/goroot/src/runtime/proc.go:1852 s=1818
           764: 0x100af3fb3 M=1 runtime.mstart0 /Volumes/Work/s/w/ir/x/w/goroot/src/runtime/proc.go:1802 s=1776
           765: 0x100b2bdff M=1 runtime.mstart /Volumes/Work/s/w/ir/x/w/goroot/src/runtime/asm_arm64.s:166 s=165
           766: 0x100afd49f M=1 runtime.procresize /Volumes/Work/s/w/ir/x/w/goroot/src/runtime/proc.go:5752 s=5694
           767: 0x100af2067 M=1 runtime.schedinit /Volumes/Work/s/w/ir/x/w/goroot/src/runtime/proc.go:873 s=793
           768: 0x100b2bd8b M=1 runtime.rt0_go /Volumes/Work/s/w/ir/x/w/goroot/src/runtime/asm_arm64.s:106 s=16
        Mappings
        1: 0x100ab8000/0x100c7c000/0x0 /Volumes/Work/s/w/ir/x/t/go-build3551365136/b1697/pprof.test  [FN]
    pprof_test.go:2556: want stack depth = 128, got 121
--- FAIL: TestProfilerStackDepth/heap (0.14s)

watchflakes

@gopherbot gopherbot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 29, 2024
@gopherbot
Copy link
Contributor Author

Found new dashboard test flakes for:

#!watchflakes
default <- pkg == "runtime/pprof" && test == "TestProfilerStackDepth/heap"
2024-10-29 16:24 gotip-darwin-arm64_13 go@f5526b56 runtime/pprof.TestProfilerStackDepth/heap (log)
=== RUN   TestProfilerStackDepth/heap
    pprof_test.go:2545: Profile = PeriodType: space bytes
        Period: 1
        Time: 2024-10-29 09:29:44.147981 -0700 PDT
        Samples:
        alloc_objects/count alloc_space/bytes inuse_objects/count inuse_space/bytes
                  0          0          0          0: 1 2 3 4 5 
                  0          0          0          0: 6 5 
                  0          0          0          0: 7 8 5 
                  0          0          0          0: 9 10 8 5 
...
           763: 0x100af4087 M=1 runtime.mstart1 /Volumes/Work/s/w/ir/x/w/goroot/src/runtime/proc.go:1852 s=1818
           764: 0x100af3fb3 M=1 runtime.mstart0 /Volumes/Work/s/w/ir/x/w/goroot/src/runtime/proc.go:1802 s=1776
           765: 0x100b2bdff M=1 runtime.mstart /Volumes/Work/s/w/ir/x/w/goroot/src/runtime/asm_arm64.s:166 s=165
           766: 0x100afd49f M=1 runtime.procresize /Volumes/Work/s/w/ir/x/w/goroot/src/runtime/proc.go:5752 s=5694
           767: 0x100af2067 M=1 runtime.schedinit /Volumes/Work/s/w/ir/x/w/goroot/src/runtime/proc.go:873 s=793
           768: 0x100b2bd8b M=1 runtime.rt0_go /Volumes/Work/s/w/ir/x/w/goroot/src/runtime/asm_arm64.s:106 s=16
        Mappings
        1: 0x100ab8000/0x100c7c000/0x0 /Volumes/Work/s/w/ir/x/t/go-build3551365136/b1697/pprof.test  [FN]
    pprof_test.go:2556: want stack depth = 128, got 121
--- FAIL: TestProfilerStackDepth/heap (0.14s)

watchflakes

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Oct 29, 2024
@nsrip-dd
Copy link
Contributor

nsrip-dd commented Nov 1, 2024

I don't think this is darwin/arm64 specific? This is failing because of the elided runtime frames at the end of the stack. If I dump the runtime.MemProfile stacks on failure, which don't trim runtime frames, we sometimes can get a stack like this:

runtime.acquireSudog /Users/nick.ripley/repos/go/src/runtime/proc.go:484
runtime.semacquire1 /Users/nick.ripley/repos/go/src/runtime/sema.go:149
runtime.semacquire /Users/nick.ripley/repos/go/src/runtime/sema.go:129
runtime.gcMarkDone /Users/nick.ripley/repos/go/src/runtime/mgc.go:827
runtime.gcAssistAlloc /Users/nick.ripley/repos/go/src/runtime/mgcmark.go:552
runtime.deductAssistCredit /Users/nick.ripley/repos/go/src/runtime/malloc.go:1679
runtime.mallocgc /Users/nick.ripley/repos/go/src/runtime/malloc.go:1044
runtime.convTslice /Users/nick.ripley/repos/go/src/runtime/iface.go:443
runtime/pprof.allocDeep /Users/nick.ripley/repos/go/src/runtime/pprof/pprof_test.go:2601
runtime/pprof.allocDeep /Users/nick.ripley/repos/go/src/runtime/pprof/pprof_test.go:2598
runtime/pprof.allocDeep /Users/nick.ripley/repos/go/src/runtime/pprof/pprof_test.go:2598
runtime/pprof.allocDeep /Users/nick.ripley/repos/go/src/runtime/pprof/pprof_test.go:2598
... etc ... 

Those runtime.* frames count against the stack depth limit when recording the stack, but get removed from the pprof output. We can probably change the test to check to account for this. Right now it only checks the first stack with allocDeep frames, which might be this one. We could check all the allocDeep stacks, since at least one should correspond to the large allocation the test is really looking for.

@gopherbot
Copy link
Contributor Author

Change https://go.dev/cl/623998 mentions this issue: runtime/pprof: relax TestProfilerStackDepth

@felixge
Copy link
Contributor

felixge commented Nov 2, 2024

Woah, I didn't realize mallocgc was allowed to allocate inside of itself. But upon closer inspection it turns out that's indeed possible until mp.mallocing is set, which makes sense. Thanks for figuring this out and submitting a fix. I reviewed the CL - LGTM 🙇.

Also: I think we've been bitten by the elided runtime frames several times now when it comes to writing reliable tests for profiling. This alone may not be sufficient reason to get rid of them, but I also feel that in situations like these they actually hide important allocation information from the user. At least personally, I'd very much like to know if something is allocating in the runtime. Especially if it explains unexpected allocation sizes being report in pprof. The way this is currently presented to the user is very confusing IMO.

@prattmic
Copy link
Member

prattmic commented Nov 4, 2024

At least personally, I'd very much like to know if something is allocating in the runtime.

Me too. :)

Personally, I think we should consider if we always include these frames in profiles, but perhaps have the pprof tool itself hide them by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Archived in project
Development

No branches or pull requests

4 participants