-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime/pprof: memory profiler over-accounts allocations of size >= sampling rate #26618
Comments
Cool, so perhaps the fix is just to remove the rate check? Last time I touched the go heapz sampling I wrote a testcase to validate the accuracy of the unsampling, but sadly does so only up to 64k :/ |
I hoped to find the rational behind this always-sample-large-objects condition but failed to find the explanation. The condition was there from malloc.cgo era (cc @rsc). Meanwhile @rauls5382 implemented more robust sampling implementation based on poisson process but this part of code remained untouched, which I think is a bug. Internally pprof team in Google verified C++ code that also uses the similar sampling mechanism doesn't have such rate check. I tried to remove the rate check and got more reasonable pprof result. |
Yes, I think that the always-sample-large-objects is a bug and should be removed. Feel free to assign this to me if you want me to prepare a commit for it. |
Change https://golang.org/cl/129117 mentions this issue: |
Remove an unnecessary check on the heap sampling code that forced sampling of all heap allocations larger than the sampling rate. This need to follow a poisson process so that they can be correctly unsampled. Maintain a check for MemProfileRate==1 to provide a mechanism for full sampling, as documented in https://golang.org/pkg/runtime/#pkg-variables. Additional testing for this change is on cl/129117. Fixes golang#26618
Change https://golang.org/cl/158337 mentions this issue: |
The test program is given below. It's output is below as well. For the test that allocates the data in 500 KiB chunks, there were 95.367 GiB allocated total and pprof reports matching "95.77GB" number. But for the test that uses 512 KiB chunks (which is == the default memory profiler sampling rate), there were 97.656 GiB allocated while the pprof output reports "154.49GB" which is 58% greater.
The issue is likely caused by the fact that in malloc.go allocations >= the sampling rate are sampled with the probability of 1 while unsampling in pprof/protomem.go is done unconditionally. And 1 / (1 - 1/2.71828) is 1.58.
@rauls5382 @hyangah
The text was updated successfully, but these errors were encountered: