-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime/pprof: 'Total samples' less than expected #45983
Comments
There may be an obvious explanation for this that I'm forgetting, but this seems off to me. I can reproduce with a simple program: package main
import (
"runtime/pprof"
"os"
"time"
)
func main() {
f, err := os.Create("/tmp/cpu.pprof")
if err != nil {
panic(err)
}
defer f.Close()
if err := pprof.StartCPUProfile(f); err != nil {
panic(err)
}
defer pprof.StopCPUProfile()
for i := 0; i < 8; i++ {
go func() {
for {}
}()
}
time.Sleep(10*time.Second)
}
Linux One thing that came to mind is that we may not be keeping up and be dropping samples, but it looks like we add fake |
@prattmic The discrepancy seems to pertain to environments, I happen to have two environments, one with discrepancy, it's where the pprof data in the problem description came from, while the other one's pprof data matches what I expect, that is 'total samples' roughly equal to "Duration * 100 * CPU usage". I tried your sample case in the two envs as well.
|
I used to add a counter in 'sighandler' to count '_SIGPROF' and turned out the number always equals to 'total samples' (sample_index=samples) in both environments I mentioned before. A wild guess is somehow some _SIGPROF fails to land. |
@shawn-xdji what do you mean by 'environment'? Different machines? Or different configurations on the same machine?
I did this too, with the same result. :) One possibility is that the kernel is dropping some of the signals. Roughly, the way ITIMER_PROF works is:
I believe SIGPROF is masked when calling |
Looking at the kernel code a bit more, there is also a similar and more likely chance for the signal to be dropped. Note that SIGPROFs are queued on the thread-group-wide signal queue (PIDTYPE_TGID): https://elixir.bootlin.com/linux/v5.12.1/source/kernel/signal.c#L1111
This leaves a small window between queuing the signal and the current thread dequeuing it where any additional timer expirations would drop their SIGPROFs. This should still be a pretty small race, but seems a bit more reasonable than the previous case. |
This is not a Go specific problem. From the Go program above, I get ~2800 SIGPROF signals. #include <pthread.h>
#include <signal.h>
#include <stdint.h>
#include <stdio.h>
#include <sys/time.h>
#include <unistd.h>
_Atomic uint64_t sigprof_count = 0;
void handler(int sig, siginfo_t* info, void* priv) {
sigprof_count++;
}
void* spin(void* priv) {
while (1) {
}
return NULL;
}
int main(int argc, char** argv) {
struct sigaction sa = {
.sa_sigaction = handler,
.sa_flags = SA_SIGINFO,
};
int ret = sigaction(SIGPROF, &sa, NULL);
if (ret < 0) {
perror("sigaction");
return 1;
}
sigset_t set;
sigemptyset(&set);
sigaddset(&set, SIGPROF);
ret = sigprocmask(SIG_UNBLOCK, &set, NULL);
if (ret < 0) {
perror("sigprocmask");
return 1;
}
struct itimerval it = {
.it_interval = {
.tv_usec = 10000,
},
.it_value = {
.tv_usec = 10000,
},
};
ret = setitimer(ITIMER_PROF, &it, NULL);
if (ret < 0) {
perror("setitimer");
return 1;
}
for (int i = 0; i < 8; i++) {
pthread_t id;
ret = pthread_create(&id, NULL, spin, NULL);
if (ret < 0) {
perror("pthread_create");
return 1;
}
}
sleep(10);
printf("sigprof_count = %llu\n", sigprof_count);
return 0;
} |
I believe this is a duplicate of #35057. |
They are of different machines and different OSes (ubuntu vs. debian). |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
on a machine with 16-core CPU (i7-10700K)
% go test -c runtime
4-core, multiple-round benchmarking delivers stable pprof numbers
% GOMAXPROCS=4 ./runtime.test -test.bench=ChanContended -test.run=none -test.benchtime=20s -test.cpuprofile=/tmp/p
% go tool pprof /tmp/p
File: runtime.test
Type: cpu
Time: May 6, 2021 at 10:48am (CST)
Duration: 28.53s, Total samples = 71.09s (249.20%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof)
8-core, multiple-round benchmarking delivers stable pprof numbers
% GOMAXPROCS=8 ./runtime.test -test.bench=ChanContended -test.run=none -test.benchtime=20s -test.cpuprofile=/tmp/p
% go tool pprof /tmp/p
File: runtime.test
Type: cpu
Time: May 6, 2021 at 10:41am (CST)
Duration: 31.09s, Total samples = 106.05s (341.07%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof)
16-core, multiple-round benchmarking delivers stable pprof numbers
% ./runtime.test -test.bench=ChanContended -test.run=none -test.benchtime=20s -test.cpuprofile=/tmp/p
% go tool pprof /tmp/p
File: runtime.test
Type: cpu
Time: May 6, 2021 at 10:47am (CST)
Duration: 37.10s, Total samples = 92.34s (248.92%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof)
What did you expect to see?
The 'Total samples' is near to GOMAXPROCS * 100% as the benchmark occupies all available cores.
If my assumption (expected 'total samples' near to GOMAXPROCS * 100%) is wrong, I'd like to learn what's the logic behind the gap and the variance.
What did you see instead?
'Total samples' is far less than expected.
The text was updated successfully, but these errors were encountered: