runtime/metrics: inconsistency can be observed in /cpu/classes metrics #66212
Labels
compiler/runtime
Issues related to the Go compiler and/or runtime.
Milestone
Go version
go version 1.23-dev-c8c46e746b778c39727c588adf79aff34ab6f151
Output of
go env
in your module/workspace:What did you do?
Run the following program over and over until it fails. It compares the result of fetching
/cpu/classes/gc/total:cpu-seconds
and calculating it from its parts, and also prints how many ULPs apart the numbers are ([good article on this])(https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/). The program makes a little effort to entice a sheared update (successfully on my workstation). This is perhaps the reason the current tests need a 2% fudge factor to pass. Once fixed, I recommend adjusting the tests to be much stricter, allowing only a couple of ULPs difference.What did you see happen?
What did you expect to see?
I expected the diff to be much smaller, on the order of
4*ULP(total)
. But there's a very large difference (in float space) between these values. After investigation together with @mknyszek, the issue was found: most of the value that serve as the source for these metrics are updated inaccumulate
:go/src/runtime/mstats.go
Lines 953 to 976 in 61d6817
under the same (STW) lock. But
gcPauseTime
andgcTotalTime
are not updated outside of the lock somewhere else:go/src/runtime/mgc.go
Lines 751 to 752 in 61d6817
It is possible that a metric read happens in between these calls, and sheared results are observed. I've verified that this is what happens by manually changing
sweepTermCpu
to a large value and re-running the test. Exactly this change then appears in the diff.The text was updated successfully, but these errors were encountered: