proposal: runtime: add HeapMarked to MemStats #51966

doujiang24 · 2022-03-26T14:04:47Z

Background:
We want to figure out the reason for memory spikes.

I found this may be a proper way:

install a GC finalizer.
monitor the HeapMarked after every GC termination, in GC finalizer.

Like this case:

HeapMarked keeps in almost 100MB usually.
if HeapMarked grows fast, reaches 150MB, after one GC termination, then, dump a heap profile, eg. a.profile.
and dump another heap profile, after the next GC termination, eg. b.profile.

We could know why heap grows 50MB, by using this command:

go tool pprof b.profile -base a.profile

This could work since heap profile may be up to two garbage collection cycles old:

go/src/runtime/mprof.go

Line 560 in 8c73f80

// The returned profile may be up to two garbage collection cycles old.

We could roughly calculate the HeapMarked by NextGC / (1+GCPercent/100).
I think it would be useful to add HeapMarked to MemStats, so that we can get it exactly.

The text was updated successfully, but these errors were encountered:

ianlancetaylor · 2022-03-26T23:33:07Z

CC @mknyszek @golang/runtime

mknyszek · 2022-03-27T16:49:10Z

A few thoughts:

We will not be adding more fields to MemStats. For all future metrics, runtime/metrics is the new package they'll be added to. It's more efficient, and gives us the option to properly deprecate and drop metrics in the future (certainly not often though, so we still want to be careful; it's not a dumping ground).
With runtime/debug: soft memory limit #48409, which I still plan to land this cycle, the amount of memory marked live is a lot less meaningful if there's a soft memory limit in effect. As an example, NextGC / (1+GCPercent/100) will definitely be inaccurate (which means HeapMarked * (1+GCPercent/100) will also be inaccurate) once that lands.
Finalizers may be run at an arbitrary time after their associated objects die, so there definitely isn't a 1:1 correspondence between finalizer execution and GC cycles.
Live heap spikes can happen for a lot of reasons that may be otherwise benign or unavoidable (and a heap profile won't reveal or point out anything in particular). For example, if the application suddenly begins allocating more rapidly, the GC mark phase will begin earlier to give the GC more runway and time to catch up without assists. The result is more memory being allocated in an already-marked state, ultimately leading to a higher live heap for the next cycle.

More fundamentally, if (3) isn't a problem (and it seems like it would be OK in your use-case to be a little loose), why isn't HeapInUse enough? While HeapInUse doesn't reveal the sawtooth curve of the GC cycle, it does still scale with... well, how much of the heap is in use. :) I think that's a reasonable expectation to have going forward.

I'm not fundamentally opposed to something like HeapMarked being added to runtime/metrics, but I think it needs some thought as to why this isn't already exposed and what the implications are with respect to GC implementations. As a general rule, the more internal state of the GC you reveal, the more users come to rely on that state, making it harder and harder to change the implementation.

doujiang24 · 2022-03-28T02:53:40Z

Thanks.

For (1), got it, thanks.
For (2), #48409 sounds really a good/useful proposal.
For (3), yeah, it is acceptable for us, it's ok while the finalizer is invoked before the next GC cycle, which is a large enough time, usually.
But, HeapInUse is not good enough, since it always grows, and we can not assume that delay time(GC termination to finalizer invoked) is very small.

For (4). We just want to figure out the spikes that are unexpected. We used to enter this case:
In some bad code paths, it will allocate lots of GC objects, and hold for a while, ultimately leading to a large GC goal.
Sometimes that GC goal is large enough to cause OOM, OOM happens before the next GC cycle happens.
We want to figure out the bad code paths accurately, by using GC finalizer + heap profile.
(We used to figure out it in by heap profile + time Ticketer, but we can not capture the bad code path accurately).
By the way, #48409 also helps in this case.

As a general rule, the more internal state of the GC you reveal, the more users come to rely on that state, making it harder and harder to change the implementation.

Yeah, totally understand.

rsc · 2022-09-07T17:47:46Z

Adding to minutes but it seems like a likely decline for next week.

rsc · 2022-09-07T19:02:04Z

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

rsc · 2022-09-21T18:02:43Z

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

rsc · 2022-09-28T20:04:40Z

No change in consensus, so declined.
— rsc for the proposal review group

doujiang24 added the Proposal label Mar 26, 2022

gopherbot added this to the Proposal milestone Mar 26, 2022

rsc moved this to Incoming in Proposals Aug 10, 2022

rsc added this to Proposals Aug 10, 2022

rsc moved this from Incoming to Active in Proposals Sep 7, 2022

julieqiu added this to Go Security Sep 8, 2022

julieqiu removed this from Go Security Sep 8, 2022

rsc moved this from Active to Likely Decline in Proposals Sep 21, 2022

rsc added the Proposal-FinalCommentPeriod label Sep 21, 2022

rsc removed the Proposal-FinalCommentPeriod label Sep 28, 2022

rsc moved this from Likely Decline to Declined in Proposals Sep 28, 2022

rsc closed this as completed Sep 28, 2022

mknyszek mentioned this issue Nov 21, 2022

runtime/metrics: add /gc/heap/live:bytes #56857

Closed

golang locked and limited conversation to collaborators Sep 28, 2023

gopherbot added the FrozenDueToAge label Sep 28, 2023

rsc removed this from Proposals Oct 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: runtime: add HeapMarked to MemStats #51966

proposal: runtime: add HeapMarked to MemStats #51966

doujiang24 commented Mar 26, 2022

ianlancetaylor commented Mar 26, 2022

mknyszek commented Mar 27, 2022

doujiang24 commented Mar 28, 2022

rsc commented Sep 7, 2022

rsc commented Sep 7, 2022

rsc commented Sep 21, 2022

rsc commented Sep 28, 2022

proposal: runtime: add HeapMarked to MemStats #51966

proposal: runtime: add HeapMarked to MemStats #51966

Comments

doujiang24 commented Mar 26, 2022

ianlancetaylor commented Mar 26, 2022

mknyszek commented Mar 27, 2022

doujiang24 commented Mar 28, 2022

rsc commented Sep 7, 2022

rsc commented Sep 7, 2022

rsc commented Sep 21, 2022

rsc commented Sep 28, 2022