Skip to content

runtime: heapdump and GC are inconsistent #7820

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
randall77 opened this issue Apr 18, 2014 · 10 comments
Closed

runtime: heapdump and GC are inconsistent #7820

randall77 opened this issue Apr 18, 2014 · 10 comments
Milestone

Comments

@randall77
Copy link
Contributor

Objects which look unreachable in the heap dump are being retained by the garbage
collector.  Unclear whether this is a bug in the heapdumper or a bug in GC, but we
suspect the GC as these objects should not be reachable in the source program.

The patch he mentions is 89360043, which cleans up some false retentions due to
conservative scanning of Gs.

Alan's reproducing instructions:

with your patch, running my script on a dump made after a GC shows no paths from the
rootset to an ssa object. Yet the memory usage reported by runtime.MemStats is large,
indicating that it includes the ssa objects. The heap dump contains all the ssa objects,
even though they don't appear to be reachable from the root, and jhat, which doesn't
check reachability, shows all those objects.  I think this indicates that the GC is
following an edge that is not recorded in the heapdump.  This is really two problems:
(1) the GC and heapdump aren't consistent and (2) the GC is too conservative and the
missing edge is spurious. 

Can I give you my test case for you to investigate?  I'm not really making any progress.

Patch your "/heapdump" handler so that it logs MemStats.Alloc just after GC. 
Build code.google.com/p/go.tools/cmd/godoc and run it like so:
% GOPATH= ./godoc -analysis=type -http :8888
Dump the heap after the message "Extracting type info complete" appears, i.e.
analysis is done.  The memory usage will be around 450MB, although I believe it ought to
be <<50MB.

Build and run the following program:
/home/adonovan/got4/src/github.com/randall77/hprof/heapsearch/main.go
% ./heapsearch dump godoc 

Without your runtime patches to zero out pointers, this will report a path via (I think)
allg[4].schedlink, or some variation.
With your patches, it reports nothing.  Yet running jhat on the same dump (converted to
hprof) shows plenty of ssa objects.
@randall77
Copy link
Contributor Author

Comment 2:

I am unable to reproduce this.  
Note that MemStats.Alloc is logged in the heap dump output, you don't need to log it
separately.  Just use d.Memstats.Alloc if d is a *read.Dump.
I get an MemStats.Alloc of 124MB.  There are only three objects in the dump with "ssa"
in their type name, and they are all ssa.Const.
There are a few objects in the dump which are unreachable.  I need to find out why that
is.  But they are all untyped (probably pointerless or runtime internal) and there are
only a few thousand of them using a total of 1MB.

@randall77
Copy link
Contributor Author

Comment 3:

Ok, I figured out what these unreachable objects are.  They are the MSpan's types arrays
that are freed during sweeping.  Sweeping doesn't happen during runtime.GC() so the heap
dumper does it.  It is harmless and does not indicate an inconsistency between GC and
the dumper.
Maybe an explicit runtime.GC() should sweep as well as mark?  Something to think about.

@minux
Copy link
Member

minux commented Apr 26, 2014

Comment 4:

i agree. i think there are other issue about making explicit runtime.GC() run full mark
and sweep.

@gopherbot
Copy link
Contributor

Comment 5:

CL https://golang.org/cl/89360043 mentions this issue.

@dvyukov
Copy link
Member

dvyukov commented Apr 26, 2014

Comment 6:

> Maybe an explicit runtime.GC() should sweep as well as mark?  Something to think about.
It already does. It triggers GC two times, two GCs necessary include full sweeping in
between.

@randall77
Copy link
Contributor Author

Comment 7:

I see.  Then I don't understand how MSpan.types.data arrays are unreachable but not
collected after I do a runtime.GC().
In heapdump, I do:
    for(i = 0; i < runtime·mheap.nspan; i++) {
        s = runtime·mheap.allspans[i];
        if(s->state == MSpanInUse)
            runtime·MSpan_EnsureSwept(s);
    }
If I start a heap dump right after calling runtime.GC(), this loop triggers lots of
MSpans to get freed, leaving the type data arrays as garbage.  If runtime.GC() does 2
gcs, shouln't all that freeing have happened already?

@randall77
Copy link
Contributor Author

Comment 8:

Or maybe we're just seeing remnants.  The first gc/sweep frees all the regular objects
and makes some type data arrays unreachable.  The second gc collects those type data
arrays.  If those objects were the last live objects in their MSpans, then more type
data arrays become unreachable after a second sweep.

@dvyukov
Copy link
Member

dvyukov commented Apr 26, 2014

Comment 9:

Types are freed during sweeping, so they would require 3 GCs. As you noted, this process
is actually transitive.

@randall77
Copy link
Contributor Author

Comment 10:

This issue was updated by revision 29d1b21.

LGTM=rsc
R=rsc, dvyukov
CC=golang-codereviews
https://golang.org/cl/89360043

@randall77
Copy link
Contributor Author

Comment 11:

This was probably partially fixed with CL 90440043, and the rest being the lingering
types data needing a few GCs to fully collect.

Status changed to Fixed.

@rsc rsc added this to the Go1.3 milestone Apr 14, 2015
@rsc rsc removed the release-go1.3 label Apr 14, 2015
@golang golang locked and limited conversation to collaborators Jun 25, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants