Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: garbage collection ineffective on 32-bit #909

Closed
peterGo opened this issue Jul 8, 2010 · 83 comments
Closed

runtime: garbage collection ineffective on 32-bit #909

peterGo opened this issue Jul 8, 2010 · 83 comments
Milestone

Comments

@peterGo
Copy link
Contributor

peterGo commented Jul 8, 2010

GOARCH=386 Garbage Collection Is Ineffectual: mmap: errno=0xc

package main

import (
    "fmt"
    "runtime"
)

func fs() []float64 {
    r := make([]float64, 923521)
    return r
}

func main() {
    s := fs()
    for i := 0; i < 1000; i++ {
        s = fs()
        m := runtime.MemStats
        fmt.Printf("i %d; Alloc %d; TotalAlloc %d\n", i, m.Alloc, m.TotalAlloc)
    }
    m := runtime.MemStats
    fmt.Printf("end; Alloc %d; TotalAlloc %d\n", m.Alloc, m.TotalAlloc)
    _ = s
}

Expected results obtained with 6g, GOARCH=amd64, and GOOS=linux, for hg id 5af6f6656531
tip, using 6.0GB real memory:
i 0; Alloc 15086544; TotalAlloc 15141352
i 1; Alloc 15156656; TotalAlloc 23020488
i 2; Alloc 22545888; TotalAlloc 30798840
i 3; Alloc 30004752; TotalAlloc 38646824
i 4; Alloc 15295920; TotalAlloc 46520408
i 5; Alloc 22685152; TotalAlloc 54298760
i 6; Alloc 30074384; TotalAlloc 62077112
 . .
i 996; Alloc 30074384; TotalAlloc 7771093592
i 997; Alloc 15295920; TotalAlloc 7778897544
i 998; Alloc 22685152; TotalAlloc 7786675896
i 999; Alloc 30074384; TotalAlloc 7794454248
  end; Alloc 30074432; TotalAlloc 7794838296

Actual results obtained with 8g, GOARCH=386, and GOOS=linux, for hg id 5af6f6656531 tip,
using 0.5GB real memory:
i 0; Alloc 14975824; TotalAlloc 16730200
i 1; Alloc 22402248; TotalAlloc 24448472
i 2; Alloc 29828344; TotalAlloc 32166408
i 3; Alloc 37254440; TotalAlloc 39884344
i 4; Alloc 44680536; TotalAlloc 47602280
i 5; Alloc 44717256; TotalAlloc 55320216
i 6; Alloc 52106488; TotalAlloc 63001288
. . .
i 427; Alloc 2814607576; TotalAlloc 3310449848
i 428; Alloc 2822033672; TotalAlloc 3318167784
i 429; Alloc 2829459768; TotalAlloc 3325885720
i 430; Alloc 2836885864; TotalAlloc 3333603656
mmap: errno=0xc

For a complete description of the problem q.v. conjugate gradient method out of memory
http://groups.google.com/group/golang-nuts/browse_thread/thread/6fb3e3b7ae04d42a
@peterGo
Copy link
Contributor Author

peterGo commented Jul 8, 2010

Comment 1:

If you reduce the size of the allocation to a 10th or a 100th to 'make([]float64,
92352)' or 'make([]float64, 9235)', the garbage collector becomes effective.

@adg
Copy link
Contributor

adg commented Jul 9, 2010

Comment 2:

Owner changed to r...@golang.org.

Status changed to Accepted.

@rsc
Copy link
Contributor

rsc commented Sep 10, 2010

Comment 3:

Issue #1091 has been merged into this issue.

@adg
Copy link
Contributor

adg commented Oct 19, 2010

Comment 4:

Issue #1210 has been merged into this issue.

@davecheney
Copy link
Contributor

Comment 5:

Hi Peter,
I ran your sample code on the current tip, a11092559c78, and it completed as expected on
8g. Is the issue still reproducible for you?
Cheers
Dave

@rsc
Copy link
Contributor

rsc commented Apr 29, 2011

Comment 6:

We've done some things to ameliorate the problem but it's not fixed.
Russ

@rsc
Copy link
Contributor

rsc commented Jun 7, 2011

Comment 7:

Issue #1920 has been merged into this issue.

@peterGo
Copy link
Contributor Author

peterGo commented Jun 7, 2011

Comment 8:

Should issue #1925, rather than issue #1920, have been merged into this issue?

@rsc
Copy link
Contributor

rsc commented Jun 7, 2011

Comment 9:

It doesn't really matter.  1920 may be a dup of 1925 or it may be a dup of 909.
1925 is really a separate issue; such a trivial program should not hit 909.

@FlorianUekermann
Copy link
Contributor

Comment 10:

package main
import "fmt"
func main() {
  for {
    vec := make([]float64, 1000000)
    fmt.Println(vec[0])
  }
}
This crashes on 386 linux with 8g from last weekly on a mashine with 1.6GB free memory:
runtime: memory allocated by OS not in usable range
runtime: out of memory: cannot allocate 8060928-byte block (533069824 in use)
throw: out of memory
Is this a dublicate of 909 or something else? Looks like 909 to me, but testcase is a
bit different, I'm not sure. If it is, can we expect it to get fixed anytime soon?,
since it renders go on 386 unusable for lots of very common usecases (at least for me).

@adg
Copy link
Contributor

adg commented Jul 7, 2011

Comment 11:

This is the same issue. But I'm confused why this is such a problem. Why would you want
to do so much allocation when (in this simple example, at least) it seems easily avoided?

@FlorianUekermann
Copy link
Contributor

Comment 12:

@adg: You make it sound like a cornercase.. ...it's not. Allocating a vector with around
1.000.000 floats is not that uncommon if you process scientific data. One vector in this
example should be about 8Mb, the program crashed at 530Mb, thats only 70 vectors. I am
most certain "much" is a bit more than 70 vectors in some very common cases.
I really don't want to complain, I like go and the compilers, they are great on 64bit,
but this is a serious bug. I'm a bit surprised that you think that it isn't.
I'm still not convinced that it is the same bug though, since my problem is not only the
GC, but that the memory allocation fails when >1Gb free memory is available.

@adg
Copy link
Contributor

adg commented Jul 11, 2011

Comment 13:

Sorry, I missed the part where you had more available physical memory than you were
trying to allocate. That seems like a bug that has nothing to do with the GC. File a
separate issue please.

@rsc
Copy link
Contributor

rsc commented Dec 9, 2011

Comment 14:

Labels changed: added priority-later.

@rsc
Copy link
Contributor

rsc commented Dec 12, 2011

Comment 15:

Labels changed: added priority-go1.

@robpike
Copy link
Contributor

robpike commented Jan 13, 2012

Comment 16:

Owner changed to builder@golang.org.

@rsc
Copy link
Contributor

rsc commented Feb 19, 2012

Comment 18:

Issue #1925 has been merged into this issue.

@rsc
Copy link
Contributor

rsc commented Feb 19, 2012

Comment 19:

Nice test program from issue #1925:
package main
import (
    "fmt"
    "runtime"
    "net/http"
    _ "net/http/pprof"
)
var st runtime.MemStats
func main() {
    runtime.MemProfileRate = 1
    for i := 0; i < 10; i++ {
        a := make([]byte, 5000000)
        if a == nil {
        }
        a = nil
        runtime.GC()
        runtime.ReadMemStats(&st)
        fmt.Println(i, st.Alloc, st.Sys, st.HeapObjects)
    }
    fmt.Println()
    for i := 0; i < 10; i++ {
        a := make([]byte, 20000000)
        if a == nil {
        }
        a = nil
        runtime.GC()
        runtime.ReadMemStats(&st)
        fmt.Println(i, st.Alloc, st.Sys, st.HeapObjects)
    }
    http.ListenAndServe(":8000", nil)
}
I have a few changes to ameliorate the effect of large static tables for Go 1, but the
general problem remains.

@rsc
Copy link
Contributor

rsc commented Feb 19, 2012

Comment 20:

This CL should fix the 'static tables cause confusion' problem, but once you remove the
static tables you quickly find that dynamic data can cause confusion too, sadly.
changeset:   12574:24411588e821
user:        Russ Cox <rsc@golang.org>
date:        Sun Feb 19 03:19:52 2012 -0500
summary:     gc, ld: tag data as no-pointers and allocate in separate section

@rsc
Copy link
Contributor

rsc commented Feb 19, 2012

Comment 21:

The rest of the issue will have to wait until after Go 1.
Or maybe all the 32-bit systems will be replaced by 64-bit ones.

Labels changed: added priority-later, removed priority-go1.

Owner changed to ---.

@gopherbot
Copy link

Comment 22 by mstplbrg:

I stumbled upon this bug because a Go program of mine runs out of memory on my 386
machine :-(.
I tried the latest version of Go (go version says "go version weekly.2012-02-22
+ca9790d6a51a"), but the problem is still present. The output of the test program from
comment #19 is:
./test909 
0 5482240 9018492 684
1 10521560 14695548 705
2 15559640 20372604 707
3 20597720 26049660 709
4 25635800 31726716 711
5 30673880 37403772 713
6 35711960 43080828 715
7 40750040 48757884 717
8 45788120 54434940 719
9 50826200 60111996 721
0 65862696 82672764 723
1 85900592 105233532 731
2 105938224 127794300 733
3 125975856 150355068 735
4 146013488 172915836 737
5 166051120 195476604 739
6 186088752 218037372 741
7 206126384 240598140 743
8 226164016 263158908 745
9 246201648 285719676 747
Am I misunderstanding your message / the test program or does your fix not actually work?
Thanks.

@rsc
Copy link
Contributor

rsc commented Feb 29, 2012

Comment 23:

Comment #19 does say "the general problem remains".
My changes helped, but they are not a complete fix.

@gopherbot
Copy link

Comment 24 by raumzeitlabor:

I hate to ask this, but do you have a rough time frame for when this bug will be
adressed? Order of weeks, months, two-digit months?
I’m just asking because I really need a solution for this. I could live without it for
a few more weeks, but if it’s going to take longer than that, I’ll have to
reimplement my program in a different language (not meant as pressure, just a fact).

@rsc
Copy link
Contributor

rsc commented Feb 29, 2012

Comment 25:

My guess would be 2-digit months.  My suggestion would be to
try to arrange to run on a 64-bit machine instead of reimplementing
your entire program.  Sorry.

@gopherbot
Copy link

Comment 26 by raumzeitlabor:

Alright, thanks for your answer. I’ll have to reimplement because my budget doesn’t
allow switching to a new machine :).

@gopherbot
Copy link

Comment 27 by pcrosby:

We're having the same problem as in comment #10...We've had to develop several
workarounds (including restarting a long-running app when approaching the 512MB
allocation limit, which is far from ideal).  I'd love it if this was fixed as moving all
our servers to 64 bit isn't an affordable option at the moment.

@gopherbot
Copy link

Comment 28 by mstplbrg:

FYI: I was able to solve this problem (in my case) by using "coffer": I put up a fixed
version at http://github.com/mstap/coffer, forked from http://github.com/mcgoo/coffer.
I just swapped all my buffers which used buffer := bytes.NewBuffer(make([]byte, 0,
readBufferBytes)) before with buffer, _ := coffer.NewMemCoffer(readBufferBytes+1). My
old buffer.Reset() becomes buffer.Seek(0, 0), but apart from that, coffer was a drop-in
replacement in my case. Note that buffer.Close() free()s the memory and *needs* to be
called.
This is not a beautiful solution, but it gets the job done until this problem is
properly fixed.

@gopherbot
Copy link

Comment 29 by zhigangc:

I spent some time digging into this issue and have found the following:
1. GC is Ineffectual because some memory blocks in the GO's heap can be mistakenly
marked as referenced, but in fact they are not.
2. The "fake" references are from the static variables defined in various go packages.
These variables are not pointers. However, as GC scans the data section for potential
references to the heap, they are treated as "pointers" and therefore the entire heap
blocks which these "pointers" happen to "reference" can never be reclaimed even when
they should be.
3. The attached test programs, mem.go can easily illustrate how these fake pointers
prevent GC from freeing used memory on both the tip (12661:426b1101b166) and r60.3 on
32-bit linux.
  To run the test programs, please unzip the attachment.
  To run it on Go 1, go to "go0" and run "go run mem.go". Make sure you are tip hash is 12661:426b1101b166, the most recent as of now.
  To run it on Go release, go to "go0" and run "make" and "./mem"
  In the unicode package, there are many static variables which end up being put in the DATA section.  As GC scans the data section at runtime/mgc0.c:648, it treats the variables are pointers and some happen to "point" to the memory blocks in the heap.
  If we comment out the unicode package, GC works and the program runs fine.
4. The issue are more likely to crash applications which allocate memory in large chunks
because one "fake" pointer can hold a large piece of memory and it does not take a lot
of fake pointers to make the app run out of RAM. If allocated in small chunks, the
problem still persist, though much less severe, and often gets away unnoticed.
5. I am suspecting, the issue potentially exists in 64 bit as well.

Attachments:

  1. issue_909.zip (1645 bytes)

@gopherbot
Copy link

Comment 30 by zhigangc:

To make it more convient, I just paste the code here:
package main
import (
        "runtime"
        //comment the following line and the program runs fine.
        "unicode"
)
func fs() []byte {
        //allocate 64 MB chunks
        r := make([]byte, 64*1024*1024)
        return r
}
func main() {
        //comment the following line and the program runs fine.
        println("addr:", &unicode.Scripts)
        var s []byte
        for i := 0; i < 100; i++ {
                println("")
                println(i, "---------------")
                s = fs()
                runtime.GC()
                var m runtime.MemStats
                runtime.ReadMemStats(&m)
                println(i, "MemStats.Alloc:", m.Alloc)
        }
        _ = s
}

@gopherbot
Copy link

Comment 31 by matthewrsiegel@comcast.net:

excellent investigation!
if there were a hack to prevent gc from bothering with static data...

@dvyukov
Copy link
Member

dvyukov commented Nov 14, 2013

Comment 76:

We need some story for testing GC precision (both progress and regressions). Good test
cases are invaluable, because usually people do not notice moderate leaks and do not
dig. We need to look through the merged issues and commit them as tests.

@rsc
Copy link
Contributor

rsc commented Nov 27, 2013

Comment 77:

Labels changed: added go1.3maybe.

@rsc
Copy link
Contributor

rsc commented Nov 27, 2013

Comment 78:

Labels changed: removed feature.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2013

Comment 79:

Labels changed: added release-none, removed go1.3maybe.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2013

Comment 80:

Labels changed: added repo-main.

@randall77
Copy link
Contributor

Comment 81:

Status update.  We've done a lot to improve the preciseness of the garbage collector for
1.3.  The major change is that scanning of Go stack frames is now completely precise. 
There are lots of minor changes, including scanning interfaces precisely, correct
context pointer scanning, and modifying reflect.Value to do its magic in a precise way.
We're not completely precise yet, but we're 99% there.  The major remaining piece is the
scanning of C stack frames.  A few minor pieces also remain, including scanning of some
runtime internal data structures.

@rsc
Copy link
Contributor

rsc commented Sep 18, 2014

Comment 82:

We've done a lot over the last few months. C stack frames are gone, and the internal
data structures are described correctly now. The only piece I am aware of that is left
are a few C-declared data structures that the linker instructs the garbage collector to
scan conservatively. I think we can eliminate those for 1.4 and finally close this bug.
There is one other piece that I am not counting: if you use SWIG to allocate Go memory
from C++, that Go memory is scanned conservatively. That's a different problem (issue
6461) and not a concern for most Go programmers (since most don't use SWIG).

Labels changed: added release-go1.4, removed release-none.

@gopherbot
Copy link

Comment 83:

CL https://golang.org/cl/149770043 mentions this issue.

@rsc
Copy link
Contributor

rsc commented Sep 24, 2014

Comment 84:

This issue was closed by revision 193daab.

Status changed to Fixed.

@adg adg mentioned this issue Dec 8, 2014
@golang golang locked and limited conversation to collaborators Dec 8, 2014
@rsc rsc added this to the Go1.4 milestone Apr 14, 2015
@rsc rsc removed the release-go1.4 label Apr 14, 2015
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jun 25, 2018
In linker, refuse to write conservative (array of pointers) as the
garbage collection type for any variable in the data/bss GC program.

In the linker, attach the Go type to an already-read C declaration
during dedup. This gives us Go types for C globals for free as long
as the cmd/dist-generated Go code contains the declaration.
(Most runtime C declarations have a corresponding Go declaration.
Both are bss declarations and so the linker dedups them.)

In cmd/dist, add a few more C files to the auto-Go-declaration list
in order to get Go type information for the C declarations into the linker.

In C compiler, mark all non-pointer-containing global declarations
and all string data as NOPTR. This allows them to exist in C files
without any corresponding Go declaration. Count C function pointers
as "non-pointer-containing", since we have no heap-allocated C functions.

In runtime, add NOPTR to the remaining pointer-containing declarations,
none of which refer to Go heap objects.

In runtime, also move os.Args and syscall.envs data into runtime-owned
variables. Otherwise, in programs that do not import os or syscall, the
runtime variables named os.Args and syscall.envs will be missing type
information.

I believe that this CL eliminates the final source of conservative GC scanning
in non-SWIG Go programs, and therefore...

Fixes golang#909.

LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/149770043
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jun 26, 2018
In linker, refuse to write conservative (array of pointers) as the
garbage collection type for any variable in the data/bss GC program.

In the linker, attach the Go type to an already-read C declaration
during dedup. This gives us Go types for C globals for free as long
as the cmd/dist-generated Go code contains the declaration.
(Most runtime C declarations have a corresponding Go declaration.
Both are bss declarations and so the linker dedups them.)

In cmd/dist, add a few more C files to the auto-Go-declaration list
in order to get Go type information for the C declarations into the linker.

In C compiler, mark all non-pointer-containing global declarations
and all string data as NOPTR. This allows them to exist in C files
without any corresponding Go declaration. Count C function pointers
as "non-pointer-containing", since we have no heap-allocated C functions.

In runtime, add NOPTR to the remaining pointer-containing declarations,
none of which refer to Go heap objects.

In runtime, also move os.Args and syscall.envs data into runtime-owned
variables. Otherwise, in programs that do not import os or syscall, the
runtime variables named os.Args and syscall.envs will be missing type
information.

I believe that this CL eliminates the final source of conservative GC scanning
in non-SWIG Go programs, and therefore...

Fixes golang#909.

LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/149770043
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jul 9, 2018
In linker, refuse to write conservative (array of pointers) as the
garbage collection type for any variable in the data/bss GC program.

In the linker, attach the Go type to an already-read C declaration
during dedup. This gives us Go types for C globals for free as long
as the cmd/dist-generated Go code contains the declaration.
(Most runtime C declarations have a corresponding Go declaration.
Both are bss declarations and so the linker dedups them.)

In cmd/dist, add a few more C files to the auto-Go-declaration list
in order to get Go type information for the C declarations into the linker.

In C compiler, mark all non-pointer-containing global declarations
and all string data as NOPTR. This allows them to exist in C files
without any corresponding Go declaration. Count C function pointers
as "non-pointer-containing", since we have no heap-allocated C functions.

In runtime, add NOPTR to the remaining pointer-containing declarations,
none of which refer to Go heap objects.

In runtime, also move os.Args and syscall.envs data into runtime-owned
variables. Otherwise, in programs that do not import os or syscall, the
runtime variables named os.Args and syscall.envs will be missing type
information.

I believe that this CL eliminates the final source of conservative GC scanning
in non-SWIG Go programs, and therefore...

Fixes golang#909.

LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/149770043
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jul 30, 2018
In linker, refuse to write conservative (array of pointers) as the
garbage collection type for any variable in the data/bss GC program.

In the linker, attach the Go type to an already-read C declaration
during dedup. This gives us Go types for C globals for free as long
as the cmd/dist-generated Go code contains the declaration.
(Most runtime C declarations have a corresponding Go declaration.
Both are bss declarations and so the linker dedups them.)

In cmd/dist, add a few more C files to the auto-Go-declaration list
in order to get Go type information for the C declarations into the linker.

In C compiler, mark all non-pointer-containing global declarations
and all string data as NOPTR. This allows them to exist in C files
without any corresponding Go declaration. Count C function pointers
as "non-pointer-containing", since we have no heap-allocated C functions.

In runtime, add NOPTR to the remaining pointer-containing declarations,
none of which refer to Go heap objects.

In runtime, also move os.Args and syscall.envs data into runtime-owned
variables. Otherwise, in programs that do not import os or syscall, the
runtime variables named os.Args and syscall.envs will be missing type
information.

I believe that this CL eliminates the final source of conservative GC scanning
in non-SWIG Go programs, and therefore...

Fixes golang#909.

LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/149770043
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests