Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/gob: decode map[string]int from 130MB file using 1GB ram #2539

Closed
gopherbot opened this issue Dec 7, 2011 · 13 comments
Closed

encoding/gob: decode map[string]int from 130MB file using 1GB ram #2539

gopherbot opened this issue Dec 7, 2011 · 13 comments

Comments

@gopherbot
Copy link

by qinn.trinh:

Before filing a bug, please check whether it has been fixed since
the latest release: run "hg pull", "hg update default", rebuild, and
retry
what you did to
reproduce the problem.  Thanks.

What steps will reproduce the problem?
package main

import "gob"
import "os"

func savePats(file string, pats map[string]int) {
    f, err := os.Create(file)
    if err != nil {
        panic("cant open file")
    }
    defer f.Close()
    enc := gob.NewEncoder(f)
    if err := enc.Encode(pats); err != nil {
        panic("cant encode")
    }
}

func loadPats(file string) (pats map[string]int) {
    f, err := os.Open(file)
    if err != nil {
        panic("cant open file")
    }
    defer f.Close()
    enc := gob.NewDecoder(f)
    if err := enc.Decode(&pats); err != nil {
        panic("cant decode")
    }
    return pats
}

func main() {
    p := loadPats("asm.txt")
    savePats("asm.txt", p)

}

What is the expected output?
expected it not to use 1 gig ram, and that after peak ram that garbage collector sweeps
and frees ram for OS

What do you see instead?


Which compiler are you using (5g, 6g, 8g, gccgo)?
on windows7: 6g version weekly.2011-10-18 10125 
on linux ubuntu 10: 6g version release.r60.3 9516 

Which operating system are you using?
windows 7
ubuntu 10.04 

Which revision are you using?  (hg identify)


Please provide any additional information below.

I have link to source and the map file i used
http://dl.dropbox.com/u/2629943/bug.zip

(on a side note i used json marshall and unmarshall first and exibited allmost the same
memory usge )

-Tai
@rsc
Copy link
Contributor

rsc commented Dec 9, 2011

Comment 1:

Labels changed: added priority-later, removed priority-medium.

@gopherbot
Copy link
Author

Comment 2 by Kalthorn:

The high memory usage occurs during both encoding and decoding.  Also, and not to derail
the original issue, the real problem (for me anyway) isn't in the high memory usage but
in the fact that it doesn't get garbage collected.  Spiking to 1GB of ram just to decode
my 100MB file is annoying (and could be entirely necessary) but mostly harmless.  The
fact that the memory isn't reclaimed is the more troublesome part.
Some informal testing with my own data:
Decode my 120MB file and trash the result (so as not to have it count against memory
usage) jumped me to ~600MB from a baseline of 14 or so.  Decoding again jumped up to
about 1GB of RAM... and it stayed there.  I could decode that or other files multiple
times after that and usage didn't increase again.  Tested with runtime.GC() and without
it, no difference.
Data was ~900,000 float64/float32 pairs and an equal number of uint32s. No strings.

@gopherbot
Copy link
Author

Comment 3 by qinn.trinh:

yeah i tried runtime.GC() to but didn't help me either...

@robpike
Copy link
Contributor

robpike commented Dec 12, 2011

Comment 4:

Owner changed to @robpike.

Status changed to Accepted.

@robpike
Copy link
Contributor

robpike commented Dec 15, 2011

Comment 5:

If I just do the decode, I see an RSS of 600MB. This is expected. Adding the size of
strings, plus map cells, plus breakage and gob overheads that is large but not
unreasonable.
If I do an encode and decode, I do see about 1GB, but in that case there is much more
breakage because gob must keep the entire message in memory before pushing it out, and
grows it incrementally, which is hard on the allocator.
You can try this to see the usage at various points:
    runtime.UpdateMemStats()
    fmt.Printf("alloc totalalloc free sys %d %d %d\n", runtime.MemStats.Alloc, runtime.MemStats.TotalAlloc, runtime.MemStats.Sys)

Labels changed: added priority-someday, removed priority-later.

@robpike
Copy link
Contributor

robpike commented Dec 15, 2011

Comment 6:

If I do
  runtime.GC()
before printing the stats, I see
alloc totalalloc free sys 252186160 1192862688 1048336632
which is perfectly reasonable (252MB). It's breakage. Sorry.

@gopherbot
Copy link
Author

Comment 7 by Kalthorn:

What does breakage mean in this case?  I am unfamiliar with the term.

@gopherbot
Copy link
Author

Comment 8 by Kalthorn:

Ahh NM, looked it up.  I take it you mean that the overall memory for the process has
increased and, while it has been garbage collected, it hasn't been released back to the
system, which is an issue that others have brought up elsewhere I believe.
Thanks for checking.

@robpike
Copy link
Contributor

robpike commented Jul 13, 2012

Comment 10:

This issue was updated by revision 37519d9.

RSS drops to 775MB from 1GB.
Active memory drops to 858317048 from 1027878136,
essentially the size of one copy of the input file.
R=dsymonds, nigeltao
CC=golang-dev
http://golang.org/cl/6392057

@davecheney
Copy link
Contributor

Comment 11:

Hello,
As of 1.1.1, the following is observed.
lucky(/tmp/bug) % GOGCTRACE=1 ./test
gc1(1): 0+0+0 ms, 0 -> 0 MB 17 -> 18 (20-2) objects, 0(0) handoff, 0(0) steal, 0/0/0
yields
gc2(1): 0+0+0 ms, 134 -> 134 MB 678 -> 472 (725-253) objects, 0(0) handoff, 0(0) steal,
0/0/0 yields
gc3(1): 64+21+0 ms, 269 -> 218 MB 2253311 -> 466996 (2274026-1807030) objects, 0(0)
handoff, 0(0) steal, 0/0/0 yields
gc4(1): 127+45+1 ms, 436 -> 324 MB 5297625 -> 1455500 (7138846-5683346) objects, 0(0)
handoff, 0(0) steal, 0/0/0 yields
gc5(1): 301+33+1 ms, 725 -> 631 MB 4275189 -> 1662413 (9958536-8296123) objects, 0(0)
handoff, 0(0) steal, 0/0/0 yields
scvg0: inuse: 651, idle: 120, sys: 772, released: 0, consumed: 772 (MB)
gc6(1): 154+22+4 ms, 648 -> 209 MB 3114175 -> 1656001 (11410298-9754297) objects, 0(0)
handoff, 0(0) steal, 0/0/0 yields
scvg1: GC forced
scvg1: inuse: 211, idle: 560, sys: 772, released: 0, consumed: 772 (MB)
scvg2: inuse: 211, idle: 560, sys: 772, released: 0, consumed: 772 (MB)
gc7(1): 158+10+3 ms, 209 -> 208 MB 1656001 -> 1653166 (11410298-9757132) objects, 0(0)
handoff, 0(0) steal, 0/0/0 yields
scvg3: GC forced
scvg3: inuse: 211, idle: 561, sys: 772, released: 0, consumed: 772 (MB)
scvg4: 0x4a MB released
scvg4: inuse: 211, idle: 561, sys: 772, released: 74, consumed: 697 (MB)
gc8(1): 157+10+3 ms, 208 -> 208 MB 1653166 -> 1653166 (11410298-9757132) objects, 0(0)
handoff, 0(0) steal, 0/0/0 yields
scvg5: GC forced
scvg5: inuse: 211, idle: 561, sys: 772, released: 74, consumed: 697 (MB)
scvg6: 0x1e5 MB released
scvg6: inuse: 211, idle: 561, sys: 772, released: 560, consumed: 211 (MB)
gc9(1): 162+10+1 ms, 208 -> 208 MB 1653166 -> 1653166 (11410298-9757132) objects, 0(0)
handoff, 0(0) steal, 0/0/0 yields
scvg7: GC forced
scvg7: inuse: 211, idle: 561, sys: 772, released: 560, consumed: 211 (MB)
scvg8: 0x0 MB released
scvg8: inuse: 211, idle: 561, sys: 772, released: 561, consumed: 211 (MB)
gc10(1): 162+10+3 ms, 208 -> 208 MB 1653166 -> 1653166 (11410298-9757132) objects, 0(0)
handoff, 0(0) steal, 0/0/0 yields
scvg9: GC forced
scvg9: inuse: 211, idle: 561, sys: 772, released: 561, consumed: 211 (MB)
scvg10: inuse: 211, idle: 561, sys: 772, released: 561, consumed: 211 (MB)
gc11(1): 155+10+3 ms, 208 -> 208 MB 1653166 -> 1653166 (11410298-9757132) objects, 0(0)
handoff, 0(0) steal, 0/0/0 yields
scvg11: GC forced
scvg11: inuse: 211, idle: 561, sys: 772, released: 561, consumed: 211 (MB)
scvg12: inuse: 211, idle: 561, sys: 772, released: 561, consumed: 211 (MB)
gc12(1): 167+10+3 ms, 208 -> 208 MB 1653166 -> 1653166 (11410298-9757132) objects, 0(0)
handoff, 0(0) steal, 0/0/0 yields
The scavenger will shrink RSS down to ~211mb and change. This appears to address the
concerns of the OP so I would petition that this issue be closed.

@gopherbot
Copy link
Author

Comment 12 by Kalthorn:

It does indeed, thanks for following up.

@gopherbot
Copy link
Author

Comment 13 by Kalthorn:

Whoops, the issue was so memorable to me that I forgot that I'm not the OP.

@robpike
Copy link
Contributor

robpike commented Jun 28, 2013

Comment 14:

Status changed to Fixed.

@golang golang locked and limited conversation to collaborators Jun 24, 2016
@rsc rsc unassigned robpike Jun 22, 2022
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants