-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: large hashmaps confuse the garbage collector when GOMAXPROCS>=2 #6119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Milestone
Comments
Labels changed: added priority-soon, go1.2, removed priority-triage, go1.2maybe. Owner changed to @randall77. Status changed to Accepted. |
Some initial data. I instrumented the runtime to print out the big MSpans after each gc. With GOMAXPROCS=1, there are only 3: one of 73728 pages (the main bucket table for the map) and two with 19532 pages (the size of the big slice). (I don't know why there are 2 of the latter - shouldn't there be only one?) With GOMAXPROCS=2, there are more. There's still one 73728 one, but the number of 19532 ones keeps growing. I get 3-4, then 7ish, then probably something bigger because it crashes. I don't yet understand why maps, or why GOMAXPROCS matters. |
I think I see the problem. To iterate through the map, the compiler allocates a hash_iter object on the stack. This object contains, among other things, a pointer to the big main bucket table. When the stack is scanned, this pointer is a root that keeps the bucket table live. On one processor this isn't a problem, because the hash table header object gets scanned first and as a side effect marks the main bucket table as scanned. On a multiprocessor, however, while one thread is still busy with the hash table header (marking overflow buckets and the like), the other thread gets this reference to the main bucket table and seeing it hasn't been marked yet, proceeds to conservatively scan the whole thing. It's got lots of things in it that look like pointers, and these end up keeping the big slice arrays alive. The fix is to prevent GC from following pointers directly into the internals of the hash map. All GC access should be done through the Hmap via gc_iter/gc_next. One way is to make the pointer in the the hash_iter structure a uintptr instead of a byte*. That should hide it from the GC, at least when we have precise stack scanning on. Probably better is to have the hashtable implementation allocate all of its internals as FlagNoPointers so that even if the GC did get a hold of an internal pointer, it won't scan any internals. Status changed to Started. |
Try patching in http://golang.org/cl/12840043 and see if that helps. |
---------- Forwarded message ---------- From: Dmitry Vyukov <dvyukov@google.com> Date: Mon, Aug 12, 2013 at 9:16 PM Subject: GC vs hashmaps To: Carl Shapiro <cshapiro@google.com>, Keith Randall <khr@google.com> Cc: Carlos Castillo <cookieo9@gmail.com>, golang-nuts <golang-nuts@googlegroups.com>, Jakob Borg <jakob@nym.se> On Mon, Aug 12, 2013 at 6:06 PM, Jakob Borg <jakob@nym.se> wrote: Right, indeed. It seems the map access is necessary to cause the issue; I slightly reduced the problem program to http://play.golang.org/p/Tkdqs0lWoc Removing the "_ = m[0]" removes the issue. Here is what happens here: _ = m[0] is turned into runtime.mapaccess1_fast64 call which returns a pointer into the huge hashmap buffer. This pointer is left on stack in 0x18(rsp). p := make([]int, N_ELEMS) is turned into makeslice1 call. 0x18(rsp) refers to makeslice1 return value, which is a pointer but not initialized until function returns. GC is triggered inside of makeslice1 and it scans 0x18(rsp) completely conservatively. As the result 160MB hashmap buffer with all the hashes is scanned conservatively. Questions: 1. We are going to zero function return values and local in C code as well, right? 2. We will need to do the same for assembly functions manually, right? |
_ = m[0] is turned into runtime.mapaccess1_fast64 call which returns a pointer into the huge hashmap buffer. This pointer is left on stack in 0x18(rsp). p := make([]int, N_ELEMS) is turned into makeslice1 call. 0x18(rsp) refers to makeslice1 return value, which is a pointer but not initialized until function returns. GC is triggered inside of makeslice1 and it scans 0x18(rsp) completely conservatively. As the result 160MB hashmap buffer with all the hashes is scanned conservatively. Questions: 1. We are going to zero function return values and local in C code as well, right? 2. We will need to do the same for assembly functions manually, right? |
I do not have definitive answers to your questions. It is possible to do both 1 and 2 though neither has been discussed. As an aside, my longer term preference is to not have 1 or 2 and instead have an explicit way for non-Go code to communicate roots, if needed, to the garbage collector. |
Having the compiler zero memory is something that might be viable in the short term. Unfortunately, it has a runtime cost. One way to avoid the runtime cost is to have only those C and Assembly routines that store pointers across a call explicitly save any live pointer values in a location known to the GC across the call. After the call is completed, the routine can reload the saved pointer values and make use of them. I have a write-up of this in progress that I hope to finish this week. |
This issue was closed by revision 0df438c. Status changed to Fixed. |
This issue was closed by revision 74e78df. |
This issue was closed by revision fb37602. Status changed to Fixed. |
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Attachments:
The text was updated successfully, but these errors were encountered: