Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: MemStats.HeapReleased vs MemStats.HeapReacquired #32284

Closed
sylr opened this issue May 28, 2019 · 19 comments
Closed

runtime: MemStats.HeapReleased vs MemStats.HeapReacquired #32284

sylr opened this issue May 28, 2019 · 19 comments
Labels
Documentation FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@sylr
Copy link

sylr commented May 28, 2019

Hi,

I don't understand what the value of MemStats.HeapReleased represents.

From what I read from the doc it's the amount of memory that has been released to the system (if you stop here you should assume that this value is a counter always growing) ... and has not yet been reacquired for the heap (things start to be weird here).

So if I'm not mistaken this value is impacted by two things:

  • idle memory returned to the system (value of HeapReleased should grow)
  • idle memory reacquired for the heap (value of HeapReleased decreases ?)

I don't understand why this has not been split into 2 values, e.g. :

  • MemStats.HeapReleased: Total cumulative Idle memory returned to the system.
  • MemStats.HeapReacquired: Total cumulative Idle memory reacquired for the heap.

I think this would make easier to understand what happen to the idle memory.

Regards.

@julieqiu julieqiu changed the title MemStats.HeapReleased runtime: MemStats.HeapReleased vs MemStats.HeapReacquired May 28, 2019
@julieqiu
Copy link
Member

/cc @aclements @randall77

@julieqiu julieqiu added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Question labels May 28, 2019
@julieqiu julieqiu added this to the Unreleased milestone May 28, 2019
@randall77
Copy link
Contributor

The current accounting is a bit weird and the docs aren't the best, I agree.

HeapReleased is the portion of HeapIdle that has been given back to the OS. It can go up and down as the runtime allocates and frees objects.

So HeapReleased is not a "total over the lifetime of the process" thing like TotalAlloc.

It will always be the case that HeapReleased <= HeapIdle. Hopefully both should go towards zero when the application is in the midst of an allocation spree.

Maybe we should replace the first sentence with this:

HeapReleased is bytes of idle spans whose physical memory has been returned to the OS.

And maybe also (maybe higher up, when describing idle spans)

When a span is reused by the Go runtime, it is removed from HeapIdle (and HeapReleased, if applicable) and added to HeapInuse or StackInuse.

@sylr
Copy link
Author

sylr commented May 28, 2019

@randall77 what about making HeapReleased a counter and add another counter HeapReacquired so that we can trace accurately what comes out of HeapIdle and where it goes ?

@randall77
Copy link
Contributor

We don't want to redefine what existing fields mean. That would break existing users who have figured out what they mean.

Is there particular information the current setup prevents you from getting?

@sylr
Copy link
Author

sylr commented May 28, 2019

I'm currently unable to tell what becomes of the idle memory.

Unless I'm missing something we can't tell apart the amounts of memory going back to heap to those which are returning to the system.

I understand that would be a breaking change however I think the name is misleading and I think it cannot be exploited as it.

If you don't want to change HeapReleased could we then add two new counters TotalHeapReleased & TotalHeapReacquired ?

@randall77
Copy link
Contributor

I'm currently unable to tell what becomes of the idle memory.

Unless I'm missing something we can't tell apart the amounts of memory going back to heap to those which are returning to the system.

Just so I understand, you're wondering what fraction of bytes that become HeapIdle go through the state HeapReleased before transitioning back to a used state (HeapInuse or StackInuse) vs. go from HeapIdle to a used state directly?

Or maybe it's something else?

And I guess a follow on, why do you need this data? What decisions are you making based on the answers you get?

If you don't want to change HeapReleased could we then add two new counters TotalHeapReleased & TotalHeapReacquired

Yes, we could do that if it turns out to be necessary.
If you want to know the data I speculated about above, you'd probably also need TotalHeapIdle.

@sylr
Copy link
Author

sylr commented May 28, 2019

Transitioning some apps from go 1.11 to go 1.12 I had troubles with docker containers being OOM Killed by kubernetes (not by the kernel) because their memory usage was only growing according to cAdvisor and at some point reaching the memory limits set in kubernetes .

It turns out this was due to go 1.12 using MADV_FREE instead of MADV_DONTNEED in 1.11 for releasing memory to the system. The kernel is only reclaiming the memory freed by MADV_FREE under memory pressure but this never was the case on my node so cAvisor continued reporting the memory usage of the container to be what has been allocated and ultimately ending up to reach the limit.

To investigate this I took a look at the metrics exported by the client_golang library of prometheus which exposes the values of runtime.MemStats and I did not succeed proving that Go was really releasing memory back to the system as the variation of HeapReleased did not make sens to me.

So I'm proposing to have counters tracking the total memory size that transitions:

  • from HeapIdle to HeapInuse/StackInuse -> TotalHeapReacquired
  • from HeapIdle to released to the system -> TotalHeapReleased

With such counters I would have been able to quickly determine that my apps were indeed releasing memory to the system and that the problem was lying elsewhere.

If TotalHeapIdle is the amount of memory that have transitioned from HeapInUse to HeapIdle I guess that would also be a great addition.

@randall77
Copy link
Contributor

the variation of HeapReleased did not make sens to me.

Could you describe more about this?

Why wasn't it obvious from just a large value of HeapReleased? That is, if you're getting killed at 1GB, and HeapInuse is 100MB and both HeapIdle and HeapReleased are 900MB, then it would be pretty clear that the issue was with the release mechanism not working. (If on the other hand HeapIdle were 900MB and HeapReleased was 0, then the issue would be that the scavenger isn't releasing pages promptly.)

@sylr
Copy link
Author

sylr commented May 29, 2019

Here a graph of a pod that consumes lot of memory which is running go1.12.4 with GODEBUG=madvdontneed=1 (metrics are not stacked).

Screenshot_72

At one point most of the in used memory became idle. Later you can see that Released grew almost to Idle but Idle didn't move. How can Released increase and Idle stay the same ? If it was really released to the system, both Idle and HeapSys should decrease, and if it was reacquired then HeapInuse should grow and HeapIdle should decrease.

@sylr
Copy link
Author

sylr commented May 29, 2019

Here another graph showing the same pod running go1.12.5 with GODEBUG=madvdontneed=1 (metrics are not stacked).

I've added a purple dashed line which shows how the system see the memory consumption of the pod (metric coming from cAdvisor).

Screenshot_73

The cAdvisor metric proves that the memory is actually being returned to the system but I don't understand why HeapSys and HeapIdle are not decreasing.

@randall77
Copy link
Contributor

That's the way it is supposed to work. When you free memory, it becomes HeapIdle. It will stay in HeapIdle indefinitely unless your application does more allocation that needs that memory.
HeapReleased is the portion of HeapIdle that was given back to the OS. In this case, all of HeapIdle was given back to the OS.

Another way to look at it is that HeapIdle is a measure of virtual memory space. That virtual memory is only backed by physical pages for HeapIdle-HeapReleased.

@aclements
Copy link
Member

One unfortunate property of MemStats is that some fields are exclusive with each other, while others are subsets. I tried to spell this out in prose when I wrote the current documentation, but maybe we should document it more explicitly? It would look something like

Sys
    HeapSys
        HeapInUse
        HeapIdle
            HeapReleased
            ...
    StackSys
    MSpanSys
    ...

I'm not quite sure how to clearly document which fields cover their parent and which don't. For example, HeapSys = HeapInUse + HeapIdle, but HeapIdle = HeapReleased + X, and there's no field for X. Maybe we should add a field for X? (I wish it had been HeapSys = HeapInUse + HeapIdle + HeapReleased, but we can't change that now.)

@sylr
Copy link
Author

sylr commented May 29, 2019

When you free memory, it becomes HeapIdle.

You mean when a span has no more objects it's considered HeapIdle ?

That virtual memory is only backed by physical pages for HeapIdle-HeapReleased.

Does that mean that HeapIdle memory is still allocated to the process whether it is backed by physical pages or not ?

@sylr
Copy link
Author

sylr commented May 29, 2019

@aclements I made this diagram to try to understand how memory changes states, would you say it's accurate ?

go_memstats(2)

@randall77
Copy link
Contributor

I don't think that is quite right. The viewcore tool provides a nested space usage report which is byte-accurate, which might be instructive:

 all                  2711552 100.00% 
   text                331776  12.24% 
   readonly            368640  13.60% 
   data                135168   4.98% 
   bss                 790528  29.15% (grab bag, includes OS thread stacks, ...)
   heap               1048576  38.67% 
     in use spans      286720  10.57% 
       alloc           128048   4.72% 
         live          127912   4.72% 
         garbage          136   0.01% 
       free            156656   5.78% 
       round             2016   0.07% 
     manual spans      294912  10.88% (Go stacks)
       alloc           245760   9.06% 
       free             49152   1.81% 
     free spans        466944  17.22% 
       retained        466944  17.22% (kept for reuse by Go)
       released             0   0.00% (given back to the OS)
   ptr bitmap           32768   1.21% 
   span table            4096   0.15% 

HeapSys == all->heap
HeapAlloc == all->heap->in use spans->alloc
HeapIdle = all->heap->free
HeapReleased = all->heap->free->released
StackSys = all->heap->manual spans, + a portion of all->bss (for OS thread stacks)
StackInuse == all->heap->manual spans

TotalAlloc is all the bytes going from heap->in use spans->free to heap->in use spans->alloc.
In particular, TotalAlloc can increase forever without ever getting any additional memory from the system.

@bwplotka
Copy link

bwplotka commented Jun 4, 2019

Sorry for hijacking nice thread and thanks for responses so far! Super useful.

Question around heap alloc/inuse. I am correct that HeapInuse = HeapAlloc + X where X is

memory
        // that has been dedicated to particular size classes, but is
        // not currently being used. This is an upper bound on
        // fragmentation, but in general this memory can be reused
        // efficiently.

In my test case I could see 300MB in heap_inuse metric and ~80MB in heap_alloc, so this "preallocation" (is that what this X is?) is quite large (:

@randall77
Copy link
Contributor

It's more batch allocation than preallocation, but yes. You can also think about it as recently freed objects which live on the same page as a live object. Just from the stats I don't think there's an easy way to distinguish the two. It all depends on when the GC last finished.

@sylr
Copy link
Author

sylr commented Jun 27, 2019

I am wondering, with Go 2 coming up, could it be considered to revamp the MemStruct structure ? If not could we consider having a new one ?

@andybons
Copy link
Member

@sylr please file a proposal for what you'd like to see in the revamp.

We have decided that our experiment to allow questions on the issue tracker has not had the outcome we desired, so I am closing this issue. I'm sorry that we can't answer your question here.

There are many other methods to get help if you're still looking for answers:

Thanks

@golang golang locked and limited conversation to collaborators Aug 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Documentation FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

8 participants