Skip to content

runtime: GC does not happen for unknown reasons #63622

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zhangbo1882 opened this issue Oct 19, 2023 · 11 comments
Closed

runtime: GC does not happen for unknown reasons #63622

zhangbo1882 opened this issue Oct 19, 2023 · 11 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@zhangbo1882
Copy link

zhangbo1882 commented Oct 19, 2023

What version of Go are you using (go version)?

$ go version
go1.19.0

I run a pod in kubernetes. When the pod started, everything was ok and GC can happen as expected. But after sometime, the memory usage was growing larger and larger and never reduced. I use gops tool to check the memory usage and found that the next-gc is become a int64 max number which should NOT be a reasonable number.

What did you expect to see?

It can do GC automatically.

What did you see instead?

GC never happens because next-gc: when heap-alloc >= 8192.00PB (9223372035383695295 bytes)

host:/# gops memstats 23633
alloc: 1.98GB (2125624808 bytes)
total-alloc: 1.47TB (1618010461264 bytes)
sys: 25.66GB (27556167208 bytes)
lookups: 0
mallocs: 14464057887
frees: 14443706511
heap-alloc: 1.98GB (2125624808 bytes)
heap-sys: 24.33GB (26122715136 bytes)
heap-idle: 22.31GB (23960510464 bytes)
heap-in-use: 2.01GB (2162204672 bytes)
heap-released: 1.21MB (1269760 bytes)
heap-objects: 20351376
stack-in-use: 7.44MB (7798784 bytes)
stack-sys: 7.44MB (7798784 bytes)
stack-mspan-inuse: 29.18MB (30602592 bytes)
stack-mspan-sys: 349.47MB (366445440 bytes)
stack-mcache-inuse: 93.75KB (96000 bytes)
stack-mcache-sys: 106.64KB (109200 bytes)
other-sys: 54.55MB (57198392 bytes)
gc-sys: 952.58MB (998849856 bytes)
next-gc: when heap-alloc >= 8192.00PB (9223372035383695295 bytes)
last-gc: 2023-10-17 03:15:28.72404389 +0000 UTC
gc-pause-total: 2.470350864s
gc-pause: 366092
gc-pause-end: 1697512528724043890
num-gc: 15002
num-forced-gc: 3
gc-cpu-fraction: 6.057903035448553e-05
enable-gc: true
debug-gc: false

@zhangbo1882 zhangbo1882 changed the title affected/package: runtime GC does not happen for unknown reasons Oct 19, 2023
@zhangbo1882 zhangbo1882 changed the title GC does not happen for unknown reasons runtime/: GC does not happen for unknown reasons Oct 19, 2023
@ianlancetaylor
Copy link
Member

Why do you think that GC does not happen? In the output I see "num-gc: 15002".

@seankhliao seankhliao changed the title runtime/: GC does not happen for unknown reasons runtime: GC does not happen for unknown reasons Oct 19, 2023
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Oct 19, 2023
@seankhliao seankhliao added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Oct 19, 2023
@zhangbo1882
Copy link
Author

@ianlancetaylor After 15002 GC, it never happen again. You can see the last gc timestamp:
last-gc: 2023-10-17 03:15:28.72404389 +0000 UTC

@qiulaidongfeng
Copy link
Member

Each major Go release is supported until there are two newer major releases.
I noticed that you mentioned using go1.19.0, will this be reproduced in the latest version?

@zhangbo1882
Copy link
Author

@qiulaidongfeng , the issue happened in our production environment. I can not reproduce it in local environment. Now we do not have the plan to upgrade the golang version. It seems that the next-gc value is calucated by the following function.

func (c *gcControllerState) heapGoalInternal() (goal, minTrigger uint64)

@zhangbo1882
Copy link
Author

zhangbo1882 commented Oct 21, 2023

Screenshot 2023-10-21 at 18 32 53

From the metrics go_memstats_next_gc_bytes – shows the target heap size of the next GC cycle. We can see that the value spike from 300Mi to int64 max value

@zhangbo1882
Copy link
Author

zhangbo1882 commented Oct 22, 2023

Even if I run gc manually, the allocated heap memory was reduced but it still was not returned to OS. It seems that when issue happens, runtime scavenge still can not work as well.

@WangLeonard
Copy link
Contributor

It is possible that debug.SetGCPercent(-1) is called somewhere.

@zhangbo1882
Copy link
Author

@WangLeonard , in our code, we do not call SetGCPercent(-1)

@zhangbo1882
Copy link
Author

Can anyone have any comments? Thanks a lot.

@randall77
Copy link
Contributor

It's going to be hard to fix this without having a way to reproduce it.
If you can, it would help to pepper (*gcControllerState) heapGoalInternal with checks of all the values it is using to compute the bad value. For instance,

goal = c.gcPercentHeapGoal.Load()
if goal > 1<<60 {
    print("c.gcPercentHeapGoal is too large", goal)
    throw("exiting")
}

Similarly for all the other pieces of info it accesses. That might give us a clue about where a bad value is coming from.

@zhangbo1882
Copy link
Author

I think I find the root cause for this issue in our environment. Actually it has nothing with golang GC code.
We use a tool gops (https://github.com/google/gops) for debug purpose. And we open a tcp port for the gops agent.
There is a potential security flaw. Anyone can send tcp packet to the port to trigger the gops command. If by change, the first byte of the data is 0x10 (defined by gops), it will trigger SetGCPercent() unexpectedly.
Thanks everyone.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

7 participants