-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: RSS seems to have increased in Go 1.24 while the runtime accounting has not #72991
Comments
Note that @felixge might of course have been observing some other phenomena, but this seemed worth reporting given it seems to be reproducible. Also, note that the RSS difference does not seem to be limited to channels:
The naming conventions there are similar to those explained above. I have more data, but maybe that's enough of a thread to tug on. Finally, this does seem to be reproducible (e.g., see the |
One thing to be careful of is that faster allocation can naturally lead to more memory use, if you're explicitly trying to allocate as fast as possible in a benchmark. The GC did not get faster in these changes, so more memory gets allocated black, and you end up with more floating garbage. There's no increase in memory and it's a pure resource win if memory allocation is not your bottleneck, which is the case in most programs. In general, I'm skeptical of memory increases that also result in a larger heap size (indicating more floating garbage). I'll try to reproduce and check these numbers with |
Hm, I can't seem to reproduce. For Can you run with |
One other bit of data maybe worth posting is the heap goals reported by gctrace are a bit different in the "bad" vs. "good" cases, but at least at first blush, seem reasonably close. Commit 60ee99cFor example, if we look at the gctrace from running against commit 60ee99c, a "good" commit from 2024-10-21 (20241021-155537-60ee99cf5d.out):
and the heapbench self-reported stats from the same time interval:
Commit 8730fcfWe can compare that to commit 8730fcf, the first "bad" indicated by bisect and also from 2024-10-21 (20241021-155625-8730fcf885.out):
and the corresponding heapbench self-reported stats:
|
Hi @mknyszek, I'm curious what distro and version?
I have that as well, including recorded during the bisect. Just posted an example above. |
Thanks @thepudds. I think it's safe to say you reproduced the issue. I think I found the culprit to @felixge's original issue and the issue you're seeing here. Try out https://go.dev/cl/659956. |
Change https://go.dev/cl/659956 mentions this issue: |
That CL hopefully addresses it! 🎉 Re-running nodechannel-large just now for go1.23, go1.24, and your fix:
The small remaining difference in the mean might just be noise. I'll re-run for longer. |
FWIW, the heapbench utility isn't trying to allocate as fast as possible. It lightly tries to look like a real application, with "jobs" coming in that include some CPU work (calculating hashes or decoding varints or similar) and some allocation work with a few dials to allow different scenarios, like a different baseline live heap, or slowly increasing live heap, different rate of garbage generation, etc. It has some variability built in (including because one of the early targets were some runtime scheduler-related glitches), but the variability averages out such that you can still look at averages across ~minutes and get reasonably consistent results. It's not super sophisticated, but it tries to keep putting the requested load on the allocator and GC while doing its work, or wait for more work. (Even if it's averaging ~2 cores of CPU usage as reported every 1 or 2 seconds, if you were to hypothetically zoom in on a small enough time scale, there are moments where it isn't doing anything, assuming you haven't set the dials to overload the machine its running on). In any event, that's probably more than you wanted to know! Regarding possibly having more floating garbage, if you look at the GC traces from the tests, does that seem to hint close to the same amount of floating garbage between the "good" and "bad" cases? |
@thepudds Thanks! That's all good to know. I intended to get this across in an earlier message but yes, your program definitely reproduces the issue and there aren't any subtleties around floating garbage happening. The heap goal was the same before and after. |
👋 We tested https://go.dev/cl/659956 with the workload which had increased memory usage in 1.24, and we can confirm the CL addresses the issue! 🎉 Thank you 🙏 ! |
Wow, fantastic work everybody! Thank you @thepudds for figuring out how to reproduce this issue. Thanks @mknyszek for finding the issue and fixing it. And thanks @Gandem for doing all the work on our end to analyze the situation and try out the various mitigation ideas as well as confirming that Michael's fix works for our prod issue. 🙇 |
Go version
go version go1.24.1 linux/amd64
Output of
go env
in your module/workspace:go env
OutputWhat did you do?
I saw @felixge report in Gopher Slack:
Interestingly, the Go-reported accounting did not seem to change.
I have a simple GC benchmarking utility that has been useful in the past to illustrate some Go runtime problems (or to illustrate something is just expected behavior and rule out bad behavior by the runtime).
I did a quick-and-dirty edit to it to have it run through various permutations of "interesting" memory properties that might be related to the issue @felixge reported, and then tested it via a few nested bash loops that ran through different flag combinations.
Sample run:
Note this is on an Ubuntu 22.04.5 LTS VM. (Given this seemingly hasn't been reported yet, I wonder if it might depend on the distro or a OS config).
What did you see happen?
Here, we can see
nodechannel-large
has a much higher RSS in go1.24 compared to go1.23, whereas the other experiments above have basically the same RSS between go1.24 and go1.23.chan node
, where thenode
type contains pointers.chan int64
, whereint64
of course does not have pointers.The
node
type:I ran a git bisect, which pointed to https://go.dev/cl/614257. That's part of the big mallocgc refactor and at least in the neighborhood of plausible:
The measured RSS results from the bisect:
What did you expect to see?
Ideally, similar RSS values.
CC @mknyszek, @prattmic
The text was updated successfully, but these errors were encountered: