-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: general slowdown relative from 1.22 to tip (1.23 candidate) #67822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Moving this to okay-after-rc1 so we can gather more data in the RC1 concurrently with investigating these regressions. (I also don't think the regressions, primarily in microbenchmarks, are so severe that they should block an RC. But we should still get to the bottom of this and either fix it or make an explicit decision before the release.) |
Hi all, I spent some time looking at this issue, specifically at one of the benchmarks that has regressed, TLDR: some portion of this slowdown appears to be due to alignment/layout artifacts, since a good chunk of it goes away when I randomize the order of functions in the text section. Here's what I did as an experiment:
This is on a 2-socket 96-core cloudtop VM. Here's the benchstat output for the base comparison (not showing "B/op" or "allocs/op" since there is no difference there):
Here's what I get with randomized layout:
The full script I am using is below. When I get a chance later today I'll try a couple of the other benchmarks.
|
I repeated my experiment with Without randomization:
With randomization:
so still a 5% slowdown for |
For |
I think I understand the root cause of the |
For Pi/foo=shopspring/prec=100-16 de3a3c9 jumps out as a possible culprit, for around 2pp of the regression. The only thing that CL may have made a little slower is zeroing large pointerful things; perhaps it's allocating large pointerful things frequently? I can take a closer look. |
I spent some time today investigating Pi/foo=shopspring/prec=100-16 at de3a3c9 but nothing is jumping out at me. After eliminating GC from the equation ( |
Discussed in the weekly release checkin meeting -- the consensus is that we are in ok shape, just leaving this open for now |
Closing, before first build for 1.23 final, based on discussion above. |
The performance dashboard shows a general, across-the-board regression from the last release (1.22) to current tip. The worst regressions are on the 88 core machine, For example:

Unfortunately, the 88 core history doesn't go far enough back to identify the original source. There are a few clear regressions on the 16 core machine, where we can go back. For example Pi/foo=shopspring/prec=100-16 and HistogramAllocation-16. It's hard to finger an obvious CL in these cases, but it may give us a narrower window to reproduce on the 88 core machine.
cc @mknyszek @golang/runtime
The text was updated successfully, but these errors were encountered: