New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: autogenerated wrapper is not inlined #25338
Comments
In genwrapper (cmd/compile/internal/gc/subr.go) I see:
Should we at least change false to flagiexport, since benchmarks are done? |
@TocarIP I see you are systematically checking benchmarks and running down significant regressions. Just wanted to say a hearty thank you for doing this! |
Additional regressions caused by that CL:
There are more regressions, but these are the most significant ones (highest delta). |
Probably we should automate this through a nightly job so that offfending CLs are caught quicker. |
Change https://golang.org/cl/135697 mentions this issue: |
Currently all inlining of autogenerated wrappers is disabled, because it causes build failures, when indexed export format is enabled. Turns out we can reenable it for common case of (*T).M wrappers. This fixes most performance degradation of 1.11 vs 1.10. encoding/binary: name old time/op new time/op delta ReadSlice1000Int32s-6 14.8µs ± 2% 11.5µs ± 2% -22.01% (p=0.000 n=10+10) WriteSlice1000Int32s-6 14.8µs ± 2% 11.7µs ± 2% -20.95% (p=0.000 n=10+10) bufio: name old time/op new time/op delta WriterFlush-6 32.4ns ± 1% 28.8ns ± 0% -11.17% (p=0.000 n=9+10) sort: SearchWrappers-6 231ns ± 1% 231ns ± 0% ~ (p=0.129 n=9+10) SortString1K-6 365µs ± 1% 298µs ± 1% -18.43% (p=0.000 n=9+10) SortString1K_Slice-6 274µs ± 2% 276µs ± 1% ~ (p=0.105 n=10+10) StableString1K-6 490µs ± 1% 373µs ± 1% -23.73% (p=0.000 n=10+10) SortInt1K-6 210µs ± 1% 142µs ± 1% -32.69% (p=0.000 n=10+10) StableInt1K-6 243µs ± 0% 151µs ± 1% -37.75% (p=0.000 n=10+10) StableInt1K_Slice-6 130µs ± 1% 130µs ± 0% ~ (p=0.237 n=10+8) SortInt64K-6 19.9ms ± 1% 13.5ms ± 1% -32.32% (p=0.000 n=10+10) SortInt64K_Slice-6 11.5ms ± 1% 11.5ms ± 1% ~ (p=0.912 n=10+10) StableInt64K-6 21.5ms ± 0% 13.5ms ± 1% -37.30% (p=0.000 n=9+10) Sort1e2-6 108µs ± 2% 83µs ± 3% -23.26% (p=0.000 n=10+10) Stable1e2-6 218µs ± 0% 161µs ± 1% -25.99% (p=0.000 n=8+9) Sort1e4-6 22.6ms ± 1% 16.8ms ± 0% -25.45% (p=0.000 n=10+7) Stable1e4-6 67.6ms ± 1% 49.7ms ± 0% -26.48% (p=0.000 n=10+10) Sort1e6-6 3.44s ± 0% 2.55s ± 1% -26.05% (p=0.000 n=8+9) Stable1e6-6 13.7s ± 0% 9.9s ± 1% -27.68% (p=0.000 n=8+10) Fixes #27621 Updates #25338 Change-Id: I6fe633202f63fa829a6ab849c44d7e45f8835dff Reviewed-on: https://go-review.googlesource.com/c/135697 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
I believe this is fixed now. The CL above, and possibly the slightly more aggressive mid-stack inlining makes the wrapper inline successfully. |
What version of Go are you using (
go version
)?Master:
Running encoding/binary benchmarks I see significant slowdown (compared to 1.10):
This is caused by compiler not inlining autogenerated binary.(*bigEndian).PutUint32, which does nothing but shuffles arguments and calls binary.bigEndian.PutUint32.
Bisect points to ca2f85f, which is IMHO strange, because it doesn't enable new export format by default.
@mdempsky any ideas?
The text was updated successfully, but these errors were encountered: