-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: benchmark performance regression in 1.9beta1 #20711
Comments
Any chance of a bisection? |
Yeah, I'll do one in the next few days. |
CL https://golang.org/cl/46134 mentions this issue. |
Looks like there is an additional bounds check at tip. Here's a simple repro:
For 1.8.3, it generates an inner loop of
For tip, it generates
@dr2chase , I think this happens because of your change to do the additional len>0 comparison wrapping the loop. Instead of the loop exit check dominating the bounds check (as it does in 1.8), there are two predecessors of the bounds check block, each which does a comparison (one is the len>0 block, the other is the loop exit check). Only by combining the information from the two predecessors can we deduce that the bounds check is unnecessary. Merging such information would be hard. Maybe there's another way to achieve the same result? And what would be the downside of backing out your change? |
Would my doing a bisect still be helpful? Meant to do it earlier but it has been an extremely busy week for me. |
Backing out the change fully creates a problem for GOEXPERIMENT=preemptibleloops; one possibility is to instead make the loop change conditional on whether the experiment is turned on (but we expect to turn it on for 1.10). Is including both copies of the loop-translating code and switching based on goexperiment considered adequately low-risk? |
CL https://golang.org/cl/46331 mentions this issue. |
@dr2chase : I have no problem including two versions of the loop-translating code and switching on goexperiment. It all depends on how big the CL is and what confidence we have in it. |
@randall77 Exactly that. And given how tricky it was to update the SSA graph to include the preemption check, I wasn't willing to put it anywhere other than the back edge, though perhaps for 1.10 we should get braver to reduce the cost of actually turning preempable loops on all the time. |
If you haven't started on it, I just did, should have a CL up as soon as it passes tests locally. |
Thanks very much @dr2chase and @randall77. |
CL https://golang.org/cl/46410 mentions this issue. |
Currently we only use 1 and 4 as a scale for indexed 4-byte load. In code generated in #20711 we can use indexed load with scale=8, to improve performance: name old time/op new time/op delta GM-6 108µs ± 0% 95µs ± 0% -12.06% (p=0.000 n=10+10) So add new ops and combine loadidx1(shift 3..).. into loadidx8, same for stores. Change-Id: I5ed1c250ac40960e20606580cf9de221e75b72f1 Reviewed-on: https://go-review.googlesource.com/46134 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
Sorry for the poor title -- I haven't done enough digging to know how to be more specific about the problem.
I'm comparing the following go versions:
go version go1.8 linux/amd64
go version go1.9beta1 linux/amd64
The following benchmark exhibits a significant slowdown:
This is extracted from a big suite of dozens of similar benchmarks in an internal codebase. Most of these benchmarks stayed the same or got faster in Go 1.9beta1, but a few got slower. I know that low-level codegen optimizations often make many benchmarks speed up while causing a few to slow down, but I thought it would be good to bring an example slowdown here in case it's not one of those. (Overall we're very happy with performance changes in 1.8->1.9beta1.)
The text was updated successfully, but these errors were encountered: