New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: unrolled loop results in extra loads/stores (suboptimal schedule?) #33251
Comments
Yes, it looks like the scheduler is too aggressive moving loads earlier. Too many early loads means it has to immediately spill the loads, then restore them. |
@jeremyfaller can you outline the exact steps you took to generate the ssa.html file? Thanks :) |
It's env variable. You give it the name of the function you want the SSA for: https://dave.cheney.net/2020/06/19/how-to-dump-the-gossafunc-graph-for-a-method So for this instance, I think it was When I filed this, I did some looking at this, it looked relatively simple to me. This is from memory, and likely to be wrong, but I think there's a heap in the schedule pass that just needs different priorities for the load/stores. (Or the heap isn't there, and that's what I was going to add.) From there it was just expanding the testing, etc. Seemed like a pretty simple fix. |
Yeah, looking at |
My original proposal of using the souce line number will not work. And actually, it's already implemented in the heap in the schedule pass. I did a little bit of digging into this issue over the holidays, and it looks slightly more complicated than I originally laid out. The SSA load instructions have dependencies between them, so sequential loads are all marked as dependent on one another. As such, the scheduler has a hard time moving them around. When I get spare cycles, I'll likely take a further look at this. |
This is fixed at tip, probably from the new scheduler (CL 270940) plus CL 509856. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
reproduces with slightly old sync of HEAD
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
While fooling around with compiler optimization passes, I found that unrolling a dot-product yields code that generates lots of extraneous loads/stores. I've been doing some digging in, and believe the schedule pass in the compiler generates a suboptimal schedule.
What did you expect to see?
I expected relatively straight-line LOAD/LOAD/MUL/ADD code.
What did you see instead?
The compiler generates a ton of extraneous load/stores.
prod.tar.gz
The text was updated successfully, but these errors were encountered: