You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This looks kind of unfortunate, but not sure there's anything to fix. The inner loop in this code is right on the edge of fitting in registers and tweaks from this CL cause regalloc to not find as good a solution as before.
Previously we kept 3-operand LEAs around to the final assembly code generation step, then broke them into two 2-operand LEAs. For instance what was originally a LEAQ 2(AX)(R13*1), R13 gets split into
Because we do the breaking very late (and after regalloc), the two parts are always adjacent. The intermediate value isn't seen by the register allocator.
After the CL, we break these LEAs up earlier. As a consequence, they can get scheduled farther away from each other. That's generally a good thing, but in this case it tweaks the register allocator just enough that a value that was in a register throughout the inner loop (the thing in AX) now gets spilled outside the loop and restored at a few places inside the loop.
In unrelated news, I did notice that in the following CL (440036) the addressingmodes pass in now not quite right after this CL, as it matches against the X versions of shifts. Maybe something to fix, but should only affect v3 builds so it isn't this issue. @wdvxdr1123
I take that back about the addressingmodes issue, we don't generate the X versions in the regular rules any more, but we do generate the Xload versions, which is what addressingmodes cares about.
Regression range: https://go.googlesource.com/go/+log/3d92205ef5ed42147376d929e0f59c765974e345..da6042e82eb7fe5ba40a5c17959a31e19a44c3e8
Next step is to bisect.
The text was updated successfully, but these errors were encountered: