New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: rematerializable ops must not clobber flags #21080
Comments
Ah, looking through the flagalloc code I see this is the expected behavior because the flags are regenerated as necessary, but the values aren't hooked up. My problem is actually the regalloc pass inserting MOVDaddr ops after the flagalloc pass. I guess it assumes the op doesn't clobber flags. Probably the right fix is to try and make MOVDaddr not clobber flags on s390x somehow (currently it can insert an ADD with 32-bit immediate in shared mode, hence the clobber flags mark). I suspect that can wait until 1.10. It's probably also worth trying to get rid of the redundant instructions, not sure if there is already an issue for that? |
Yeah, the register allocator may insert MOVDaddr because it is "rematerializable". But if it may clobber flags, it is not really rematerializable. Marking MOVDaddr not rematerializable probably makes liveness unhappy, though. Is there an instruction that does the address calculation without clobbering flags, like LEA on 386? I tried to change |
Yeah, we have LAY, but it has a couple of limitations so it's not quite a straight swap. Firstly it can't take R0 as in input, but we could easily workaround that by removing R0 from the list of valid MOVDaddr targets. The second issue is that the offset is limited to a 20-bit signed integer which means we need to use a temp register when the offset is larger. |
Change https://golang.org/cl/63030 mentions this issue: |
While debugging #21048 I noticed that there was a conditional jump immediately after a MOVDaddr instruction. This shouldn't happen because MOVDaddr clobbers flags on s390x.
Digging deeper it looks to me like flagalloc is silently failing in runtime.gcDrain. The problem is reproducible on darwin/amd64.
If you do:
And look at the generated ssa.html after the regalloc phase you'll see this:
v18 is clobbered immediately. v356 and v234 are generated but immediately clobbered. All these BTL instructions make it into the final assembly.
The text was updated successfully, but these errors were encountered: