Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/internal/obj/x86: add zero-extending and truncating LEAx #24463

Closed
josharian opened this issue Mar 20, 2018 · 9 comments
Closed

cmd/internal/obj/x86: add zero-extending and truncating LEAx #24463

josharian opened this issue Mar 20, 2018 · 9 comments

Comments

@josharian
Copy link
Contributor

LEA instructions support some zero-extension and truncation. It would be pretty straightforward to add compiler rewrite rules to utilize them, but it doesn't appear that we have obj support for them. They would probably need new names(?).

Related: https://go-review.googlesource.com/c/go/+/101276

cc @TocarIP @rasky

@josharian josharian added this to the Go1.11 milestone Mar 20, 2018
@TocarIP
Copy link
Contributor

TocarIP commented Mar 20, 2018

So they can't be used even via asm? This looks like something that should fixed regardless of optimization rules.
cc @quasilyte

@josharian
Copy link
Contributor Author

Maybe they can, but I couldn’t figure out how.

@quasilyte
Copy link
Contributor

quasilyte commented Mar 21, 2018

Which form of LEA is required or appears to be missing?

TEXT lea(SB), 0, $0
        LEAQ 0x0(AX), AX       // lea rax, [rax]
        LEAQ 0x7(AX), AX       // lea rax, [rax+0x7]
        LEAQ 0x0(AX)(DX*1), AX // lea rax, [rax+rdx]
        LEAQ 0x7(AX)(DX*1), AX // lea rax, [rax+rdx+0x7]
        RET

  goasm.s:2		0xa5			488d00			LEAQ 0(AX), AX		
  goasm.s:3		0xa8			488d4007		LEAQ 0x7(AX), AX	
  goasm.s:4		0xac			488d0410		LEAQ 0(AX)(DX*1), AX	
  goasm.s:5		0xb0			488d441007		LEAQ 0x7(AX)(DX*1), AX	
  goasm.s:7		0xb5			c3			RET

Same forms are available for LEAW and LEAL.

The only caveats I remember is that you can't omit scaling (not an issue with compiler-generated asm) and there is no riz/eiz pseudo registers.

If there is no problem on obj level, maybe it's cmd/compile/internal/amd64 that causes bad code generation? Also, I think it's worthwhile to insert related TODO comment here:

gen/AMD64.rules:
// Merge ADDQconst into atomic adds.
// TODO: merging LEAQ doesn't work, assembler doesn't like the resulting instructions.

@josharian
Copy link
Contributor Author

The forms I believe to be missing are those in which the bit widths of the address calculated and the destination register are different. For example, calculate a 16 bit address and store in a 32 bit register (zero-extended), or calculate a 64 bit address and store in a 32 bit register (truncated).

In other assemblers, you would accomplish by using (say) AL or AH or the like instead of AX. In Go world, we always use AX, and use instruction names (e.g. MOVLQZX) to indicate the bit widths of the registers/memory locations involved.

Seems worthwhile to add the TODO you mentioned, perhaps with some additional detail.

@quasilyte
Copy link
Contributor

@josharian, thank you for explanation.

| Instruction | Operand size | Address size | Effect             |
|-------------+--------------+--------------+--------------------|
| LEAW        |           16 |           64 | Truncate 64->16    |
| LEAL        |           32 |           64 | Truncate 64->32    |
| LEAQ        |           64 |           64 | Move as is         |
| ? LEAW32    |           16 |           32 | Truncate 32->16    |
| ? LEAL32    |           32 |           32 | Move as is         |
| ? LEAQ32    |           64 |           32 | Zero-extend 32->64 |

There are other ways to call these (like LEAWL, etc), but I can't figure the best alternative.
Will send CL in a moment that implements 3 new instructions for amd64.

@gopherbot
Copy link

Change https://golang.org/cl/102376 mentions this issue: cmd/internal/obj/x86: add 32bit addr LEA

@quasilyte
Copy link
Contributor

@josharian, I'm unable to find inputs that will lead to different results between 32 and 64 bit address generating LEA. Additional sources of information also do not reveal any additional details that may help. (I haven't tried any performance tests though.)

Could you try to find these differences, if there are any, or if you agree that those instructions are useless, we can close this issue and abandon related CL.

There is an option to mark this with [Help Wanted], but I'm not sure there will be many volunteers to investigate on this.

@josharian
Copy link
Contributor Author

I had in mind using them for code like:

var x uint64
// set x
y := uint32(x * 7)

In that case, we could in theory use a truncating 64 -> 32 LEA, instead of calculating all 64 bits of x * 7 using a regular LEA, then truncating using a MOV.

But this does overall seem like a low priority, and I'm happy to just decide it's not worth it.

@randall77
Copy link
Contributor

y := uint32(x * 7)

We already have an instruction for that, LEAL. It zeros the upper 32 bits.

I'm going to close this, I don't think there is anything to do here.
Read my comments on Iskander's CL for more details.

@golang golang locked and limited conversation to collaborators Apr 12, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants