proposal: cmd/asm: untyped imm8 operand class for x86 asm #21528

quasilyte · 2017-08-18T19:49:52Z

Many amd64 instructions take immediate operand without auto sign extension.
When we only care about binary representation, both "$128" and "$-128" have the same value.

In Intel manual, VPBLENDD have imm8 operand class. This means "signed 8bit value".
But semantics only care about bits, so it totally makes sense to use 0xF0 as an argument.
Well, src/crypto/sha512/sha512block_amd64.s:524 does exactly this:
VPBLENDD $0xF0, Y2, Y4, Y4

GAS also happily accepts both positive and negative immediate values.

VPSHUFD $-128, (%R11), %YMM15
VPSHUFD $128, (%R11), %YMM15

// Note: same encoding
   6:	c4 41 7d 70 3b 80    	vpshufd $0x80,(%r11),%ymm15
   c:	c4 41 7d 70 3b 80    	vpshufd $0x80,(%r11),%ymm15

I propose backwards-compatible change that includes a new opclass - Yimmb which is a union of {Yi0, Yi1, Yi8, Yu8, Yu7} opclasses via ycover.
From a user point of view, this change will make it possible to express immediate argument in both signed and unsigned form.
This first step helps us to reduce yvex_xyi3 to 2 ytab entries.

The other step that may be proposed later: allowing more instructions to accept Yimmb opclass instead of Yi8/Yu8.

pros:

More compatibility with existing assemblers
Easier to express "imm8" operands that are not numerical values by their nature (e.g. masks)
Easier to automatically generate instructions that work without artifical restrictions
If we really need both Yi8 and Yu8, we can encode them in 1 ytab instead of duplication (see yvex_xyi3)

Even if asm6.go is going to be automatically generated somewhere in future,
the exact dates are unknown, so this simple change may make sense for the current.

Implementation:

Possible patch is provided below.

Yimmb.patch

diff --git a/src/cmd/asm/internal/asm/testdata/amd64enc.s b/src/cmd/asm/internal/asm/testdata/amd64enc.s
index ec888bc..02ccebb 100644
--- a/src/cmd/asm/internal/asm/testdata/amd64enc.s
+++ b/src/cmd/asm/internal/asm/testdata/amd64enc.s
@@ -10681,4 +10681,17 @@ TEXT asmtest(SB),DUPOK|NOSPLIT,$0
 	//TODO: XSAVES64 (R11)                  // 490fc72b
 	//TODO: XSETBV                          // 0f01d1
 	XTEST                                   // 0f01d6
+	// Note: code below is written by hand,
+	// so it seems like a good idea to move
+	// them into separate file.
+	// Maybe test runner should permit include
+	// directives (right now it will not
+	// work as intended).
+	//
+	// * Check that immb opclass works as expected.
+	VPSHUFD $-7, (R11), Y15                 // c4417d703bf9
+	VPSHUFD $-128, (R11), Y15               // c4417d703b80
+	VPSHUFD $128, (R11), Y15                // c4417d703b80
+	VPSHUFD $255, (R11), Y15                // c4417d703bff
+	VPSHUFD $0, (R11), Y1                   // c4c17d700b00
 	RET
diff --git a/src/cmd/internal/obj/x86/asm6.go b/src/cmd/internal/obj/x86/asm6.go
index bcf9318..33e3d8a 100644
--- a/src/cmd/internal/obj/x86/asm6.go
+++ b/src/cmd/internal/obj/x86/asm6.go
@@ -92,11 +92,12 @@ type Movtab struct {
 const (
 	Yxxx = iota
 	Ynone
-	Yi0 // $0
-	Yi1 // $1
-	Yi8 // $x, x fits in int8
-	Yu8 // $x, x fits in uint8
-	Yu7 // $x, x in 0..127 (fits in both int8 and uint8)
+	Yi0   // $0
+	Yi1   // $1
+	Yi8   // $x, x fits in int8
+	Yu8   // $x, x fits in uint8
+	Yu7   // $x, x in 0..127 (fits in both int8 and uint8)
+	Yimmb // $x, x <- {Yi0, Yi1, Yi8, Yu8, Yu7}
 	Ys32
 	Yi32
 	Yi64
@@ -811,10 +812,8 @@ var yvex_ri3 = []ytab{
 }
 
 var yvex_xyi3 = []ytab{
-	{Yu8, Yxm, Yxr, Zvex_i_rm_r, 2},
-	{Yu8, Yym, Yyr, Zvex_i_rm_r, 2},
-	{Yi8, Yxm, Yxr, Zvex_i_rm_r, 2},
-	{Yi8, Yym, Yyr, Zvex_i_rm_r, 2},
+	{Yimmb, Yxm, Yxr, Zvex_i_rm_r, 2},
+	{Yimmb, Yym, Yyr, Zvex_i_rm_r, 2},
 }
 
 var yvex_yyi4 = []ytab{ //TODO don't hide 4 op, some version have xmm version
@@ -1672,7 +1671,7 @@ var optab =
 	{AVPBROADCASTB, yvex_vpbroadcast, Pvex, [23]uint8{VEX_128_66_0F38_W0, 0x78, VEX_256_66_0F38_W0, 0x78}},
 	{AVPTEST, yvex_xy2, Pvex, [23]uint8{VEX_128_66_0F38_WIG, 0x17, VEX_256_66_0F38_WIG, 0x17}},
 	{AVPSHUFB, yvex_xy3, Pvex, [23]uint8{VEX_128_66_0F38_WIG, 0x00, VEX_256_66_0F38_WIG, 0x00}},
-	{AVPSHUFD, yvex_xyi3, Pvex, [23]uint8{VEX_128_66_0F_WIG, 0x70, VEX_256_66_0F_WIG, 0x70, VEX_128_66_0F_WIG, 0x70, VEX_256_66_0F_WIG, 0x70}},
+	{AVPSHUFD, yvex_xyi3, Pvex, [23]uint8{VEX_128_66_0F_WIG, 0x70, VEX_256_66_0F_WIG, 0x70}},
 	{AVPOR, yvex_xy3, Pvex, [23]uint8{VEX_128_66_0F_WIG, 0xeb, VEX_256_66_0F_WIG, 0xeb}},
 	{AVPADDQ, yvex_xy3, Pvex, [23]uint8{VEX_128_66_0F_WIG, 0xd4, VEX_256_66_0F_WIG, 0xd4}},
 	{AVPADDD, yvex_xy3, Pvex, [23]uint8{VEX_128_66_0F_WIG, 0xfe, VEX_256_66_0F_WIG, 0xfe}},
@@ -2003,17 +2002,25 @@ func instinit(ctxt *obj.Link) {
 	ycover[Yi1*Ymax+Yu8] = 1
 	ycover[Yu7*Ymax+Yu8] = 1
 
+	ycover[Yi0*Ymax+Yimmb] = 1
+	ycover[Yi1*Ymax+Yimmb] = 1
+	ycover[Yu7*Ymax+Yimmb] = 1
+	ycover[Yu8*Ymax+Yimmb] = 1
+	ycover[Yi8*Ymax+Yimmb] = 1
+
 	ycover[Yi0*Ymax+Ys32] = 1
 	ycover[Yi1*Ymax+Ys32] = 1
 	ycover[Yu7*Ymax+Ys32] = 1
 	ycover[Yu8*Ymax+Ys32] = 1
 	ycover[Yi8*Ymax+Ys32] = 1
+	ycover[Yimmb*Ymax+Ys32] = 1
 
 	ycover[Yi0*Ymax+Yi32] = 1
 	ycover[Yi1*Ymax+Yi32] = 1
 	ycover[Yu7*Ymax+Yi32] = 1
 	ycover[Yu8*Ymax+Yi32] = 1
 	ycover[Yi8*Ymax+Yi32] = 1
+	ycover[Yimmb*Ymax+Yi32] = 1
 	ycover[Ys32*Ymax+Yi32] = 1
 
 	ycover[Yi0*Ymax+Yi64] = 1
@@ -2021,6 +2028,7 @@ func instinit(ctxt *obj.Link) {
 	ycover[Yu7*Ymax+Yi64] = 1
 	ycover[Yu8*Ymax+Yi64] = 1
 	ycover[Yi8*Ymax+Yi64] = 1
+	ycover[Yimmb*Ymax+Yi64] = 1
 	ycover[Ys32*Ymax+Yi64] = 1
 	ycover[Yi32*Ymax+Yi64] = 1

The text was updated successfully, but these errors were encountered:

quasilyte · 2017-08-21T10:06:24Z

I want to emphasize: it does not break any existing assembly code;
it only reduces immediate argument encoding restrictions (two’s complement/negative + positive instead of just one of them).

gopherbot · 2017-08-21T15:56:42Z

Change https://golang.org/cl/57570 mentions this issue: cmd/internal/obj/x86: new operand class for x86 asm backend

rsc · 2017-08-21T20:29:32Z

I tried pretty hard in the instruction tables to distinguish the ones that interpret the argument as signed vs unsigned. If one is wrong I'd like to fix that. Do you have an example of one that's wrong?

If VPBLENDD only cares about bits (and 0x80 is a bit and not the sign), then it should be defined to take uimm8.

Your comment:

Well, src/crypto/sha512/sha512block_amd64.s:524 does exactly this:
VPBLENDD $0xF0, Y2, Y4, Y4

makes me think that VPBLENDD is already correct, or else it wouldn't assemble.

quasilyte · 2017-08-21T20:42:10Z

@rsc, VPBLENDD is a bit special due to current amd64 asm handling of 4 operand instructions.
Each 4 operand instruction does not check against any of the particular immediate oclass (Yi8/Yu8,...).

When instructions are generated from, for example, x86.csv (in x86spec format),
we need to make a decision about Yi8 vs Yu8.
From the spec entry, we know that operand have a kind of imm8 which can map to Yi8 or Yu8.
It is easier to mark all such arguments with proposed Yimmb which permits both Yi8 and Yu8.

As a side effect of this, the result behavior is also more compatible with widely used assemblers.

quasilyte · 2017-08-21T20:45:39Z

To make it clear, current 4 operand instructions semantics for immediate argument are the same as Yimmb oclass, its just less explicit and feels more like a bug than a feature.

rsc · 2017-08-21T21:02:35Z

I don't want to be sloppy here. Assembly is hard enough to get right. I want to be precise. Either it's a signed value or an unsigned value, and it's OK for the assembler to insist that you know which one.

I understand that other assemblers take a different approach, and that's fine, but it's not our approach. Our approach is to be picky, to try to help avoid bugs.

quasilyte · 2017-08-21T21:28:32Z

@rsc, I see.

Maybe we can make x86spec format express signed/unsigned immediate arguments?
If x86.csv will have this information, code generation based on it can be simplified.

quasilyte · 2017-08-22T14:13:41Z

I tried pretty hard in the instruction tables to distinguish the ones that interpret the argument as signed vs unsigned. If one is wrong I'd like to fix that. Do you have an example of one that's wrong?

@rsc, I may be wrong, but it seems like yshl should have Yu8 oclass instead of Yi32.
https://github.com/golang/go/blob/master/src/cmd/internal/obj/x86/asm6.go#L401

var yshl = []ytab{
	{Yi1, Ynone, Yml, Zo_m, 2},
	{Yi32, Ynone, Yml, Zibo_m, 2}, // <- here
	{Ycl, Ynone, Yml, Zo_m, 2},
	{Ycx, Ynone, Yml, Zo_m, 2},
}

If Yi32 is the right thing there,
the amount of my confusion will be enormous. :)

I can check other instructions one-by-one later.

rsc · 2017-08-28T20:59:36Z

x86.csv does have this information: it says imm8 for signed or imm8u for unsigned. (Sorry for the late reply.)

rsc · 2017-08-28T21:00:51Z

The yshl table can be corrected to Yu8, I believe. That code is much older than Go and is in some places sloppier than I would prefer.

gopherbot · 2017-09-07T10:51:53Z

Change https://golang.org/cl/62190 mentions this issue: cmd/asm: restrict x86 shift ops to 8bit args

Change "yshl" and "yshb" immediate oclass from Yi32 to Yu8. This forbids: - negative shift counts - shift counts that not fit into 8bit Affects: RCL{B,L,Q,W} RCR{B,L,Q,W} ROL{B,L,Q,W} ROR{B,L,Q,W} SAL{B,L,Q,W} SAR{B,L,Q,W} SHL{B,L,Q,W} SHR{B,L,Q,W} Issue #21528 has some additional context about this change. Change-Id: I60884cb2b41a860820889fcd878ca6f564006b4a Reviewed-on: https://go-review.googlesource.com/62190 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>

quasilyte · 2017-09-08T16:37:58Z

Going to close this.
All immediate operands will remain explicitly signed/unsigned.

gopherbot added this to the Proposal milestone Aug 18, 2017

gopherbot added the Proposal label Aug 18, 2017

quasilyte changed the title ~~proposal: cmd/asm: untyped imm8 operand class for x86 asm backend~~ proposal: cmd/asm: untyped imm8 operand class for x86 asm Aug 18, 2017

quasilyte closed this as completed Sep 8, 2017

golang locked and limited conversation to collaborators Sep 8, 2018

gopherbot added the FrozenDueToAge label Sep 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: cmd/asm: untyped imm8 operand class for x86 asm #21528

proposal: cmd/asm: untyped imm8 operand class for x86 asm #21528

quasilyte commented Aug 18, 2017 •

edited

Loading

quasilyte commented Aug 21, 2017

gopherbot commented Aug 21, 2017

rsc commented Aug 21, 2017

quasilyte commented Aug 21, 2017

quasilyte commented Aug 21, 2017 •

edited

Loading

rsc commented Aug 21, 2017

quasilyte commented Aug 21, 2017

quasilyte commented Aug 22, 2017 •

edited

Loading

rsc commented Aug 28, 2017

rsc commented Aug 28, 2017

gopherbot commented Sep 7, 2017

quasilyte commented Sep 8, 2017

proposal: cmd/asm: untyped imm8 operand class for x86 asm #21528

proposal: cmd/asm: untyped imm8 operand class for x86 asm #21528

Comments

quasilyte commented Aug 18, 2017 • edited Loading

quasilyte commented Aug 21, 2017

gopherbot commented Aug 21, 2017

rsc commented Aug 21, 2017

quasilyte commented Aug 21, 2017

quasilyte commented Aug 21, 2017 • edited Loading

rsc commented Aug 21, 2017

quasilyte commented Aug 21, 2017

quasilyte commented Aug 22, 2017 • edited Loading

rsc commented Aug 28, 2017

rsc commented Aug 28, 2017

gopherbot commented Sep 7, 2017

quasilyte commented Sep 8, 2017

quasilyte commented Aug 18, 2017 •

edited

Loading

quasilyte commented Aug 21, 2017 •

edited

Loading

quasilyte commented Aug 22, 2017 •

edited

Loading