x/arch/x86/x86asm: x86 64-bit immediate argument integer overflow #45052

stephen-fox · 2021-03-16T05:25:27Z

What version of Go are you using (`go version`)?

$ go version
go version go1.16 darwin/amd64

Note: My go module imported arch versioned at v0.0.0-20210308155006-05f8f0431f72.

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (`go env`)?

go env Output

$ go env
GOHOSTARCH="amd64"
GOHOSTOS="darwin"

What did you do?

I am writing a "shellcode" parser that takes raw binary, or a C-style hex array and converts it into human-readable disassembly like the Unix objdump utility. While experimenting with some sample shellcode, I found that this x86_64 sample produced an instruction with an argument that differed from the comments and the output of objdump.

The instruction in question:

"\xbf\xad\xde\xe1\xfe"  // mov    $0xfee1dead,%edi

objdump's interpretation:

$ objdump -D -M att,x86-64 -b binary -m i386 foo

foo:     file format binary

Disassembly of section .data:

00000000 <.data>:
   0:	bf ad de e1 fe       	mov    $0xfee1dead,%edi

A sample Go program, and its interpretation:

package main

import (
	"golang.org/x/arch/x86/x86asm"
	"log"
)

func main() {
	log.SetFlags(0)

	inst, err := x86asm.Decode([]byte{0xbf, 0xad, 0xde, 0xe1, 0xfe}, 64)
	if err != nil {
		log.Fatalln(err)
	}

	log.Println(inst.Args)

	log.Println(x86asm.GNUSyntax(inst, 0, nil))

	// Output:
	// [EDI -0x11e2153 <nil> <nil>]
	// mov $-0x11e2153,%edi
}

What did you expect to see?

I expected the immediate argument to the mov to be 0xfee1dead.

What did you see instead?

The immediate argument to the mov is -0x11e2153 (0xFFFFFFFFFEE1DEAD).

Debugging and thoughts

I should start by stating that my knowledge of x86 is... limited. So, I apologize in advance for misusing any terminology. I am going off what I already know, some Googling, and code comments.

I did attempt to understand what exactly is going on in the x86asm library. My first thought is maybe this is not even a bug? The value in question overflows a signed 32-bit integer... But this is 64-bit assembly.

Following the x86asm.Decode function through a debugger, we end up in the decode1 function in decode.go. Its documentation mentions that it mimic[s] bugs (or at least unique features) of GNU libopcodes as used by objdump. So, is this data even being interpreted correctly in the first place?

That said, @rsc's comment regarding 64-bit support on line 242 states:

// TODO(rsc): 64-bit mode not tested, probably not working.

... so, perhaps I am already setting my expectations a bit high :) The comment dates back to ~2014. Maybe it no longer applies?

In decode1, we end up in a loop on line 468. Its job appears to be decoding op code bytes. On the sixth iteration, we end up in a switch statement case on line 1046. Apparently, the op code is interpreted as xArgImm32. The code in this case converts the local imm variable (an int64) to an int32, and then converts this new value to an Imm, which is a type alias for int64. It then assigns it to index 1 of the Args field of the Inst object being constructed:

case xArgImm32:
    inst.Args[narg] = Imm(int32(imm)) // 'imm' = 4276215469 (0xFEE1DEAD).
    narg++

The resulting value is -18751827 (0xFFFFFFFFFEE1DEAD).

I wonder if the op code was interpreted incorrectly? The subsequent xArgImm64 case does the right thing (not converting imm to an int32), and this is 64-bit code.

Thank you for reading!

The text was updated successfully, but these errors were encountered:

zephyrtronium · 2021-03-16T15:37:40Z

Even though the code is for 64-bit mode, edi is a 32-bit register, so the immediate operand is not sign-extended to 0xFFFFFFFFFEE1DEAD. Both 0xfee1dead and -0x11e2153 are the same here.

stephen-fox · 2021-03-18T23:08:39Z

Thanks for the explanation, @zephyrtronium. I guess I'm stuck on the human readable form of the instruction being represented that way. I do not think a debugger would represent a register's value as -0x11e2153. It just feels maclunky, especially if you have several values in a single binary being represented that way. Thinking back to that comment about emulating objdump bugs, is this the way objdump should represent the instruction?

gopherbot added this to the Unreleased milestone Mar 16, 2021

cherrymui added NeedsFix NeedsInvestigation and removed NeedsFix labels Mar 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x/arch/x86/x86asm: x86 64-bit immediate argument integer overflow #45052

x/arch/x86/x86asm: x86 64-bit immediate argument integer overflow #45052

stephen-fox commented Mar 16, 2021

zephyrtronium commented Mar 16, 2021

stephen-fox commented Mar 18, 2021

x/arch/x86/x86asm: x86 64-bit immediate argument integer overflow #45052

x/arch/x86/x86asm: x86 64-bit immediate argument integer overflow #45052

Comments

stephen-fox commented Mar 16, 2021

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

Debugging and thoughts

zephyrtronium commented Mar 16, 2021

stephen-fox commented Mar 18, 2021

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?