Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: unnecessary full complex multiplication for x*c where x is complex and c a real constant #38780

Closed
griesemer opened this issue Apr 30, 2020 · 4 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.

Comments

@griesemer
Copy link
Contributor

These two functions

const c = 2.71828

func mul1(z complex128) complex128 {
	return z*c
}

func mul2(z complex128) complex128 {
	return complex(real(z)*c, imag(z)*c)
}

both multiply a complex variable z with a scalar (real) constant c.

Since c is an untyped real constant, mul1 can be implemented as mul2. However, the compiler appears to generate code for a full complex multiplication x * complex(c, 0) for mul1, leading both to slower multiplication and possibly a less accurate result.

Assembly code for the two functions:

	0x0000 00000 (c.go:5)	TEXT	"".mul1(SB), NOSPLIT|ABIInternal, $0-32
	0x0000 00000 (c.go:5)	FUNCDATA	$0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0000 00000 (c.go:5)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0000 00000 (c.go:6)	MOVSD	$f64.4005bf0995aaf790(SB), X0
	0x0008 00008 (c.go:6)	MOVSD	"".z+8(SP), X1
	0x000e 00014 (c.go:6)	MULSD	X1, X0
	0x0012 00018 (c.go:6)	XORPS	X2, X2
	0x0015 00021 (c.go:6)	MOVSD	"".z+16(SP), X3
	0x001b 00027 (c.go:6)	MULSD	X3, X2
	0x001f 00031 (c.go:6)	SUBSD	X2, X0
	0x0023 00035 (c.go:6)	MOVSD	X0, "".~r1+24(SP)
	0x0029 00041 (c.go:6)	XORPS	X0, X0
	0x002c 00044 (c.go:6)	MULSD	X0, X1
	0x0030 00048 (c.go:6)	MOVSD	$f64.4005bf0995aaf790(SB), X0
	0x0038 00056 (c.go:6)	MULSD	X3, X0
	0x003c 00060 (c.go:6)	ADDSD	X0, X1
	0x0040 00064 (c.go:6)	MOVSD	X1, "".~r1+32(SP)
	0x0046 00070 (c.go:6)	RET

and

	0x0000 00000 (c.go:9)	TEXT	"".mul2(SB), NOSPLIT|ABIInternal, $0-32
	0x0000 00000 (c.go:9)	FUNCDATA	$0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0000 00000 (c.go:9)	FUNCDATA	$1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
	0x0000 00000 (c.go:10)	MOVSD	$f64.4005bf0995aaf790(SB), X0
	0x0008 00008 (c.go:10)	MOVSD	"".z+8(SP), X1
	0x000e 00014 (c.go:10)	MULSD	X0, X1
	0x0012 00018 (c.go:10)	MOVSD	X1, "".~r1+24(SP)
	0x0018 00024 (c.go:10)	MOVSD	"".z+16(SP), X1
	0x001e 00030 (c.go:10)	MULSD	X1, X0
	0x0022 00034 (c.go:10)	MOVSD	X0, "".~r1+32(SP)
	0x0028 00040 (c.go:10)	RET

The compiler can easily statically determine that c has a 0 imaginary component and simplify the code generation.

@griesemer griesemer added the NeedsFix The path to resolution is known, but the work has not been done. label Apr 30, 2020
@griesemer
Copy link
Contributor Author

cc: @randall77, @josharian

@griesemer
Copy link
Contributor Author

griesemer commented May 1, 2020

See src/math/cmplx/log.go, TODO in func Log10 and simplify that code once this is fixed.
(This is in a CL pending submission.)

@randall77
Copy link
Contributor

I'm not sure if this is a correct optimization. I think it's the equivalent of the rule x * 0 == 0, which isn't actually true for float64.

(x+yi)*(c+0i) = (x*c-y*0) + (x*0+y*c) i

Say x == +inf, y == c == 1. Then the complex result is +inf + NaN i. Just multiplying both parts by 1 gives +inf + i.

Having said that, I think our story on inf/nan complex numbers is haphazard at best.

@griesemer
Copy link
Contributor Author

NaN's really are a mistake. Posits wouldn't have this problem... :-(

I suppose we could specify (in the spec) that a real scalar multiplication of a complex number really is translated into element-wise multiplication, etc. But that would probably lead to other troubles.

Sigh. Closing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

3 participants