cmd/compile: generate same code regardless of whether return values are named #20859

bradfitz · 2017-06-30T04:11:37Z

There's an article at the top of Hacker News right now,

https://blog.minio.io/golang-internals-part-2-nice-benefits-of-named-return-values-1e95305c8687
https://news.ycombinator.com/item?id=14668323

... which advocates for naming return values for the benefit of better generated code.

It concludes:

So we will be gradually adopting named return values more and more, both for new code as well as for existing code.

In fact we are also investigating if we can develop a little utility to help or automate this process.

This is totally counter to:

https://golang.org/s/style#named-result-parameters

... which says that naming result parameters should only be for godoc cleanliness.

We should fix the compiler to generate the same code regardless.

/cc @randall77 @mdempsky @josharian @brtzsnr @cherrymui

robpike · 2017-06-30T04:43:25Z

I like named return values. I wish I had permission to use them.

-rob

bradfitz · 2017-06-30T04:50:11Z

I like named return values. I wish I had permission to use them.

I interpret this as sadness about the "style" URL above, another term I know you don't like. I guess you value anti-stutter in godoc differently than some of us do. I agree there are cases where it'd be nice to use named return values (especially to reference them from defer) without cluttering godoc. Or to eliminate a few lines in other cases.

It'd be neat if we could influence godoc rendering separate from whether we chose to use named return values in code. But those proposals never go anywhere quickly (#16666, #18342, #7873), so instead we control documentation formatting by how we write the code.

dmitshur · 2017-06-30T05:31:31Z

Is this issue about the following two styles generating same code?

func foo() (ch credentialHeader) {
    // ...
    if cond {
        return ch
    }
    // ...
    return ch
}

func foo() credentialHeader {
    var ch credentialHeader
    // ...
    if cond {
        return ch
    }
    // ...
    return ch
}

Or is it about handling these cases too?

func foo() credentialHeader {
    // ...
    if cond {
        return credentialHeader{/* ... */}
    }
    // ...
    return credentialHeader{/* ... */}
}

fwessels · 2017-06-30T07:11:47Z

It would be great if the compiler can take care of this and always generated the most optimal code.

Even so, for all of the examples as posted above by @shurcooL, you can still argue that the first is the most readable (or at the very least it is the briefest version). But of course this also has to do with personal style.

robpike · 2017-06-30T08:30:41Z

Sometimes, such as when there are several return values , code can be cleaner and simpler using named returns. One instance is when there are multiple return values followed by an error: you can use Go's handling of zero values on function entry to avoid messy return statements. But the style guys win, they always win, and much as I like consistency in the code base I get sad when following the style guide makes the code uglier.

The style tyrants made illegal the single most interesting language property of Sawzall because to use it required judgment. The language was made much clumsier in practice as a result, and it depressed me. Style tyrants want to eliminate judgment, plus it seems sometimes that the result of style guides is to steer programmers away from all the interesting properties of the language, towards a mainstream but dull uniformity free of joy.

I follow the style guide anyway, just to be clear.

fwessels · 2017-06-30T09:00:42Z

Maybe, if someone has poor judgment, it is better to stay away from writing code altogether (or doing other stuff for that matter...)

And besides, hey, where would be the world be without some rebellions... 😄

hanwen · 2017-06-30T14:30:34Z

@robpike - pray tell, which feature was that?

cherrymui · 2017-06-30T15:26:00Z

It seems there are a few things for the binary difference. In the first example in the blog,

type objectInfo struct {
	arg1 int64
	arg2 uint64
	arg3 string
	arg4 []int
}
func NoNamedReturnParams(i int) (objectInfo) {

	if i == 1 {
		// Do one thing
		return objectInfo{}
	}

	if i == 2 {
		// Do another thing
		return objectInfo{}
	}

	if i == 3 {
		// Do one more thing still
		return objectInfo{}
	}

	// Normal return
	return objectInfo{}
}

"".NoNamedReturnParams t=1 size=243 args=0x40 locals=0x0
0x0000 	TEXT	"".NoNamedReturnParams(SB), $0-64
0x0000 	MOVQ	$0, "".~r1+16(FP)
0x0009 	LEAQ	"".~r1+24(FP), DI
0x000e 	XORPS	X0, X0
0x0011 	ADDQ	$-16, DI
0x0015 	DUFFZERO	$288
0x0028 	MOVQ	"".i+8(FP), AX
0x002d 	CMPQ	AX, $1
0x0031 	JEQ	$0, 199
0x0037 	CMPQ	AX, $2
0x003b 	JEQ	$0, 155
0x003d 	CMPQ	AX, $3
0x0041 	JNE	111
0x0043 	MOVQ	"".statictmp_2(SB), AX
0x004a 	MOVQ	AX, "".~r1+16(FP)
0x004f 	LEAQ	"".~r1+24(FP), DI
0x0054 	LEAQ	"".statictmp_2+8(SB), SI
0x005b 	DUFFCOPY	$854
0x006e 	RET
0x006f 	MOVQ	"".statictmp_3(SB), AX
0x0076 	MOVQ	AX, "".~r1+16(FP)
0x007b 	LEAQ	"".~r1+24(FP), DI
0x0080 	LEAQ	"".statictmp_3+8(SB), SI
0x0087 	DUFFCOPY	$854
0x009a 	RET
0x009b 	MOVQ	"".statictmp_1(SB), AX
0x00a2 	MOVQ	AX, "".~r1+16(FP)
0x00a7 	LEAQ	"".~r1+24(FP), DI
0x00ac 	LEAQ	"".statictmp_1+8(SB), SI
0x00b3 	DUFFCOPY	$854
0x00c6 	RET
0x00c7 	MOVQ	"".statictmp_0(SB), AX
0x00ce 	MOVQ	AX, "".~r1+16(FP)
0x00d3 	LEAQ	"".~r1+24(FP), DI
0x00d8 	LEAQ	"".statictmp_0+8(SB), SI
0x00df 	DUFFCOPY	$854
0x00f2 	RET

The compiler generates four separate statictmps for objectInfo{}. It could realize that they are the same value so just generate one statictmp and reuse it. Better, it could realize it is zero value therefore no need to generate statictmp.

If I rewrite it to something like (like @shurcooL's example)

func NoNamedReturnParams2(i int) (objectInfo) {
        var o objectInfo
        if i == 1 {
                // Do one thing
                return o
        }

        if i == 2 {
                // Do another thing
                return o
        }

        if i == 3 {
                // Do one more thing still
                return o
        }

        // Normal return
        return o
}

It generates

"".NoNamedReturnParams2 STEXT nosplit size=309 args=0x40 locals=0x40
	0x0000 00000 (r.go:29)	TEXT	"".NoNamedReturnParams2(SB), NOSPLIT, $64-64
	0x0000 00000 (r.go:29)	SUBQ	$64, SP
	0x0004 00004 (r.go:29)	MOVQ	BP, 56(SP)
	0x0009 00009 (r.go:29)	LEAQ	56(SP), BP
	0x000e 00014 (r.go:29)	FUNCDATA	$0, gclocals·1d0ed49f611d7e40a62328b5976a2ede(SB)
	0x000e 00014 (r.go:29)	FUNCDATA	$1, gclocals·98abd00825523461856b7cae0585bf34(SB)
	0x000e 00014 (r.go:29)	MOVQ	$0, "".~r1+80(SP)
	0x0017 00023 (r.go:29)	LEAQ	"".~r1+88(SP), DI
	0x001c 00028 (r.go:29)	XORPS	X0, X0
	0x001f 00031 (r.go:29)	ADDQ	$-16, DI
	0x0023 00035 (r.go:29)	DUFFZERO	$288
	0x0036 00054 (r.go:30)	MOVQ	$0, "".o(SP)
	0x003e 00062 (r.go:30)	LEAQ	"".o+8(SP), DI
	0x0043 00067 (r.go:30)	ADDQ	$-16, DI
	0x0047 00071 (r.go:30)	DUFFZERO	$288
	0x005a 00090 (r.go:29)	MOVQ	"".i+72(SP), AX
	0x005f 00095 (r.go:31)	CMPQ	AX, $1
	0x0063 00099 (r.go:31)	JEQ	261
	0x0069 00105 (r.go:36)	CMPQ	AX, $2
	0x006d 00109 (r.go:36)	JEQ	213
	0x006f 00111 (r.go:41)	CMPQ	AX, $3
	0x0073 00115 (r.go:41)	JNE	165
	0x0075 00117 (r.go:43)	MOVQ	"".o(SP), AX
	0x0079 00121 (r.go:43)	MOVQ	AX, "".~r1+80(SP)
	0x007e 00126 (r.go:43)	LEAQ	"".~r1+88(SP), DI
	0x0083 00131 (r.go:43)	LEAQ	"".o+8(SP), SI
	0x0088 00136 (r.go:43)	DUFFCOPY	$854
	0x009b 00155 (r.go:43)	MOVQ	56(SP), BP
	0x00a0 00160 (r.go:43)	ADDQ	$64, SP
	0x00a4 00164 (r.go:43)	RET
	0x00a5 00165 (r.go:47)	MOVQ	"".o(SP), AX
	0x00a9 00169 (r.go:47)	MOVQ	AX, "".~r1+80(SP)
	0x00ae 00174 (r.go:47)	LEAQ	"".~r1+88(SP), DI
	0x00b3 00179 (r.go:47)	LEAQ	"".o+8(SP), SI
	0x00b8 00184 (r.go:47)	DUFFCOPY	$854
	0x00cb 00203 (r.go:47)	MOVQ	56(SP), BP
	0x00d0 00208 (r.go:47)	ADDQ	$64, SP
	0x00d4 00212 (r.go:47)	RET
	0x00d5 00213 (r.go:38)	MOVQ	"".o(SP), AX
	0x00d9 00217 (r.go:38)	MOVQ	AX, "".~r1+80(SP)
	0x00de 00222 (r.go:38)	LEAQ	"".~r1+88(SP), DI
	0x00e3 00227 (r.go:38)	LEAQ	"".o+8(SP), SI
	0x00e8 00232 (r.go:38)	DUFFCOPY	$854
	0x00fb 00251 (r.go:38)	MOVQ	56(SP), BP
	0x0100 00256 (r.go:38)	ADDQ	$64, SP
	0x0104 00260 (r.go:38)	RET
	0x0105 00261 (r.go:33)	MOVQ	"".o(SP), AX
	0x0109 00265 (r.go:33)	MOVQ	AX, "".~r1+80(SP)
	0x010e 00270 (r.go:33)	LEAQ	"".~r1+88(SP), DI
	0x0113 00275 (r.go:33)	LEAQ	"".o+8(SP), SI
	0x0118 00280 (r.go:33)	DUFFCOPY	$854
	0x012b 00299 (r.go:33)	MOVQ	56(SP), BP
	0x0130 00304 (r.go:33)	ADDQ	$64, SP
	0x0134 00308 (r.go:33)	RET

So it makes a zero-valued local variable o, and copied to the return value in each branch. It could lift the copy up out of the branches. Better if it could use the return value slot for o. I guess this issue focuses on this problem, not the statictmp ones.

4ad · 2017-06-30T17:41:27Z

@robpike I use named return values, even though I lack permission.

josharian · 2017-07-01T00:03:00Z

Some minor observations.

If the returned types are small (in particular, small enough to be registerized), the generated code appears to already be identical.
If the returned types are large, and naked returns are only used in unlikely cases, then it is worse performance-wise to use a named return value, since it will add unnecessary zeroing in the common case.
Zeroing the return early (as triggered by named returns) could be worse for performance in some cases due to the cache impact of the zeroing.

So while the compiler doesn't do well on the example given, and should be improved, it's not obvious to me that named and unnamed returns should always generate the same code.

And in fact, I'd think using unnamed returns should always be a pareto performance improvement on named returns, since using named returns ties the compiler's hands in a way that unnamed returns do not. But, with @robpike, the primary consideration should be a judgment call about the readability of the code anyway. Except where critical to performance, there is significant danger of overfitting the compiler.

It could realize that they are the same value so just generate one statictmp and reuse it. Better, it could realize it is zero value therefore no need to generate statictmp.

Agreed on both counts. I'm surprised that there is a statictmp for this; I thought sinit was pretty good about recognizing zero values.

copied to the return value in each branch. It could lift the copy up out of the branches.

Yes, although why should it copy instead of just zeroing?

Better if it could use the return value slot for o.

Seems to me like that is the crux of this issue, although I'm not sure how feasible that is to implement.

fwessels · 2017-07-07T00:18:00Z

Just a few remarks:

Zeroing the return early (as triggered by named returns) could be worse for performance in some cases due to the cache impact of the zeroing.

Looking at NoNamedReturnParams it is calling DUFFZERO

"".NoNamedReturnParams t=1 size=243 args=0x40 locals=0x0
0x0000 	TEXT	"".NoNamedReturnParams(SB), $0-64
0x0000 	MOVQ	$0, "".~r1+16(FP)
0x0009 	LEAQ	"".~r1+24(FP), DI
0x000e 	XORPS	X0, X0
0x0011 	ADDQ	$-16, DI
0x0015 	DUFFZERO	$288

just like in NamedReturnParams, so it seems to be zeroing out the return value either way.:

"".NamedReturnParams t=1 size=67 args=0x40 locals=0x0
0x0000 	TEXT	"".NamedReturnParams(SB), $0-64
0x0000 	MOVQ	$0, "".oi+16(FP)
0x0009 	LEAQ	"".oi+24(FP), DI
0x000e 	XORPS	X0, X0
0x0011 	ADDQ	$-16, DI
0x0015 	DUFFZERO	$288

(and looking at NoNamedReturnParams2 as listed above, it is calling DUFFZERO twice:)

"".NoNamedReturnParams2 STEXT nosplit size=309 args=0x40 locals=0x40
0x000e 00014 (r.go:29)	MOVQ	$0, "".~r1+80(SP)
0x0017 00023 (r.go:29)	LEAQ	"".~r1+88(SP), DI
0x001c 00028 (r.go:29)	XORPS	X0, X0
0x001f 00031 (r.go:29)	ADDQ	$-16, DI
0x0023 00035 (r.go:29)	DUFFZERO	$288
0x0036 00054 (r.go:30)	MOVQ	$0, "".o(SP)
0x003e 00062 (r.go:30)	LEAQ	"".o+8(SP), DI
0x0043 00067 (r.go:30)	ADDQ	$-16, DI
0x0047 00071 (r.go:30)	DUFFZERO	$288

So while the compiler doesn't do well on the example given, and should be improved, it's not obvious to me that named and unnamed returns should always generate the same code.

Agreed, when using a named return value its content can be modified, so it is not possible to rely on the fact that it is not modified, so for a return objectInfo{} you may need to copy/zero anyway.

... since using named returns ties the compiler's hands in a way that unnamed returns do not.

Would think that this actually makes it easier, especially in combination with naked returns where, by definition, the (named) return values are already setup, so you can just RET out.

dr2chase · 2018-06-26T14:37:54Z

Not likely for 1.11, moved to 1.12.

odeke-em · 2019-02-04T00:59:38Z

Nor is it likely for Go1.12, I am moving this to Go1.13 /cc @randall77

bradfitz · 2021-05-26T18:35:17Z

I just hit this again when I sent https://go-review.googlesource.com/c/go/+/322329 to clean up some inconsistent godoc and it instead increased the size of stacks.

mdempsky · 2021-05-26T19:14:47Z

@bradfitz I think CL 322329 is a slightly different issue? I understand this issue to be that if a function has a named result parameter, but the name isn't used, then it should compile identically to if it was anonymous or blank instead. However, that CL not only removed the result parameter names, but also added new local variables to replace them.

In that particular case, we could optimize the code to directly allocate the ret variable in the result parameter slot on the stack. But in general with panic/defer, there can be differences. So I think the details of addressing this would be more complex.

josharian · 2021-05-26T19:19:00Z

I wonder whether we could use space reserved for unnamed return parameters as scratch space (assuming compatible types). That'd probably fix this mismatch and be a bonus optimization in other cases. cc @dr2chase

mdempsky · 2021-05-26T19:21:36Z

@josharian I think as long as panic and defer are handled correctly, that seems reasonable. I'd suggest as a starting point to not attempt that optimization in any function that contains a defer statement.

gopherbot · 2021-11-05T17:25:45Z

Change https://golang.org/cl/361674 mentions this issue: net/netip: optimize As4 and As16

name old time/op new time/op delta As16-8 2.88ns ± 3% 2.16ns ± 3% -25.19% (p=0.000 n=15+15) Fixes #49379 Updates #20859 Change-Id: If4cf58d19ed0e2ac0f179da5c132ed37061e4cb7 Reviewed-on: https://go-review.googlesource.com/c/go/+/361674 Trust: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

bradfitz added the NeedsFix The path to resolution is known, but the work has not been done. label Jun 30, 2017

bradfitz added this to the Go1.10 milestone Jun 30, 2017

aronatkins mentioned this issue Jul 17, 2017

cmd/compile: named return values produce functionally different code #21049

Closed

cherrymui modified the milestones: Go1.10, Go1.11 Nov 29, 2017

64kramsystem mentioned this issue Jan 2, 2018

Naming return values (?) goby-lang/goby#558

Open

TocarIP mentioned this issue Mar 21, 2018

cmd/compile: optimize large structs #24416

Open

cristaloleg mentioned this issue Jun 19, 2018

lint: unnamedResult should not warn on different named types go-critic/go-critic#259

Closed

dr2chase modified the milestones: Go1.11, Go1.12 Jun 26, 2018

samvantran mentioned this issue Aug 29, 2018

[DCOS-38138] Update Spark CLI for shell-escape fix mesosphere/spark-build#388

Merged

lysu mentioned this issue Sep 6, 2018

make tidb adaptor in errors. pingcap/errors#2

Merged

odeke-em modified the milestones: Go1.12, Go1.13 Feb 4, 2019

andybons modified the milestones: Go1.13, Go1.14 Jul 8, 2019

rsc removed this from the Go1.14 milestone Oct 9, 2019

rsc added this to the Backlog milestone Oct 9, 2019

dgryski mentioned this issue Aug 7, 2020

cmd/compile: named returns are faster than inline returns #40638

Closed

mdempsky mentioned this issue Jun 2, 2021

cmd/compile: explore optimizations for fixed-length array copies #46529

Open

randall77 mentioned this issue Sep 21, 2021

cmd/compile: inline-able function with named results performs worse #48507

Open

gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/compile: generate same code regardless of whether return values are named #20859

cmd/compile: generate same code regardless of whether return values are named #20859

bradfitz commented Jun 30, 2017

robpike commented Jun 30, 2017

bradfitz commented Jun 30, 2017

dmitshur commented Jun 30, 2017 •

edited

fwessels commented Jun 30, 2017

robpike commented Jun 30, 2017

fwessels commented Jun 30, 2017

hanwen commented Jun 30, 2017

cherrymui commented Jun 30, 2017

4ad commented Jun 30, 2017

josharian commented Jul 1, 2017

fwessels commented Jul 7, 2017

dr2chase commented Jun 26, 2018

odeke-em commented Feb 4, 2019 •

edited

bradfitz commented May 26, 2021

mdempsky commented May 26, 2021

josharian commented May 26, 2021

mdempsky commented May 26, 2021

gopherbot commented Nov 5, 2021

cmd/compile: generate same code regardless of whether return values are named #20859

cmd/compile: generate same code regardless of whether return values are named #20859

Comments

bradfitz commented Jun 30, 2017

robpike commented Jun 30, 2017

bradfitz commented Jun 30, 2017

dmitshur commented Jun 30, 2017 • edited

fwessels commented Jun 30, 2017

robpike commented Jun 30, 2017

fwessels commented Jun 30, 2017

hanwen commented Jun 30, 2017

cherrymui commented Jun 30, 2017

4ad commented Jun 30, 2017

josharian commented Jul 1, 2017

fwessels commented Jul 7, 2017

dr2chase commented Jun 26, 2018

odeke-em commented Feb 4, 2019 • edited

bradfitz commented May 26, 2021

mdempsky commented May 26, 2021

josharian commented May 26, 2021

mdempsky commented May 26, 2021

gopherbot commented Nov 5, 2021

dmitshur commented Jun 30, 2017 •

edited

odeke-em commented Feb 4, 2019 •

edited