Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: unnecessary zeroing #15914

Closed
randall77 opened this issue May 31, 2016 · 8 comments
Closed

cmd/compile: unnecessary zeroing #15914

randall77 opened this issue May 31, 2016 · 8 comments

Comments

@randall77
Copy link
Contributor

var sink *[10]byte
func f() {
    var x [10]byte
    sink = &x
}

This function used to compile to something like:

  p = mallocgc(...)
  sink = p

since the liveness fix (https://go-review.googlesource.com/c/23393/), it now compiles to:

  p = mallocgc(...)
  ...code to zero *p...
  sink = p

Figure out why the redundant zeroing has been added and then get rid of it.
See #15902 .

@randall77 randall77 added this to the Go1.8 milestone May 31, 2016
@aclements
Copy link
Member

How often does this trigger? If it can happen on large allocations, this could nearly double the cost of allocation.

@randall77
Copy link
Contributor Author

I'm not sure when it happens, exactly. Maybe only escaping local declarations. Something to investigate.

@dr2chase
Copy link
Contributor

dr2chase commented Jun 1, 2016

FWIW, input to SSA is very different.
Good:

.   DCL l(6) tc(1)
.   .   NAME-foo.x u(2) g(1) l(6) x(0+0) class(PAUTO,heap) f(1) esc(h) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

.   AS u(2) l(6) tc(1)
.   .   NAME-foo.x u(2) g(1) l(6) x(0+0) class(PAUTO,heap) f(1) esc(h) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

.   ASWB u(100) l(7) tc(1)
.   .   NAME-foo.sink u(1) a(true) l(3) x(0+0) class(PEXTERN) tc(1) assigned PTR64-*[10]byte
.   .   ADDR u(2) l(7) esc(h) tc(1) PTR64-*[10]byte
.   .   .   NAME-foo.x u(2) g(1) l(6) x(0+0) class(PAUTO,heap) f(1) esc(h) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

Bad:

.   AS u(100) l(6) colas(true) tc(1)
.   .   NAME-foo.&x u(1) a(true) l(5) x(0+0) class(PAUTO) esc(N) tc(1) assigned used(true) PTR64-*[10]byte
.   .   CALLFUNC u(100) l(6) tc(1) nonnil PTR64-*[10]byte
.   .   .   NAME-runtime.newobject u(1) a(true) x(0+0) class(PFUNC) tc(1) used(true) FUNC-func(*byte) *[10]byte
.   .   CALLFUNC-list
.   .   .   AS u(2) l(6) tc(1)
.   .   .   .   INDREG-SP a(true) l(6) x(0+0) tc(1) addrtaken runtime.typ·2 PTR64-*byte
.   .   .   .   ADDR u(2) a(true) l(6) tc(1) PTR64-*uint8
.   .   .   .   .   NAME-type.[10]uint8 u(1) a(true) l(6) x(0+0) class(PEXTERN) tc(1) uint8

.   AS u(2) l(6) tc(1)
.   .   IND u(2) l(6) tc(1) ARRAY-[10]byte
.   .   .   NAME-foo.&x u(1) a(true) l(5) x(0+0) class(PAUTO) esc(N) tc(1) assigned used(true) PTR64-*[10]byte

.   ASWB u(100) l(7) tc(1)
.   .   NAME-foo.sink u(1) a(true) l(3) x(0+0) class(PEXTERN) tc(1) assigned PTR64-*[10]byte
.   .   ADDR u(2) l(7) esc(h) tc(1) PTR64-*[10]byte
.   .   .   IND u(2) l(7) tc(1) ARRAY-[10]byte
.   .   .   .   NAME-foo.&x u(1) a(true) l(5) x(0+0) class(PAUTO) esc(N) tc(1) assigned used(true) PTR64-*[10]byte

@dr2chase
Copy link
Contributor

dr2chase commented Jun 1, 2016

Before walk, also before order:
Good:

before f
.   DCL l(6) tc(1)
.   .   NAME-foo.x u(2) g(1) l(6) x(0+0) class(PAUTO,heap) f(1) esc(h) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

.   AS l(6) tc(1)
.   .   NAME-foo.x u(2) g(1) l(6) x(0+0) class(PAUTO,heap) f(1) esc(h) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

.   AS l(7) tc(1)
.   .   NAME-foo.sink u(1) a(true) l(3) x(0+0) class(PEXTERN) tc(1) assigned PTR64-*[10]byte
.   .   ADDR l(7) esc(h) tc(1) PTR64-*[10]byte
.   .   .   NAME-foo.x u(2) g(1) l(6) x(0+0) class(PAUTO,heap) f(1) esc(h) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

Bad:

before f
.   DCL l(6) tc(1)
.   .   NAME-foo.x u(2) a(true) g(1) l(6) x(0+0) class(PAUTOHEAP) f(1) esc(h) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

.   AS l(6) tc(1)
.   .   NAME-foo.x u(2) a(true) g(1) l(6) x(0+0) class(PAUTOHEAP) f(1) esc(h) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

.   AS l(7) tc(1)
.   .   NAME-foo.sink u(1) a(true) l(3) x(0+0) class(PEXTERN) tc(1) assigned PTR64-*[10]byte
.   .   ADDR l(7) esc(h) tc(1) PTR64-*[10]byte
.   .   .   NAME-foo.x u(2) a(true) g(1) l(6) x(0+0) class(PAUTOHEAP) f(1) esc(h) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

"Bad" is "Addable".

@dr2chase
Copy link
Contributor

dr2chase commented Jun 1, 2016

And inputs to "escfunc" in escape analysis are identical.

@dr2chase
Copy link
Contributor

dr2chase commented Jun 1, 2016

Apparent difference appears during escape anaysis:
Good:

go build -gcflags '-m -m -m -m -m -W -W' z.go
# command-line-arguments
./z.go:5: can inline @"".f as: func() { var @"".x·1 [10]byte; ; @"".sink = &@"".x·1 }

before escfunc f
.   DCL l(6) tc(1)
.   .   NAME-foo.x u(1) a(true) g(1) l(6) x(0+0) class(PAUTO) f(1) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

.   AS l(6) tc(1)
.   .   NAME-foo.x u(1) a(true) g(1) l(6) x(0+0) class(PAUTO) f(1) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

.   AS l(7) tc(1)
.   .   NAME-foo.sink u(1) a(true) l(3) x(0+0) class(PEXTERN) tc(1) assigned PTR64-*[10]byte
.   .   ADDR l(7) tc(1) PTR64-*[10]byte
.   .   .   NAME-foo.x u(1) a(true) g(1) l(6) x(0+0) class(PAUTO) f(1) tc(1) addrtaken assigned used(true) ARRAY-[10]byte
./z.go:6:[1] f esc: x
./z.go:6:[1] f esc: var x [10]byte
./z.go:6:[1] f esc: x
./z.go:6:[1] f esc: x = <N>
./z.go:7:[1] f esc: sink
./z.go:7:[1] f esc: x
./z.go:7:[1] f esc: &x
./z.go:7:[1] f esc: sink = &x
./z.go:7:[1] f escassign: sink( l(3) class(PEXTERN) assigned)[NAME] = &x( l(7) esc(no) ld(1))[&]
./z.go:7::flows:: .sink <- &x

escflood:0: dst .sink scope:<S>[-1]
escwalk: level:{0 0} depth:0  op=& &x( l(7) esc(no) ld(1)) scope:f[1] extraloopdepth=-1
./z.go:6: moved to heap: x
./z.go:7: &x escapes to heap, level={0 0}, dst.eld=-1, src.eld=1
escwalk: level:{-1 -1} depth:1   op=NAME x( l(6) class(PAUTO,heap) f(1) esc(h) ld(1) addrtaken assigned) scope:f[1] extraloopdepth=1

before esctag f
.   DCL l(6) tc(1)
.   .   NAME-foo.x u(2) g(1) l(6) x(0+0) class(PAUTO,heap) f(1) esc(h) ld(1) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

.   AS l(6) tc(1)
.   .   NAME-foo.x u(2) g(1) l(6) x(0+0) class(PAUTO,heap) f(1) esc(h) ld(1) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

.   AS l(7) tc(1)
.   .   NAME-foo.sink u(1) a(true) l(3) x(0+0) class(PEXTERN) tc(1) assigned PTR64-*[10]byte
.   .   ADDR l(7) esc(h) ld(1) tc(1) PTR64-*[10]byte
.   .   .   NAME-foo.x u(2) g(1) l(6) x(0+0) class(PAUTO,heap) f(1) esc(h) ld(1) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

Bad:

go build -gcflags '-m -m -m -m -m -W -W' z.go
# command-line-arguments
./z.go:5: can inline @"".f as: func() { var @"".x·1 [10]byte; ; @"".sink = &@"".x·1 }

before escfunc f
.   DCL l(6) tc(1)
.   .   NAME-foo.x u(1) a(true) g(1) l(6) x(0+0) class(PAUTO) f(1) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

.   AS l(6) tc(1)
.   .   NAME-foo.x u(1) a(true) g(1) l(6) x(0+0) class(PAUTO) f(1) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

.   AS l(7) tc(1)
.   .   NAME-foo.sink u(1) a(true) l(3) x(0+0) class(PEXTERN) tc(1) assigned PTR64-*[10]byte
.   .   ADDR l(7) tc(1) PTR64-*[10]byte
.   .   .   NAME-foo.x u(1) a(true) g(1) l(6) x(0+0) class(PAUTO) f(1) tc(1) addrtaken assigned used(true) ARRAY-[10]byte
./z.go:6:[1] f esc: x
./z.go:6:[1] f esc: var x [10]byte
./z.go:6:[1] f esc: x
./z.go:6:[1] f esc: x = <N>
./z.go:7:[1] f esc: sink
./z.go:7:[1] f esc: x
./z.go:7:[1] f esc: &x
./z.go:7:[1] f esc: sink = &x
./z.go:7:[1] f escassign: sink( l(3) class(PEXTERN) assigned)[NAME] = &x( l(7) esc(no) ld(1))[&]
./z.go:7::flows:: .sink <- &x

escflood:0: dst .sink scope:<S>[-1]
escwalk: level:{0 0} depth:0  op=& &x( l(7) esc(no) ld(1)) scope:f[1] extraloopdepth=-1
./z.go:7: &x escapes to heap, level={0 0}, dst=.sink dst.eld=-1, src.eld=1
./z.go:6: moved to heap: x
escwalk: level:{-1 -1} depth:1   op=NAME x( l(6) class(PAUTOHEAP) f(1) esc(h) ld(1) addrtaken assigned) scope:f[1] extraloopdepth=1

before esctag f
.   DCL l(6) tc(1)
.   .   NAME-foo.x u(2) a(true) g(1) l(6) x(0+0) class(PAUTOHEAP) f(1) esc(h) ld(1) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

.   AS l(6) tc(1)
.   .   NAME-foo.x u(2) a(true) g(1) l(6) x(0+0) class(PAUTOHEAP) f(1) esc(h) ld(1) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

.   AS l(7) tc(1)
.   .   NAME-foo.sink u(1) a(true) l(3) x(0+0) class(PEXTERN) tc(1) assigned PTR64-*[10]byte
.   .   ADDR l(7) esc(h) ld(1) tc(1) PTR64-*[10]byte
.   .   .   NAME-foo.x u(2) a(true) g(1) l(6) x(0+0) class(PAUTOHEAP) f(1) esc(h) ld(1) tc(1) addrtaken assigned used(true) ARRAY-[10]byte

@dr2chase
Copy link
Contributor

dr2chase commented Jun 1, 2016

Backing up and looking carefully, studying this on amd64 where it is easy, with "good" at 99d29d5 and "bad" at e29e0ba (plus a minor tweak, described below):

* e29e0ba (HEAD -> master, origin/master, origin/HEAD) cmd/compile: fix TestAssembly on Plan 9
...
* b6dc3e6 cmd/compile: fix liveness computation for heap-escaped parameters
* 99d29d5 path/filepath: fix globbing of c:\*dir\... pattern

With SSA, the extra zeroing always occurs, before and after the problematic CL (https://go-review.googlesource.com/c/23393/).
With SSA disabled (GOSSAHASH=n), the zeroing only occurs after the problematic CL, same as on ARM.
The extra zeroing is NOT dependent on difference in the Addable bit, which can be "fixed" here:

--- a/src/cmd/compile/internal/gc/gen.go
+++ b/src/cmd/compile/internal/gc/gen.go
@@ -186,6 +186,7 @@ func moveToHeap(n *Node) {

        // Modify n in place so that uses of n now mean indirection of the heapaddr.
        n.Class = PAUTOHEAP
+       n.Addable = false
        n.Ullman = 2
        n.Xoffset = 0
        n.Name.Heapaddr = heapaddr

Here are the differences in generated assembly, with "bad" including the Addable "fix":

diff good.s bad.s
2c2
< "".f t=1 size=112 args=0x0 locals=0x18
---
> "".f t=1 size=128 args=0x0 locals=0x18
6c6
<   0x000d 00013 (/Users/drchase/GoogleDrive/work/tmp/z.go:5)   JLS 99
---
>   0x000d 00013 (/Users/drchase/GoogleDrive/work/tmp/z.go:5)   JLS 114
16,32c16,36
<   0x002d 00045 (/Users/drchase/GoogleDrive/work/tmp/z.go:6)   MOVQ    8(SP), BX
<   0x0032 00050 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   CMPB    runtime.writeBarrier(SB), $0
<   0x0039 00057 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   JNE $0, 76
<   0x003b 00059 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   MOVQ    BX, "".sink(SB)
<   0x0042 00066 (/Users/drchase/GoogleDrive/work/tmp/z.go:8)   MOVQ    16(SP), BP
<   0x0047 00071 (/Users/drchase/GoogleDrive/work/tmp/z.go:8)   ADDQ    $24, SP
<   0x004b 00075 (/Users/drchase/GoogleDrive/work/tmp/z.go:8)   RET
<   0x004c 00076 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   LEAQ    "".sink(SB), R8
<   0x0053 00083 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   MOVQ    R8, (SP)
<   0x0057 00087 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   MOVQ    BX, 8(SP)
<   0x005c 00092 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   PCDATA  $0, $0
<   0x005c 00092 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   CALL    runtime.writebarrierptr(SB)
<   0x0061 00097 (/Users/drchase/GoogleDrive/work/tmp/z.go:8)   JMP 66
<   0x0063 00099 (/Users/drchase/GoogleDrive/work/tmp/z.go:8)   NOP
<   0x0063 00099 (/Users/drchase/GoogleDrive/work/tmp/z.go:5)   CALL    runtime.morestack_noctxt(SB)
<   0x0068 00104 (/Users/drchase/GoogleDrive/work/tmp/z.go:5)   JMP 0
<   0x0000 65 48 8b 0c 25 00 00 00 00 48 3b 61 10 76 54 48  eH..%....H;a.vTH
---
>   0x002d 00045 (/Users/drchase/GoogleDrive/work/tmp/z.go:6)   MOVQ    8(SP), AX
>   0x0032 00050 (/Users/drchase/GoogleDrive/work/tmp/z.go:6)   NOP
>   0x0032 00050 (/Users/drchase/GoogleDrive/work/tmp/z.go:6)   MOVQ    $0, (AX)
>   0x0039 00057 (/Users/drchase/GoogleDrive/work/tmp/z.go:6)   MOVQ    $0, 2(AX)
>   0x0041 00065 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   NOP
>   0x0041 00065 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   CMPB    runtime.writeBarrier(SB), $0
>   0x0048 00072 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   JNE $0, 91
>   0x004a 00074 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   MOVQ    AX, "".sink(SB)
>   0x0051 00081 (/Users/drchase/GoogleDrive/work/tmp/z.go:8)   MOVQ    16(SP), BP
>   0x0056 00086 (/Users/drchase/GoogleDrive/work/tmp/z.go:8)   ADDQ    $24, SP
>   0x005a 00090 (/Users/drchase/GoogleDrive/work/tmp/z.go:8)   RET
>   0x005b 00091 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   LEAQ    "".sink(SB), R8
>   0x0062 00098 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   MOVQ    R8, (SP)
>   0x0066 00102 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   MOVQ    AX, 8(SP)
>   0x006b 00107 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   PCDATA  $0, $0
>   0x006b 00107 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   CALL    runtime.writebarrierptr(SB)
>   0x0070 00112 (/Users/drchase/GoogleDrive/work/tmp/z.go:8)   JMP 81
>   0x0072 00114 (/Users/drchase/GoogleDrive/work/tmp/z.go:8)   NOP
>   0x0072 00114 (/Users/drchase/GoogleDrive/work/tmp/z.go:5)   CALL    runtime.morestack_noctxt(SB)
>   0x0077 00119 (/Users/drchase/GoogleDrive/work/tmp/z.go:5)   JMP 0
(etc)

With SSA backend, the generated assembly language is nearly identical, differing only in line numbers:

diff good.s bad.s
12,16c12,16
<   0x001d 00029 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   LEAQ    type.[10]uint8(SB), AX
<   0x0024 00036 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   MOVQ    AX, (SP)
<   0x0028 00040 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   PCDATA  $0, $0
<   0x0028 00040 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   CALL    runtime.newobject(SB)
<   0x002d 00045 (/Users/drchase/GoogleDrive/work/tmp/z.go:7)   MOVQ    8(SP), AX
---
>   0x001d 00029 (/Users/drchase/GoogleDrive/work/tmp/z.go:6)   LEAQ    type.[10]uint8(SB), AX
>   0x0024 00036 (/Users/drchase/GoogleDrive/work/tmp/z.go:6)   MOVQ    AX, (SP)
>   0x0028 00040 (/Users/drchase/GoogleDrive/work/tmp/z.go:6)   PCDATA  $0, $0
>   0x0028 00040 (/Users/drchase/GoogleDrive/work/tmp/z.go:6)   CALL    runtime.newobject(SB)
>   0x002d 00045 (/Users/drchase/GoogleDrive/work/tmp/z.go:6)   MOVQ    8(SP), AX

@gopherbot
Copy link

CL https://golang.org/cl/27930 mentions this issue.

@golang golang locked and limited conversation to collaborators Aug 30, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants