Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: incorrect liveness map during init of 1-word composite literal #8036

Closed
rsc opened this issue May 20, 2014 · 10 comments
Closed

runtime: incorrect liveness map during init of 1-word composite literal #8036

rsc opened this issue May 20, 2014 · 10 comments
Milestone

Comments

@rsc
Copy link
Contributor

rsc commented May 20, 2014

g% cat x.go
package main

type T struct {
    P *int
}

func F() (t T) {
    t = T{P: new(int)}
    F()
    return
}
g% go tool 6g -live x.go
x.go:8: live at call to new: t
x.go:9: live at call to F: t
g% 

't' has not been initialized at the call to new and therefore should not be marked live.
This is causing crashes in a test case reported offline.
@DanielMorsing
Copy link
Contributor

Comment 1:

Don't we set the memory of a target for a complit to zero and then fill in the fields?
We need to consider it live, since there can be multiple calls in initializing and it
can be partially filled in. If we don't consider it live, then those references could be
collected.

@rsc
Copy link
Contributor Author

rsc commented May 28, 2014

Comment 3:

Issue #8100 has been merged into this issue.

@randall77
Copy link
Contributor

Comment 4:

It looks like the memory for t is initialized before the call to new:
...
    0x002b 00043 (issue8036.go:8)   MOVQ    "".statictmp_0000+0(SB),BX
    0x0033 00051 (issue8036.go:8)   MOVQ    BX,"".t+24(FP)
    0x0038 00056 (issue8036.go:8)   MOVQ    $type.int+0(SB),(SP)
    0x0040 00064 (issue8036.go:8)   PCDATA  $0,$16
    0x0040 00064 (issue8036.go:8)   PCDATA  $1,$1
    0x0040 00064 (issue8036.go:8)   CALL    ,runtime.new(SB)
...
So although it is maybe inefficient to do it this way, GC shouldn't fail - t is properly
initialized at the call to new.

@rsc
Copy link
Contributor Author

rsc commented May 28, 2014

Comment 5:

Re #4, are you using -N? 
What I see with -N:
    000000 00000 (/tmp/x.go:7)  TEXT    "".F+0(SB),$16-8
    000000 00000 (/tmp/x.go:7)  MOVQ    (TLS),CX
    0x0009 00009 (/tmp/x.go:7)  CMPQ    SP,(CX)
    0x000c 00012 (/tmp/x.go:7)  JHI ,21
    0x000e 00014 (/tmp/x.go:7)  CALL    ,runtime.morestack8_noctxt(SB)
    0x0013 00019 (/tmp/x.go:7)  JMP ,0
    0x0015 00021 (/tmp/x.go:7)  SUBQ    $16,SP
    0x0019 00025 (/tmp/x.go:7)  FUNCDATA    $2,gclocals·e1ae6533a9e39048ba0735a2264ce16a+0(SB)
    0x0019 00025 (/tmp/x.go:7)  FUNCDATA    $3,gclocals·0115f8d53b75c1696444f08ad03251d9+0(SB)
    0x0019 00025 (/tmp/x.go:7)  MOVQ    $0,"".t+24(FP)
    0x0022 00034 (/tmp/x.go:7)  MOVQ    $0,"".t+24(FP)
    0x002b 00043 (/tmp/x.go:8)  MOVQ    "".statictmp_0000+0(SB),BX
    0x0033 00051 (/tmp/x.go:8)  MOVQ    BX,"".t+24(FP)
    0x0038 00056 (/tmp/x.go:8)  MOVQ    $type.int+0(SB),(SP)
    0x0040 00064 (/tmp/x.go:8)  PCDATA  $0,$16
    0x0040 00064 (/tmp/x.go:8)  PCDATA  $1,$1
    0x0040 00064 (/tmp/x.go:8)  CALL    ,runtime.new(SB)
What I see normally:
    000000 00000 (/tmp/x.go:7)  TEXT    "".F+0(SB),$16-8
    000000 00000 (/tmp/x.go:7)  MOVQ    (TLS),CX
    0x0009 00009 (/tmp/x.go:7)  CMPQ    SP,(CX)
    0x000c 00012 (/tmp/x.go:7)  JHI ,21
    0x000e 00014 (/tmp/x.go:7)  CALL    ,runtime.morestack8_noctxt(SB)
    0x0013 00019 (/tmp/x.go:7)  JMP ,0
    0x0015 00021 (/tmp/x.go:7)  SUBQ    $16,SP
    0x0019 00025 (/tmp/x.go:7)  FUNCDATA    $2,gclocals·e1ae6533a9e39048ba0735a2264ce16a+0(SB)
    0x0019 00025 (/tmp/x.go:7)  FUNCDATA    $3,gclocals·0115f8d53b75c1696444f08ad03251d9+0(SB)
    0x0019 00025 (/tmp/x.go:8)  MOVQ    "".statictmp_0000+0(SB),BX
    0x0021 00033 (/tmp/x.go:8)  MOVQ    $type.int+0(SB),(SP)
    0x0029 00041 (/tmp/x.go:8)  PCDATA  $0,$16
    0x0029 00041 (/tmp/x.go:8)  PCDATA  $1,$1
    0x0029 00041 (/tmp/x.go:8)  CALL    ,runtime.new(SB)
The single-word value t+24(FP) is being registerized and the optimizer can see it's dead
even though the liveness cannot. We could change the optimizer to match the liveness or
we could change the liveness to match the optimizer.
I've done some experiments to understand what other cases are broken. It's more than
just composite literals - not surprising since they just compile into simpler code.
Any aggregate (struct, array) that can be fully registerized due to use of componentgen
for assignment has this problem. For example, 4-word structs registerize on 6g, so this
program demonstrates the same issue:
package main
type T struct {
    P *int
    Q *int
    R *int
    S *int
}
func F() (t T) {
    t.P = new(int)
    t.Q = t.P
    t.R = t.P
    t.S = t.P
    F()
    return
}
t is live at the call to new but all the fields are set afterward so the optimizer
eliminates all the earlier stores.
If you drop any of the assignments, so that some field in t is live across the call,
then all fields will be preserved by the registerizer - there is already code to expand
'one' to 'all', more for multiword values like slices or interfaces than for this case.
But if none are live, that code doesn't trigger.
Arrays are the same:
package main
type T [4]*int
func F() (t T) {
    t[0] = new(int)
    t[1] = t[0]
    t[2] = t[0]
    t[3] = t[0]
    F()
    return
}
Given this, it seems like we must change the optimizer to match the liveness, except
that I think it is hard for the optimizer to guess what the liveness is going to decide,
and the liveness runs last (because it must record actual stack locations, which are
influenced by the optimizer's passes).
Perhaps the optimizer can push a VARDEF for a variable forward to the next mention of
the variable.
Still thinking.

@rsc
Copy link
Contributor Author

rsc commented May 29, 2014

Comment 6:

CL https://golang.org/cl/99660044 fixes this but I think it's too slow to
commit (anywhere from 6% to 30% on BenchmarkJSONDecode depending on the run). Still
thinking.

@gopherbot
Copy link

Comment 7:

CL https://golang.org/cl/101980043 mentions this issue.

@randall77
Copy link
Contributor

Comment 8:

RE #4, perhaps you are right, I can't now reconstruct what I did to get that proper
initialization.
Take a look at CL 101980043, I think it fixes both this and 8057.  It finds all the
assignments that "implement" the vardef and prevents regopt from removing them.

@gopherbot
Copy link

Comment 9:

CL https://golang.org/cl/99660044 mentions this issue.

@rsc
Copy link
Contributor Author

rsc commented May 30, 2014

Comment 10:

Issue #8057 has been merged into this issue.

@rsc
Copy link
Contributor Author

rsc commented May 30, 2014

Comment 11:

This issue was closed by revision 1afbceb.

Status changed to Fixed.

@rsc rsc added fixed labels May 30, 2014
@rsc rsc added this to the Go1.3 milestone Apr 14, 2015
@rsc rsc removed the release-go1.3 label Apr 14, 2015
@golang golang locked and limited conversation to collaborators Jun 25, 2016
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jun 25, 2018
This CL forces the optimizer to preserve some memory stores
that would be redundant except that a stack scan due to garbage
collection or stack copying might look at them during a function call.
As such, it forces additional memory writes and therefore slows
down the execution of some programs, especially garbage-heavy
programs that are already limited by memory bandwidth.

The slowdown can be as much as 7% for end-to-end benchmarks.

These numbers are from running go1.test -test.benchtime=5s three times,
taking the best (lowest) ns/op for each benchmark. I am excluding
benchmarks with time/op < 10us to focus on macro effects.
All benchmarks are on amd64.

Comparing tip (a27f34c771cb) against this CL on an Intel Core i5 MacBook Pro:

benchmark                          old ns/op      new ns/op      delta
BenchmarkBinaryTree17              3876500413     3856337341     -0.52%
BenchmarkFannkuch11                2965104777     2991182127     +0.88%
BenchmarkGobDecode                 8563026        8788340        +2.63%
BenchmarkGobEncode                 5050608        5267394        +4.29%
BenchmarkGzip                      431191816      434168065      +0.69%
BenchmarkGunzip                    107873523      110563792      +2.49%
BenchmarkHTTPClientServer          85036          86131          +1.29%
BenchmarkJSONEncode                22143764       22501647       +1.62%
BenchmarkJSONDecode                79646916       85658808       +7.55%
BenchmarkMandelbrot200             4720421        4700108        -0.43%
BenchmarkGoParse                   4651575        4712247        +1.30%
BenchmarkRegexpMatchMedium_1K      71986          73490          +2.09%
BenchmarkRegexpMatchHard_1K        111018         117495         +5.83%
BenchmarkRevcomp                   648798723      659352759      +1.63%
BenchmarkTemplate                  112673009      112819078      +0.13%

Comparing tip (a27f34c771cb) against this CL on an Intel Xeon E5520:

BenchmarkBinaryTree17              5461110720     5393104469     -1.25%
BenchmarkFannkuch11                4314677151     4327177615     +0.29%
BenchmarkGobDecode                 11065853       11235272       +1.53%
BenchmarkGobEncode                 6500065        6959837        +7.07%
BenchmarkGzip                      647478596      671769097      +3.75%
BenchmarkGunzip                    139348579      141096376      +1.25%
BenchmarkHTTPClientServer          69376          73610          +6.10%
BenchmarkJSONEncode                30172320       31796106       +5.38%
BenchmarkJSONDecode                113704905      114239137      +0.47%
BenchmarkMandelbrot200             6032730        6003077        -0.49%
BenchmarkGoParse                   6775251        6405995        -5.45%
BenchmarkRegexpMatchMedium_1K      111832         113895         +1.84%
BenchmarkRegexpMatchHard_1K        161112         168420         +4.54%
BenchmarkRevcomp                   876363406      892319935      +1.82%
BenchmarkTemplate                  146273096      148998339      +1.86%

Just to get a sense of where we are compared to the previous release,
here are the same benchmarks comparing Go 1.2 to this CL.

Comparing Go 1.2 against this CL on an Intel Core i5 MacBook Pro:

BenchmarkBinaryTree17              4370077662     3856337341     -11.76%
BenchmarkFannkuch11                3347052657     2991182127     -10.63%
BenchmarkGobDecode                 8791384        8788340        -0.03%
BenchmarkGobEncode                 4968759        5267394        +6.01%
BenchmarkGzip                      437815669      434168065      -0.83%
BenchmarkGunzip                    94604099       110563792      +16.87%
BenchmarkHTTPClientServer          87798          86131          -1.90%
BenchmarkJSONEncode                22818243       22501647       -1.39%
BenchmarkJSONDecode                97182444       85658808       -11.86%
BenchmarkMandelbrot200             4733516        4700108        -0.71%
BenchmarkGoParse                   5054384        4712247        -6.77%
BenchmarkRegexpMatchMedium_1K      67612          73490          +8.69%
BenchmarkRegexpMatchHard_1K        107321         117495         +9.48%
BenchmarkRevcomp                   733270055      659352759      -10.08%
BenchmarkTemplate                  109304977      112819078      +3.21%

Comparing Go 1.2 against this CL on an Intel Xeon E5520:

BenchmarkBinaryTree17              5986953594     5393104469     -9.92%
BenchmarkFannkuch11                4861139174     4327177615     -10.98%
BenchmarkGobDecode                 11830997       11235272       -5.04%
BenchmarkGobEncode                 6608722        6959837        +5.31%
BenchmarkGzip                      661875826      671769097      +1.49%
BenchmarkGunzip                    138630019      141096376      +1.78%
BenchmarkHTTPClientServer          71534          73610          +2.90%
BenchmarkJSONEncode                30393609       31796106       +4.61%
BenchmarkJSONDecode                139645860      114239137      -18.19%
BenchmarkMandelbrot200             5988660        6003077        +0.24%
BenchmarkGoParse                   6974092        6405995        -8.15%
BenchmarkRegexpMatchMedium_1K      111331         113895         +2.30%
BenchmarkRegexpMatchHard_1K        165961         168420         +1.48%
BenchmarkRevcomp                   995049292      892319935      -10.32%
BenchmarkTemplate                  145623363      148998339      +2.32%

Fixes golang#8036.

LGTM=khr
R=golang-codereviews, josharian, khr
CC=golang-codereviews, iant, r
https://golang.org/cl/99660044
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants