-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: incorrect liveness map during init of 1-word composite literal #8036
Labels
Milestone
Comments
Issue #8100 has been merged into this issue. |
It looks like the memory for t is initialized before the call to new: ... 0x002b 00043 (issue8036.go:8) MOVQ "".statictmp_0000+0(SB),BX 0x0033 00051 (issue8036.go:8) MOVQ BX,"".t+24(FP) 0x0038 00056 (issue8036.go:8) MOVQ $type.int+0(SB),(SP) 0x0040 00064 (issue8036.go:8) PCDATA $0,$16 0x0040 00064 (issue8036.go:8) PCDATA $1,$1 0x0040 00064 (issue8036.go:8) CALL ,runtime.new(SB) ... So although it is maybe inefficient to do it this way, GC shouldn't fail - t is properly initialized at the call to new. |
Re #4, are you using -N? What I see with -N: 000000 00000 (/tmp/x.go:7) TEXT "".F+0(SB),$16-8 000000 00000 (/tmp/x.go:7) MOVQ (TLS),CX 0x0009 00009 (/tmp/x.go:7) CMPQ SP,(CX) 0x000c 00012 (/tmp/x.go:7) JHI ,21 0x000e 00014 (/tmp/x.go:7) CALL ,runtime.morestack8_noctxt(SB) 0x0013 00019 (/tmp/x.go:7) JMP ,0 0x0015 00021 (/tmp/x.go:7) SUBQ $16,SP 0x0019 00025 (/tmp/x.go:7) FUNCDATA $2,gclocals·e1ae6533a9e39048ba0735a2264ce16a+0(SB) 0x0019 00025 (/tmp/x.go:7) FUNCDATA $3,gclocals·0115f8d53b75c1696444f08ad03251d9+0(SB) 0x0019 00025 (/tmp/x.go:7) MOVQ $0,"".t+24(FP) 0x0022 00034 (/tmp/x.go:7) MOVQ $0,"".t+24(FP) 0x002b 00043 (/tmp/x.go:8) MOVQ "".statictmp_0000+0(SB),BX 0x0033 00051 (/tmp/x.go:8) MOVQ BX,"".t+24(FP) 0x0038 00056 (/tmp/x.go:8) MOVQ $type.int+0(SB),(SP) 0x0040 00064 (/tmp/x.go:8) PCDATA $0,$16 0x0040 00064 (/tmp/x.go:8) PCDATA $1,$1 0x0040 00064 (/tmp/x.go:8) CALL ,runtime.new(SB) What I see normally: 000000 00000 (/tmp/x.go:7) TEXT "".F+0(SB),$16-8 000000 00000 (/tmp/x.go:7) MOVQ (TLS),CX 0x0009 00009 (/tmp/x.go:7) CMPQ SP,(CX) 0x000c 00012 (/tmp/x.go:7) JHI ,21 0x000e 00014 (/tmp/x.go:7) CALL ,runtime.morestack8_noctxt(SB) 0x0013 00019 (/tmp/x.go:7) JMP ,0 0x0015 00021 (/tmp/x.go:7) SUBQ $16,SP 0x0019 00025 (/tmp/x.go:7) FUNCDATA $2,gclocals·e1ae6533a9e39048ba0735a2264ce16a+0(SB) 0x0019 00025 (/tmp/x.go:7) FUNCDATA $3,gclocals·0115f8d53b75c1696444f08ad03251d9+0(SB) 0x0019 00025 (/tmp/x.go:8) MOVQ "".statictmp_0000+0(SB),BX 0x0021 00033 (/tmp/x.go:8) MOVQ $type.int+0(SB),(SP) 0x0029 00041 (/tmp/x.go:8) PCDATA $0,$16 0x0029 00041 (/tmp/x.go:8) PCDATA $1,$1 0x0029 00041 (/tmp/x.go:8) CALL ,runtime.new(SB) The single-word value t+24(FP) is being registerized and the optimizer can see it's dead even though the liveness cannot. We could change the optimizer to match the liveness or we could change the liveness to match the optimizer. I've done some experiments to understand what other cases are broken. It's more than just composite literals - not surprising since they just compile into simpler code. Any aggregate (struct, array) that can be fully registerized due to use of componentgen for assignment has this problem. For example, 4-word structs registerize on 6g, so this program demonstrates the same issue: package main type T struct { P *int Q *int R *int S *int } func F() (t T) { t.P = new(int) t.Q = t.P t.R = t.P t.S = t.P F() return } t is live at the call to new but all the fields are set afterward so the optimizer eliminates all the earlier stores. If you drop any of the assignments, so that some field in t is live across the call, then all fields will be preserved by the registerizer - there is already code to expand 'one' to 'all', more for multiword values like slices or interfaces than for this case. But if none are live, that code doesn't trigger. Arrays are the same: package main type T [4]*int func F() (t T) { t[0] = new(int) t[1] = t[0] t[2] = t[0] t[3] = t[0] F() return } Given this, it seems like we must change the optimizer to match the liveness, except that I think it is hard for the optimizer to guess what the liveness is going to decide, and the liveness runs last (because it must record actual stack locations, which are influenced by the optimizer's passes). Perhaps the optimizer can push a VARDEF for a variable forward to the next mention of the variable. Still thinking. |
CL https://golang.org/cl/99660044 fixes this but I think it's too slow to commit (anywhere from 6% to 30% on BenchmarkJSONDecode depending on the run). Still thinking. |
CL https://golang.org/cl/101980043 mentions this issue. |
CL https://golang.org/cl/99660044 mentions this issue. |
Issue #8057 has been merged into this issue. |
This issue was closed by revision 1afbceb. Status changed to Fixed. |
This was referenced Dec 8, 2014
wheatman
pushed a commit
to wheatman/go-akaros
that referenced
this issue
Jun 25, 2018
This CL forces the optimizer to preserve some memory stores that would be redundant except that a stack scan due to garbage collection or stack copying might look at them during a function call. As such, it forces additional memory writes and therefore slows down the execution of some programs, especially garbage-heavy programs that are already limited by memory bandwidth. The slowdown can be as much as 7% for end-to-end benchmarks. These numbers are from running go1.test -test.benchtime=5s three times, taking the best (lowest) ns/op for each benchmark. I am excluding benchmarks with time/op < 10us to focus on macro effects. All benchmarks are on amd64. Comparing tip (a27f34c771cb) against this CL on an Intel Core i5 MacBook Pro: benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 3876500413 3856337341 -0.52% BenchmarkFannkuch11 2965104777 2991182127 +0.88% BenchmarkGobDecode 8563026 8788340 +2.63% BenchmarkGobEncode 5050608 5267394 +4.29% BenchmarkGzip 431191816 434168065 +0.69% BenchmarkGunzip 107873523 110563792 +2.49% BenchmarkHTTPClientServer 85036 86131 +1.29% BenchmarkJSONEncode 22143764 22501647 +1.62% BenchmarkJSONDecode 79646916 85658808 +7.55% BenchmarkMandelbrot200 4720421 4700108 -0.43% BenchmarkGoParse 4651575 4712247 +1.30% BenchmarkRegexpMatchMedium_1K 71986 73490 +2.09% BenchmarkRegexpMatchHard_1K 111018 117495 +5.83% BenchmarkRevcomp 648798723 659352759 +1.63% BenchmarkTemplate 112673009 112819078 +0.13% Comparing tip (a27f34c771cb) against this CL on an Intel Xeon E5520: BenchmarkBinaryTree17 5461110720 5393104469 -1.25% BenchmarkFannkuch11 4314677151 4327177615 +0.29% BenchmarkGobDecode 11065853 11235272 +1.53% BenchmarkGobEncode 6500065 6959837 +7.07% BenchmarkGzip 647478596 671769097 +3.75% BenchmarkGunzip 139348579 141096376 +1.25% BenchmarkHTTPClientServer 69376 73610 +6.10% BenchmarkJSONEncode 30172320 31796106 +5.38% BenchmarkJSONDecode 113704905 114239137 +0.47% BenchmarkMandelbrot200 6032730 6003077 -0.49% BenchmarkGoParse 6775251 6405995 -5.45% BenchmarkRegexpMatchMedium_1K 111832 113895 +1.84% BenchmarkRegexpMatchHard_1K 161112 168420 +4.54% BenchmarkRevcomp 876363406 892319935 +1.82% BenchmarkTemplate 146273096 148998339 +1.86% Just to get a sense of where we are compared to the previous release, here are the same benchmarks comparing Go 1.2 to this CL. Comparing Go 1.2 against this CL on an Intel Core i5 MacBook Pro: BenchmarkBinaryTree17 4370077662 3856337341 -11.76% BenchmarkFannkuch11 3347052657 2991182127 -10.63% BenchmarkGobDecode 8791384 8788340 -0.03% BenchmarkGobEncode 4968759 5267394 +6.01% BenchmarkGzip 437815669 434168065 -0.83% BenchmarkGunzip 94604099 110563792 +16.87% BenchmarkHTTPClientServer 87798 86131 -1.90% BenchmarkJSONEncode 22818243 22501647 -1.39% BenchmarkJSONDecode 97182444 85658808 -11.86% BenchmarkMandelbrot200 4733516 4700108 -0.71% BenchmarkGoParse 5054384 4712247 -6.77% BenchmarkRegexpMatchMedium_1K 67612 73490 +8.69% BenchmarkRegexpMatchHard_1K 107321 117495 +9.48% BenchmarkRevcomp 733270055 659352759 -10.08% BenchmarkTemplate 109304977 112819078 +3.21% Comparing Go 1.2 against this CL on an Intel Xeon E5520: BenchmarkBinaryTree17 5986953594 5393104469 -9.92% BenchmarkFannkuch11 4861139174 4327177615 -10.98% BenchmarkGobDecode 11830997 11235272 -5.04% BenchmarkGobEncode 6608722 6959837 +5.31% BenchmarkGzip 661875826 671769097 +1.49% BenchmarkGunzip 138630019 141096376 +1.78% BenchmarkHTTPClientServer 71534 73610 +2.90% BenchmarkJSONEncode 30393609 31796106 +4.61% BenchmarkJSONDecode 139645860 114239137 -18.19% BenchmarkMandelbrot200 5988660 6003077 +0.24% BenchmarkGoParse 6974092 6405995 -8.15% BenchmarkRegexpMatchMedium_1K 111331 113895 +2.30% BenchmarkRegexpMatchHard_1K 165961 168420 +1.48% BenchmarkRevcomp 995049292 892319935 -10.32% BenchmarkTemplate 145623363 148998339 +2.32% Fixes golang#8036. LGTM=khr R=golang-codereviews, josharian, khr CC=golang-codereviews, iant, r https://golang.org/cl/99660044
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The text was updated successfully, but these errors were encountered: