Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: split stack overflow and broken stack traces #5723

Closed
rsc opened this issue Jun 17, 2013 · 15 comments
Closed

runtime: split stack overflow and broken stack traces #5723

rsc opened this issue Jun 17, 2013 · 15 comments

Comments

@rsc
Copy link
Contributor

rsc commented Jun 17, 2013

Rob reports a split stack overflow in the mail below.

Notice that the first goroutine in the trace says it is in semacquire. The first stack
should always be the running goroutine, so I'd like to know, in addition to the split
stack problem, how it is that a goroutine stuck in semacquire can possibly be splitting
its stack. 

That is, in addition to the split stack issue, stack traces are not working correctly.

I have been seeing similar issues recently in my own programming, where index checks
fail to report the stack trace of the goroutine with the out-of-bounds index.  I was in
the middle of other things and just fixed the indexes (by guessing where the bug was!)
and now I can't reproduce it.

I am also staring right now at a call to runtime.Stack in an internal Google program
using Go 1.1 (NOT tip) that is reporting its stack as being in a system call. At the
point where runtime.Stack is called, of course, Go code is running, not a system call.
SWIG is involved in this particular program, but the previous two examples (Rob's and my
index bugs) are pure Go.

These all seem related.

---

can't reproduce this yet, but the go command gave me a split stack
overflow. it was during go test -cover=set std (with my patches) so
it's a little confusing. i can't figure out which package it crashed
on. it's just like there's a stack trace in the middle of otherwise
complete output.

runtime: split stack overflow: 0x6c2fc8 < 0x6c3000
fatal error: runtime: split stack overflow

goroutine 1 [semacquire]:
sync.runtime_Semacquire(0xc2103ba5d8)
/Users/r/go/src/pkg/runtime/zsema_darwin_amd64.c:165 +0x2e
sync.(*WaitGroup).Wait(0xc21030b980)
/Users/r/go/src/pkg/sync/waitgroup.go:127 +0xf6
main.(*builder).do(0xc21008b2a0, 0xc2105c4410)
/Users/r/go/src/cmd/go/build.go:727 +0x3e4
main.runTest(0x5a4b40, 0xc21000a020, 0x2, 0x2)
/Users/r/go/src/cmd/go/test.go:442 +0x1dda
main.main()
/Users/r/go/src/cmd/go/main.go:161 +0x4f9
goroutine 3 [syscall]:
os/signal.loop()
/Users/r/go/src/pkg/os/signal/signal_unix.go:21 +0x1c
created by os/signal.init·1
/Users/r/go/src/pkg/os/signal/signal_unix.go:27 +0x2f

goroutine 5 [syscall]:
syscall.Syscall6()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:39 +0x5
syscall.wait4(0xd942, 0xc210109fb8, 0x0, 0xc21096e090, 0x100000001, ...)
/Users/r/go/src/pkg/syscall/zsyscall_darwin_amd64.go:32 +0x7b
syscall.Wait4(0xd942, 0x6cdc44, 0x0, 0xc21096e090, 0x8, ...)
/Users/r/go/src/pkg/syscall/syscall_bsd.go:137 +0x6c
os.(*Process).wait(0xc210176c20, 0x6cdcc8, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec_unix.go:22 +0xea
os.(*Process).Wait(0xc210176c20, 0x2, 0x6cdcd8, 0x40dea8)
/Users/r/go/src/pkg/os/doc.go:43 +0x25
os/exec.(*Cmd).Wait(0xc210967a00, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec/exec.go:310 +0x197
os/exec.(*Cmd).Run(0xc210967a00, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec/exec.go:232 +0x52
main.(*builder).runOut(0xc21008b2a0, 0x35e540, 0x1, 0x38b3b0, 0x8, ...)
/Users/r/go/src/cmd/go/build.go:1307 +0x221
main.(*builder).run(0xc21008b2a0, 0x35e540, 0x1, 0x38b3b0, 0x8, ...)
/Users/r/go/src/cmd/go/build.go:1258 +0xb7
main.gcToolchain.ld(0xc21008b2a0, 0xc210486800, 0xc210107070, 0x6b,
0xc21096b100, ...)
/Users/r/go/src/cmd/go/build.go:1564 +0x57e
main.gcToolchain.ld·i(0x1, 0xc21008b2a0, 0xc210486800, 0xc210107070, 0x6b, ...)
/Users/r/go/src/cmd/go/build.go:0 +0xc5
main.(*builder).build(0xc21008b2a0, 0xc210557dd0, 0x0, 0x0)
/Users/r/go/src/cmd/go/build.go:955 +0x30ce
main.func·005(0xc210557dd0)
/Users/r/go/src/cmd/go/build.go:661 +0x2e8
main.func·006()
/Users/r/go/src/cmd/go/build.go:718 +0x109
created by main.(*builder).do
/Users/r/go/src/cmd/go/build.go:724 +0x3b7

goroutine 6 [select]:
main.(*builder).runTest(0xc21008b2a0, 0xc2101dc410, 0x6aff38, 0x27)
/Users/r/go/src/cmd/go/test.go:760 +0x143d
main.func·005(0xc2101dc410)
/Users/r/go/src/cmd/go/build.go:661 +0x2e8
main.func·006()
/Users/r/go/src/cmd/go/build.go:718 +0x109
created by main.(*builder).do
/Users/r/go/src/cmd/go/build.go:724 +0x3b7

goroutine 7 [select]:
main.(*builder).runTest(0xc21008b2a0, 0xc21049b0d0, 0x6aef38, 0x17)
/Users/r/go/src/cmd/go/test.go:760 +0x143d
main.func·005(0xc21049b0d0)
/Users/r/go/src/cmd/go/build.go:661 +0x2e8
main.func·006()
/Users/r/go/src/cmd/go/build.go:718 +0x109
created by main.(*builder).do
/Users/r/go/src/cmd/go/build.go:724 +0x3b7

goroutine 8 [select]:
main.(*builder).runTest(0xc21008b2a0, 0xc21026fea0, 0x6b4f38, 0x2b)
/Users/r/go/src/cmd/go/test.go:760 +0x143d
main.func·005(0xc21026fea0)
/Users/r/go/src/cmd/go/build.go:661 +0x2e8
main.func·006()
/Users/r/go/src/cmd/go/build.go:718 +0x109
created by main.(*builder).do
/Users/r/go/src/cmd/go/build.go:724 +0x3b7

goroutine 9 [running]:
[fp=0x6c3558] syscall.Syscall()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:15 +0x5
[fp=0x6c35d8] syscall.forkAndExecInChild(0xc210968720, 0xc210176d80,
0x3, 0x3, 0xc21096d160, ...)
/Users/r/go/src/pkg/syscall/exec_bsd.go:59 +0xf4
[fp=0x6c36f8] syscall.forkExec(0xc2104790c0, 0x5b, 0xc210176ca0, 0x2, 0x2, ...)
/Users/r/go/src/pkg/syscall/exec_unix.go:191 +0x4ac
[fp=0x6c3748] syscall.StartProcess(0xc2104790c0, 0x5b, 0xc210176ca0,
0x2, 0x2, ...)
/Users/r/go/src/pkg/syscall/exec_unix.go:238 +0x74
[fp=0x6c3848] os.startProcess(0xc2104790c0, 0x5b, 0xc210176ca0, 0x2, 0x2, ...)
/Users/r/go/src/pkg/os/exec_posix.go:45 +0x33e
[fp=0x6c3898] os.StartProcess(0xc2104790c0, 0x5b, 0xc210176ca0, 0x2, 0x2, ...)
/Users/r/go/src/pkg/os/doc.go:24 +0x5c
[fp=0x6c3a20] os/exec.(*Cmd).Start(0xc210967b40, 0xc212c82276, 0x5abf00)
/Users/r/go/src/pkg/os/exec/exec.go:264 +0x587
[fp=0x6c3eb0] main.(*builder).runTest(0xc21008b2a0, 0xc2104390d0,
0x6c3f38, 0x11)
/Users/r/go/src/cmd/go/test.go:748 +0x809
[fp=0x6c3f48] main.func·005(0xc2104390d0)
/Users/r/go/src/cmd/go/build.go:661 +0x2e8
[fp=0x6c3fa0] main.func·006()
/Users/r/go/src/cmd/go/build.go:718 +0x109
[fp=0x6c3fa8] runtime.goexit()
/Users/r/go/src/pkg/runtime/proc.c:1243
created by main.(*builder).do
/Users/r/go/src/cmd/go/build.go:724 +0x3b7

goroutine 10 [select]:
main.(*builder).runTest(0xc21008b2a0, 0xc2102e0ea0, 0x6c2f38, 0x2b)
/Users/r/go/src/cmd/go/test.go:760 +0x143d
main.func·005(0xc2102e0ea0)
/Users/r/go/src/cmd/go/build.go:661 +0x2e8
main.func·006()
/Users/r/go/src/cmd/go/build.go:718 +0x109
created by main.(*builder).do
/Users/r/go/src/cmd/go/build.go:724 +0x3b7

goroutine 11 [syscall]:
syscall.Syscall6()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:39 +0x5
syscall.wait4(0xd93e, 0xc210109c70, 0x0, 0xc210964750, 0x100000001, ...)
/Users/r/go/src/pkg/syscall/zsyscall_darwin_amd64.go:32 +0x7b
syscall.Wait4(0xd93e, 0x6f4c44, 0x0, 0xc210964750, 0x8, ...)
/Users/r/go/src/pkg/syscall/syscall_bsd.go:137 +0x6c
os.(*Process).wait(0xc21017df60, 0x6f4cc8, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec_unix.go:22 +0xea
os.(*Process).Wait(0xc21017df60, 0x2, 0x6f4cd8, 0x40dea8)
/Users/r/go/src/pkg/os/doc.go:43 +0x25
os/exec.(*Cmd).Wait(0xc210967280, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec/exec.go:310 +0x197
os/exec.(*Cmd).Run(0xc210967280, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec/exec.go:232 +0x52
main.(*builder).runOut(0xc21008b2a0, 0x35e540, 0x1, 0x38b3b0, 0x8, ...)
/Users/r/go/src/cmd/go/build.go:1307 +0x221
main.(*builder).run(0xc21008b2a0, 0x35e540, 0x1, 0x38b3b0, 0x8, ...)
/Users/r/go/src/cmd/go/build.go:1258 +0xb7
main.gcToolchain.ld(0xc21008b2a0, 0xc2104f2000, 0xc21017bcc0, 0x5d,
0xc210966600, ...)
/Users/r/go/src/cmd/go/build.go:1564 +0x57e
main.gcToolchain.ld·i(0x1, 0xc21008b2a0, 0xc2104f2000, 0xc21017bcc0, 0x5d, ...)
/Users/r/go/src/cmd/go/build.go:0 +0xc5
main.(*builder).build(0xc21008b2a0, 0xc2104ec750, 0x0, 0x0)
/Users/r/go/src/cmd/go/build.go:955 +0x30ce
main.func·005(0xc2104ec750)
/Users/r/go/src/cmd/go/build.go:661 +0x2e8
main.func·006()
/Users/r/go/src/cmd/go/build.go:718 +0x109
created by main.(*builder).do
/Users/r/go/src/cmd/go/build.go:724 +0x3b7

goroutine 12 [select]:
main.(*builder).runTest(0xc21008b2a0, 0xc21039fdd0, 0x6c0f38, 0x2a)
/Users/r/go/src/cmd/go/test.go:760 +0x143d
main.func·005(0xc21039fdd0)
/Users/r/go/src/cmd/go/build.go:661 +0x2e8
main.func·006()
/Users/r/go/src/cmd/go/build.go:718 +0x109
created by main.(*builder).do
/Users/r/go/src/cmd/go/build.go:724 +0x3b7

goroutine 230 [finalizer wait]:

goroutine 116 [syscall]:
syscall.Syscall6()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:39 +0x5
syscall.wait4(0xd6f7, 0xc210058dc8, 0x0, 0xc210669480, 0x100000001, ...)
/Users/r/go/src/pkg/syscall/zsyscall_darwin_amd64.go:32 +0x7b
syscall.Wait4(0xd6f7, 0x6ace6c, 0x0, 0xc210669480, 0x0, ...)
/Users/r/go/src/pkg/syscall/syscall_bsd.go:137 +0x6c
os.(*Process).wait(0xc210233a20, 0x0, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec_unix.go:22 +0xea
os.(*Process).Wait(0xc210233a20, 0x6acf78, 0x0, 0xc2102efa20)
/Users/r/go/src/pkg/os/doc.go:43 +0x25
os/exec.(*Cmd).Wait(0xc2106db780, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec/exec.go:310 +0x197
main.func·027()
/Users/r/go/src/cmd/go/test.go:758 +0x30
created by main.(*builder).runTest
/Users/r/go/src/cmd/go/test.go:759 +0x91c

goroutine 571 [syscall]:
syscall.Syscall6()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:39 +0x5
syscall.wait4(0xd8ae, 0xc2103223c8, 0x0, 0xc21088ed80, 0x100000001, ...)
/Users/r/go/src/pkg/syscall/zsyscall_darwin_amd64.go:32 +0x7b
syscall.Wait4(0xd8ae, 0x6d6e6c, 0x0, 0xc21088ed80, 0x0, ...)
/Users/r/go/src/pkg/syscall/syscall_bsd.go:137 +0x6c
os.(*Process).wait(0xc21061ab40, 0x0, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec_unix.go:22 +0xea
os.(*Process).Wait(0xc21061ab40, 0x6d6f78, 0x0, 0x0)
/Users/r/go/src/pkg/os/doc.go:43 +0x25
os/exec.(*Cmd).Wait(0xc21088bdc0, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec/exec.go:310 +0x197
main.func·027()
/Users/r/go/src/cmd/go/test.go:758 +0x30
created by main.(*builder).runTest
/Users/r/go/src/cmd/go/test.go:759 +0x91c

goroutine 730 [syscall]:
syscall.Syscall()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:15 +0x5
syscall.read(0xa, 0xc210966800, 0x200, 0x200, 0x8, ...)
/Users/r/go/src/pkg/syscall/zsyscall_darwin_amd64.go:900 +0x70
syscall.Read(0xa, 0xc210966800, 0x200, 0x200, 0x5dfe4, ...)
/Users/r/go/src/pkg/syscall/syscall_unix.go:132 +0x5a
os.(*File).read(0xc210109be8, 0xc210966800, 0x200, 0x200, 0x26f200, ...)
/Users/r/go/src/pkg/os/file_unix.go:174 +0x60
os.(*File).Read(0xc210109be8, 0xc210966800, 0x200, 0x200, 0x0, ...)
/Users/r/go/src/pkg/os/file.go:95 +0x96
bytes.(*Buffer).ReadFrom(0xc210965460, 0x66f270, 0xc210109be8, 0x0, 0x0, ...)
/Users/r/go/src/pkg/bytes/buffer.go:169 +0x203
io.Copy(0x132de70, 0xc210965460, 0x66f270, 0xc210109be8, 0x0, ...)
/Users/r/go/src/pkg/io/io.go:344 +0x122
os/exec.func·003(0x270f80, 0xc2109653f0)
/Users/r/go/src/pkg/os/exec/exec.go:207 +0x60
os/exec.func·004(0xc21017dd40)
/Users/r/go/src/pkg/os/exec/exec.go:276 +0x2a
created by os/exec.(*Cmd).Start
/Users/r/go/src/pkg/os/exec/exec.go:277 +0x71d

goroutine 125 [syscall]:
syscall.Syscall6()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:39 +0x5
syscall.wait4(0xd6fc, 0xc2103191b8, 0x0, 0xc2104d3120, 0x100000001, ...)
/Users/r/go/src/pkg/syscall/zsyscall_darwin_amd64.go:32 +0x7b
syscall.Wait4(0xd6fc, 0x6bce6c, 0x0, 0xc2104d3120, 0x0, ...)
/Users/r/go/src/pkg/syscall/syscall_bsd.go:137 +0x6c
os.(*Process).wait(0xc21022abc0, 0x0, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec_unix.go:22 +0xea
os.(*Process).Wait(0xc21022abc0, 0x6bcf78, 0x0, 0xc2102ef000)
/Users/r/go/src/pkg/os/doc.go:43 +0x25
os/exec.(*Cmd).Wait(0xc2106dbdc0, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec/exec.go:310 +0x197
main.func·027()
/Users/r/go/src/cmd/go/test.go:758 +0x30
created by main.(*builder).runTest
/Users/r/go/src/cmd/go/test.go:759 +0x91c

goroutine 124 [syscall]:
syscall.Syscall()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:15 +0x5
syscall.read(0xd, 0xc210652c00, 0x200, 0x200, 0x8, ...)
/Users/r/go/src/pkg/syscall/zsyscall_darwin_amd64.go:900 +0x70
syscall.Read(0xd, 0xc210652c00, 0x200, 0x200, 0x5dfe4, ...)
/Users/r/go/src/pkg/syscall/syscall_unix.go:132 +0x5a
os.(*File).read(0xc210319088, 0xc210652c00, 0x200, 0x200, 0x26f200, ...)
/Users/r/go/src/pkg/os/file_unix.go:174 +0x60
os.(*File).Read(0xc210319088, 0xc210652c00, 0x200, 0x200, 0x0, ...)
/Users/r/go/src/pkg/os/file.go:95 +0x96
bytes.(*Buffer).ReadFrom(0xc21047cc40, 0x66f270, 0xc210319088, 0x0, 0x0, ...)
/Users/r/go/src/pkg/bytes/buffer.go:169 +0x203
io.Copy(0x132de70, 0xc21047cc40, 0x66f270, 0xc210319088, 0x0, ...)
/Users/r/go/src/pkg/io/io.go:344 +0x122
os/exec.func·003(0x4551e, 0x270f80)
/Users/r/go/src/pkg/os/exec/exec.go:207 +0x60
os/exec.func·004(0xc21022a5e0)
/Users/r/go/src/pkg/os/exec/exec.go:276 +0x2a
created by os/exec.(*Cmd).Start
/Users/r/go/src/pkg/os/exec/exec.go:277 +0x71d

goroutine 217 [syscall]:
syscall.Syscall()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:15 +0x5
syscall.read(0x5, 0xc210829a00, 0x200, 0x200, 0x8, ...)
/Users/r/go/src/pkg/syscall/zsyscall_darwin_amd64.go:900 +0x70
syscall.Read(0x5, 0xc210829a00, 0x200, 0x200, 0x5dfe4, ...)
/Users/r/go/src/pkg/syscall/syscall_unix.go:132 +0x5a
os.(*File).read(0xc210338278, 0xc210829a00, 0x200, 0x200, 0x26f200, ...)
/Users/r/go/src/pkg/os/file_unix.go:174 +0x60
os.(*File).Read(0xc210338278, 0xc210829a00, 0x200, 0x200, 0x0, ...)
/Users/r/go/src/pkg/os/file.go:95 +0x96
bytes.(*Buffer).ReadFrom(0xc21081cf50, 0x66f270, 0xc210338278, 0x0, 0x0, ...)
/Users/r/go/src/pkg/bytes/buffer.go:169 +0x203
io.Copy(0x132de70, 0xc21081cf50, 0x66f270, 0xc210338278, 0x0, ...)
/Users/r/go/src/pkg/io/io.go:344 +0x122
os/exec.func·003(0x270f80, 0xc210812d20)
/Users/r/go/src/pkg/os/exec/exec.go:207 +0x60
os/exec.func·004(0xc210101e20)
/Users/r/go/src/pkg/os/exec/exec.go:276 +0x2a
created by os/exec.(*Cmd).Start
/Users/r/go/src/pkg/os/exec/exec.go:277 +0x71d

goroutine 115 [syscall]:
syscall.Syscall()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:15 +0x5
syscall.read(0x8, 0xc210652000, 0x200, 0x200, 0x8, ...)
/Users/r/go/src/pkg/syscall/zsyscall_darwin_amd64.go:900 +0x70
syscall.Read(0x8, 0xc210652000, 0x200, 0x200, 0x5dfe4, ...)
/Users/r/go/src/pkg/syscall/syscall_unix.go:132 +0x5a
os.(*File).read(0xc210058cc8, 0xc210652000, 0x200, 0x200, 0x26f200, ...)
/Users/r/go/src/pkg/os/file_unix.go:174 +0x60
os.(*File).Read(0xc210058cc8, 0xc210652000, 0x200, 0x200, 0x0, ...)
/Users/r/go/src/pkg/os/file.go:95 +0x96
bytes.(*Buffer).ReadFrom(0xc21045cd90, 0x66f270, 0xc210058cc8, 0x0, 0x0, ...)
/Users/r/go/src/pkg/bytes/buffer.go:169 +0x203
io.Copy(0x132de70, 0xc21045cd90, 0x66f270, 0xc210058cc8, 0x0, ...)
/Users/r/go/src/pkg/io/io.go:344 +0x122
os/exec.func·003(0x270f80, 0xc21023f310)
/Users/r/go/src/pkg/os/exec/exec.go:207 +0x60
os/exec.func·004(0xc210233380)
/Users/r/go/src/pkg/os/exec/exec.go:276 +0x2a
created by os/exec.(*Cmd).Start
/Users/r/go/src/pkg/os/exec/exec.go:277 +0x71d

goroutine 105 [chan receive]:
main.func·026()
/Users/r/go/src/cmd/go/signal.go:21 +0x30
created by main.processSignals
/Users/r/go/src/cmd/go/signal.go:23 +0xb2

goroutine 120 [syscall]:
syscall.Syscall()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:15 +0x5
syscall.read(0x6, 0xc210652600, 0x200, 0x200, 0x800, ...)
/Users/r/go/src/pkg/syscall/zsyscall_darwin_amd64.go:900 +0x70
syscall.Read(0x6, 0xc210652600, 0x200, 0x200, 0x5dfe4, ...)
/Users/r/go/src/pkg/syscall/syscall_unix.go:132 +0x5a
os.(*File).read(0xc210058e00, 0xc210652600, 0x200, 0x200, 0x26f200, ...)
/Users/r/go/src/pkg/os/file_unix.go:174 +0x60
os.(*File).Read(0xc210058e00, 0xc210652600, 0x200, 0x200, 0x0, ...)
/Users/r/go/src/pkg/os/file.go:95 +0x96
bytes.(*Buffer).ReadFrom(0xc21047c070, 0x66f270, 0xc210058e00, 0x0, 0x0, ...)
/Users/r/go/src/pkg/bytes/buffer.go:169 +0x203
io.Copy(0x132de70, 0xc21047c070, 0x66f270, 0xc210058e00, 0x0, ...)
/Users/r/go/src/pkg/io/io.go:344 +0x122
os/exec.func·003(0x0, 0x0)
/Users/r/go/src/pkg/os/exec/exec.go:207 +0x60
os/exec.func·004(0xc210233b40)
/Users/r/go/src/pkg/os/exec/exec.go:276 +0x2a
created by os/exec.(*Cmd).Start
/Users/r/go/src/pkg/os/exec/exec.go:277 +0x71d

goroutine 121 [syscall]:
syscall.Syscall6()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:39 +0x5
syscall.wait4(0xd6f9, 0xc210058f10, 0x0, 0xc210669900, 0x100000001, ...)
/Users/r/go/src/pkg/syscall/zsyscall_darwin_amd64.go:32 +0x7b
syscall.Wait4(0xd6f9, 0x138ae6c, 0x0, 0xc210669900, 0x0, ...)
/Users/r/go/src/pkg/syscall/syscall_bsd.go:137 +0x6c
os.(*Process).wait(0xc21022a1e0, 0x0, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec_unix.go:22 +0xea
os.(*Process).Wait(0xc21022a1e0, 0x0, 0x0, 0x0)
/Users/r/go/src/pkg/os/doc.go:43 +0x25
os/exec.(*Cmd).Wait(0xc2106dba00, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec/exec.go:310 +0x197
main.func·027()
/Users/r/go/src/cmd/go/test.go:758 +0x30
created by main.(*builder).runTest
/Users/r/go/src/cmd/go/test.go:759 +0x91c

goroutine 218 [syscall]:
syscall.Syscall6()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:39 +0x5
syscall.wait4(0xd750, 0xc2103382d8, 0x0, 0xc21082a6c0, 0x100000001, ...)
/Users/r/go/src/pkg/syscall/zsyscall_darwin_amd64.go:32 +0x7b
syscall.Wait4(0xd750, 0x1389e6c, 0x0, 0xc21082a6c0, 0x0, ...)
/Users/r/go/src/pkg/syscall/syscall_bsd.go:137 +0x6c
os.(*Process).wait(0xc2100e42e0, 0x0, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec_unix.go:22 +0xea
os.(*Process).Wait(0xc2100e42e0, 0x1389f78, 0x0, 0xc2102ef000)
/Users/r/go/src/pkg/os/doc.go:43 +0x25
os/exec.(*Cmd).Wait(0xc21082e140, 0x0, 0x0)
/Users/r/go/src/pkg/os/exec/exec.go:310 +0x197
main.func·027()
/Users/r/go/src/cmd/go/test.go:758 +0x30
created by main.(*builder).runTest
/Users/r/go/src/cmd/go/test.go:759 +0x91c

goroutine 570 [syscall]:
syscall.Syscall()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:15 +0x5
syscall.read(0x7, 0xc210891200, 0x200, 0x200, 0x8, ...)
/Users/r/go/src/pkg/syscall/zsyscall_darwin_amd64.go:900 +0x70
syscall.Read(0x7, 0xc210891200, 0x200, 0x200, 0x5dfe4, ...)
/Users/r/go/src/pkg/syscall/syscall_unix.go:132 +0x5a
os.(*File).read(0xc210322368, 0xc210891200, 0x200, 0x200, 0x26f200, ...)
/Users/r/go/src/pkg/os/file_unix.go:174 +0x60
os.(*File).Read(0xc210322368, 0xc210891200, 0x200, 0x200, 0x0, ...)
/Users/r/go/src/pkg/os/file.go:95 +0x96
bytes.(*Buffer).ReadFrom(0xc210886f50, 0x66f270, 0xc210322368, 0x0, 0x0, ...)
/Users/r/go/src/pkg/bytes/buffer.go:169 +0x203
io.Copy(0x132de70, 0xc210886f50, 0x66f270, 0xc210322368, 0x0, ...)
/Users/r/go/src/pkg/io/io.go:344 +0x122
os/exec.func·003(0x270f80, 0xc21086cb60)
/Users/r/go/src/pkg/os/exec/exec.go:207 +0x60
os/exec.func·004(0xc21061a8a0)
/Users/r/go/src/pkg/os/exec/exec.go:276 +0x2a
created by os/exec.(*Cmd).Start
/Users/r/go/src/pkg/os/exec/exec.go:277 +0x71d

goroutine 734 [syscall]:
syscall.Syscall()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:15 +0x5
syscall.read(0x4, 0xc21096f000, 0x200, 0x200, 0x8, ...)
/Users/r/go/src/pkg/syscall/zsyscall_darwin_amd64.go:900 +0x70
syscall.Read(0x4, 0xc21096f000, 0x200, 0x200, 0x5dfe4, ...)
/Users/r/go/src/pkg/syscall/syscall_unix.go:132 +0x5a
os.(*File).read(0xc210109f20, 0xc21096f000, 0x200, 0x200, 0x26f200, ...)
/Users/r/go/src/pkg/os/file_unix.go:174 +0x60
os.(*File).Read(0xc210109f20, 0xc21096f000, 0x200, 0x200, 0x0, ...)
/Users/r/go/src/pkg/os/file.go:95 +0x96
bytes.(*Buffer).ReadFrom(0xc210965d20, 0x66f270, 0xc210109f20, 0x0, 0x0, ...)
/Users/r/go/src/pkg/bytes/buffer.go:169 +0x203
io.Copy(0x132de70, 0xc210965d20, 0x66f270, 0xc210109f20, 0x0, ...)
/Users/r/go/src/pkg/io/io.go:344 +0x122
os/exec.func·003(0x270f80, 0xc210965bd0)
/Users/r/go/src/pkg/os/exec/exec.go:207 +0x60
os/exec.func·004(0xc2101769e0)
/Users/r/go/src/pkg/os/exec/exec.go:276 +0x2a
created by os/exec.(*Cmd).Start
/Users/r/go/src/pkg/os/exec/exec.go:277 +0x71d
@rsc
Copy link
Contributor Author

rsc commented Jun 17, 2013

Comment 1:

I propose that we stop submitting runtime CLs until we understand all three of these
problems.

@rsc
Copy link
Contributor Author

rsc commented Jun 17, 2013

Comment 2:

---------- Forwarded message ----------
From: Russ Cox <rsc@golang.org>
Date: Mon, Jun 17, 2013 at 8:27 AM
Subject: runtime CL moratorium
To: golang-dev <golang-dev@googlegroups.com>
The runtime is too flaky to get work done. See golang.org/issue/5723.
Until that issue is closed, please stop writing, sending for review, discussing, and
submitting CLs making changes to the runtime.
Thank you.
Russ

@dvyukov
Copy link
Member

dvyukov commented Jun 17, 2013

Comment 3:

>i can't figure out which package it crashed
on. it's just like there's a stack trace in the middle of otherwise
complete output.
I frequently add unwind of current stack to runtime.startpanic (before crashing)
locally. It helps when crash happens on g0. I suspect this is the case here.

@dvyukov
Copy link
Member

dvyukov commented Jun 17, 2013

Comment 4:

Rob, haven't you changed pkg/syscall/exec_bsd.go:59 to use Syscall instead of RawSyscall
locally?
Otherwise I do not understand how this is possible:
goroutine 9 [running]:
[fp=0x6c3558] syscall.Syscall()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:15 +0x5
[fp=0x6c35d8] syscall.forkAndExecInChild(0xc210968720, 0xc210176d80,
0x3, 0x3, 0xc21096d160, ...)
/Users/r/go/src/pkg/syscall/exec_bsd.go:59 +0xf4
Such change is not safe, it can have bad consequences for child process.

@robpike
Copy link
Contributor

robpike commented Jun 17, 2013

Comment 5:

Dmitriy: No. My "patches" I refer to are just in cmd/go itself.

@dvyukov
Copy link
Member

dvyukov commented Jun 17, 2013

Comment 6:

Rob, please run 'go test runtime' w/o rebuilding anything.
I've seen a similar crash few days ago, it was caused by the hg sync w/o running
make.bash, so I ended up with inconsistent compiler/linker/runtime. Rebuilding
everything helped.
Runtime contains a good test for split stacks.

@robpike
Copy link
Contributor

robpike commented Jun 17, 2013

Comment 7:

I have not been able to reproduce the crash despite many attempts, and the one I
reported was in a fully consistent repo except for my changes to cmd/go.

@dvyukov
Copy link
Member

dvyukov commented Jun 17, 2013

Comment 8:

>Notice that the first goroutine in the trace says it is in semacquire. The first stack
should always be the running goroutine, so I'd like to know, in addition to the split
stack problem, how it is that a goroutine stuck in semacquire can possibly be splitting
its stack. 
The crash happened on g0, we do not print g0 stacks.
The goroutine that was running at the time of crash (m->curg) is:
goroutine 9 [running]:
[fp=0x6c3558] syscall.Syscall()
/Users/r/go/src/pkg/syscall/asm_darwin_amd64.s:15 +0x5
[fp=0x6c35d8] syscall.forkAndExecInChild(0xc210968720, 0xc210176d80,
0x3, 0x3, 0xc21096d160, ...)
/Users/r/go/src/pkg/syscall/exec_bsd.go:59 +0xf4
[fp=0x6c36f8] syscall.forkExec(0xc2104790c0, 0x5b, 0xc210176ca0, 0x2, 0x2, ...)
/Users/r/go/src/pkg/syscall/exec_unix.go:191 +0x4ac
[fp=0x6c3748] syscall.StartProcess(0xc2104790c0, 0x5b, 0xc210176ca0,
0x2, 0x2, ...)
/Users/r/go/src/pkg/syscall/exec_unix.go:238 +0x74
[fp=0x6c3848] os.startProcess(0xc2104790c0, 0x5b, 0xc210176ca0, 0x2, 0x2, ...)
/Users/r/go/src/pkg/os/exec_posix.go:45 +0x33e
[fp=0x6c3898] os.StartProcess(0xc2104790c0, 0x5b, 0xc210176ca0, 0x2, 0x2, ...)
/Users/r/go/src/pkg/os/doc.go:24 +0x5c
[fp=0x6c3a20] os/exec.(*Cmd).Start(0xc210967b40, 0xc212c82276, 0x5abf00)
/Users/r/go/src/pkg/os/exec/exec.go:264 +0x587
[fp=0x6c3eb0] main.(*builder).runTest(0xc21008b2a0, 0xc2104390d0,
0x6c3f38, 0x11)
/Users/r/go/src/cmd/go/test.go:748 +0x809
[fp=0x6c3f48] main.func·005(0xc2104390d0)
/Users/r/go/src/cmd/go/build.go:661 +0x2e8
[fp=0x6c3fa0] main.func·006()
/Users/r/go/src/cmd/go/build.go:718 +0x109
[fp=0x6c3fa8] runtime.goexit()
/Users/r/go/src/pkg/runtime/proc.c:1243
created by main.(*builder).do
/Users/r/go/src/cmd/go/build.go:724 +0x3b7

@rsc
Copy link
Contributor Author

rsc commented Jun 17, 2013

Comment 9:

We should print g0 stacks when the fault happens on g0.

@rsc
Copy link
Contributor Author

rsc commented Jun 17, 2013

Comment 10:

We should also print the running goroutine first.

@rsc
Copy link
Contributor Author

rsc commented Jun 18, 2013

Comment 11:

Today's scorecard:
1) Rob's split stack overflow.
Dmitriy points out that it was goroutine 9 that had the overflow. That's great. However,
the stack trace does not make sense. First, Rob did not have any pending changes to
package syscall or package runtime. Second, the runtime did recognize the goroutine as
part of the problem, so it included [fp=xxx] annotations to dump the frame pointers. It
does this if m->throwing && gp == m->curg. That's the same test showframe uses to decide
to include internal runtime call frames. But they are missing from that goroutine stack.
A real program would have called runtime.entersyscall from syscall.Syscall, yet that
frame is missing. Also, entersyscall would not have been enough to eat up the 0x500 or
so bytes remaining on the stack at that point.
No, the stack being shown for goroutine 9 is completely bogus. It is being dumped by
tracebackothers, which uses gp->sched as the initial frame information. But in the case
of a running goroutine, gp->sched is stale. It records the pc and sp of the most recent
saving of gp->sched, which happened during a call to syscall.Syscall some time ago, and
that call is over. The rest of the stack trace may be accurate, since it was read
directly from the stack memory, but we don't know what's below, what used up the 0x590
bytes between the fp we have information for and the bottom of the stack. What we'd
really like to know are morebuf.sp and morepc, and they are not printed.
At the least we need to make the printing here actually print useful information. But I
still think we should be able to puzzle through what might have been happening on the
stack to use up the remaining space. Just don't get misled by the syscall.Syscall frame.
2) My index bounds checks not appearing.
No progress has been made on this. I have not gotten any real programming done today. It
is possible that this is the same problem as (3) - the last record I have of rebuilding
that machine's Go tree was June 13 at 12:20 or so California time. Before that it had
been not since before Go 1.1 (April 29). What I don't know is whether I observed the
missing stacks before or after 12:20.
3) My bad behavior from runtime.Stack in a Go 1.1 program
This was caused by a bug fixed in CL 9307045. Because there was no issue filed about it,
we did not consider it for Go 1.1.1, so Go 1.1.1 still suffers from the bug. I created
issue #5724 so that we put it into Go 1.1.2.

@alberts
Copy link
Contributor

alberts commented Jun 18, 2013

Comment 12:

We're also seeing a crash that looks like it might be related to:
https://code.google.com/p/go/source/detail?r=bb553699c7e6
Yesterday was a holiday, so I'll check again whether we're still seeing this later today.
My previous message:
We're seeing a crash that dvyukov suggested might be related to this CL.
runtime: unknown argument frame size for runtime.copy
fatal error: invalid stack
goroutine 0 [idle]:
runtime: unknown argument frame size for runtime.sigreturn
panic during panic
No stack trace beyond that.
I suspect it's happening in code with quite a deep stack that is using the copy builtin
on some memory mapped with syscall.Mmap.
We've seen crashes with:
linux/amd64 devel +e274daf75c26 Thu Jun 13 17:04:47 2013 -0700
linux/amd64 devel +52d53d0e177e Fri Jun 14 11:14:45 2013 +0200
We also saw one segfault in the same test a few days ago while running:
linux/amd64 devel +ae79f385177d Wed Jun 12 14:05:13 2013 +1000
It didn't produce a stack trace in that case either.
It's quite a large chunk of code, but it hasn't seen any changes in a while, so I think
it's probably the runtime, not us.
If it would help we can try and reduce it to something we can submit to the issue
tracker. It's going to take us a while, so if anyone has any ideas before we start, that
would be nice.

@davecheney
Copy link
Contributor

Comment 13:

Printing g0 would be very useful.

@rsc
Copy link
Contributor Author

rsc commented Jun 19, 2013

Comment 14:

An update:
1) Rob's split stack overflow. All the signs point to the stack trace being correct
except for s/Syscall/RawSyscall/ (itself a long story). That is, the goroutine stack was
definitely in the call to fork, or it had returned from fork but not yet executed any
other calls. That fact, combined with Rob's comment that the go command (which this is
clearly a trace for) continued executing instead of dying, very strongly suggests that
the split stack overflow happened in the child of fork. I cannot explain how. I have
multiple theories, all fairly unlikely, and all involving signals arriving at the child
at just the wrong time. The most likely theory combines the signal with an OS X kernel
bug. None are worth recounting in detail. CL 10360048, out for review, cleans up the
stack trace that a split stack overflow shows. If the bug happens again (and it's not
the OS X kernel bug) then the stack trace should have enough information next time.
2) I still haven't written any real code, so I haven't run into the problem again. But I
am willing to assume for now that I only observed the problem before rebuilding on June
13 around 12:00 PDT and that the fix is the same as 3. (Until then my Go tree on that
machine had last been built on April 29.)
3) Fixed (see previous update, in comment #11).
Once CL 10360048 is submitted, I will close this bug, and then we can review the
preemption stuff. After that, I hope there will be no changes to anything involving
stacks again for a while, so that things can settle.

@rsc
Copy link
Contributor Author

rsc commented Jun 27, 2013

Comment 15:

This issue was closed by revision 6fa3c89.

Status changed to Fixed.

@rsc rsc added fixed labels Jun 27, 2013
@rsc rsc self-assigned this Jun 27, 2013
@golang golang locked and limited conversation to collaborators Jun 24, 2016
@rsc rsc removed their assignment Jun 22, 2022
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants