-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: segfault with large heap #4981
Labels
Milestone
Comments
Can you also post stack traces? Are there any similarities? Here is the one from your email: fatal error: fault [signal 0xb code=0x1 addr=0xc614c20000 pc=0x421f2f] goroutine 353 [running]: [fp=0x7fb6a6acdd80] runtime.throw(0xf221b2) /build/go/go/src/pkg/runtime/panic.c:465 +0x67 [fp=0x7fb6a6acdd98] runtime.sigpanic() /build/go/go/src/pkg/runtime/thread_linux.c:217 +0xe7 [fp=0x7fb6a6acdda0] runtime.memmove() /build/go/go/src/pkg/runtime/memmove_amd64.s:46 +0x1f [fp=0x7fb6a6acddd8] runtime.copy() /build/go/go/src/pkg/runtime/slice.c:259 +0x13d [fp=0x7fb6a6acdea8] bytes.(*Buffer).ReadFrom(0xc2004df850, 0xc2001813c0, 0xc200000280, 0xfffffc36, 0x0, ...) /build/go/go/src/pkg/bytes/buffer.go:162 +0x157 [fp=0x7fb6a6acdf30] io.Copy(0xc200448e10, 0xc2004df850, 0xc2001813c0, 0xc200000280, 0x0, ...) /build/go/go/src/pkg/io/io.go:328 +0xa3 [fp=0x7fb6a6acdf80] os/exec.func·003(0x8c1a68, 0xc20046a2a0) /build/go/go/src/pkg/os/exec/exec.go:207 +0x60 [fp=0x7fb6a6acdfb0] os/exec.func·004(0xc200384160) /build/go/go/src/pkg/os/exec/exec.go:274 +0x2a [fp=0x7fb6a6acdfb8] runtime.goexit() /build/go/go/src/pkg/runtime/proc.c:1149 created by os/exec.(*Cmd).Start /build/go/go/src/pkg/os/exec/exec.go:275 +0x6ac |
1b4b6a694b1c is fine. this means one of two things: 1. 231af8ac63aa changes the behaviour of the inside program significantly, causing it log a lot than it ever has before, exposing an old bug. 2. 231af8ac63aa changed the outside program. I'll do some more tests to try to eliminate the first option. it seem significant that every single time it crashed, it was preceded by a GC that didn't manage to collect *anything* here's two more: gc32(8): 3+0+1123 ms, 15069 -> 15069 MB 10143 -> 10143 (68405-58262) objects, 49(2387) handoff, 45(662) steal, 595/374/105 yields scvg32: GC forced scvg32: inuse: 15070, idle: 1631, sys: 16702, released: 597, consumed: 16104 (MB) unexpected fault address 0xc613e90000 scvg30: inuse: 6402, idle: 2187, sys: 8590, released: 203, consumed: 8386 (MB) gc35(8): 3+0+0 ms, 14593 -> 14593 MB 9640 -> 9640 (68401-58761) objects, 108(5489) handoff, 39(684) steal, 808/379/91 yields unexpected fault address 0xc618e70000 I've tried to build a program to simulate the use of os/exec to run a bunch of programs that produce a lot of output that is buffered, but I haven't been able to trigger this "unable to GC anything whatsoever" condition yet. |
The GC collects ~1000 objects, but they seem to be small. I think the GC happens after allocation of a huge object (few GBs), soon after the previous GC. Looking at these records: gc37(8): 3+0+0 ms, 6523 -> 6523 MB 9490 -> 9489 (68285-58796) objects, 65(2826) handoff, 38(541) steal, 667/411/122 yields gc38(8): 2+0+0 ms, 14715 -> 14715 MB 9490 -> 9490 (68287-58797) objects, 100(5268) handoff, 39(743) steal, 809/426/86 yields it seems that the GC happens after allocation of 8192MB object. Can it be exactly the bytes.Buffer that we've seen in the stack trace? |
I have caught the crash inside gdb: #0 runtime.memmove (to=void, fr=void, n=void) at /build/next/go/src/pkg/runtime/memmove_amd64.s:46 #1 0x000000000041b5a2 in runtime.copy (to=void, fm=void, width=void, ret=void) at /build/next/go/src/pkg/runtime/slice.c:230 #2 0x0000000000535e9c in bytes.(*Buffer).ReadFrom (fp=void, b=0xc20051bbd0, r=..., n=4294966338, err=0) at /build/next/go/src/pkg/bytes/buffer.go:162 #3 0x00000000004778a4 in io.Copy (fp=void, dst=..., src=..., written=0, err=0) at /build/next/go/src/pkg/io/io.go:328 #4 0x00000000005068a4 in os/exec.func·003 (fp=void, ~anon0="<bad dynamic type>") at /build/next/go/src/pkg/os/exec/exec.go:207 #5 0x0000000000506928 in os/exec.func·004 (fp=void, fn={void (error *)} 0x7ff2040b2fb0) at /build/next/go/src/pkg/os/exec/exec.go:274 #6 0x0000000000416760 in ?? () at /build/next/go/src/pkg/runtime/proc.c:1126 #7 0x000000c2006e8e40 in ?? () #8 0x0000000000000000 in ?? () void stuff looks a bit suspect. n=4294966338 looks exciting... |
with GOGC=off at atom's suggestion <some output snipped> scvg22: GC forced scvg22: inuse: 1390, idle: 0, sys: 1390, released: 0, consumed: 1390 (MB) scvg23: GC forced scvg23: inuse: 2926, idle: 0, sys: 2926, released: 0, consumed: 2926 (MB) scvg24: GC forced scvg24: inuse: 4974, idle: 0, sys: 4974, released: 0, consumed: 4974 (MB) scvg25: GC forced scvg25: inuse: 4974, idle: 0, sys: 4974, released: 0, consumed: 4974 (MB) scvg26: GC forced scvg26: inuse: 4974, idle: 0, sys: 4974, released: 0, consumed: 4974 (MB) scvg27: GC forced scvg27: inuse: 9070, idle: 0, sys: 9070, released: 0, consumed: 9070 (MB) scvg28: GC forced scvg28: inuse: 9070, idle: 0, sys: 9070, released: 0, consumed: 9070 (MB) scvg29: GC forced scvg29: inuse: 9070, idle: 0, sys: 9070, released: 0, consumed: 9070 (MB) scvg30: GC forced scvg30: inuse: 9070, idle: 0, sys: 9070, released: 0, consumed: 9070 (MB) scvg31: GC forced scvg31: inuse: 9070, idle: 0, sys: 9070, released: 0, consumed: 9070 (MB) scvg32: GC forced scvg32: 0x0 MB released scvg32: inuse: 9070, idle: 0, sys: 9070, released: 0, consumed: 9070 (MB) unexpected fault address 0xc52efd0000 fatal error: fault [signal 0xb code=0x1 addr=0xc52efd0000 pc=0x42349f] goroutine 1 [running]: [fp=0x7f366a568ae8] runtime.throw(0x11f326a) /build/next/go/src/pkg/runtime/panic.c:465 +0x67 [fp=0x7f366a568b00] runtime.sigpanic() /build/next/go/src/pkg/runtime/thread_linux.c:217 +0xe7 [fp=0x7f366a568b08] runtime.memmove() /build/next/go/src/pkg/runtime/memmove_amd64.s:46 +0x1f [fp=0x7f366a568b40] runtime.slicebytetostring(0xc336e9b000, 0xf810c6a9, 0xfffffe00, 0xc436e9b000, 0xf810c6a9, ...) /build/next/go/src/pkg/runtime/zstring_linux_amd64.c:300 +0x96 [fp=0x7f366a568ba0] foo/testing/runner.RunCommandCombined(0xc200aa45e0, 0x3f00000001, 0xc236dc4bd0, 0x200076f6db, 0x408470, ...) this function does: out, err := command.Exec.CombinedOutput() return string(out), err Exec is a *os/exec.Cmd [fp=0x7f366a568c88] foo/testing/runner.RunCommandTestCase(0xc200aa45e0, 0xc236e90a20, 0x0, 0x0, 0x0, ...) [fp=0x7f366a568d38] foo/testing/runner.RunCommandTestSuite(0xc200aa45e0, 0xcb3eb0, 0xa, 0xc236e6bab0, 0x1, ...) [fp=0x7f366a568e08] foo/testing/unit.(*GotestRunner).Run(0xc200371630, 0xc200aa45e0, 0x1, 0x0, 0x0, ...) [fp=0x7f366a568ec8] foo/testing/runner.Run(0xc200357c60, 0xc200371630, 0xc2003ab5a0, 0x1a, 0xc85af0, ...) [fp=0x7f366a568f90] main.main() /build/foo/cmd/testing/footest/footest.go:45 +0x3aa [fp=0x7f366a568fb8] runtime.main() /build/go/src/pkg/runtime/proc.c:180 +0xab [fp=0x7f366a568fc0] runtime.goexit() /build/go/src/pkg/runtime/proc.c:1131 |
I think I got it. http://play.golang.org/p/kB0-5J8c_9 $ go build bork.go && GOGC=off GOGCTRACE=1 /usr/bin/time -v ./bork 4883073352 5302471482 5893650854 5076787217 415263038 854870012 89900999 6255975042 unexpected fault address 0xd45cd40000 fatal error: fault [signal 0xb code=0x1 addr=0xd45cd40000 pc=0x4181bf] goroutine 3 [running]: [fp=0x7f2d9bae3ee8] runtime.throw(0x448222) /build/go/go/src/pkg/runtime/panic.c:465 +0x67 [fp=0x7f2d9bae3f00] runtime.sigpanic() /build/go/go/src/pkg/runtime/thread_linux.c:217 +0xe7 [fp=0x7f2d9bae3f08] runtime.memmove() /build/go/go/src/pkg/runtime/memmove_amd64.s:46 +0x1f [fp=0x7f2d9bae3f40] runtime.copy() /build/go/go/src/pkg/runtime/slice.c:259 +0x13d [fp=0x7f2d9bae3fb0] main.foo() /home/alberts/bork.go:9 +0xb8 [fp=0x7f2d9bae3fb8] main.func·001() /home/alberts/bork.go:22 +0x18 [fp=0x7f2d9bae3fc0] runtime.goexit() /build/go/go/src/pkg/runtime/proc.c:1151 created by main.main /home/alberts/bork.go:24 +0x5f the process used about 80 GB of RAM before falling over. |
Do you really need the goroutines to reproduce? I expect a single call to foo() to fail. Can you try http://play.golang.org/p/9yl3-f74pA ? Can you s/MOVLQSX/MOVQ at src/pkg/runtime/memmove_amd64.s:30 and see if it fixes the issue? You might need to also change the memmove signature in runtime.h to use a uintptr as argument. |
your one didn't do it, but here is one that does: http://play.golang.org/p/Oq4dHz0qLX $ go build k.go && GOGC=off GOGCTRACE=1 ./k unexpected fault address 0xc2e15b0000 fatal error: fault [signal 0xb code=0x1 addr=0xc2e15b0000 pc=0x417d7f] goroutine 1 [running]: [fp=0x7fe698e12ef0] runtime.throw(0x43dd2a) /build/go/go/src/pkg/runtime/panic.c:465 +0x67 [fp=0x7fe698e12f08] runtime.sigpanic() /build/go/go/src/pkg/runtime/thread_linux.c:217 +0xe7 [fp=0x7fe698e12f10] runtime.memmove() /build/go/go/src/pkg/runtime/memmove_amd64.s:46 +0x1f [fp=0x7fe698e12f48] runtime.copy() /build/go/go/src/pkg/runtime/slice.c:259 +0x13d [fp=0x7fe698e12f90] main.main() /home/alberts/k.go:7 +0x92 [fp=0x7fe698e12fb8] runtime.main() /build/go/go/src/pkg/runtime/proc.c:182 +0xa1 [fp=0x7fe698e12fc0] runtime.goexit() /build/go/go/src/pkg/runtime/proc.c:1151 your change seems to fix the problem. I had to edit runtime.h too. |
This issue was closed by revision 1b8f51c. Status changed to Fixed. |
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The text was updated successfully, but these errors were encountered: