runtime: crash with "leftover defer" #11023

rhysh · 2015-06-01T16:38:59Z

$ go version
go version devel +8cd191b Sat May 30 12:21:56 2015 +0000 linux/amd64
$ uname -a | awk '$2="host"'
Linux host 3.13.0-52-generic #86~precise1-Ubuntu SMP Tue May 5 18:08:21 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

I have a process that receives data over a few hundred concurrent TCP connections and writes them to files. It's been crashing on recent versions of tip (it was stable on 1.4.1) with messages like "runtime: g422: leftover defer sp=0xdeaddeaddeaddead pc=0x48d60e" and "runtime: g883: leftover defer sp=0x6576452e67736d62 pc=0x69203a746553746e". This may be related to #9610 and #10941, but I've seen it (once) with sp != 0xdeaddeaddeaddead .

runtime: g422: leftover defer sp=0xdeaddeaddeaddead pc=0x48d60e
    defer 0xc2087e7680 sp=0xdeaddeaddeaddead pc=0x48d60e
fatal error: traceback has leftover defers

runtime stack:
runtime.throw(0x8ce070, 0x1d)
    /usr/local/go/src/runtime/panic.go:527 +0x96
runtime.gentraceback(0x4435a0, 0xc208780a78, 0x0, 0xc208848a00, 0x0, 0x0, 0x7fffffff, 0x9298a0, 0xc208045e38, 0x0, ...)
    /usr/local/go/src/runtime/traceback.go:450 +0x7b9
runtime.copystack(0xc208848a00, 0x1000)
    /usr/local/go/src/runtime/stack1.go:569 +0x180
runtime.newstack()
    /usr/local/go/src/runtime/stack1.go:757 +0xb8a
runtime.morestack()
    /usr/local/go/src/runtime/asm_amd64.s:330 +0x82

goroutine 422 [copystack]:
runtime.cansemacquire(0xc208ed80e8, 0xc208848a80)
    /usr/local/go/src/runtime/sema.go:151 fp=0xc208780a80 sp=0xc208780a78
runtime.semacquire(0xc208ed80e8, 0xc208848a01)
    /usr/local/go/src/runtime/sema.go:69 +0x5d fp=0xc208780ae0 sp=0xc208780a80
net.runtime_Semacquire(0xc208ed80e8)
    /usr/local/go/src/runtime/sema.go:48 +0x2d fp=0xc208780af8 sp=0xc208780ae0
net.(*fdMutex).RWLock(0xc208ed80e0, 0x1, 0x0)
    /usr/local/go/src/net/fd_mutex.go:143 +0x16f fp=0xc208780b60 sp=0xc208780af8
net.(*netFD).readLock(0xc208ed80e0, 0x0, 0x0)
    /usr/local/go/src/net/fd_unix.go:161 +0x40 fp=0xc208780b80 sp=0xc208780b60
net.(*netFD).Read(0xc208ed80e0, 0xc20896e00e, 0xff2, 0xff2, 0x0, 0x0, 0x0)
    /usr/local/go/src/net/fd_unix.go:221 +0x6f fp=0xc208780c50 sp=0xc208780b80
net.(*conn).Read(0xc2089ba010, 0xc20896e00e, 0xff2, 0xff2, 0x0, 0x0, 0x0)
    /usr/local/go/src/net/net.go:124 +0xe7 fp=0xc208780cb8 sp=0xc208780c50
bufio.(*Scanner).Scan(0xc20882b280, 0xc2080b4240)
    /usr/local/go/src/bufio/scan.go:180 +0x87a fp=0xc208780e78 sp=0xc208780cb8
redacted(0xc20812a0a0, 0x7f375ce70a30, 0xc2089ba010, 0xc2080b4240)
    /redacted.go:99 +0x93 fp=0xc208780fc0 sp=0xc208780e78
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:1670 +0x1 fp=0xc208780fc8 sp=0xc208780fc0
created by redacted
    /redacted.go:81 +0x21e

goroutine 1 [chan receive]:
main.main()
    /redacted.go:230 +0xe7b

goroutine 5 [chan send]:
redacted(0xc20809e000)
    /redacted.go:199 +0x1f0
created by redacted.init.1
    /redacted.go:184 +0x6e

[snip]

The stack trace with non-0xdead address is a bit more interesting - it looks like two separate threads are dumping stacks.

runtime: g883: leftover defer sp=0x6576452e67736d62 pc=0x69203a746553746e
    defer 0xc20af93780 sp=0x6576452e67736d62 pc=0x69203a746553746e
fatal error: unexpected signal during runtime execution
[signal 0xb code=0x80 addr=0x0 pc=0x452ba4]

runtime stack:
runtime.throw(0x8ef170, 0x2a)
    /usr/local/go/src/runtime/panic.go:527 +0x96
runtime.sigpanic()
    /usr/local/go/src/runtime/sigpanic_unix.go:12 +0x5d
runtime.gentraceback(0x4435a0, 0xc2088e8a78, 0x0, 0xc2095de000, 0x0, 0x0, 0x7fffffff, 0x9298a0, 0xc2085f3e38, 0x0, ...)
    /usr/local/go/src/runtime/traceback.go:448 +0x6d4
runtime.copystack(0xc2095de000, 0x1000)
    /usr/local/go/src/runtime/stack1.go:569 +0x180
runtime.newstack()
    /usr/local/go/src/runtime/stack1.go:757 +0xb8a
runtime.morestack()
    /usr/local/go/src/runtime/asm_amd64.s:330 +0x82

goroutine 883 [copystack]:
runtime.cansemacquire(0xc208ab2c48, 0xc2095de080)
    /usr/local/go/src/runtime/sema.go:151 fp=0xc2088e8a80 sp=0xc2088e8a78
runtime.semacquire(0xc208ab2c48, 0xc2095de001)
    /usr/local/go/src/runtime/sema.go:69 +0x5d fp=0xc2088e8ae0 sp=0xc2088e8a80
net.runtime_Semacquire(0xc208ab2c48)
    /usr/local/go/src/runtime/sema.go:48 +0x2d fp=0xc2088e8af8 sp=0xc2088e8ae0
net.(*fdMutex).RWLock(0xc208ab2c40, 0x1, 0x0)
    /usr/local/go/src/net/fd_mutex.go:143 +0x16f fp=0xc2088e8b60 sp=0xc2088e8af8
net.(*netFD).readLock(0xc208ab2c40, 0x0, 0x0)
    /usr/local/go/src/net/fd_unix.go:161 +0x40 fp=0xc2088e8b80 sp=0xc2088e8b60
net.(*netFD).Read(0xc208ab2c40, 0xc20886b081, 0xf7f, 0xf7f, 0x0, 0x0, 0x0)
    /usr/local/go/src/net/fd_unix.go:221 +0x6f fp=0xc2088e8c50 sp=0xc2088e8b80
net.(*conn).Read(0xc209ab0508, 0xc20886b081, 0xf7f, 0xf7f, 0x0, 0x0, 0x0)
    /usr/local/go/src/net/net.go:124 +0xe7 fp=0xc2088e8cb8 sp=0xc2088e8c50
bufio.(*Scanner).Scan(0xc20894b280, 0xc2080c0240)
runtime: g382: leftover defer sp=0x6576452e67736d62 pc=0x69203a746553746e
    defer 0xc20dfabfc0 sp=0x6576452e67736d62 pc=0x69203a746553746e
fatal error: unexpected signal during runtime execution
    /usr/local/go/src/bufio/scan.go:180 +0x87a fp=0xc2088e8e78 sp=0xc2088e8cb8
redacted(0xc20812a0a0, 0x7f4989e2f8d8, 0xc209ab0508, 0xc2080c0240)
    /redacted.go:99 +0x93 fp=0xc2088e8fc0 sp=0xc2088e8e78
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:1670 +0x1 fp=0xc2088e8fc8 sp=0xc2088e8fc0
created by redacted
    /redacted.go:81 +0x21e

goroutine 1 [chan receive, 6 minutes]:
main.main()
    /redacted.go:230 +0xe7b

goroutine 5 [chan send]:
redacted(0xc20809e000)
    /redacted.go:199 +0x1f0
created by redacted.init.1
    /redacted.go:184 +0x6e

[snip]

The text was updated successfully, but these errors were encountered:

gopherbot · 2015-06-04T23:00:43Z

CL https://golang.org/cl/10713 mentions this issue.

gopherbot · 2015-06-05T22:00:35Z

CL https://golang.org/cl/10791 mentions this issue.

Issues golang#10240, golang#10541, golang#10941, golang#11023, golang#11027 and possibly others are indicating memory corruption in the runtime. One of the easiest places to both get corruption and detect it is in the allocator's free lists since they appear throughout memory and follow strict invariants. This commit adds a check when sweeping a span that its free list is sane and, if not, it prints the corrupted free list and panics. Hopefully this will help us collect more information on these failures. Change-Id: I6d417bcaeedf654943a5e068bd76b58bb02d4a64

Stack barriers assume that writes through pointers to frames above the current frame will get write barriers, and hence these frames do not need to be re-scanned to pick up these changes. For normal writes, this is true. However, there are places in the runtime that use typedmemmove to potentially write through pointers to higher frames (such as mapassign1). Currently, typedmemmove does not execute write barriers if the destination is on the stack. If there's a stack barrier between the current frame and the frame being modified with typedmemmove, and the stack barrier is not otherwise hit, it's possible that the garbage collector will never see the updated pointer and incorrectly reclaim the object. Fix this by making heapBitsBulkBarrier (which lies behind typedmemmove and its variants) detect when the destination is in the stack and unwind stack barriers up to the point, forcing mark termination to later rescan the effected frame and collect these pointers. Fixes #11084. Might be related to #10240, #10541, #10941, #11023, #11027 and possibly others. Change-Id: I323d6cd0f1d29fa01f8fc946f4b90e04ef210efd Reviewed-on: https://go-review.googlesource.com/10791 Reviewed-by: Russ Cox <rsc@golang.org>

Issues #10240, #10541, #10941, #11023, #11027 and possibly others are indicating memory corruption in the runtime. One of the easiest places to both get corruption and detect it is in the allocator's free lists since they appear throughout memory and follow strict invariants. This commit adds a check when sweeping a span that its free list is sane and, if not, it prints the corrupted free list and panics. Hopefully this will help us collect more information on these failures. Change-Id: I6d417bcaeedf654943a5e068bd76b58bb02d4a64 Reviewed-on: https://go-review.googlesource.com/10713 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>

aclements · 2015-06-22T18:32:37Z

Hi @rhysh. We've fixed several memory corruption and lost write barrier issues in the runtime over the past few weeks. Please try to reproduce the problem with current master and reopen this issue if it's still happening. Thanks!

rhysh · 2015-06-23T05:24:00Z

Hi @aclements - I ran my app with 8fa1a69 (from 17 Jun 2015) for a few days and did not observe any crashes. It looks like this is resolved. Thanks!

ianlancetaylor added this to the Go1.5 milestone Jun 1, 2015

rhysh mentioned this issue Jun 2, 2015

runtime: crash with signal 0xb near runtime.sweepone #11027

Closed

aclements closed this as completed Jun 22, 2015

golang locked and limited conversation to collaborators Jun 25, 2016

gopherbot added the FrozenDueToAge label Jun 25, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runtime: crash with "leftover defer" #11023

runtime: crash with "leftover defer" #11023

rhysh commented Jun 1, 2015

gopherbot commented Jun 4, 2015

gopherbot commented Jun 5, 2015

aclements commented Jun 22, 2015

rhysh commented Jun 23, 2015

runtime: crash with "leftover defer" #11023

runtime: crash with "leftover defer" #11023

Comments

rhysh commented Jun 1, 2015

gopherbot commented Jun 4, 2015

gopherbot commented Jun 5, 2015

aclements commented Jun 22, 2015

rhysh commented Jun 23, 2015