Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: systemstack called from unexpected goroutine/traceback has leftover defers #31538

Closed
t57root opened this issue Apr 18, 2019 · 5 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@t57root
Copy link

t57root commented Apr 18, 2019

What version of Go are you using (go version)?

go version go1.12.4 linux/amd64

Does this issue reproduce with the latest release?

Yes, go version go1.12.4 linux/amd64

What operating system and processor architecture are you using (go env)?

go env Output
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/build/src/zagen/../.."
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build611199415=/tmp/go-build"

What did you do?

The program sometimes crashed with backtrace in stderr but sometimes crashed with nothing.

At first the code is golang mixed with some c code, which crashed with error traceback has leftover defers , and the backtrace is attached as BT#1.
Then I insert a dead-loop of sleep to the code before it running into the c code, and it crashed with error systemstack called from unexpected goroutineSIGTRAP: trace trap. Backtrace is attached as BT#2.

The crash only happens on two specific servers, and was never seen on other boxes before.

What did you expect to see?

Run without crash.

What did you see instead?

BT#1
runtime: g1: leftover defer sp=0x0 pc=0x0
        defer 0xc0000b4000 sp=0x0 pc=0x0
fatal error: traceback has leftover defers

runtime stack:
runtime.throw(0xaed3f5, 0x1d)
/usr/local/go/src/runtime/panic.go:617 +0x72
runtime.gentraceback(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0xc000000180, 0x0, 0x0, 0x7fffffff, 0x7f1ac0ff8b80, 0x0, 0x0, ...)
/usr/local/go/src/runtime/traceback.go:563 +0x1c8c
runtime.scanstack(0xc000000180, 0xc000035270)
/usr/local/go/src/runtime/mgcmark.go:711 +0x15f
runtime.scang(0xc000000180, 0xc000035270)
/usr/local/go/src/runtime/proc.go:888 +0x1e4
runtime.markroot.func1()
/usr/local/go/src/runtime/mgcmark.go:221 +0x6e
runtime.markroot(0xc000035270, 0x5)
/usr/local/go/src/runtime/mgcmark.go:202 +0x2e9
runtime.gcDrainN(0xc000035270, 0x10000, 0x7f1accc34108)
/usr/local/go/src/runtime/mgcmark.go:1019 +0x100
runtime.gcAssistAlloc1(0xc000216900, 0x10000)
/usr/local/go/src/runtime/mgcmark.go:512 +0xf5
runtime.gcAssistAlloc.func1()
/usr/local/go/src/runtime/mgcmark.go:423 +0x33
runtime.systemstack(0x0)
/usr/local/go/src/runtime/asm_amd64.s:351 +0x66
runtime.mstart()
/usr/local/go/src/runtime/proc.go:1153

goroutine 59 [GC assist marking]:
runtime.systemstack_switch()
/usr/local/go/src/runtime/asm_amd64.s:311 fp=0xc000276d40 sp=0xc000276d38 pc=0x45f500
runtime.gcAssistAlloc(0xc000216900)
/usr/local/go/src/runtime/mgcmark.go:422 +0x157 fp=0xc000276da0 sp=0xc000276d40 pc=0x426b57
runtime.mallocgc(0x6c, 0x0, 0x0, 0xc0003a2060)
/usr/local/go/src/runtime/malloc.go:843 +0x8e6 fp=0xc000276e40 sp=0xc000276da0 pc=0x414966
runtime.slicebytetostring(0x0, 0xc0004c0000, 0x6c, 0x600, 0x0, 0x0)
/usr/local/go/src/runtime/string.go:102 +0xa1 fp=0xc000276e70 sp=0xc000276e40 pc=0x44e461
zagen/monitor.getIo(0x2f0cb, 0xc0000b7f20)
/home/build/src/zagen/monitor/resource_reporter.go:128 +0xee fp=0xc000276f48 sp=0xc000276e70 pc=0x912ade
zagen/monitor.ResourceReport(0xadafa7, 0x6, 0xadb6af, 0x6, 0x2f0cb, 0x3c)
/home/build/src/zagen/monitor/resource_reporter.go:204 +0xba fp=0xc000276fb0 sp=0xc000276f48 pc=0x91359a
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc000276fb8 sp=0xc000276fb0 pc=0x4615d1
created by main.MonitorEntry
/home/build/src/zagen/monitor.go:90 +0x7f6

goroutine 1 [syscall (scan)]:
zagen/monitor._Cfunc_limit_start(0x2f0cb, 0x0)
_cgo_gotypes.go:47 +0x4d
zagen/monitor.SigLimit(...)
/home/build/src/zagen/monitor/siglimit.go:350
main.MonitorEntry(0x7ffff16f1726, 0x17, 0xc0001c0000, 0x3, 0x0, 0xc00014e030, 0x29)
/home/build/src/zagen/monitor.go:94 +0x869
main.main()
/home/build/src/zagen/main.go:112 +0xa8c

goroutine 18 [syscall]:
os/signal.signal_recv(0x0)
/usr/local/go/src/runtime/sigqueue.go:139 +0x9c
os/signal.loop()
/usr/local/go/src/os/signal/signal_unix.go:23 +0x22
created by os/signal.init.0
/usr/local/go/src/os/signal/signal_unix.go:29 +0x41

goroutine 10 [chan receive]:
zagen/rpc.(*Logger).Daemon(0xc00000ed00)
/home/build/src/zagen/rpc/log.go:138 +0x63
created by main.MonitorEntry
/home/build/src/zagen/monitor.go:38 +0x1ff

goroutine 57 [select]:
zagen/rpc.(*JsonRpc).Daemon(0xc00012a3c0)
/home/build/src/zagen/rpc/json.go:234 +0x135
created by main.MonitorEntry
/home/build/src/zagen/monitor.go:53 +0x407

goroutine 58 [select]:
zagen/rpc.(*Transporter).Work(0xc0002fa000)
/home/build/src/zagen/rpc/trans.go:84 +0xd5
created by main.MonitorEntry
/home/build/src/zagen/monitor.go:62 +0x540

goroutine 60 [runnable]:
runtime.SetFinalizer(0xa3afa0, 0xc0003df200, 0x0, 0x0)
/usr/local/go/src/runtime/mfinal.go:309 +0x812
os.(*file).close(0xc0003df200, 0xc0003df378, 0x0)
/usr/local/go/src/os/file_unix.go:256 +0x109
os.(*File).Close(0xc000232358, 0xc0003df320, 0xc00009fdd8)
/usr/local/go/src/os/file_unix.go:237 +0x33
os/exec.(*Cmd).closeDescriptors(0xc000304580, 0xc000246d10, 0x1, 0x1)
/usr/local/go/src/os/exec/exec.go:297 +0x5a
os/exec.(*Cmd).Wait(0xc000304580, 0x0, 0x0)
/usr/local/go/src/os/exec/exec.go:487 +0x182
os/exec.(*Cmd).Run(0xc000304580, 0xc, 0xc00009fef8)
/usr/local/go/src/os/exec/exec.go:318 +0x5c
zagen/monitor.getCpu(0xadb6af, 0x6, 0x2f0cc, 0xc0003df140)
/home/build/src/zagen/monitor/resource_reporter.go:47 +0x137
zagen/monitor.ResourceReport(0xadcb17, 0x7, 0xadb6af, 0x6, 0x2f0cc, 0x3c)
/home/build/src/zagen/monitor/resource_reporter.go:202 +0x84
created by main.MonitorEntry
/home/build/src/zagen/monitor.go:91 +0x84f

BT#2
fatal: systemstack called from unexpected goroutineSIGTRAP: trace trap
PC=0x45f5b3 m=5 sigcode=128

goroutine 0 [idle]:
invalid spdelta runtime.systemstack 0x45f510 0x45f5b3 0x68ff5 -1
runtime: unexpected return pc for runtime.systemstack called from 0x434b1d00
stack: frame={sp:0x7fa044993ce0, fp:0x7fa044993ce7} stack=[0x7fa0441942a8,0x7fa044993ea8)
00007fa044993be0: 0000000000000000 0000000000000000
00007fa044993bf0: 0000000000000000 0000000000000000
00007fa044993c00: 0000000000000000 0000000000000000
00007fa044993c10: 0000000000000000 0000000000000000
00007fa044993c20: 000000000042445e <runtime.gcBgMarkWorker+254> 000000020003f799
00007fa044993c30: 0000000000000000 0000000000000000
00007fa044993c40: 0000000000000000 0000000000000000
00007fa044993c50: 0000000000000000 0000000000000000
00007fa044993c60: 0000000000000000 0000000000000000
00007fa044993c70: 000000000043689e <runtime.gopark+238> 000000010004d085
00007fa044993c80: 0000000000000000 0000000000000000
00007fa044993c90: 0000000000000000 0000000000000000
00007fa044993ca0: 000000c0004b4000 000000000044f483 <runtime.badsystemstack+51>
00007fa044993cb0: 0000000000000002 0000000000af88f2
00007fa044993cc0: 0000000000000033 000000c000000033
00007fa044993cd0: 00007fa044993d08 000000000045f5b1 <runtime.systemstack+161>
00007fa044993ce0: <0000000000434b1d <runtime.throw+77> 00007fa044993cf0
00007fa044993cf0: 000000000045dd40 <runtime.throw.func1+0> 0000000000ae6066
00007fa044993d00: 0000000000000013 00007fa044993d40
00007fa044993d10: 0000000000412f87 <runtime.notesleep+231> 0000000000ae6066
00007fa044993d20: 0000000000000013 00007fa044993d40
00007fa044993d30: 000000c0000c0000 00007fa0491916d0
00007fa044993d40: 00007fa044993d68 000000000043a651 <runtime.stopm+193>
00007fa044993d50: 000000c0000c0148 000000c000036500
00007fa044993d60: 000000c0000c2180 00007fa044993e08
00007fa044993d70: 000000000043b7aa <runtime.findrunnable+1354> 000000c000085900
00007fa044993d80: 0000000000000001 0000000000000000
00007fa044993d90: 0000000000000000 0100000000000000
00007fa044993da0: 0000000000000000 000000000042ba9d <runtime.trygetfull+45>
00007fa044993db0: 0000000000000024 0000000000000023
00007fa044993dc0: 0000002400000024 0000001300000006
00007fa044993dd0: 000000c000036500 0000000000000000
00007fa044993de0: 000000000045d6f0 <runtime.wbBufFlush.func1+0>
runtime.systemstack(0x7fa044993cf000)
?:0 +0xa3

goroutine 1 [sleep, 26 minutes]:
runtime.goparkunlock(...)
/usr/local/go/src/runtime/proc.go:307
time.Sleep(0x34630b8a000)
/usr/local/go/src/runtime/time.go:105 +0x159
zagen/monitor.SigLimit(0x1a6, 0x0, 0xadca57)
/home/build/src/zagen/monitor/siglimit.go:360 +0x72
main.MonitorEntry(0x7fff67ed572a, 0x17, 0xc0001c8000, 0x3, 0x0, 0xc0001d8000, 0x29)
/home/build/src/zagen/monitor.go:94 +0x86a
main.main()
/home/build/src/zagen/main.go:112 +0xa8c

goroutine 18 [syscall, 26 minutes]:
os/signal.signal_recv(0x0)
/usr/local/go/src/runtime/sigqueue.go:139 +0x9c
os/signal.loop()
/usr/local/go/src/os/signal/signal_unix.go:23 +0x22
created by os/signal.init.0
/usr/local/go/src/os/signal/signal_unix.go:29 +0x41

goroutine 6 [chan receive, 2 minutes]:
zagen/rpc.(*Logger).Daemon(0xc00000ea60)
/home/build/src/zagen/rpc/log.go:138 +0x63
created by main.MonitorEntry
/home/build/src/zagen/monitor.go:38 +0x1ff

goroutine 51 [select, 6 minutes]:
zagen/rpc.(*JsonRpc).Daemon(0xc00012a280)
/home/build/src/zagen/rpc/json.go:234 +0x135
created by main.MonitorEntry
/home/build/src/zagen/monitor.go:53 +0x407

goroutine 52 [select]:
zagen/rpc.(*Transporter).Work(0xc0002d83f0)
/home/build/src/zagen/rpc/trans.go:84 +0xd5
created by main.MonitorEntry
/home/build/src/zagen/monitor.go:62 +0x540

goroutine 53 [sleep]:
runtime.goparkunlock(...)
/usr/local/go/src/runtime/proc.go:307
time.Sleep(0xdf8475800)
/usr/local/go/src/runtime/time.go:105 +0x159
zagen/monitor.ResourceReport(0xadaee7, 0x6, 0xadb5ef, 0x6, 0x1a6, 0x3c)
/home/build/src/zagen/monitor/resource_reporter.go:208 +0x49
created by main.MonitorEntry
/home/build/src/zagen/monitor.go:90 +0x7f6

goroutine 54 [chan receive]:
os/exec.(*Cmd).Wait(0xc0000f6000, 0x0, 0x0)
/usr/local/go/src/os/exec/exec.go:482 +0x126
os/exec.(*Cmd).Run(0xc0000f6000, 0xc, 0xc000236ef8)
/usr/local/go/src/os/exec/exec.go:318 +0x5c
zagen/monitor.getCpu(0xadb5ef, 0x6, 0x1a7, 0xc00014a660)
/home/build/src/zagen/monitor/resource_reporter.go:47 +0x137
zagen/monitor.ResourceReport(0xadca57, 0x7, 0xadb5ef, 0x6, 0x1a7, 0x3c)
/home/build/src/zagen/monitor/resource_reporter.go:202 +0x84
created by main.MonitorEntry
/home/build/src/zagen/monitor.go:91 +0x84f

goroutine 261 [IO wait]:
internal/poll.runtime_pollWait(0x7fa0490f9d10, 0x72, 0xffffffffffffffff)
/usr/local/go/src/runtime/netpoll.go:182 +0x56
internal/poll.(*pollDesc).wait(0xc00031c018, 0x72, 0x501, 0x56e, 0xffffffffffffffff)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x9b
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc00031c000, 0xc00042c692, 0x56e, 0x56e, 0x0, 0x0, 0x0)
/usr/local/go/src/internal/poll/fd_unix.go:169 +0x19b
os.(*File).read(...)
/usr/local/go/src/os/file_unix.go:263
os.(*File).Read(0xc0000100b8, 0xc00042c692, 0x56e, 0x56e, 0x92, 0x0, 0x0)
/usr/local/go/src/os/file.go:108 +0x70
bytes.(*Buffer).ReadFrom(0xc0002802a0, 0xba8480, 0xc0000100b8, 0x7fa0491350b8, 0xc0002802a0, 0x1)
/usr/local/go/src/bytes/buffer.go:207 +0xbd
io.copyBuffer(0xba7b00, 0xc0002802a0, 0xba8480, 0xc0000100b8, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/local/go/src/io/io.go:388 +0x2fc
io.Copy(...)
/usr/local/go/src/io/io.go:364
os/exec.(*Cmd).writerDescriptor.func1(0x0, 0x0)
/usr/local/go/src/os/exec/exec.go:288 +0x63
os/exec.(*Cmd).Start.func1(0xc0000f6000, 0xc00040c020)
/usr/local/go/src/os/exec/exec.go:409 +0x27
created by os/exec.(*Cmd).Start
/usr/local/go/src/os/exec/exec.go:408 +0x58f

rax 0x33
rbx 0xc0000c0000
rcx 0x462e05
rdx 0x33
rdi 0x2
rsi 0xaf88f2
rbp 0x7fa044993d08
rsp 0x7fa044993ce0
r8 0x0
r9 0x0
r10 0x0
r11 0x202
r12 0x56
r13 0x1192ec0
r14 0x0
r15 0x0
rip 0x45f5b3
rflags 0x206
cs 0x33
fs 0x0
gs 0x0

@t57root
Copy link
Author

t57root commented Apr 19, 2019

Also I could provide the coredump file which's generated when traceback has leftover defers occurred, if it helps.

@ianlancetaylor
Copy link
Contributor

Is there a way that we can reproduce the problem ourselves?

@ianlancetaylor ianlancetaylor added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker labels Apr 19, 2019
@ianlancetaylor ianlancetaylor added this to the Go1.13 milestone Apr 19, 2019
@t57root
Copy link
Author

t57root commented May 7, 2019

unfortunately still haven't worked out a minimal code which could reproduce the problem. But will keep trying to figure it out.

@andybons andybons added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jun 4, 2019
@aclements
Copy link
Member

Hi @t57root, were you able to create a reproducer?

@gopherbot
Copy link

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@golang golang locked and limited conversation to collaborators Jul 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

5 participants