Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: TestCallersDeferNilFuncPanic failed in generating traceback (linux/arm64) #36050

Closed
shawndx opened this issue Dec 9, 2019 · 7 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@shawndx
Copy link
Contributor

shawndx commented Dec 9, 2019

What version of Go are you using (go version)?

$ go version
go version devel +acf3ff2e8a Tue Dec 3 17:35:06 2019 +0000 linux/arm64

Does this issue reproduce with the latest release?

Reproducible with the tip.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/home/xiaji01/.cache/go-build"
GOENV="/home/xiaji01/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/xiaji01/.go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/xiaji01/util/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/xiaji01/util/go/pkg/tool/linux_arm64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/xiaji01/util/go.dbg2/src/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build624397069=/tmp/go-build -gno-record-gcc-switches"

What did you do?

cd /src
GO_GCFLAGS='-N -l' ./all.bash

What did you expect to see?

building passes.

What did you see instead?

Testing of runtime package failed:

fatal error: traceback_arm: found jmpdefer when tracing with callback

It turns out TestCallersDeferNilFuncPanic triggers the error in function 'gentraceback', so far I could reproduce the issue with running './all.bash' only, both 'go test ...' and 'go tool dist test' works fine.

My questions:

  1. Does building Go with '-N -l' options make any sense?
  2. Any better way to debug such issues?

Thanks a lot.

@shawndx shawndx changed the title cmd/dist: TestCallersDeferNilFuncPanic failed in generating traceback cmd/dist: TestCallersDeferNilFuncPanic failed in generating traceback (linux/arm64) Dec 9, 2019
@randall77
Copy link
Contributor

@danscales

@randall77
Copy link
Contributor

Does building Go with '-N -l' options make any sense?

Yes, that is fine.

Any better way to debug such issues?

Posting the full stack trace that the program fails with would help.

It would be nice to find a way to get the test to fail by itself. If it fails during a GC interrupt (probably, but hard to tell without the full error log), using GOGC=1 will help tickle that bug. Also running the test in multiple goroutines simultaneously might also help.

@toothrot toothrot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 9, 2019
@toothrot toothrot added this to the Backlog milestone Dec 9, 2019
@ianlancetaylor ianlancetaylor changed the title cmd/dist: TestCallersDeferNilFuncPanic failed in generating traceback (linux/arm64) runtime: TestCallersDeferNilFuncPanic failed in generating traceback (linux/arm64) Dec 10, 2019
@ianlancetaylor ianlancetaylor modified the milestones: Backlog, Go1.15 Dec 10, 2019
@shawndx
Copy link
Contributor Author

shawndx commented Dec 10, 2019

Thanks @randall77.
Tried 'GOGC=1' and, multiple-goroutine way and others but no lucky, stack trace is enclosed. Thanks.

==============================================================
fatal error: traceback_arm: found jmpdefer when tracing with callback

runtime stack:
runtime.throw(0x312f1c, 0x38)
/home/xiaji01/work/go.testdefernil/src/runtime/panic.go:1106 +0x54 fp=0xffff5dc21560 sp=0xffff5dc21530 pc=0x488d4
runtime.gentraceback(0x46bb0, 0x4000385d20, 0x0, 0x4000302780, 0x0, 0x0, 0x7fffffff, 0xffff5dc218f0, 0x0, 0x0, ...)
/home/xiaji01/work/go.testdefernil/src/runtime/traceback.go:238 +0x145c fp=0xffff5dc21890 sp=0xffff5dc21560 pc=0x7154c
runtime.addOneOpenDeferFrame.func1()
/home/xiaji01/work/go.testdefernil/src/runtime/panic.go:713 +0x74 fp=0xffff5dc21910 sp=0xffff5dc21890 pc=0x7c604
runtime.systemstack(0xffff5dc21940)
/home/xiaji01/work/go.testdefernil/src/runtime/asm_arm64.s:237 +0xa0 fp=0xffff5dc21920 sp=0xffff5dc21910 pc=0x7e760
runtime.mstart()
/home/xiaji01/work/go.testdefernil/src/runtime/proc.go:1077 fp=0xffff5dc21920 sp=0xffff5dc21920 pc=0x4d420

goroutine 35 [running]:
runtime.systemstack_switch()
/home/xiaji01/work/go.testdefernil/src/runtime/asm_arm64.s:182 +0x8 fp=0x4000385c30 sp=0x4000385c20 pc=0x7e6a8
runtime.addOneOpenDeferFrame(0x4000302780, 0x46bb0, 0x4000385d20)
/home/xiaji01/work/go.testdefernil/src/runtime/panic.go:712 +0x64 fp=0x4000385c70 sp=0x4000385c30 pc=0x47bd4
panic(0x2b82e0, 0x5045b0)
/home/xiaji01/work/go.testdefernil/src/runtime/panic.go:921 +0xcc fp=0x4000385d20 sp=0x4000385c70 pc=0x481ec
runtime.panicmem()
/home/xiaji01/work/go.testdefernil/src/runtime/panic.go:212 +0x60 fp=0x4000385d40 sp=0x4000385d20 pc=0x46bb0
runtime.sigpanic()
/home/xiaji01/work/go.testdefernil/src/runtime/signal_unix.go:677 +0xe8 fp=0x4000385d80 sp=0x4000385d40 pc=0x605c8
runtime.jmpdefer(0x400039a0a0, 0x4000385dd8)
/home/xiaji01/work/go.testdefernil/src/runtime/asm_arm64.s:869 +0x1c fp=0x4000385d98 sp=0x4000385d90 pc=0x80cec
created by testing.(*T).Run
/home/xiaji01/work/go.testdefernil/src/testing/testing.go:1005 +0x530

goroutine 1 [chan receive, locked to thread]:
runtime.gopark(0x315a28, 0x4000396178, 0x170e, 0x2)
/home/xiaji01/work/go.testdefernil/src/runtime/proc.go:304 +0xb8 fp=0x400031f6e0 sp=0x400031f6b0 pc=0x4b128
runtime.chanrecv(0x4000396120, 0x400031f800, 0x4000000101, 0x11b610)
/home/xiaji01/work/go.testdefernil/src/runtime/chan.go:563 +0x1f0 fp=0x400031f770 sp=0x400031f6e0 pc=0x179b0
runtime.chanrecv1(0x4000396120, 0x400031f800)
/home/xiaji01/work/go.testdefernil/src/runtime/chan.go:433 +0x28 fp=0x400031f7a0 sp=0x400031f770 pc=0x17768
testing.(*T).Run(0x4000398120, 0x30b761, 0x1c, 0x316a80, 0x2a9a33c353000)
/home/xiaji01/work/go.testdefernil/src/testing/testing.go:1006 +0x558 fp=0x400031f910 sp=0x400031f7a0 pc=0x11b638
testing.runTests.func1(0x4000368000)
/home/xiaji01/work/go.testdefernil/src/testing/testing.go:1247 +0xac fp=0x400031f9b0 sp=0x400031f910 pc=0x12092c
testing.tRunner(0x4000368000, 0x400031fb98)
/home/xiaji01/work/go.testdefernil/src/testing/testing.go:954 +0x158 fp=0x400031fab0 sp=0x400031f9b0 pc=0x11b068
testing.runTests(0x400032e0a0, 0x50de20, 0x148, 0x148, 0x4000310100)
/home/xiaji01/work/go.testdefernil/src/testing/testing.go:1245 +0x344 fp=0x400031fbc0 sp=0x400031fab0 pc=0x11d1f4
testing.(*M).Run(0x4000364000, 0x0)
/home/xiaji01/work/go.testdefernil/src/testing/testing.go:1162 +0x378 fp=0x400031fe60 sp=0x400031fbc0 pc=0x11bf78
runtime_test.TestMain(0x4000364000)
/home/xiaji01/work/go.testdefernil/src/runtime/crash_test.go:28 +0x28 fp=0x400031fef0 sp=0x400031fe60 pc=0x1e2178
main.main()
_testmain.go:1172 +0xdc fp=0x400031ff70 sp=0x400031fef0 pc=0x27f8dc
runtime.main()
/home/xiaji01/work/go.testdefernil/src/runtime/proc.go:203 +0x1e4 fp=0x400031ffd0 sp=0x400031ff70 pc=0x4ad44
runtime.goexit()
/home/xiaji01/work/go.testdefernil/src/runtime/asm_arm64.s:1148 +0x4 fp=0x400031ffd0 sp=0x400031ffd0 pc=0x81034

goroutine 2 [force gc (idle)]:
runtime.gopark(0x315d28, 0x510aa0, 0x1411, 0x1)
/home/xiaji01/work/go.testdefernil/src/runtime/proc.go:304 +0xb8 fp=0x40002a8770 sp=0x40002a8740 pc=0x4b128
runtime.goparkunlock(0x510aa0, 0x4000001411, 0x1)
/home/xiaji01/work/go.testdefernil/src/runtime/proc.go:310 +0x4c fp=0x40002a87a0 sp=0x40002a8770 pc=0x4b1dc
runtime.forcegchelper()
/home/xiaji01/work/go.testdefernil/src/runtime/proc.go:253 +0xb0 fp=0x40002a87d0 sp=0x40002a87a0 pc=0x4afc0
runtime.goexit()
/home/xiaji01/work/go.testdefernil/src/runtime/asm_arm64.s:1148 +0x4 fp=0x40002a87d0 sp=0x40002a87d0 pc=0x81034
created by runtime.init.5
/home/xiaji01/work/go.testdefernil/src/runtime/proc.go:242 +0x30

goroutine 3 [GC sweep wait]:
runtime.gopark(0x315d28, 0x510ec0, 0x140c, 0x1)
/home/xiaji01/work/go.testdefernil/src/runtime/proc.go:304 +0xb8 fp=0x40002b6f70 sp=0x40002b6f40 pc=0x4b128
runtime.goparkunlock(0x510ec0, 0x35140c, 0x1)
/home/xiaji01/work/go.testdefernil/src/runtime/proc.go:310 +0x4c fp=0x40002b6fa0 sp=0x40002b6f70 pc=0x4b1dc
runtime.bgsweep(0x4000022380)
/home/xiaji01/work/go.testdefernil/src/runtime/mgcsweep.go:70 +0x98 fp=0x40002b6fd0 sp=0x40002b6fa0 pc=0x36ad8
runtime.goexit()
/home/xiaji01/work/go.testdefernil/src/runtime/asm_arm64.s:1148 +0x4 fp=0x40002b6fd0 sp=0x40002b6fd0 pc=0x81034
created by runtime.gcenable
/home/xiaji01/work/go.testdefernil/src/runtime/mgc.go:214 +0x54

goroutine 4 [GC scavenge wait]:
runtime.gopark(0x315d28, 0x510e80, 0x140d, 0x1)
/home/xiaji01/work/go.testdefernil/src/runtime/proc.go:304 +0xb8 fp=0x40002bcf20 sp=0x40002bcef0 pc=0x4b128
runtime.goparkunlock(0x510e80, 0x35140d, 0x1)
/home/xiaji01/work/go.testdefernil/src/runtime/proc.go:310 +0x4c fp=0x40002bcf50 sp=0x40002bcf20 pc=0x4b1dc
runtime.bgscavenge(0x4000022380)
/home/xiaji01/work/go.testdefernil/src/runtime/mgcscavenge.go:226 +0xd4 fp=0x40002bcfd0 sp=0x40002bcf50 pc=0x35474
runtime.goexit()
/home/xiaji01/work/go.testdefernil/src/runtime/asm_arm64.s:1148 +0x4 fp=0x40002bcfd0 sp=0x40002bcfd0 pc=0x81034
created by runtime.gcenable
/home/xiaji01/work/go.testdefernil/src/runtime/mgc.go:215 +0x74

goroutine 18 [finalizer wait]:
runtime.gopark(0x315d28, 0x53b288, 0x1410, 0x1)
/home/xiaji01/work/go.testdefernil/src/runtime/proc.go:304 +0xb8 fp=0x4000384f00 sp=0x4000384ed0 pc=0x4b128
runtime.goparkunlock(0x53b288, 0x4000021410, 0x1)
/home/xiaji01/work/go.testdefernil/src/runtime/proc.go:310 +0x4c fp=0x4000384f30 sp=0x4000384f00 pc=0x4b1dc
runtime.runfinq()
/home/xiaji01/work/go.testdefernil/src/runtime/mfinal.go:175 +0xac fp=0x4000384fd0 sp=0x4000384f30 pc=0x2ce9c
runtime.goexit()
/home/xiaji01/work/go.testdefernil/src/runtime/asm_arm64.s:1148 +0x4 fp=0x4000384fd0 sp=0x4000384fd0 pc=0x81034
created by runtime.createfing
/home/xiaji01/work/go.testdefernil/src/runtime/mfinal.go:156 +0x78
FAIL runtime 0.025s

@danscales danscales self-assigned this Dec 10, 2019
@danscales
Copy link
Contributor

@shawn-xdji thanks for the bug report and the stack trace.

I can reproduce this always with debug compilation of the individual runtime test on a linux-arm64 gomote:

cd /workdir/go/src; ../bin/go test -gcflags="-N -l" runtime -test.run TestCallersDeferNilFuncPanic

The issue is that we are calling gentraceback() with a callback in order to look for the next open-coded defer frame to process in the panic case. But calling gentraceback() with a callback on arm64 (lr architectures) throws an error if jmpdefer is on the stack (because jmpdefer register manipulation is non-atomic). This only happens in the one case where the defer pointer is null, since then there is a seg fault on the jmpdefer instruction that jumps though the defer pointer. Also, it only happens if stack-allocated defers are used (since open-coded defers don't use jmpdefer), hence why this only happens when debug flags are enabled.

The simplest fix is to force the seg fault to happen in deferreturn() just before the jmpdefer(). This will cause one extra load in cases when jmpdefer is used, but that is now rare, since stack-allocated defers are now rare.

@shawndx
Copy link
Contributor Author

shawndx commented Dec 11, 2019

Thanks a lot @danscales , I could reproduce the issue using your command line.

Just out of curiosity, why setting GO_GCFLAGS failed to reproduce the issue?

Both the following two work fine.

GO_GCFLAGS=all='-N -l' go test -run=TestCallersDeferNilFuncPanic runtime
GO_GCFLAGS=all='-N -l' go test -run=TestCallersDeferNilFuncPanic runtime

@cherrymui
Copy link
Member

GO_GCFLAGS is only used in make.bash/all.bash. After that it is not used. Use -gcflags instead.

@gopherbot
Copy link

Change https://golang.org/cl/210978 mentions this issue: runtime: force segv for nil defer function to be in deferreturn()

@golang golang locked and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

7 participants