Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: crashes on linux/riscv64 during runtime/pprof and os/signal tests #49709

Closed
rhysh opened this issue Nov 21, 2021 · 15 comments
Closed

runtime: crashes on linux/riscv64 during runtime/pprof and os/signal tests #49709

rhysh opened this issue Nov 21, 2021 · 15 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@rhysh
Copy link
Contributor

rhysh commented Nov 21, 2021

What version of Go are you using (go version)?

$ go version
go version go1.17.3 darwin/amd64

(Cross-compiling to linux/riscv64)

Does this issue reproduce with the latest release?

Yes, this problem appears with go1.17.3, the latest stable release.

What operating system and processor architecture are you using (go env)?

I only have a working Go installation on my darwin/amd64 machine (below). I'm cross-compiling from there to linux/riscv64, where I see the problem.

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/rhysh/Library/Caches/go-build"
GOENV="/Users/rhysh/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/rhysh/work/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/rhysh/work"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GOVCS=""
GOVERSION="go1.17.3"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/pw/d_qmtcrd3vs0890gvmrq8qx80000gn/T/go-build338083123=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

I've got a small RISCV64 computer: Nezha with the Allwinner D1 SOC. I'm running the runtime/pprof tests at version go1.17.3 and seeing several types of "fatal error" failures.

$ GOOS=linux GOARCH=riscv64 go test -c -o /tmp/pprof.go1.17.3.test runtime/pprof
$ GOOS=linux GOARCH=riscv64 go build -o /tmp/stress golang.org/x/tools/cmd/stress
$ go version /tmp/pprof.go1.17.3.test 
/tmp/pprof.go1.17.3.test: go1.17.3
$ # scp and ssh to the linux/riscv64 host

sipeed@sipeed:~$ uname -a
Linux sipeed 5.4.61 #15 PREEMPT Thu Jun 10 06:50:03 UTC 2021 riscv64 GNU/Linux
sipeed@sipeed:~$ nohup ./stress ./pprof.go1.17.3.test -test.run=TestCPUProfileMultithreaded -test.short=true &

What did you expect to see?

Near-zero failure rate of runtime/pprof's TestCPUProfileMultithreaded test in short mode when using go1.17.3, and zero crashes.

What did you see instead?

Currently about 1.5% failure rate, all of which are crashes ("fatal error"). (On the plus side, I haven't seen any test failures for that test.)

In the last 42 minutes, the stress run has collected:

  • 22 crashes with "fatal error: unexpected signal during runtime execution" (Most with pc and addr of 0x0, some with pc and addr of 0x2, some with 0x80.)
  • 7 crashes with "fatal error: unknown caller pc"
  • 7 crashes with "fatal error: runtime: split stack overflow"

Here's an example of each:

"fatal error: unexpected signal during runtime execution" / "[signal SIGSEGV: segmentation violation code=0x1 addr=0x2 pc=0x2]"
sipeed@sipeed:~$ cat /tmp/go-stress-20211121T161735-4177513427
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x2 pc=0x2]

goroutine 0 [idle]:
runtime: unexpected return pc for runtime.sigpanic called from 0x2
stack: frame={sp:0x3fb005b730, fp:0x3fb005b760} stack=[0x3fb0054000,0x3fb005c000)
0x0000003fb005b630:  0x0000000000000000  0x000000000004cb6c <runtime.fatalthrow.func1+0x000000000000005c> 
0x0000003fb005b640:  0x000000000004c8c0 <runtime.throw+0x0000000000000060>  0x0000003fb005b708 
0x0000003fb005b650:  0x0000000000000000  0x0000003fb0000ea0 
0x0000003fb005b660:  0x010000000004cea4  0x0000000000000004 
0x0000003fb005b670:  0x000000000000001f  0x0000000000000002 
0x0000003fb005b680:  0x0000000000000002  0x0000000000000001 
0x0000003fb005b690:  0x00000000001ae451  0x000000000004cafc <runtime.fatalthrow+0x000000000000004c> 
0x0000003fb005b6a0:  0x0000003fb0000ea0  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b6b0:  0x0000003fb005b708  0x000000000004e4d8 <runtime.printunlock+0x0000000000000058> 
0x0000003fb005b6c0:  0x0000003fb005b708  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b6d0:  0x0000003fb0000ea0  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b6e0:  0x0000003fb005b6e8  0x000000000004cb10 <runtime.fatalthrow.func1+0x0000000000000000> 
0x0000003fb005b6f0:  0x0000003fb0000ea0  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b700:  0x0000003fb005b708  0x0000000000066a3c <runtime.sigpanic+0x000000000000029c> 
0x0000003fb005b710:  0x0000003fb005b718  0x000000000004c8d8 <runtime.throw.func1+0x0000000000000000> 
0x0000003fb005b720:  0x00000000001b170e  0x000000000000002a 
0x0000003fb005b730: <0x0000000000000002  0x00000000001b170e 
0x0000003fb005b740:  0x000000000000002a  0x0000000000000080 
0x0000003fb005b750:  0x0000000000000003  0x0000000000000000 
0x0000003fb005b760: >0x0000000000000003  0x00000000000658a4 <runtime.doSigPreempt+0x00000000000000bc> 
0x0000003fb005b770:  0x0000000000158554 <runtime/pprof.cpuHog1+0x0000000000000014>  0x00000000002a7328 
0x0000003fb005b780:  0x00000000002b6f80  0x0000000000000000 
0x0000003fb005b790:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7a0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7b0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7c0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7d0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7e0:  0x0000000000065ef8 <runtime.sighandler+0x0000000000000118>  0x0000003fb00ba820 
0x0000003fb005b7f0:  0x0000000000158554 <runtime/pprof.cpuHog1+0x0000000000000014>  0x0000003fb002ef58 
0x0000003fb005b800:  0x00000000001584d4 <runtime/pprof.cpuHogger+0x000000000000005c>  0x0000000000065f98 <runtime.sighandler+0x00000000000001b8> 
0x0000003fb005b810:  0x000000000006605c <runtime.sighandler+0x000000000000027c>  0x0000000000280638 
0x0000003fb005b820:  0x00000000002b6f80  0x0000000000065ad4 <runtime.sigtrampgo+0x000000000000019c> 
0x0000003fb005b830:  0x0000003fb00ba820  0x0000003fb005b8c8 
0x0000003fb005b840:  0x00000000001584d4 <runtime/pprof.cpuHogger+0x000000000000005c>  0x0000003fb00ba820 
0x0000003fb005b850:  0x0000003fb0030800  0x0000000000000000 
runtime.throw({0x1b170e, 0x2a})
	/usr/local/go/src/runtime/panic.go:1198 +0x60
runtime: unexpected return pc for runtime.sigpanic called from 0x2
stack: frame={sp:0x3fb005b730, fp:0x3fb005b760} stack=[0x3fb0054000,0x3fb005c000)
0x0000003fb005b630:  0x0000000000000000  0x000000000004cb6c <runtime.fatalthrow.func1+0x000000000000005c> 
0x0000003fb005b640:  0x000000000004c8c0 <runtime.throw+0x0000000000000060>  0x0000003fb005b708 
0x0000003fb005b650:  0x0000000000000000  0x0000003fb0000ea0 
0x0000003fb005b660:  0x010000000004cea4  0x0000000000000004 
0x0000003fb005b670:  0x000000000000001f  0x0000000000000002 
0x0000003fb005b680:  0x0000000000000002  0x0000000000000001 
0x0000003fb005b690:  0x00000000001ae451  0x000000000004cafc <runtime.fatalthrow+0x000000000000004c> 
0x0000003fb005b6a0:  0x0000003fb0000ea0  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b6b0:  0x0000003fb005b708  0x000000000004e4d8 <runtime.printunlock+0x0000000000000058> 
0x0000003fb005b6c0:  0x0000003fb005b708  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b6d0:  0x0000003fb0000ea0  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b6e0:  0x0000003fb005b6e8  0x000000000004cb10 <runtime.fatalthrow.func1+0x0000000000000000> 
0x0000003fb005b6f0:  0x0000003fb0000ea0  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b700:  0x0000003fb005b708  0x0000000000066a3c <runtime.sigpanic+0x000000000000029c> 
0x0000003fb005b710:  0x0000003fb005b718  0x000000000004c8d8 <runtime.throw.func1+0x0000000000000000> 
0x0000003fb005b720:  0x00000000001b170e  0x000000000000002a 
0x0000003fb005b730: <0x0000000000000002  0x00000000001b170e 
0x0000003fb005b740:  0x000000000000002a  0x0000000000000080 
0x0000003fb005b750:  0x0000000000000003  0x0000000000000000 
0x0000003fb005b760: >0x0000000000000003  0x00000000000658a4 <runtime.doSigPreempt+0x00000000000000bc> 
0x0000003fb005b770:  0x0000000000158554 <runtime/pprof.cpuHog1+0x0000000000000014>  0x00000000002a7328 
0x0000003fb005b780:  0x00000000002b6f80  0x0000000000000000 
0x0000003fb005b790:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7a0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7b0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7c0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7d0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7e0:  0x0000000000065ef8 <runtime.sighandler+0x0000000000000118>  0x0000003fb00ba820 
0x0000003fb005b7f0:  0x0000000000158554 <runtime/pprof.cpuHog1+0x0000000000000014>  0x0000003fb002ef58 
0x0000003fb005b800:  0x00000000001584d4 <runtime/pprof.cpuHogger+0x000000000000005c>  0x0000000000065f98 <runtime.sighandler+0x00000000000001b8> 
0x0000003fb005b810:  0x000000000006605c <runtime.sighandler+0x000000000000027c>  0x0000000000280638 
0x0000003fb005b820:  0x00000000002b6f80  0x0000000000065ad4 <runtime.sigtrampgo+0x000000000000019c> 
0x0000003fb005b830:  0x0000003fb00ba820  0x0000003fb005b8c8 
0x0000003fb005b840:  0x00000000001584d4 <runtime/pprof.cpuHogger+0x000000000000005c>  0x0000003fb00ba820 
0x0000003fb005b850:  0x0000003fb0030800  0x0000000000000000 
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:719 +0x29c

goroutine 8 [running]:
fatal error: unexpected signal during runtime execution
panic during panic
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x767d0]

goroutine 0 [idle]:
runtime: unexpected return pc for runtime.sigpanic called from 0x2
stack: frame={sp:0x3fb005b730, fp:0x3fb005b760} stack=[0x3fb0054000,0x3fb005c000)
0x0000003fb005b630:  0x0000000000000000  0x000000000004cb6c <runtime.fatalthrow.func1+0x000000000000005c> 
0x0000003fb005b640:  0x0000003fb0000ea0  0x0000003fb005b708 
0x0000003fb005b650:  0x0000000000000000  0x0000003fb0000ea0 
0x0000003fb005b660:  0x010000000004cea4  0x0000000000000004 
0x0000003fb005b670:  0x000000000000001f  0x0000000000000002 
0x0000003fb005b680:  0x0000000000000002  0x0000000000000001 
0x0000003fb005b690:  0x00000000001ae451  0x000000000004cafc <runtime.fatalthrow+0x000000000000004c> 
0x0000003fb005b6a0:  0x0000003fb0000ea0  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b6b0:  0x0000003fb005b708  0x000000000004e4d8 <runtime.printunlock+0x0000000000000058> 
0x0000003fb005b6c0:  0x0000003fb005b708  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b6d0:  0x0000003fb0000ea0  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b6e0:  0x0000003fb005b6e8  0x000000000004cb10 <runtime.fatalthrow.func1+0x0000000000000000> 
0x0000003fb005b6f0:  0x0000003fb0000ea0  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b700:  0x0000003fb005b708  0x0000000000066a3c <runtime.sigpanic+0x000000000000029c> 
0x0000003fb005b710:  0x0000003fb005b718  0x000000000004c8d8 <runtime.throw.func1+0x0000000000000000> 
0x0000003fb005b720:  0x00000000001b170e  0x000000000000002a 
0x0000003fb005b730: <0x0000000000000002  0x00000000001b170e 
0x0000003fb005b740:  0x000000000000002a  0x0000000000000080 
0x0000003fb005b750:  0x0000000000000003  0x0000000000000000 
0x0000003fb005b760: >0x0000000000000003  0x00000000000658a4 <runtime.doSigPreempt+0x00000000000000bc> 
0x0000003fb005b770:  0x0000000000158554 <runtime/pprof.cpuHog1+0x0000000000000014>  0x00000000002a7328 
0x0000003fb005b780:  0x00000000002b6f80  0x0000000000000000 
0x0000003fb005b790:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7a0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7b0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7c0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7d0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7e0:  0x0000000000065ef8 <runtime.sighandler+0x0000000000000118>  0x0000003fb00ba820 
0x0000003fb005b7f0:  0x0000000000158554 <runtime/pprof.cpuHog1+0x0000000000000014>  0x0000003fb002ef58 
0x0000003fb005b800:  0x00000000001584d4 <runtime/pprof.cpuHogger+0x000000000000005c>  0x0000000000065f98 <runtime.sighandler+0x00000000000001b8> 
0x0000003fb005b810:  0x000000000006605c <runtime.sighandler+0x000000000000027c>  0x0000000000280638 
0x0000003fb005b820:  0x00000000002b6f80  0x0000000000065ad4 <runtime.sigtrampgo+0x000000000000019c> 
0x0000003fb005b830:  0x0000003fb00ba820  0x0000003fb005b8c8 
0x0000003fb005b840:  0x00000000001584d4 <runtime/pprof.cpuHogger+0x000000000000005c>  0x0000003fb00ba820 
0x0000003fb005b850:  0x0000003fb0030800  0x0000000000000000 
runtime.throw({0x1b170e, 0x2a})
	/usr/local/go/src/runtime/panic.go:1198 +0x60
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:719 +0x29c
runtime.gentraceback(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0x3fb00ba820, 0x0, 0x0, 0x64, 0x0, 0x0, 0x0)
	/usr/local/go/src/runtime/traceback.go:247 +0x4d8
runtime.traceback1(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0x3fb00ba820, 0x0)
	/usr/local/go/src/runtime/traceback.go:823 +0x14c
runtime.traceback(...)
	/usr/local/go/src/runtime/traceback.go:777
runtime.tracebackothers(0x3fb0000ea0)
	/usr/local/go/src/runtime/traceback.go:1015 +0xcc
runtime.dopanic_m(0x3fb0000ea0, 0x4c8c0, 0x3fb005b708)
	/usr/local/go/src/runtime/panic.go:1398 +0x350
runtime.fatalthrow.func1()
	/usr/local/go/src/runtime/panic.go:1253 +0x5c
runtime.fatalthrow()
	/usr/local/go/src/runtime/panic.go:1250 +0x4c
runtime.throw({0x1b170e, 0x2a})
	/usr/local/go/src/runtime/panic.go:1198 +0x60
runtime: unexpected return pc for runtime.sigpanic called from 0x2
stack: frame={sp:0x3fb005b730, fp:0x3fb005b760} stack=[0x3fb0054000,0x3fb005c000)
0x0000003fb005b630:  0x0000000000000000  0x000000000004cb6c <runtime.fatalthrow.func1+0x000000000000005c> 
0x0000003fb005b640:  0x0000003fb0000ea0  0x0000003fb005b708 
0x0000003fb005b650:  0x0000000000000000  0x0000003fb0000ea0 
0x0000003fb005b660:  0x010000000004cea4  0x0000000000000004 
0x0000003fb005b670:  0x000000000000001f  0x0000000000000002 
0x0000003fb005b680:  0x0000000000000002  0x0000000000000001 
0x0000003fb005b690:  0x00000000001ae451  0x000000000004cafc <runtime.fatalthrow+0x000000000000004c> 
0x0000003fb005b6a0:  0x0000003fb0000ea0  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b6b0:  0x0000003fb005b708  0x000000000004e4d8 <runtime.printunlock+0x0000000000000058> 
0x0000003fb005b6c0:  0x0000003fb005b708  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b6d0:  0x0000003fb0000ea0  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b6e0:  0x0000003fb005b6e8  0x000000000004cb10 <runtime.fatalthrow.func1+0x0000000000000000> 
0x0000003fb005b6f0:  0x0000003fb0000ea0  0x000000000004c8c0 <runtime.throw+0x0000000000000060> 
0x0000003fb005b700:  0x0000003fb005b708  0x0000000000066a3c <runtime.sigpanic+0x000000000000029c> 
0x0000003fb005b710:  0x0000003fb005b718  0x000000000004c8d8 <runtime.throw.func1+0x0000000000000000> 
0x0000003fb005b720:  0x00000000001b170e  0x000000000000002a 
0x0000003fb005b730: <0x0000000000000002  0x00000000001b170e 
0x0000003fb005b740:  0x000000000000002a  0x0000000000000080 
0x0000003fb005b750:  0x0000000000000003  0x0000000000000000 
0x0000003fb005b760: >0x0000000000000003  0x00000000000658a4 <runtime.doSigPreempt+0x00000000000000bc> 
0x0000003fb005b770:  0x0000000000158554 <runtime/pprof.cpuHog1+0x0000000000000014>  0x00000000002a7328 
0x0000003fb005b780:  0x00000000002b6f80  0x0000000000000000 
0x0000003fb005b790:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7a0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7b0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7c0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7d0:  0x0000000000000000  0x0000000000000000 
0x0000003fb005b7e0:  0x0000000000065ef8 <runtime.sighandler+0x0000000000000118>  0x0000003fb00ba820 
0x0000003fb005b7f0:  0x0000000000158554 <runtime/pprof.cpuHog1+0x0000000000000014>  0x0000003fb002ef58 
0x0000003fb005b800:  0x00000000001584d4 <runtime/pprof.cpuHogger+0x000000000000005c>  0x0000000000065f98 <runtime.sighandler+0x00000000000001b8> 
0x0000003fb005b810:  0x000000000006605c <runtime.sighandler+0x000000000000027c>  0x0000000000280638 
0x0000003fb005b820:  0x00000000002b6f80  0x0000000000065ad4 <runtime.sigtrampgo+0x000000000000019c> 
0x0000003fb005b830:  0x0000003fb00ba820  0x0000003fb005b8c8 
0x0000003fb005b840:  0x00000000001584d4 <runtime/pprof.cpuHogger+0x000000000000005c>  0x0000003fb00ba820 
0x0000003fb005b850:  0x0000003fb0030800  0x0000000000000000 
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:719 +0x29c


ERROR: exit status 2
"runtime: unexpected return pc for runtime.sigpanic called from 0x0" / "fatal error: unknown caller pc"
sipeed@sipeed:~$ cat /tmp/go-stress-20211121T161735-4026159646
runtime: unexpected return pc for runtime.sigpanic called from 0x0
stack: frame={sp:0x3fb405b910, fp:0x3fb405b940} stack=[0x3fb402e800,0x3fb402f000)

fatal error: unknown caller pc

runtime stack:
runtime.throw({0x1aa247, 0x11})
	/usr/local/go/src/runtime/panic.go:1198 +0x60
runtime.gentraceback(0x49820, 0x3fb405b8f8, 0x0, 0x3fb40ba820, 0x0, 0x0, 0x7fffffff, 0x3fb403bfc0, 0x0, 0x0)
	/usr/local/go/src/runtime/traceback.go:274 +0x17c0
runtime.addOneOpenDeferFrame.func1()
	/usr/local/go/src/runtime/panic.go:751 +0x88
runtime.systemstack()
	/usr/local/go/src/runtime/asm_riscv64.s:131 +0x58

goroutine 8 [running]:
runtime.systemstack_switch()
	/usr/local/go/src/runtime/asm_riscv64.s:94 +0x8 fp=0x3fb405b7f0 sp=0x3fb405b7e8 pc=0x80768
runtime.addOneOpenDeferFrame(0x3fb40ba820, 0x49820, 0x3fb405b8f8)
	/usr/local/go/src/runtime/panic.go:750 +0x80 fp=0x3fb405b828 sp=0x3fb405b7f0 pc=0x4b160
panic({0x1871a0, 0x2c5ab0})
	/usr/local/go/src/runtime/panic.go:998 +0xcc fp=0x3fb405b8f8 sp=0x3fb405b828 pc=0x4c09c
runtime.panicmem()
	/usr/local/go/src/runtime/panic.go:221 +0x60 fp=0x3fb405b910 sp=0x3fb405b8f8 pc=0x49820
runtime: unexpected return pc for runtime.sigpanic called from 0x0
stack: frame={sp:0x3fb405b910, fp:0x3fb405b940} stack=[0x3fb402e800,0x3fb402f000)

runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:735 +0x264 fp=0x3fb405b940 sp=0x3fb405b910 pc=0x66a04
created by runtime/pprof.TestCPUProfileMultithreaded.func1
	/usr/local/go/src/runtime/pprof/pprof_test.go:100 +0x64

goroutine 1 [chan receive]:
testing.(*T).Run(0x3fb40ba1a0, {0x1acff7, 0x1b}, 0x1b7750)
	/usr/local/go/src/testing/testing.go:1307 +0x3a8
testing.runTests.func1(0x3fb40ba1a0)
	/usr/local/go/src/testing/testing.go:1598 +0x84
testing.tRunner(0x3fb40ba1a0, 0x3fb409dd50)
	/usr/local/go/src/testing/testing.go:1259 +0xf8
testing.runTests(0x3fb400a030, {0x2cb500, 0x1e, 0x1e}, {0x0, 0x0, 0x0})
	/usr/local/go/src/testing/testing.go:1596 +0x434
testing.(*M).Run(0x3fb40a4680)
	/usr/local/go/src/testing/testing.go:1504 +0x574
main.main()
	_testmain.go:101 +0x174

goroutine 6 [runnable]:
runtime/pprof.cpuHog2(0x0)
	/usr/local/go/src/runtime/pprof/pprof_test.go:71 +0x1c
runtime/pprof.cpuHogger(0x1b7940, 0x2fc268, 0x5f5e100)
	/usr/local/go/src/runtime/pprof/pprof_test.go:40 +0x5c
runtime/pprof.TestCPUProfileMultithreaded.func1(0x5f5e100)
	/usr/local/go/src/runtime/pprof/pprof_test.go:104 +0x8c
runtime/pprof.testCPUProfile(0x3fb40ba340, 0x1b79b8, {0x3fb4062f58, 0x2, 0x2}, {0x0, 0x0, 0x0}, 0x1b7748)
	/usr/local/go/src/runtime/pprof/pprof_test.go:324 +0x108
runtime/pprof.TestCPUProfileMultithreaded(0x3fb40ba340)
	/usr/local/go/src/runtime/pprof/pprof_test.go:98 +0xd0
testing.tRunner(0x3fb40ba340, 0x1b7750)
	/usr/local/go/src/testing/testing.go:1259 +0xf8
created by testing.(*T).Run
	/usr/local/go/src/testing/testing.go:1306 +0x388

goroutine 7 [sleep]:
time.Sleep(0x5f5e100)
	/usr/local/go/src/runtime/time.go:193 +0x16c
runtime/pprof.profileWriter({0x1e0400, 0x3fb400e8a0})
	/usr/local/go/src/runtime/pprof/pprof.go:799 +0x68
created by runtime/pprof.StartCPUProfile
	/usr/local/go/src/runtime/pprof/pprof.go:784 +0x190


ERROR: exit status 2
"fatal error: runtime: split stack overflow"
sipeed@sipeed:~$ cat /tmp/go-stress-20211121T161735-2006060682
runtime: newstack sp=0x3fa80098d0 stack=[0x3fa8062000, 0x3fa8063000]
	morebuf={pc:0x0 sp:0x3fa80098d0 lr:0x0}
	sched={pc:0x667b0 sp:0x3fa80098d0 lr:0x0 ctxt:0x0}
runtime: gp=0x3fa80bc340, goid=6, gp->status=0x2
 runtime: split stack overflow: 0x3fa80098d0 < 0x3fa8062000
fatal error: runtime: split stack overflow

runtime stack:
runtime.throw({0x1add89, 0x1d})
	/usr/local/go/src/runtime/panic.go:1198 +0x60
runtime.newstack()
	/usr/local/go/src/runtime/stack.go:1032 +0x850
runtime.morestack()
	/usr/local/go/src/runtime/asm_riscv64.s:202 +0x70

goroutine 6 [running]:
runtime: unexpected return pc for runtime.sigpanic called from 0x0
stack: frame={sp:0x3fa80098d0, fp:0x3fa80098d0} stack=[0x3fa8062000,0x3fa8063000)

runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:716 +0x10 fp=0x3fa80098d0 sp=0x3fa80098d0 pc=0x667b0
created by testing.(*T).Run
	/usr/local/go/src/testing/testing.go:1306 +0x388

goroutine 1 [chan receive]:
testing.(*T).Run(0x3fa80bc000, {0x1acff7, 0x1b}, 0x1b7750)
	/usr/local/go/src/testing/testing.go:1307 +0x3a8
testing.runTests.func1(0x3fa80bc000)
	/usr/local/go/src/testing/testing.go:1598 +0x84
testing.tRunner(0x3fa80bc000, 0x3fa809dd50)
	/usr/local/go/src/testing/testing.go:1259 +0xf8
testing.runTests(0x3fa800a030, {0x2cb500, 0x1e, 0x1e}, {0x0, 0x0, 0x0})
	/usr/local/go/src/testing/testing.go:1596 +0x434
testing.(*M).Run(0x3fa80a4680)
	/usr/local/go/src/testing/testing.go:1504 +0x574
main.main()
	_testmain.go:101 +0x174

goroutine 7 [sleep]:
time.Sleep(0x5f5e100)
	/usr/local/go/src/runtime/time.go:193 +0x16c
runtime/pprof.profileWriter({0x1e0400, 0x3fa800e8a0})
	/usr/local/go/src/runtime/pprof/pprof.go:799 +0x68
created by runtime/pprof.StartCPUProfile
	/usr/local/go/src/runtime/pprof/pprof.go:784 +0x190

goroutine 8 [runnable]:
runtime/pprof.cpuHog0(...)
	/usr/local/go/src/runtime/pprof/pprof_test.go:59
runtime/pprof.cpuHog1(0x0)
	/usr/local/go/src/runtime/pprof/pprof_test.go:54 +0xc
runtime/pprof.cpuHogger(0x1b7938, 0x2fc260, 0x5f5e100)
	/usr/local/go/src/runtime/pprof/pprof_test.go:40 +0x5c
runtime/pprof.TestCPUProfileMultithreaded.func1.1(0x5f5e100, 0x3fa805c3c0)
	/usr/local/go/src/runtime/pprof/pprof_test.go:101 +0x44
created by runtime/pprof.TestCPUProfileMultithreaded.func1
	/usr/local/go/src/runtime/pprof/pprof_test.go:100 +0x64


ERROR: exit status 2
@rhysh
Copy link
Contributor Author

rhysh commented Nov 22, 2021

This looks like a problem with linux/riscv64 signal handling, not specific to the cpu profiler: the tests for the os/signal package see similar crashes (though because of the test structure, it's in a subprocess rather than the test runner itself).

$ GOOS=linux GOARCH=riscv64 go test -c -o /tmp/signal.go1.17.3.test os/signal
$ go version /tmp/signal.go1.17.3.test 
/tmp/signal.go1.17.3.test: go1.17.3
$ # scp and ssh to the linux/riscv64 host

sipeed@sipeed:~$ nohup ./stress ./signal.go1.17.3.test -test.short >> ./nohup.log &
sipeed@sipeed:~$ tail -n 1 ./nohup.log 
3h18m50s: 2809 runs so far, 3 failures (0.11%)

One of the failures is fatal error: unexpected signal during runtime execution. Two of them are fatal error: runtime: split stack overflow.

"fatal error: unexpected signal during runtime execution"
sipeed@sipeed:~$ cat /tmp/go-stress-20211121T203003-1328641936 
--- FAIL: TestNohup (1.75s)
    --- FAIL: TestNohup/nohup-2 (0.06s)
        signal_test.go:504: ran test with -send_uncaught_sighup=2 under nohup and it failed: expected success.
            Error: exit status 2
            Output:
            === RUN   TestStop
            === RUN   TestStop/window_changed
            === PAUSE TestStop/window_changed
            === RUN   TestStop/hangup
            === PAUSE TestStop/hangup
            === RUN   TestStop/user_defined_signal_1
            === PAUSE TestStop/user_defined_signal_1
            === CONT  TestStop/window_changed
            fatal error: unexpected signal during runtime execution
            [signal SIGSEGV: segmentation violation code=0x1 addr=0x2 pc=0x2]
            
            goroutine 0 [idle]:
            runtime: unexpected return pc for runtime.sigpanic called from 0x2
            stack: frame={sp:0x3fa80097d0, fp:0x3fa8009800} stack=[0x3fa8002000,0x3fa800a000)
            0x0000003fa80096d0:  0x0000000000000000  0x000000000004cb64 <runtime.fatalthrow.func1+0x000000000000005c> 
            0x0000003fa80096e0:  0x000000000004c8b8 <runtime.throw+0x0000000000000060>  0x0000003fa80097a8 
            0x0000003fa80096f0:  0x0000000000000000  0x0000003fa8000000 
            0x0000003fa8009700:  0x010000000004ce9c  0x0000000000000004 
            0x0000003fa8009710:  0x000000000000001f  0x0000000000000002 
            0x0000003fa8009720:  0x0000000000000002  0x0000000000000001 
            0x0000003fa8009730:  0x0000000000163171  0x000000000004caf4 <runtime.fatalthrow+0x000000000000004c> 
            0x0000003fa8009740:  0x0000003fa8000000  0x000000000004c8b8 <runtime.throw+0x0000000000000060> 
            0x0000003fa8009750:  0x0000003fa80097a8  0x000000000004e4d0 <runtime.printunlock+0x0000000000000058> 
            0x0000003fa8009760:  0x0000003fa80097a8  0x000000000004c8b8 <runtime.throw+0x0000000000000060> 
            0x0000003fa8009770:  0x0000003fa8000000  0x000000000004c8b8 <runtime.throw+0x0000000000000060> 
            0x0000003fa8009780:  0x0000003fa8009788  0x000000000004cb08 <runtime.fatalthrow.func1+0x0000000000000000> 
            0x0000003fa8009790:  0x0000003fa8000000  0x000000000004c8b8 <runtime.throw+0x0000000000000060> 
            0x0000003fa80097a0:  0x0000003fa80097a8  0x0000000000066f6c <runtime.sigpanic+0x000000000000029c> 
            0x0000003fa80097b0:  0x0000003fa80097b8  0x000000000004c8d0 <runtime.throw.func1+0x0000000000000000> 
            0x0000003fa80097c0:  0x00000000001659bf  0x000000000000002a 
            0x0000003fa80097d0: <0x0000000000000002  0x00000000001659bf 
            0x0000003fa80097e0:  0x000000000000002a  0x0000000000000080 
            0x0000003fa80097f0:  0x0000000000000003  0x0000000000000000 
            0x0000003fa8009800: >0x0000000000000003  0x00000000000664c8 <runtime.sighandler+0x00000000000001b8> 
            0x0000003fa8009810:  0x000000000008f2d4 <syscall.RawSyscall+0x0000000000000014>  0x0000000000214750 
            0x0000003fa8009820:  0x0000000000236c80  0x0000000000066004 <runtime.sigtrampgo+0x000000000000019c> 
            0x0000003fa8009830:  0x000000000008f2d4 <syscall.RawSyscall+0x0000000000000014>  0x0000000000000000 
            0x0000003fa8009840:  0x0000000000000000  0x0000000000000000 
            0x0000003fa8009850:  0x0000000000000000  0x0000000000000000 
            0x0000003fa8009860:  0x0000010100000000  0x0000000000000000 
            0x0000003fa8009870:  0x0000000000000000  0x0000000000065f98 <runtime.sigtrampgo+0x0000000000000130> 
            0x0000003fa8009880:  0x0000000000000000  0x0000000000000000 
            0x0000003fa8009890:  0x0000003fa80098e0  0x0000000000065e90 <runtime.sigtrampgo+0x0000000000000028> 
            0x0000003fa80098a0:  0x000000000000001c  0x0000003fa8000000 
            0x0000003fa80098b0:  0x0000000000000000  0x0000000000000000 
            0x0000003fa80098c0:  0x0000000000000000  0x0000003fa8009990 
            0x0000003fa80098d0:  0x0000003fa8009a10  0x0000000000084bcc <runtime.sigtramp+0x0000000000000034> 
            0x0000003fa80098e0:  0x0000003f0000001c  0x0000003fa8009990 
            0x0000003fa80098f0:  0x0000003fa8009a10  0x0000003fa808a9c0 
            runtime.throw({0x1659bf, 0x2a})
            	/usr/local/go/src/runtime/panic.go:1198 +0x60
            runtime: unexpected return pc for runtime.sigpanic called from 0x2
            stack: frame={sp:0x3fa80097d0, fp:0x3fa8009800} stack=[0x3fa8002000,0x3fa800a000)
            0x0000003fa80096d0:  0x0000000000000000  0x000000000004cb64 <runtime.fatalthrow.func1+0x000000000000005c> 
            0x0000003fa80096e0:  0x000000000004c8b8 <runtime.throw+0x0000000000000060>  0x0000003fa80097a8 
            0x0000003fa80096f0:  0x0000000000000000  0x0000003fa8000000 
            0x0000003fa8009700:  0x010000000004ce9c  0x0000000000000004 
            0x0000003fa8009710:  0x000000000000001f  0x0000000000000002 
            0x0000003fa8009720:  0x0000000000000002  0x0000000000000001 
            0x0000003fa8009730:  0x0000000000163171  0x000000000004caf4 <runtime.fatalthrow+0x000000000000004c> 
            0x0000003fa8009740:  0x0000003fa8000000  0x000000000004c8b8 <runtime.throw+0x0000000000000060> 
            0x0000003fa8009750:  0x0000003fa80097a8  0x000000000004e4d0 <runtime.printunlock+0x0000000000000058> 
            0x0000003fa8009760:  0x0000003fa80097a8  0x000000000004c8b8 <runtime.throw+0x0000000000000060> 
            0x0000003fa8009770:  0x0000003fa8000000  0x000000000004c8b8 <runtime.throw+0x0000000000000060> 
            0x0000003fa8009780:  0x0000003fa8009788  0x000000000004cb08 <runtime.fatalthrow.func1+0x0000000000000000> 
            0x0000003fa8009790:  0x0000003fa8000000  0x000000000004c8b8 <runtime.throw+0x0000000000000060> 
            0x0000003fa80097a0:  0x0000003fa80097a8  0x0000000000066f6c <runtime.sigpanic+0x000000000000029c> 
            0x0000003fa80097b0:  0x0000003fa80097b8  0x000000000004c8d0 <runtime.throw.func1+0x0000000000000000> 
            0x0000003fa80097c0:  0x00000000001659bf  0x000000000000002a 
            0x0000003fa80097d0: <0x0000000000000002  0x00000000001659bf 
            0x0000003fa80097e0:  0x000000000000002a  0x0000000000000080 
            0x0000003fa80097f0:  0x0000000000000003  0x0000000000000000 
            0x0000003fa8009800: >0x0000000000000003  0x00000000000664c8 <runtime.sighandler+0x00000000000001b8> 
            0x0000003fa8009810:  0x000000000008f2d4 <syscall.RawSyscall+0x0000000000000014>  0x0000000000214750 
            0x0000003fa8009820:  0x0000000000236c80  0x0000000000066004 <runtime.sigtrampgo+0x000000000000019c> 
            0x0000003fa8009830:  0x000000000008f2d4 <syscall.RawSyscall+0x0000000000000014>  0x0000000000000000 
            0x0000003fa8009840:  0x0000000000000000  0x0000000000000000 
            0x0000003fa8009850:  0x0000000000000000  0x0000000000000000 
            0x0000003fa8009860:  0x0000010100000000  0x0000000000000000 
            0x0000003fa8009870:  0x0000000000000000  0x0000000000065f98 <runtime.sigtrampgo+0x0000000000000130> 
            0x0000003fa8009880:  0x0000000000000000  0x0000000000000000 
            0x0000003fa8009890:  0x0000003fa80098e0  0x0000000000065e90 <runtime.sigtrampgo+0x0000000000000028> 
            0x0000003fa80098a0:  0x000000000000001c  0x0000003fa8000000 
            0x0000003fa80098b0:  0x0000000000000000  0x0000000000000000 
            0x0000003fa80098c0:  0x0000000000000000  0x0000003fa8009990 
            0x0000003fa80098d0:  0x0000003fa8009a10  0x0000000000084bcc <runtime.sigtramp+0x0000000000000034> 
            0x0000003fa80098e0:  0x0000003f0000001c  0x0000003fa8009990 
            0x0000003fa80098f0:  0x0000003fa8009a10  0x0000003fa808a9c0 
            runtime.sigpanic()
            	/usr/local/go/src/runtime/signal_unix.go:719 +0x29c
            
            goroutine 7 [running]:
            syscall.Syscall(0x40, 0xa205c, 0x3fa8018080, 0x22)
            	/usr/local/go/src/syscall/asm_linux_riscv64.s:13 +0x10 fp=0x3fa802e438 sp=0x3fa802e430 pc=0x8f1d0
            syscall.write(0x3fa80480c0, {0x3fa8018080, 0x22, 0x40})
            	/usr/local/go/src/syscall/zsyscall_linux_riscv64.go:915 +0x5c fp=0x3fa802e488 sp=0x3fa802e438 pc=0x8e80c
            syscall.Write(...)
            	/usr/local/go/src/syscall/syscall_unix.go:214
            internal/poll.ignoringEINTRIO(...)
            	/usr/local/go/src/internal/poll/fd_unix.go:582
            internal/poll.(*FD).Write(0x3fa80480c0, {0x3fa8018080, 0x22, 0x40})
            	/usr/local/go/src/internal/poll/fd_unix.go:275 +0x310 fp=0x3fa802e520 sp=0x3fa802e488 pc=0xa3078
            os.(*File).write(...)
            	/usr/local/go/src/os/file_posix.go:49
            os.(*File).Write(0x3fa8068f70, {0x3fa8018080, 0x22, 0x40})
            	/usr/local/go/src/os/file.go:176 +0x7c fp=0x3fa802e5a8 sp=0x3fa802e520 pc=0xa4f54
            fmt.Fprintf({0x3fa8066340, 0x8157c}, {0x2b4f5, 0x18fd8b9d}, {0x3fa802e6d8, 0x9779c, 0xa1578e28dd9d})
            	/usr/local/go/src/fmt/print.go:205 +0x8c fp=0x3fa802e608 sp=0x3fa802e5a8 pc=0xc132c
            testing.(*chattyPrinter).Updatef(0xc05ed10f12e576fa, {0xdac1ad, 0x24e1c0}, {0x15e8e7, 0x11bf20}, {0x81, 0x803ab, 0x1c})
            	/usr/local/go/src/testing/testing.go:367 +0xfc fp=0x3fa802e680 sp=0x3fa802e608 pc=0xd884c
            testing.(*T).Parallel(0x803ab)
            	/usr/local/go/src/testing/testing.go:1108 +0x290 fp=0x3fa802e6e8 sp=0x3fa802e680 pc=0xdb3e0
            os/signal.TestStop.func1(0x3fa808a820)
            	/usr/local/go/src/os/signal/signal_test.go:349 +0x38 fp=0x3fa802e780 sp=0x3fa802e6e8 pc=0x11be90
            testing.tRunner(0x3fa808a820, 0x3fa8012170)
            	/usr/local/go/src/testing/testing.go:1259 +0xf8 fp=0x3fa802e7c8 sp=0x3fa802e780 pc=0xdb568
            runtime.goexit()
            	/usr/local/go/src/runtime/asm_riscv64.s:507 +0x4 fp=0x3fa802e7c8 sp=0x3fa802e7c8 pc=0x83714
            created by testing.(*T).Run
            	/usr/local/go/src/testing/testing.go:1306 +0x388
            
            goroutine 1 [chan receive]:
            testing.(*T).Run(0x3fa808a340, {0x15d665, 0x8}, 0x168300)
            	/usr/local/go/src/testing/testing.go:1307 +0x3a8
            testing.runTests.func1(0x3fa808a340)
            	/usr/local/go/src/testing/testing.go:1598 +0x84
            testing.tRunner(0x3fa808a340, 0x3fa803dd50)
            	/usr/local/go/src/testing/testing.go:1259 +0xf8
            testing.runTests(0x3fa800a030, {0x24a980, 0x13, 0x13}, {0x0, 0x0, 0x0})
            	/usr/local/go/src/testing/testing.go:1596 +0x434
            testing.(*M).Run(0x3fa808c000)
            	/usr/local/go/src/testing/testing.go:1504 +0x574
            main.main()
            	_testmain.go:83 +0x174
            
            goroutine 6 [chan receive]:
            testing.tRunner.func1(0x3fa808a4e0)
            	/usr/local/go/src/testing/testing.go:1225 +0x420
            testing.tRunner(0x3fa808a4e0, 0x168300)
            	/usr/local/go/src/testing/testing.go:1265 +0x14c
            created by testing.(*T).Run
            	/usr/local/go/src/testing/testing.go:1306 +0x388
            
            goroutine 8 [runnable]:
            testing.(*T).Parallel(0x3fa808ab60)
            	/usr/local/go/src/testing/testing.go:1104 +0x1f0
            os/signal.TestStop.func1(0x3fa808ab60)
            	/usr/local/go/src/os/signal/signal_test.go:349 +0x38
            testing.tRunner(0x3fa808ab60, 0x3fa80121b0)
            	/usr/local/go/src/testing/testing.go:1259 +0xf8
            created by testing.(*T).Run
            	/usr/local/go/src/testing/testing.go:1306 +0x388
            
            goroutine 9 [runnable]:
            testing.(*T).Parallel(0x3fa808aea0)
            	/usr/local/go/src/testing/testing.go:1104 +0x1f0
            os/signal.TestStop.func1(0x3fa808aea0)
            	/usr/local/go/src/os/signal/signal_test.go:349 +0x38
            testing.tRunner(0x3fa808aea0, 0x3fa80121f0)
            	/usr/local/go/src/testing/testing.go:1259 +0xf8
            created by testing.(*T).Run
            	/usr/local/go/src/testing/testing.go:1306 +0x388
FAIL


ERROR: exit status 1
"fatal error: runtime: split stack overflow"
sipeed@sipeed:~$ cat /tmp/go-stress-20211121T203003-2319835814 
--- FAIL: TestNohup (1.95s)
    --- FAIL: TestNohup/nohup-1 (0.06s)
        signal_test.go:504: ran test with -send_uncaught_sighup=1 under nohup and it failed: expected success.
            Error: exit status 2
            Output:
            === RUN   TestStop
            === RUN   TestStop/window_changed
            === PAUSE TestStop/window_changed
            === RUN   TestStop/hangup
            === PAUSE TestStop/hangup
            === RUN   TestStop/user_defined_signal_1
            === PAUSE TestStop/user_defined_signal_1
            === CONT  TestStop/window_changed
            runtime: newstack sp=0x3fb0009940 stack=[0x3fb002e000, 0x3fb002e800]
            	morebuf={pc:0x0 sp:0x3fb0009940 lr:0x0}
            	sched={pc:0x66ce0 sp:0x3fb0009940 lr:0x0 ctxt:0x0}
            runtime: gp=0x3fb0098340, goid=7, gp->status=0x2
             runtime: split stack overflow: 0x3fb0009940 < 0x3fb002e000
            fatal error: runtime: split stack overflow
            
            runtime stack:
            runtime.throw({0x162bf4, 0x1d})
            	/usr/local/go/src/runtime/panic.go:1198 +0x60
            runtime.newstack()
            	/usr/local/go/src/runtime/stack.go:1032 +0x850
            runtime.morestack()
            	/usr/local/go/src/runtime/asm_riscv64.s:202 +0x70
            
            goroutine 7 [running]:
            runtime: unexpected return pc for runtime.sigpanic called from 0x0
            stack: frame={sp:0x3fb0009940, fp:0x3fb0009940} stack=[0x3fb002e000,0x3fb002e800)
            
            runtime.sigpanic()
            	/usr/local/go/src/runtime/signal_unix.go:716 +0x10 fp=0x3fb0009940 sp=0x3fb0009940 pc=0x66ce0
            created by testing.(*T).Run
            	/usr/local/go/src/testing/testing.go:1306 +0x388
            
            goroutine 1 [chan receive]:
            testing.(*T).Run(0x3fb0001ba0, {0x15d665, 0x8}, 0x168300)
            	/usr/local/go/src/testing/testing.go:1307 +0x3a8
            testing.runTests.func1(0x3fb0001ba0)
            	/usr/local/go/src/testing/testing.go:1598 +0x84
            testing.tRunner(0x3fb0001ba0, 0x3fb003bd50)
            	/usr/local/go/src/testing/testing.go:1259 +0xf8
            testing.runTests(0x3fb000a030, {0x24a980, 0x13, 0x13}, {0x0, 0x0, 0x0})
            	/usr/local/go/src/testing/testing.go:1596 +0x434
            testing.(*M).Run(0x3fb0082000)
            	/usr/local/go/src/testing/testing.go:1504 +0x574
            main.main()
            	_testmain.go:83 +0x174
            
            goroutine 6 [chan receive]:
            testing.tRunner.func1(0x3fb0001d40)
            	/usr/local/go/src/testing/testing.go:1225 +0x420
            testing.tRunner(0x3fb0001d40, 0x168300)
            	/usr/local/go/src/testing/testing.go:1265 +0x14c
            created by testing.(*T).Run
            	/usr/local/go/src/testing/testing.go:1306 +0x388
            
            goroutine 8 [runnable]:
            testing.(*T).Parallel(0x3fb00984e0)
            	/usr/local/go/src/testing/testing.go:1104 +0x1f0
            os/signal.TestStop.func1(0x3fb00984e0)
            	/usr/local/go/src/os/signal/signal_test.go:349 +0x38
            testing.tRunner(0x3fb00984e0, 0x3fb00121b0)
            	/usr/local/go/src/testing/testing.go:1259 +0xf8
            created by testing.(*T).Run
            	/usr/local/go/src/testing/testing.go:1306 +0x388
            
            goroutine 9 [runnable]:
            testing.(*T).Parallel(0x3fb0098820)
            	/usr/local/go/src/testing/testing.go:1104 +0x1f0
            os/signal.TestStop.func1(0x3fb0098820)
            	/usr/local/go/src/os/signal/signal_test.go:349 +0x38
            testing.tRunner(0x3fb0098820, 0x3fb00121f0)
            	/usr/local/go/src/testing/testing.go:1259 +0xf8
            created by testing.(*T).Run
            	/usr/local/go/src/testing/testing.go:1306 +0x388
FAIL


ERROR: exit status 1

@rhysh rhysh changed the title runtime: crashes on linux/riscv64 during runtime/pprof tests runtime: crashes on linux/riscv64 during runtime/pprof and os/signal tests Nov 22, 2021
@mengzhuo
Copy link
Contributor

kindly cc @cherrymui @4a6f656c @prattmic

@cherrymui
Copy link
Member

Yeah, it does look like signal related. Is it reproducible at tip? Thanks.

@rhysh
Copy link
Contributor Author

rhysh commented Nov 22, 2021

Yes, the runtime/pprof crashes reproduce at tip. It's very fast, I got 10 failures in under 4 minutes. The crashes in os/signal are slower, less than one per hour on go1.17.3. I'll leave it running.

$ ~/gotip/bin/go version
go version devel go1.18-cd0bf38966 Mon Nov 22 16:53:57 2021 +0000 darwin/amd64
$ GOOS=linux GOARCH=riscv64 ~/gotip/bin/go test -c -o /tmp/signal.go1.18-devel-cd0bf38966.test os/signal
$ GOOS=linux GOARCH=riscv64 ~/gotip/bin/go test -c -o /tmp/pprof.go1.18-devel-cd0bf38966.test runtime/pprof
$ go version /tmp/*devel*.test
/tmp/pprof.go1.18-devel-cd0bf38966.test: devel go1.18-cd0bf38966 Mon Nov 22 16:53:57 2021 +0000
/tmp/signal.go1.18-devel-cd0bf38966.test: devel go1.18-cd0bf38966 Mon Nov 22 16:53:57 2021 +0000

# scp, ssh

sipeed@sipeed:~$ nohup ./stress ./pprof.go1.18-devel-cd0bf38966.test -test.run=TestCPUProfileMultithreaded -test.short > ./nohup.pprof.cd0bf38966.log &
sipeed@sipeed:~$ tac ./nohup.pprof.cd0bf38966.log | grep failures | head -n 1
3m15s: 192 runs so far, 10 failures (5.21%)
sipeed@sipeed:~$ (for f in /tmp/go-stress-20211122T181250-* ; do grep fatal $f | head -n 1 ; done) | sort | uniq -c | head
      8 fatal error: unexpected signal during runtime execution
      2 fatal error: unknown caller pc

sipeed@sipeed:~$ nohup ./stress ./signal.go1.18-devel-cd0bf38966.test -test.short > ./nohup.signal.cd0bf38966.log &
sipeed@sipeed:~$ tac ./nohup.signal.cd0bf38966.log | grep failures | head -n 1
9m35s: 136 runs so far, 0 failures

@heschi heschi added the NeedsFix The path to resolution is known, but the work has not been done. label Nov 22, 2021
@heschi heschi added this to the Go1.18 milestone Nov 22, 2021
@rhysh
Copy link
Contributor Author

rhysh commented Nov 23, 2021

I haven't seen any os/signal failures yet at tip, and with the go1.17.3 failure rate I'd expect "about 6" by now:

sipeed@sipeed:~$ tac ./nohup.signal.cd0bf38966.log | grep failures | head -n 1
6h39m45s: 5692 runs so far, 0 failures

@rhysh
Copy link
Contributor Author

rhysh commented Nov 23, 2021

The os/signal failures reproduce at tip, but very slowly (10 hours):

sipeed@sipeed:~$ tac ./nohup.signal.cd0bf38966.log | grep failures | head -n 1
21h21m25s: 18251 runs so far, 2 failures (0.01%)
sipeed@sipeed:~$ (for f in /tmp/go-stress-20211122T181657-* ; do grep fatal $f | head -n 1 ; done) | sort | uniq -c | head
      1             fatal error: runtime: split stack overflow
      1             fatal error: unexpected signal during runtime execution

@4a6f656c
Copy link
Contributor

4a6f656c commented Nov 24, 2021

Yes, the runtime/pprof crashes reproduce at tip. It's very fast, I got 10 failures in under 4 minutes. The crashes in os/signal are slower, less than one per hour on go1.17.3. I'll leave it running.

I'm not able to reproduce this here on a SiFive HiFive Unleashed:

$ ../bin/go version
go version devel go1.18-00045b76e5 Tue Nov 23 17:30:32 2021 +0000 linux/riscv64
$ ../bin/go test -c -o /tmp/pprof.test runtime/pprof
$ while /tmp/pprof.test -test.run=TestCPUProfileMultithreaded -test.short ; do : ; done

That ran for several hours without failing - @mengzhuo can you test on your SiFive HiFive Unmatched?

@mengzhuo
Copy link
Contributor

mengzhuo commented Nov 26, 2021

@4a6f656c Joel, I ran this test for about 1 hour and no failure occurred.
I notice that the CPU of Nezha is only one core.
Should we disable the multi-thread tests for single core CPU?


some updates, It runs 18 hours and no failures.

Yes, the runtime/pprof crashes reproduce at tip. It's very fast, I got 10 failures in under 4 minutes. The crashes in os/signal are slower, less than one per hour on go1.17.3. I'll leave it running.

I'm not able to reproduce this here on a SiFive HiFive Unleashed:

$ ../bin/go version
go version devel go1.18-00045b76e5 Tue Nov 23 17:30:32 2021 +0000 linux/riscv64
$ ../bin/go test -c -o /tmp/pprof.test runtime/pprof
$ while /tmp/pprof.test -test.run=TestCPUProfileMultithreaded -test.short ; do : ; done

That ran for several hours without failing - @mengzhuo can you test on your SiFive HiFive Unmatched?

@rhysh
Copy link
Contributor Author

rhysh commented Nov 27, 2021

Should we disable the multi-thread tests for single core CPU?

It's a crash, not a test failure, so I don't think that disabling the tests is the right thing to do.

The TestCPUProfileMultithreaded case sets GOMAXPROCS to 2. On multi-core machines, does the crash reproduce when taskset or similar restricts all threads to run on a single core?


I ran the stress test for several hours with async preemption disabled and saw zero failures:

sipeed@sipeed:~$ nohup env GODEBUG=asyncpreemptoff=1 ./stress ./pprof.go1.18-devel-cd0bf38966.test -test.run='^TestCPUProfileMultithreaded$' -test.short > ./nohup.pprof.cd0bf38966.asyncpreemptoff.log &

# wait a while

sipeed@sipeed:~$ tac ./nohup.pprof.cd0bf38966.asyncpreemptoff.log | grep failures | head -n 1
5h23m0s: 15074 runs so far, 0 failures

Some of the failures I see, which complain of fatal error: unexpected signal during runtime execution, happen soon after a SIGPROF signal arrives on a thread that is already running on the gsignal stack to handle SIGURG, meaning that inside the sighandler function, gp == _g_.m.gsignal.

The other two failure modes, fatal error: runtime: split stack overflow and fatal error: unknown caller pc, seem like they could also be caused by stack-sharing: one execution using the stack from where the other left off leading to larger-than-expected stack use, or one clobbering the return values that the other had saved.

What, if anything, would make it safe for the signal handlers to be re-entrant, and/or to share the M's gsignal stack? I haven't seen these failures on linux/amd64 (though I don't typically use single-core machines).


I added some debuglog to the signal handler; this is what I see.

Atop go1.17.3:

diff --git a/src/runtime/signal_unix.go b/src/runtime/signal_unix.go
index 6096760b50..63f7b1f8ba 100644
--- a/src/runtime/signal_unix.go
+++ b/src/runtime/signal_unix.go
@@ -538,6 +538,17 @@ func sighandler(sig uint32, info *siginfo, ctxt unsafe.Pointer, gp *g) {
 	_g_ := getg()
 	c := &sigctxt{info, ctxt}
 
+	if sig == 11 || gp == _g_.m.gsignal {
+		log := dlog().s("sighandler").s("m").i64(_g_.m.id).s("sig").u32(sig).s("pc").hex(uint64(c.sigpc())).s("g").i64(_g_.goid)
+		if _g_.m.curg != nil {
+			log = log.s("curg").i64(_g_.m.curg.goid)
+		}
+		if gp != nil {
+			log = log.s("gp").i64(gp.goid)
+		}
+		log.end()
+	}
+
 	if sig == _SIGPROF {
 		sigprof(c.sigpc(), c.sigsp(), c.siglr(), gp, _g_.m)
 		return
$ GOOS=linux GOARCH=riscv64 ./bin/go test -c -o /tmp/pprof.go1.17-edit.test -tags 'debuglog' runtime/pprof

# scp, ssh

sipeed@sipeed:~$ ./stress ./pprof.go1.17-edit.test -test.run='^TestCPUProfileMultithreaded$' -test.short
sipeed@sipeed:~$ for f in /tmp/go-stress-20211126T234753-* ; do echo $f ; cat $f | egrep "^(fatal error|\[)" ; echo ; done
/tmp/go-stress-20211126T234753-1671273976
fatal error: unknown caller pc
[0.137608292 P 1] sighandler m 3 sig 11 pc 0x0 g 0 curg 8 gp 8

/tmp/go-stress-20211126T234753-2730224263
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x2 pc=0x2]
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x78ef8]
[0.729127202 P 0] sighandler m 0 sig 27 pc 0x70bcc g 0 curg 6 gp 0
[0.729284995 P 0] sighandler m 0 sig 11 pc 0x2 g 0 curg 6 gp 0
[0.736798263 P 0] sighandler m 0 sig 11 pc 0x78ef8 g 0 curg 6 gp 0
[0.739702328 P 0] sighandler m 0 sig 27 pc 0x8608c g 0 curg 6 gp 0

/tmp/go-stress-20211126T234753-32012639
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x0]
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x78ef8]
[0.118312684 P 1] sighandler m 3 sig 27 pc 0x71cc8 g 0 curg 8 gp 0
[0.118463435 P 1] sighandler m 3 sig 11 pc 0x0 g 0 curg 8 gp 0
[0.126182496 P 1] sighandler m 3 sig 11 pc 0x78ef8 g 0 curg 8 gp 0
[0.128827642 P 1] sighandler m 3 sig 27 pc 0x8608c g 0 curg 8 gp 0

/tmp/go-stress-20211126T234753-3518561130
fatal error: runtime: split stack overflow
[0.562301991 P 0] sighandler m 0 sig 11 pc 0x0 g 0 curg 6 gp 6

/tmp/go-stress-20211126T234753-3944482246
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x80 pc=0x80]
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x78ef8]
[0.540990792 P 1] sighandler m 2 sig 27 pc 0x4f86c g 0 curg 8 gp 0
[0.541141210 P 1] sighandler m 2 sig 11 pc 0x80 g 0 curg 8 gp 0
[0.548668404 P 1] sighandler m 2 sig 11 pc 0x78ef8 g 0 curg 8 gp 0
[0.551650807 P 1] sighandler m 2 sig 27 pc 0x8608c g 0 curg 8 gp 0

/tmp/go-stress-20211126T234753-3953261365
fatal error: runtime: split stack overflow
[0.207116710 P 0] sighandler m 0 sig 11 pc 0x0 g 0 curg 6 gp 6

/tmp/go-stress-20211126T234753-873267354
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x0]
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x78ef8]
[0.639896662 P 0] sighandler m 0 sig 27 pc 0x71cc8 g 0 curg 6 gp 0
[0.640053955 P 0] sighandler m 0 sig 11 pc 0x0 g 0 curg 6 gp 0
[0.647703639 P 0] sighandler m 0 sig 11 pc 0x78ef8 g 0 curg 6 gp 0
[0.650338701 P 0] sighandler m 0 sig 27 pc 0x71ed8 g 0 curg 6 gp 0

/tmp/go-stress-20211126T234753-946845684
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x0]
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x78ef8]
[0.023616104 P 0] sighandler m 0 sig 23 pc 0x78cb8 g 0 curg 6 gp 0
[0.023966732 P 0] sighandler m 0 sig 23 pc 0x61d3c g 0 curg 6 gp 0
[0.024075024 P 0] sighandler m 0 sig 23 pc 0x61d3c g 0 curg 6 gp 0
[0.583399559 P 0] sighandler m 0 sig 27 pc 0x71cc8 g 0 curg 6 gp 0
[0.583484268 P 0] sighandler m 0 sig 11 pc 0x0 g 0 curg 6 gp 0
[0.591338775 P 0] sighandler m 0 sig 11 pc 0x78ef8 g 0 curg 6 gp 0
[0.594175667 P 0] sighandler m 0 sig 27 pc 0x8608c g 0 curg 6 gp 0

/tmp/go-stress-20211126T234753-948494023
fatal error: runtime: split stack overflow
[0.128850956 P 0] sighandler m 0 sig 11 pc 0x0 g 0 curg 6 gp 6

/tmp/go-stress-20211126T234753-976614241
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x80 pc=0x80]
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x78ef8]
[0.184631256 P 0] sighandler m 0 sig 27 pc 0x4f86c g 0 curg 6 gp 0
[0.184781716 P 0] sighandler m 0 sig 11 pc 0x80 g 0 curg 6 gp 0
[0.192525895 P 0] sighandler m 0 sig 11 pc 0x78ef8 g 0 curg 6 gp 0
[0.195321040 P 0] sighandler m 0 sig 27 pc 0x8608c g 0 curg 6 gp 0

# and the addresses that gp was running when its M received signal 27 (SIGPROF)

$ echo 0x4f86c | go tool addr2line /tmp/pprof.go1.17-edit.test 
runtime.isAbortPC
/usr/local/go/src/runtime/panic.go:1492
$ echo 0x71cc8 | go tool addr2line /tmp/pprof.go1.17-edit.test 
runtime.pcdatavalue2
/usr/local/go/src/runtime/symtab.go:1005

@cherrymui
Copy link
Member

@rhysh thanks for the logs! If the signal handler is reentrant, or the signal stack is somehow shared, I think it is very wrong. Could you print the stack pointer at entry of signal handler to see if it is running on the same stack?

@cherrymui
Copy link
Member

gp == _g_.m.gsignal

I think that condition should never be true. This could either due to that we're getting nested signals or the G is somehow wrong. As we got signal PC in runtime.pcdatavalue2, which is only called from the signal handler, it indicates that we're actually getting nested signals, which should not happen as we block signals when running the signal handler.

What kernel version are you running on? @4a6f656c and @mengzhuo , what kernel version does the builder running?

@rhysh
Copy link
Contributor Author

rhysh commented Nov 29, 2021

Yes, the variation in the result of getcallersp makes it looks like nested signals. I'm running the OS installation that came with the machine, kernel 5.4.61.

sipeed@sipeed:~$ uname -a
Linux sipeed 5.4.61 #15 PREEMPT Thu Jun 10 06:50:03 UTC 2021 riscv64 GNU/Linux

In a crash from "fatal error: unexpected signal during runtime execution", it looks like M0 processes SIGURG and SIGPROF always with SP 0x3fa80098d8. On M3, it usually processes those with SP 0x3fa805b8d8. But then it gets a SIGPROF/27 on PC 0x16c34 (runtime.(*debugLogWriter).varint, hmm) and the SP for that call to runtime.sighandler is 0x3fa805b0c8, a 0x810 or 2064-byte delta from the usual place.

We don't see a log line from a competing SIGURG/23 delivery, but we do see that the SIGPROF/27 signal came when the PC was inside a function that only the signal handler would call. And we see a split in the debuglog output (">> begin log 0 <<"), which seems to mean that something else (the signal handler) was using the debuglog that's usually in the global pool (between calls to dlog and end) which meant the SIGPROF/27 handler needed to allocate another buffer.

sipeed@sipeed:~$ cat /tmp/go-stress-20211129T195813-1450030122 | grep 'sp 0x' | cut -d' ' -f 5-10 | sort | uniq -c
      9 m 0 sp 0x3fa80098d8 sig 23
      7 m 0 sp 0x3fa80098d8 sig 27
      1 m 3 sp 0x3fa805a378 sig 27
      1 m 3 sp 0x3fa805aa48 sig 11
      1 m 3 sp 0x3fa805b0c8 sig 11
      1 m 3 sp 0x3fa805b0c8 sig 27
      7 m 3 sp 0x3fa805b8d8 sig 23
      7 m 3 sp 0x3fa805b8d8 sig 27
Here's the current diff from go1.17.3
$ git diff
diff --git a/src/runtime/signal_unix.go b/src/runtime/signal_unix.go
index 6096760b50..b17a001ac6 100644
--- a/src/runtime/signal_unix.go
+++ b/src/runtime/signal_unix.go
@@ -538,6 +538,19 @@ func sighandler(sig uint32, info *siginfo, ctxt unsafe.Pointer, gp *g) {
        _g_ := getg()
        c := &sigctxt{info, ctxt}
 
+       if sig == 11 || gp == _g_.m.gsignal || true {
+               log := dlog().s("sighandler")
+               log = log.s("m").i64(_g_.m.id).s("sp").hex(uint64(getcallersp()))
+               log = log.s("sig").u32(sig).s("pc").hex(uint64(c.sigpc())).s("g").i64(_g_.goid)
+               if _g_.m.curg != nil {
+                       log = log.s("curg").i64(_g_.m.curg.goid)
+               }
+               if gp != nil {
+                       log = log.s("gp").i64(gp.goid)
+               }
+               log.end()
+       }
+
        if sig == _SIGPROF {
                sigprof(c.sigpc(), c.sigsp(), c.siglr(), gp, _g_.m)
                return
And the crash log from stress (including debuglog output)
sipeed@sipeed:~$ cat /tmp/go-stress-20211129T195813-1450030122
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x80 pc=0x80]

goroutine 0 [idle]:
runtime: unexpected return pc for runtime.sigpanic called from 0x80
stack: frame={sp:0x3fa805b7b8, fp:0x3fa805b7e8} stack=[0x3fa8054000,0x3fa805c000)
0x0000003fa805b6b8:  0x0000000000000000  0x000000000004f024 <runtime.fatalthrow.func1+0x000000000000005c> 
0x0000003fa805b6c8:  0x000000000004ed78 <runtime.throw+0x0000000000000060>  0x0000003fa805b790 
0x0000003fa805b6d8:  0x0000000000000000  0x0000003fa8000ea0 
0x0000003fa805b6e8:  0x010000000004f35c  0x0000000000000004 
0x0000003fa805b6f8:  0x000000000000001f  0x0000000000000080 
0x0000003fa805b708:  0x0000000000000080  0x0000000000000001 
0x0000003fa805b718:  0x00000000001aed2c  0x000000000004efb4 <runtime.fatalthrow+0x000000000000004c> 
0x0000003fa805b728:  0x0000003fa8000ea0  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b738:  0x0000003fa805b790  0x0000000000050998 <runtime.printunlock+0x0000000000000058> 
0x0000003fa805b748:  0x0000003fa805b790  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b758:  0x0000003fa8000ea0  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b768:  0x0000003fa805b770  0x000000000004efc8 <runtime.fatalthrow.func1+0x0000000000000000> 
0x0000003fa805b778:  0x0000003fa8000ea0  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b788:  0x0000003fa805b790  0x000000000006913c <runtime.sigpanic+0x000000000000029c> 
0x0000003fa805b798:  0x0000003fa805b7a0  0x000000000004ed90 <runtime.throw.func1+0x0000000000000000> 
0x0000003fa805b7a8:  0x00000000001b200b  0x000000000000002a 
0x0000003fa805b7b8: <0x0000000000000080  0x00000000001b200b 
0x0000003fa805b7c8:  0x000000000000002a  0x0000000000000000 
0x0000003fa805b7d8:  0x0000000000000080  0x0000000000000003 
0x0000003fa805b7e8: >0x0000000000000080  0x0000000000068304 <runtime.sighandler+0x0000000000000064> 
0x0000003fa805b7f8:  0x0000003fac01b000  0x0000000000000001 
0x0000003fa805b808:  0x00030cb3a06830c3  0x00030cb3a06830c3 
0x0000003fa805b818:  0x0000003fac01b000  0x000012bc998f4621 
0x0000003fa805b828:  0x0000000000067f94 <runtime.sigtrampgo+0x000000000000019c>  0x000000000015ae20 <runtime/pprof.cpuHog1+0x0000000000000010> 
0x0000003fa805b838:  0x0000003fa802ef58  0x000000000015ada4 <runtime/pprof.cpuHogger+0x000000000000005c> 
0x0000003fa805b848:  0x0000003fa80c0680  0x0000003fa8030900 
0x0000003fa805b858:  0x0000000000000000  0x0000010100000000 
0x0000003fa805b868:  0x0000000000000000  0x0000000000000000 
0x0000003fa805b878:  0x0000000000067f28 <runtime.sigtrampgo+0x0000000000000130>  0x0000000000000000 
0x0000003fa805b888:  0x0000000000000000  0x0000003fa805b8e0 
0x0000003fa805b898:  0x0000000000067e20 <runtime.sigtrampgo+0x0000000000000028>  0x0000000000000017 
0x0000003fa805b8a8:  0x0000003fa8000ea0  0x0000000000000000 
0x0000003fa805b8b8:  0x0000000000000000  0x0000000000000000 
0x0000003fa805b8c8:  0x0000003fa805b990  0x0000003fa805ba10 
0x0000003fa805b8d8:  0x00000000000862cc <runtime.sigtramp+0x0000000000000034>  0x0000003f00000017 
runtime.throw({0x1b200b, 0x2a})
	/usr/local/go/src/runtime/panic.go:1198 +0x60
runtime: unexpected return pc for runtime.sigpanic called from 0x80
stack: frame={sp:0x3fa805b7b8, fp:0x3fa805b7e8} stack=[0x3fa8054000,0x3fa805c000)
0x0000003fa805b6b8:  0x0000000000000000  0x000000000004f024 <runtime.fatalthrow.func1+0x000000000000005c> 
0x0000003fa805b6c8:  0x000000000004ed78 <runtime.throw+0x0000000000000060>  0x0000003fa805b790 
0x0000003fa805b6d8:  0x0000000000000000  0x0000003fa8000ea0 
0x0000003fa805b6e8:  0x010000000004f35c  0x0000000000000004 
0x0000003fa805b6f8:  0x000000000000001f  0x0000000000000080 
0x0000003fa805b708:  0x0000000000000080  0x0000000000000001 
0x0000003fa805b718:  0x00000000001aed2c  0x000000000004efb4 <runtime.fatalthrow+0x000000000000004c> 
0x0000003fa805b728:  0x0000003fa8000ea0  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b738:  0x0000003fa805b790  0x0000000000050998 <runtime.printunlock+0x0000000000000058> 
0x0000003fa805b748:  0x0000003fa805b790  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b758:  0x0000003fa8000ea0  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b768:  0x0000003fa805b770  0x000000000004efc8 <runtime.fatalthrow.func1+0x0000000000000000> 
0x0000003fa805b778:  0x0000003fa8000ea0  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b788:  0x0000003fa805b790  0x000000000006913c <runtime.sigpanic+0x000000000000029c> 
0x0000003fa805b798:  0x0000003fa805b7a0  0x000000000004ed90 <runtime.throw.func1+0x0000000000000000> 
0x0000003fa805b7a8:  0x00000000001b200b  0x000000000000002a 
0x0000003fa805b7b8: <0x0000000000000080  0x00000000001b200b 
0x0000003fa805b7c8:  0x000000000000002a  0x0000000000000000 
0x0000003fa805b7d8:  0x0000000000000080  0x0000000000000003 
0x0000003fa805b7e8: >0x0000000000000080  0x0000000000068304 <runtime.sighandler+0x0000000000000064> 
0x0000003fa805b7f8:  0x0000003fac01b000  0x0000000000000001 
0x0000003fa805b808:  0x00030cb3a06830c3  0x00030cb3a06830c3 
0x0000003fa805b818:  0x0000003fac01b000  0x000012bc998f4621 
0x0000003fa805b828:  0x0000000000067f94 <runtime.sigtrampgo+0x000000000000019c>  0x000000000015ae20 <runtime/pprof.cpuHog1+0x0000000000000010> 
0x0000003fa805b838:  0x0000003fa802ef58  0x000000000015ada4 <runtime/pprof.cpuHogger+0x000000000000005c> 
0x0000003fa805b848:  0x0000003fa80c0680  0x0000003fa8030900 
0x0000003fa805b858:  0x0000000000000000  0x0000010100000000 
0x0000003fa805b868:  0x0000000000000000  0x0000000000000000 
0x0000003fa805b878:  0x0000000000067f28 <runtime.sigtrampgo+0x0000000000000130>  0x0000000000000000 
0x0000003fa805b888:  0x0000000000000000  0x0000003fa805b8e0 
0x0000003fa805b898:  0x0000000000067e20 <runtime.sigtrampgo+0x0000000000000028>  0x0000000000000017 
0x0000003fa805b8a8:  0x0000003fa8000ea0  0x0000000000000000 
0x0000003fa805b8b8:  0x0000000000000000  0x0000000000000000 
0x0000003fa805b8c8:  0x0000003fa805b990  0x0000003fa805ba10 
0x0000003fa805b8d8:  0x00000000000862cc <runtime.sigtramp+0x0000000000000034>  0x0000003f00000017 
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:732 +0x29c

goroutine 8 [running]:
fatal error: unexpected signal during runtime execution
panic during panic
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x78ed0]

goroutine 0 [idle]:
runtime: unexpected return pc for runtime.sigpanic called from 0x80
stack: frame={sp:0x3fa805b7b8, fp:0x3fa805b7e8} stack=[0x3fa8054000,0x3fa805c000)
0x0000003fa805b6b8:  0x0000000000000000  0x000000000004f024 <runtime.fatalthrow.func1+0x000000000000005c> 
0x0000003fa805b6c8:  0x0000003fa8000ea0  0x0000003fa805b790 
0x0000003fa805b6d8:  0x0000000000000000  0x0000003fa8000ea0 
0x0000003fa805b6e8:  0x010000000004f35c  0x0000000000000004 
0x0000003fa805b6f8:  0x000000000000001f  0x0000000000000080 
0x0000003fa805b708:  0x0000000000000080  0x0000000000000001 
0x0000003fa805b718:  0x00000000001aed2c  0x000000000004efb4 <runtime.fatalthrow+0x000000000000004c> 
0x0000003fa805b728:  0x0000003fa8000ea0  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b738:  0x0000003fa805b790  0x0000000000050998 <runtime.printunlock+0x0000000000000058> 
0x0000003fa805b748:  0x0000003fa805b790  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b758:  0x0000003fa8000ea0  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b768:  0x0000003fa805b770  0x000000000004efc8 <runtime.fatalthrow.func1+0x0000000000000000> 
0x0000003fa805b778:  0x0000003fa8000ea0  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b788:  0x0000003fa805b790  0x000000000006913c <runtime.sigpanic+0x000000000000029c> 
0x0000003fa805b798:  0x0000003fa805b7a0  0x000000000004ed90 <runtime.throw.func1+0x0000000000000000> 
0x0000003fa805b7a8:  0x00000000001b200b  0x000000000000002a 
0x0000003fa805b7b8: <0x0000000000000080  0x00000000001b200b 
0x0000003fa805b7c8:  0x000000000000002a  0x0000000000000000 
0x0000003fa805b7d8:  0x0000000000000080  0x0000000000000003 
0x0000003fa805b7e8: >0x0000000000000080  0x0000000000068304 <runtime.sighandler+0x0000000000000064> 
0x0000003fa805b7f8:  0x0000003fac01b000  0x0000000000000001 
0x0000003fa805b808:  0x00030cb3a06830c3  0x00030cb3a06830c3 
0x0000003fa805b818:  0x0000003fac01b000  0x000012bc998f4621 
0x0000003fa805b828:  0x0000000000067f94 <runtime.sigtrampgo+0x000000000000019c>  0x000000000015ae20 <runtime/pprof.cpuHog1+0x0000000000000010> 
0x0000003fa805b838:  0x0000003fa802ef58  0x000000000015ada4 <runtime/pprof.cpuHogger+0x000000000000005c> 
0x0000003fa805b848:  0x0000003fa80c0680  0x0000003fa8030900 
0x0000003fa805b858:  0x0000000000000000  0x0000010100000000 
0x0000003fa805b868:  0x0000000000000000  0x0000000000000000 
0x0000003fa805b878:  0x0000000000067f28 <runtime.sigtrampgo+0x0000000000000130>  0x0000000000000000 
0x0000003fa805b888:  0x0000000000000000  0x0000003fa805b8e0 
0x0000003fa805b898:  0x0000000000067e20 <runtime.sigtrampgo+0x0000000000000028>  0x0000000000000017 
0x0000003fa805b8a8:  0x0000003fa8000ea0  0x0000000000000000 
0x0000003fa805b8b8:  0x0000000000000000  0x0000000000000000 
0x0000003fa805b8c8:  0x0000003fa805b990  0x0000003fa805ba10 
0x0000003fa805b8d8:  0x00000000000862cc <runtime.sigtramp+0x0000000000000034>  0x0000003f00000017 
runtime.throw({0x1b200b, 0x2a})
	/usr/local/go/src/runtime/panic.go:1198 +0x60
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:732 +0x29c
runtime.gentraceback(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0x3fa80c0680, 0x0, 0x0, 0x64, 0x0, 0x0, 0x0)
	/usr/local/go/src/runtime/traceback.go:247 +0x4d8
runtime.traceback1(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0x3fa80c0680, 0x0)
	/usr/local/go/src/runtime/traceback.go:823 +0x14c
runtime.traceback(...)
	/usr/local/go/src/runtime/traceback.go:777
runtime.tracebackothers(0x3fa8000ea0)
	/usr/local/go/src/runtime/traceback.go:1015 +0xcc
runtime.dopanic_m(0x3fa8000ea0, 0x4ed78, 0x3fa805b790)
	/usr/local/go/src/runtime/panic.go:1398 +0x350
runtime.fatalthrow.func1()
	/usr/local/go/src/runtime/panic.go:1253 +0x5c
runtime.fatalthrow()
	/usr/local/go/src/runtime/panic.go:1250 +0x4c
runtime.throw({0x1b200b, 0x2a})
	/usr/local/go/src/runtime/panic.go:1198 +0x60
runtime: unexpected return pc for runtime.sigpanic called from 0x80
stack: frame={sp:0x3fa805b7b8, fp:0x3fa805b7e8} stack=[0x3fa8054000,0x3fa805c000)
0x0000003fa805b6b8:  0x0000000000000000  0x000000000004f024 <runtime.fatalthrow.func1+0x000000000000005c> 
0x0000003fa805b6c8:  0x0000003fa8000ea0  0x0000003fa805b790 
0x0000003fa805b6d8:  0x0000000000000000  0x0000003fa8000ea0 
0x0000003fa805b6e8:  0x010000000004f35c  0x0000000000000004 
0x0000003fa805b6f8:  0x000000000000001f  0x0000000000000080 
0x0000003fa805b708:  0x0000000000000080  0x0000000000000001 
0x0000003fa805b718:  0x00000000001aed2c  0x000000000004efb4 <runtime.fatalthrow+0x000000000000004c> 
0x0000003fa805b728:  0x0000003fa8000ea0  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b738:  0x0000003fa805b790  0x0000000000050998 <runtime.printunlock+0x0000000000000058> 
0x0000003fa805b748:  0x0000003fa805b790  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b758:  0x0000003fa8000ea0  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b768:  0x0000003fa805b770  0x000000000004efc8 <runtime.fatalthrow.func1+0x0000000000000000> 
0x0000003fa805b778:  0x0000003fa8000ea0  0x000000000004ed78 <runtime.throw+0x0000000000000060> 
0x0000003fa805b788:  0x0000003fa805b790  0x000000000006913c <runtime.sigpanic+0x000000000000029c> 
0x0000003fa805b798:  0x0000003fa805b7a0  0x000000000004ed90 <runtime.throw.func1+0x0000000000000000> 
0x0000003fa805b7a8:  0x00000000001b200b  0x000000000000002a 
0x0000003fa805b7b8: <0x0000000000000080  0x00000000001b200b 
0x0000003fa805b7c8:  0x000000000000002a  0x0000000000000000 
0x0000003fa805b7d8:  0x0000000000000080  0x0000000000000003 
0x0000003fa805b7e8: >0x0000000000000080  0x0000000000068304 <runtime.sighandler+0x0000000000000064> 
0x0000003fa805b7f8:  0x0000003fac01b000  0x0000000000000001 
0x0000003fa805b808:  0x00030cb3a06830c3  0x00030cb3a06830c3 
0x0000003fa805b818:  0x0000003fac01b000  0x000012bc998f4621 
0x0000003fa805b828:  0x0000000000067f94 <runtime.sigtrampgo+0x000000000000019c>  0x000000000015ae20 <runtime/pprof.cpuHog1+0x0000000000000010> 
0x0000003fa805b838:  0x0000003fa802ef58  0x000000000015ada4 <runtime/pprof.cpuHogger+0x000000000000005c> 
0x0000003fa805b848:  0x0000003fa80c0680  0x0000003fa8030900 
0x0000003fa805b858:  0x0000000000000000  0x0000010100000000 
0x0000003fa805b868:  0x0000000000000000  0x0000000000000000 
0x0000003fa805b878:  0x0000000000067f28 <runtime.sigtrampgo+0x0000000000000130>  0x0000000000000000 
0x0000003fa805b888:  0x0000000000000000  0x0000003fa805b8e0 
0x0000003fa805b898:  0x0000000000067e20 <runtime.sigtrampgo+0x0000000000000028>  0x0000000000000017 
0x0000003fa805b8a8:  0x0000003fa8000ea0  0x0000000000000000 
0x0000003fa805b8b8:  0x0000000000000000  0x0000000000000000 
0x0000003fa805b8c8:  0x0000003fa805b990  0x0000003fa805ba10 
0x0000003fa805b8d8:  0x00000000000862cc <runtime.sigtramp+0x0000000000000034>  0x0000003f00000017 
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:732 +0x29c
>> begin log 1 <<
[0.026027891 P 0] sighandler m 0 sp 0x3fa80098d8 sig 23 pc 0x15ae54 g 0 curg 6 gp 6
[0.034801424 P 1] sighandler m 3 sp 0x3fa805b8d8 sig 23 pc 0x15ae28 g 0 curg 8 gp 8
[0.036000768 P 1] sighandler m 3 sp 0x3fa805b8d8 sig 27 pc 0x15ae44 g 0 curg 8 gp 8
[0.044301756 P 0] sighandler m 0 sp 0x3fa80098d8 sig 23 pc 0x15ae64 g 0 curg 6 gp 6
[0.046072521 P 0] sighandler m 0 sp 0x3fa80098d8 sig 27 pc 0x15ae84 g 0 curg 6 gp 6
[0.054425260 P 1] sighandler m 3 sp 0x3fa805b8d8 sig 23 pc 0x15ae40 g 0 curg 8 gp 8
[0.056001690 P 1] sighandler m 3 sp 0x3fa805b8d8 sig 27 pc 0x15ae3c g 0 curg 8 gp 8
[0.064551597 P 0] sighandler m 0 sp 0x3fa80098d8 sig 23 pc 0x15ae68 g 0 curg 6 gp 6
[0.065965193 P 0] sighandler m 0 sp 0x3fa80098d8 sig 27 pc 0x15ae6c g 0 curg 6 gp 6
[0.074677393 P 1] sighandler m 3 sp 0x3fa805b8d8 sig 23 pc 0x15ae30 g 0 curg 8 gp 8
[0.075943070 P 1] sighandler m 3 sp 0x3fa805b8d8 sig 27 pc 0x15ae44 g 0 curg 8 gp 8
[0.084789021 P 0] sighandler m 0 sp 0x3fa80098d8 sig 23 pc 0x15ae68 g 0 curg 6 gp 6
[0.085958823 P 0] sighandler m 0 sp 0x3fa80098d8 sig 27 pc 0x15ae5c g 0 curg 6 gp 6
[0.094915150 P 1] sighandler m 3 sp 0x3fa805b8d8 sig 23 pc 0x15ae30 g 0 curg 8 gp 8
[0.095942742 P 1] sighandler m 3 sp 0x3fa805b8d8 sig 27 pc 0x15ae20 g 0 curg 8 gp 8
[0.105028154 P 0] sighandler m 0 sp 0x3fa80098d8 sig 23 pc 0x15ae60 g 0 curg 6 gp 6
[0.105966245 P 0] sighandler m 0 sp 0x3fa80098d8 sig 27 pc 0x15ae84 g 0 curg 6 gp 6
[0.115148491 P 1] sighandler m 3 sp 0x3fa805b8d8 sig 23 pc 0x15ae30 g 0 curg 8 gp 8
[0.115941248 P 1] sighandler m 3 sp 0x3fa805b8d8 sig 27 pc 0x15ae1c g 0 curg 8 gp 8
[0.125269703 P 0] sighandler m 0 sp 0x3fa80098d8 sig 23 pc 0x15ae64 g 0 curg 6 gp 6
[0.125964126 P 0] sighandler m 0 sp 0x3fa80098d8 sig 27 pc 0x15ae6c g 0 curg 6 gp 6
[0.135386165 P 1] sighandler m 3 sp 0x3fa805b8d8 sig 23 pc 0x15ae40 g 0 curg 8 gp 8
[0.135951878 P 1] sighandler m 3 sp 0x3fa805b8d8 sig 27 pc 0x15ae1c g 0 curg 8 gp 8
[0.145511377 P 0] sighandler m 0 sp 0x3fa80098d8 sig 23 pc 0x15ae80 g 0 curg 6 gp 6
[0.145979965 P 0] sighandler m 0 sp 0x3fa80098d8 sig 27 pc 0x15ae68 g 0 curg 6 gp 6
[0.155635756 P 1] sighandler m 3 sp 0x3fa805b8d8 sig 23 pc 0x15ae40 g 0 curg 8 gp 8
[0.155944217 P 1] sighandler m 3 sp 0x3fa805b8d8 sig 27 pc 0x15ae20 g 0 curg 8 gp 8
[0.165747218 P 0] sighandler m 0 sp 0x3fa80098d8 sig 23 pc 0x15ae70 g 0 curg 6 gp 6
[0.165960137 P 0] sighandler m 0 sp 0x3fa80098d8 sig 27 pc 0x15ae7c g 0 curg 6 gp 6
>> begin log 0 <<
[0.175955556 P 1] sighandler m 3 sp 0x3fa805b0c8 sig 27 pc 0x16c34 g 0 curg 8 gp 0
[0.176103224 P 1] sighandler m 3 sp 0x3fa805b0c8 sig 11 pc 0x80 g 0 curg 8 gp 0
[0.176851689 P 0] sighandler m 0 sp 0x3fa80098d8 sig 23 pc 0x15ae6c g 0 curg 6 gp 6
[0.184034501 P 1] sighandler m 3 sp 0x3fa805aa48 sig 11 pc 0x78ed0 g 0 curg 8 gp 0
[0.186848275 P 1] sighandler m 3 sp 0x3fa805a378 sig 27 pc 0x86064 g 0 curg 8 gp 0


ERROR: exit status 2

@rhysh
Copy link
Contributor Author

rhysh commented Nov 29, 2021

On linux/amd64, adding strace -f -ttt -e rt_sigaction shows lines that set sa_mask=~[], which I take to mean "all bits set".

On linux/riscv64, the strace output shows sa_mask=[], which I take to mean "all bits cleared". When I modify runtime.setsig to set the sa_restorer field, I see it reflected in strace's view of the sa_mask field.

It looks from this like the definition of runtime.sigactiont is not correct on linux/riscv64, resulting in no signals at all being blocked when handling a signal.

type sigactiont struct {
sa_handler uintptr
sa_flags uint64
sa_restorer uintptr
sa_mask uint64
}

@cherrymui
Copy link
Member

cherrymui commented Nov 29, 2021

Good finding. From the kernel C header

struct sigaction {
        __sighandler_t sa_handler;
        unsigned long sa_flags;
#ifdef SA_RESTORER
        __sigrestore_t sa_restorer;
#endif
        sigset_t sa_mask;               /* mask last for extensibility */
};

So it does look wrong if SA_RESTORER is not defined. And it does look like not defined on riscv64.

@gopherbot
Copy link

Change https://golang.org/cl/367635 mentions this issue: runtime: fix riscv64 sigaction mask field offset

@golang golang locked and limited conversation to collaborators Dec 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

6 participants