Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: process crash instead of panic on SIGBUS with SetPanicOnDefault(true) #41155

Open
florisch opened this issue Sep 1, 2020 · 11 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@florisch
Copy link

florisch commented Sep 1, 2020

What version of Go are you using (go version)?

$ go version
go version go1.15 windows/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
set GO111MODULE=
set GOARCH=arm
set GOBIN=
set GOCACHE=C:\Users\Florian\AppData\Local\go-build
set GOENV=C:\Users\Florian\AppData\Roaming\go\env
set GOEXE=
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=C:\Users\Florian\go\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=linux
set GOPATH=C:\Users\Florian\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=c:\go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=c:\go\pkg\tool\windows_amd64
set GCCGO=gccgo
set GOARM=7
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=0
set GOMOD=
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-fPIC -marm -fmessage-length=0 -fdebug-prefix-map=C:\Users\Florian\AppData\Local\Temp\go-build326602894=/tmp/go-build -gno-record-gcc-switches
GOROOT/bin/go version: go version go1.15 windows/amd64
GOROOT/bin/go tool compile -V: compile version go1.15
gdb --version: GNU gdb (GDB) 8.1

What did you do?

We are using Go for some embedded development (cross compiled to linux arm32). We access various FPGA registers from the Go process. In order to access those registers, we use mmap /dev/mem at the address space of those registers.

When we access registers which are not defined/accessible in the FPGA, the process crash with the error reported below.

We use defer debug.SetPanicOnFault(debug.SetPanicOnFault(true)) in the stack which makes the register read as we expect this to make the runtime panic instead of crash on this kind of memory fault.

What did you expect to see?

A panic where the bad access happened. This way, with a recover call, it would be possible to handle the case where some registers are not available.

What did you see instead?

The process crash, in an unrecoverable way, with the following output:

Unhandled fault: external abort on non-linefetch (0x018) at 0x26b48010
pgd = 5e090000
[26b48010] *pgd=1e234831, *pte=40040703, *ppte=40040e33
runner.sh: SIGBUS: bus error
runner.sh: PC=0x2a8ff0 m=0 sigcode=0
runner.sh: goroutine 43 [running]:
runner.sh: gobv1/pkg/hw/pmem.Access.ReadUint32(...)
runner.sh:      C:/projects/ellisys/bv1go/pkg/hw/pmem/memAccess_linux.go:122
runner.sh: gobv1/pkg/hw/pmem.(*Access).ReadUint32(0x925200, 0x10, 0x28e594)
runner.sh:      <autogenerated>:1 +0x44 fp=0x8a6acc sp=0x8a6aa4 pc=0x2a8ff0
...
runner.sh: main.(*command).initializeDevice(0x9222c0, 0x922b80)
runner.sh:      C:/projects/ellisys/bv1go/cmd/gobv1/main.go:154 +0x94 fp=0x8a6fe4 sp=0x8a6fa0 pc=0x371320
runner.sh: runtime.goexit()
...
runner.sh: goroutine 20 [select]:
runner.sh: io.(*pipe).Read(0x922280, 0x84c000, 0x1000, 0x1000, 0x3b50e8, 0x1136b0, 0x84c000)
runner.sh:      C:/Go/src/io/pipe.go:57 +0xac
...
runner.sh: goroutine 42 [runnable]:
...
runner.sh: trap    0x0
runner.sh: error   0x18
runner.sh: oldmask 0x0
runner.sh: r0      0x26b48000
runner.sh: r1      0x3c
runner.sh: r2      0x8a6acc
runner.sh: r3      0x10
runner.sh: r4      0x1
runner.sh: r5      0x1
runner.sh: r6      0xf1
runner.sh: r7      0x26ccc521
runner.sh: r8      0x925200
runner.sh: r9      0x20
runner.sh: r10     0x883500
runner.sh: fp      0x7
runner.sh: ip      0x925203
runner.sh: sp      0x8a6aa4
runner.sh: lr      0x2a8fdc
runner.sh: pc      0x2a8ff0
runner.sh: cpsr    0x80000010
runner.sh: fault   0x0
runner.sh: Program instance execution terminated

Workaround

I build a custom runtime with this commit which makes the call panic as expected.

@ianlancetaylor ianlancetaylor changed the title Process crash instead of panic on SIGBUS runtime: process crash instead of panic on SIGBUS with SetPanicOnDefault(true) Sep 2, 2020
@ianlancetaylor ianlancetaylor added the NeedsFix The path to resolution is known, but the work has not been done. label Sep 2, 2020
@ianlancetaylor ianlancetaylor added this to the Go1.16 milestone Sep 2, 2020
@tpaschalis
Copy link
Contributor

I'm not sure how to replicate this failure, but I'd like to give this a shot.

Do we think that the posted workaround is something that could also be long-term solution?

@ianlancetaylor
Copy link
Contributor

Please avoid looking at the workaround (and, everyone, please avoid posting patches through the issue tracker). We want patches to only come in as Gerritt code reviews or GitHub pull requests, because then we have automation that confirms that the copyright assignments are in order. Thanks.

To put it another way, I can't answer your question about the posted workaround because I'm not going to look at it. Sorry.

I think you might be able to write a test that gets a SIGBUS by using mmap to map memory as read-only and then trying to write to it. I'm not really sure, though.

@tpaschalis
Copy link
Contributor

Thanks for the pointers, I'll try to get a repro done, and then see how the issue can be fixed!

@florisch
Copy link
Author

florisch commented Sep 4, 2020

Thank you for looking into this. I tough I should open a ticket for discussion before creating a PR. Sorry if I didn't respect the rules by adding a link to my workaround commit in the ticket.

If desired, I would be happy to contribute to fix this issue and make a PR. For now, I try to find a way to write a test which could be integrated with the regular test suite to reproduce this issue without our embedded FPGA platform.

I created a test doing what @ianlancetaylor suggested. Doing this doesn't reproduce the issue. This result in the expected panic: runtime error: invalid memory address or nil pointer dereference (both on a linux desktop and on our embedded platform).

@networkimprov
Copy link

@tpaschalis
Copy link
Contributor

tpaschalis commented Sep 5, 2020

For now, I try to find a way to write a test which could be integrated with the regular test suite to reproduce this issue without our embedded FPGA platform.

This would a good first step; I hope I can assist in that as well. (and also, thanks for having a positive attitude to getting to the bottom of this!)

The following code uses CGO and triggers a SIGBUS. I tried it on darwin and linux, but could not get the same error. This happens both with and without the debug.SetPanicOnFault(debug.SetPanicOnFault(true)) line. On the other hand CGO is a different beast, and maybe that's why the same error does not appear.

Code : https://play.golang.org/p/vWdhf2mtuEq
Output :

fatal error: unexpected signal during runtime execution
[signal SIGBUS: bus error code=0x2 addr=0x7ff893c38000 pc=0x46ca3d]

runtime stack:
runtime.throw(0x48bc4c, 0x2a)
	/usr/local/go/src/runtime/panic.go:1116 +0x72
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:704 +0x4ac

EDIT: Here's the same using syscall.Mmap instead of CGO.
Code : https://play.golang.org/p/aYAsrUND0i_D
Output with debug.SetPanicOnFault

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGBUS: bus error code=0x2 addr=0x7f3ac1f2c000 pc=0x46ceac]

Output without debug.SetPanicOnFault

unexpected fault address 0x7fe33aac3000
fatal error: fault
[signal SIGBUS: bus error code=0x2 addr=0x7fe33aac3000 pc=0x494db6]

@florisch
Copy link
Author

florisch commented Sep 8, 2020

I tried the code using mmap on our embedded platform, and see the same behavior. Then I modified the runtime to print the flags and the sigcode when a SIGBUS is received.

Output of SIGBUS generated by sample from previous comment

SIGBUS flags=0x0x88 sigcode=0x2

Output with SIGBUS generated by a bad register access

SIGBUS flags=0x0x88 sigcode=0x0

Since sigcode 0 match with _SI_USER, it is not handled properly in the case of our bad register access while it is handled properly when generated using code from previous comment.

@florisch
Copy link
Author

Here is a minimal code which reproduce the issue on armv7. The same code on amd64 doesn't reproduce the issue as mmap simply refuse to mmap bad addresses.

https://play.golang.org/p/Zbi9pBZ3rKu
Output:

SIGBUS flags=0x0x88 sigcode=0x 0x0
SIGBUS: bus error
PC=0xa2728 m=0 sigcode=0

goroutine 1 [running]:
main.main()
        gobv1/tools/crash/main.go:37 +0x240 fp=0x4227b8 sp=0x422740 pc=0xa2728
runtime.main()
        runtime/proc.go:205 +0x208 fp=0x4227e4 sp=0x4227b8 pc=0x427f8
runtime.goexit()
        runtime/asm_arm.s:857 +0x4 fp=0x4227e4 sp=0x4227e4 pc=0x6d8f0

trap    0x0
error   0x1818
oldmask 0x0
r0      0x0
r1      0x1000
r2      0x26c2c000
r3      0x0
r4      0x4
r5      0x0
r6      0x26c2cfff
r7      0x0
r8      0x7
r9      0x1
r10     0x4000e0
fp      0x14d078
ip      0xd
sp      0x422740
lr      0x119f0
pc      0xa2728
cpsr    0x20000010
fault   0x0

@odeke-em
Copy link
Member

odeke-em commented Feb 6, 2021

Punting to Go1.17, thank you all for the patience, and for the discussion, please keep it going.

@odeke-em odeke-em modified the milestones: Go1.16, Go1.17 Feb 6, 2021
@ianlancetaylor
Copy link
Contributor

I don't understand why the kernel would send a signal with si_code set to SI_USER. That seems like a kernel bug. The SI_USER code is supposed to indicate an explicit use of the kill system call. I don't mind working around a kernel bug but we don't want to treat all SIGBUS signals with si_code == SI_USER as indicating an actual bus error.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.17, Backlog Apr 30, 2021
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
@shakefu
Copy link

shakefu commented Mar 18, 2024

Since this has been around forever, I'd like just to add you can reliably trigger a crashing SIGBUS even when trying to recover the panic by writing to PROT_READ mmap'd memory. I can trigger it 100% of the time using gommap on Darwin arm64. Not sure if that helps with debugging and finding a handler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsFix The path to resolution is known, but the work has not been done.
Projects
Status: Triage Backlog
Development

No branches or pull requests

7 participants