Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: crash on 1.14 with unexpected return pc, fatal error: unknown caller pc #37664

Closed
apmckinlay opened this issue Mar 4, 2020 · 19 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@apmckinlay
Copy link

What version of Go are you using (go version)?

$ go version
go1.14 windows/amd64

Also happens on darwin

Works with 1.13.8

Does this issue reproduce with the latest release?

Yes, and also with tip as of Mar 4, 2020

What operating system and processor architecture are you using (go env)?

Happens on Windows and Mac OS X (darwin)
I haven't tried Linux

go env Output This is showing 1.13.8 since it's my "main" installation. I am testing with go1.14 and gotip
$ go env
set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\Andrew\AppData\Local\go-build
set GOENV=C:\Users\Andrew\AppData\Roaming\go\env
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\Andrew\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=c:\go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=c:\go\pkg\tool\windows_amd64
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=C:\Dropbox\gsuneido\go.mod
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\Andrew\AppData\Local\Temp\go-build176488038=/tmp/go-build -gno-record-gcc-switches
GOROOT/bin/go version: go version go1.13.8 windows/amd64
GOROOT/bin/go tool compile -V: compile version go1.13.8
gdb --version: GNU gdb (GDB) 8.1

What did you do?

I built my program with 1.14 and now it crashes. Works fine with 1.13

There is no cgo or assembler involved.
(There is some cgo in the project, but I'm building without it.)
There is minor (read-only) use of unsafe, but it does not appear to be involved.
It is running a single go routine, no concurrency (other than internal Go stuff).
Still crashes with GOGC=off

It is consistent and repeatable - running the same thing crashes the same way every time.
Crashes the same way on darwin, so presumably it's a cross platform issue.
However it is "touchy". Making slight changes to what I'm running can eliminate or change the crash.

It appears to be related to panic/defer/recover
Possibly something to do with re-panic the result of recover
The function doing the panic/defer/recover is recursive if that makes any difference.
Possibly something to do with the defer changes in 1.14
This code makes heavier than normal use of panic/defer/recover which may be why I'm running into this and other people are not.

Unfortunately, it's a large complex system and so far I have not been able to come up with a small Go example that reproduces it. (I could provide the necessary files and configuration if desired.)

I searched the issues but couldn't find anything that looked related.

I assume that nothing I do in normal single-threaded Go code should cause this?

I would welcome any suggestions on how to debug this issue.
e.g. Is there any way to control the compilation of defer handling?

What did you expect to see?

no crash

What did you see instead?

crash

example crash output
runtime: unexpected return pc for github.com/apmckinlay/gsuneido/runtime.(*Thread).interp
called from 0xc000011610
stack: frame={sp:0xc000127208, fp:0xc0001275d0} stack=[0xc000120000,0xc000128000)
000000c000127108:  0000000000618e00  000000000064bec8
000000c000127118:  000000c00017ba40  000000c0001271a0
000000c000127128:  00000000004f0504   000000c000198000
000000c000127138:  00000000000003ff  00000000006a93a0
000000c000127148:  000000c000169a80  0000000000000000
000000c000127158:  000000c0001271f8  000000c0001c8000
000000c000127168:  0000000000000000  000000c00019c020
000000c000127178:  0000000000000000  00000000000003ff
000000c000127188:  0000000000000000  0000000000000000
000000c000127198:  0000000000000000  000000c000127240
000000c0001271a8:  0000000000507414   000000c000198000
000000c0001271b8:  000000c0001c8000  0000000000000000
000000c0001271c8:  0000000000000000  00000000000003ff
000000c0001271d8:  00000000000003ff  000000c000188500
000000c0001271e8:  000000c00019c020  0000000000894b40
000000c0001271f8:  000000c0001275c0  00000000004f5e5a 
000000c000127208: <000000c000198000  00000000000003ff
000000c000127218:  000000c000127610  0000000000000000
000000c000127228:  000000c0001275e8  000000c00019c020
000000c000127238:  0000000000894b40  000000c000127608
000000c000127248:  00000000004f513c   000000c0001c8000
000000c000127258:  000000c000198000  0000000000000000
000000c000127268:  0000000000000000  000000000084e3c0
000000c000127278:  000000c000186b80  000000c0001272a8
000000c000127288:  00000000006a7f00  000000c00017e470
000000c000127298:  000000c000186b80  000000c000169ac0
000000c0001272a8:  000001003e6a7ea0  00000000005e9fe6 
000000c0001272b8:  000000c000127330  0000000000000000
000000c0001272c8:  0000000000000000  00000000006a95e0
000000c0001272d8:  000000c0001c80e0  000000c000127320
000000c0001272e8:  000000000044d4ce   00000000001273a0
000000c0001272f8:  000000c000127338  000000c000127320
000000c000127308:  00000000004eff88   000000000084d780
000000c000127318:  0000000000000012  000000c0001273f0
000000c000127328:  00000000005e9ad4   000000c0000114b1
000000c000127338:  000000000000000b  000000c0001c80e0
000000c000127348:  0000000000000000  000000c00013c180
000000c000127358:  0000000000000077  00000000006a93a0
000000c000127368:  000000c0001c80e0  0000000000000012
000000c000127378:  0000000000000000  00000000006aa060
000000c000127388:  000000000054f95b   0000000000000000
000000c000127398:  0000000000000005  0000000000000000
000000c0001273a8:  757465526b636f6c  0000000000006e72
000000c0001273b8:  000000000051d9b6   0000000000000040
000000c0001273c8:  0000000000000000  0000000000000000
000000c0001273d8:  0000000000669a00  0000000000000009
000000c0001273e8:  000000c000127508  0000000000000000
000000c0001273f8:  00000000004ef890   0000000000000001
000000c000127408:  0000000000000040  000000c0000114b1
000000c000127418:  000000000000000b  00000000006a95e0
000000c000127428:  0000000000010000  0000000000000000
000000c000127438:  0000000000000000  0000000000000000
000000c000127448:  0000000000000000  0000000000000000
000000c000127458:  0000000000000000  000000c0001c80e0
000000c000127468:  000000000040ccd8   000000c00017d540
000000c000127478:  000000c0001274a8  0000000000669910
000000c000127488:  000000c0001274b8  00000000004ef714 
000000c000127498:  000000c000198000  0000000000000069
000000c0001274a8:  00000000006a95e0  000000c0001c80e0
000000c0001274b8:  000000c000127530  0000000000547543 
000000c0001274c8:  000000c000169a80  0000000000010000
000000c0001274d8:  000000c00017e250  0000000000000000
000000c0001274e8:  000000c000198040  000000c000011500
000000c0001274f8:  000000000084e3c0  0000000000000000
000000c000127508:  0000000000010000  0000000000000000
000000c000127518:  0000000000000000  0000000000000000
000000c000127528:  0000000000000000  000000c000198040
000000c000127538:  000000c000011610  000000000051dae0 
000000c000127548:  000000c000127263  000000c000198000
000000c000127558:  0000000000000000  0000000000000000
000000c000127568:  0000000000000000  000000000051da70 
000000c000127578:  000000c000198040  000000c000011500
000000c000127588:  000000000051dae0   000000c0001272ab
000000c000127598:  000000c000198000  000000c000127610
000000c0001275a8:  000000c000198040  000000c000198000
000000c0001275b8:  000000000051da70   000000c000198040
000000c0001275c8: !000000c000011610 >0000000000000009
000000c0001275d8:  000000c000127630  000000c000127558
000000c0001275e8:  000000c000127658  000000c000198040
000000c0001275f8:  000000c000198000  00000000006698a0
000000c000127608:  000000c000127690  00000000004f06d6 
000000c000127618:  000000c000198000  000000c000127658
000000c000127628:  000000c000127650  0000000000000000
000000c000127638:  0000000000000000  0000000000550227 
000000c000127648:  0000000000000001  ffffffffffffffff
000000c000127658:  0000000000000000  00000000006a9be0
000000c000127668:  0000000000000002  0000000000000001
000000c000127678:  0000000000000000  0000000000000000
000000c000127688:  000000c00017b960  000000c000127710
000000c000127698:  00000000004f0504   000000c000198000
000000c0001276a8:  0000000000000400  0000000000411c54 
000000c0001276b8:  000000c0001276f0  0000000028df0dc3
000000c0001276c8:  5bfc77cc605e10fb
fatal error: unknown caller pc

runtime stack:
runtime.throw(0x65e504, 0x11)
C:/Users/Andrew/sdk/gotip/src/runtime/panic.go:1112 +0x79
runtime.gentraceback(0x4f5d02, 0xc000127208, 0x0, 0xc000056000, 0x0, 0x0, 0x7fffffff, 0xc5feb0, 0x0, 0x0, ...)
C:/Users/Andrew/sdk/gotip/src/runtime/traceback.go:273 +0x1a09
runtime.addOneOpenDeferFrame.func1()
C:/Users/Andrew/sdk/gotip/src/runtime/panic.go:719 +0x98
runtime.systemstack(0x0)
C:/Users/Andrew/sdk/gotip/src/runtime/asm_amd64.s:370 +0x6b
runtime.mstart()
C:/Users/Andrew/sdk/gotip/src/runtime/proc.go:1042

goroutine 1 [running]:
runtime.systemstack_switch()
C:/Users/Andrew/sdk/gotip/src/runtime/asm_amd64.s:330 fp=0xc0001269d0 sp=0xc0001269c8 pc=0x4603b0
runtime.addOneOpenDeferFrame(0xc000056000, 0x0, 0x0)
C:/Users/Andrew/sdk/gotip/src/runtime/panic.go:718 +0x82 fp=0xc000126a10 sp=0xc0001269d0 pc=0x435132
panic(0x64cbc0, 0xc000004780)
C:/Users/Andrew/sdk/gotip/src/runtime/panic.go:969 +0x344 fp=0xc000126ab8 sp=0xc000126a10 pc=0x435a24
github.com/apmckinlay/gsuneido/runtime.(*Thread).interp.func6(0xc000198000, 0xc000198080,
0xc0001270e8, 0xc000126fe8, 0xc0001270c0)
C:/Dropbox/gsuneido/runtime/interp.go:140 +0x33a fp=0xc000126b28 sp=0xc000126ab8 pc=0x51df8a
runtime.call64(0x0, 0x6698a0, 0xc00012c898, 0x2800000028)
C:/Users/Andrew/sdk/gotip/src/runtime/asm_amd64.s:540 +0x42 fp=0xc000126b78 sp=0xc000126b28 pc=0x460802
runtime.reflectcallSave(0xc000126c98, 0x6698a0, 0xc00012c898, 0xc000000028)
C:/Users/Andrew/sdk/gotip/src/runtime/panic.go:879 +0x5f fp=0xc000126ba8 sp=0xc000126b78 pc=0x43562f
runtime.runOpenDeferFrame(0xc000056000, 0xc00012c850, 0xc000126ce0)
C:/Users/Andrew/sdk/gotip/src/runtime/panic.go:853 +0x2c0 fp=0xc000126c38 sp=0xc000126ba8 pc=0x4354f0
panic(0x64cbc0, 0xc000004780)
C:/Users/Andrew/sdk/gotip/src/runtime/panic.go:967 +0x16b fp=0xc000126ce0 sp=0xc000126c38 pc=0x43584b
github.com/apmckinlay/gsuneido/runtime.(*Thread).interp(0xc000198000, 0xc0001270e8, 0xc0001270e0, 0x0, 0x0)
C:/Dropbox/gsuneido/runtime/interp.go:438 +0x5c64 fp=0xc0001270a8 sp=0xc000126ce0
pc=0x4f65e4
github.com/apmckinlay/gsuneido/runtime.(*Thread).run(0xc000198000, 0x3ff, 0x6a93a0)
C:/Dropbox/gsuneido/runtime/interp.go:67 +0x156 fp=0xc000127130 sp=0xc0001270a8 pc=0x4f06d6
github.com/apmckinlay/gsuneido/runtime.(*Thread).Start(0xc000198000, 0xc0001c8000, 0x0, 0x0, 0x3ff, 0x3ff)
C:/Dropbox/gsuneido/runtime/interp.go:31 +0x204 fp=0xc0001271b0 sp=0xc000127130 pc=0x4f0504
github.com/apmckinlay/gsuneido/runtime.(*SuFunc).Call(0xc0001c8000, 0xc000198000, 0x0, 0x0, 0x84e3c0, 0xc000186b80, 0xc0001272a8)
C:/Dropbox/gsuneido/runtime/sufunc.go:56 +0x2d4 fp=0xc000127250 sp=0xc0001271b0 pc=0x507414
github.com/apmckinlay/gsuneido/runtime.(*Thread).interp(0xc000198000, 0xc000127658, 0xc000127650, 0x0, 0x0)
C:/Dropbox/gsuneido/runtime/interp.go:450 +0x47bc fp=0xc000127618 sp=0xc000127250
pc=0x4f513c
github.com/apmckinlay/gsuneido/runtime.(*Thread).run(0xc000198000, 0x400, 0x411c54)
C:/Dropbox/gsuneido/runtime/interp.go:67 +0x156 fp=0xc0001276a0 sp=0xc000127618 pc=0x4f06d6
github.com/apmckinlay/gsuneido/runtime.(*Thread).Start(0xc000198000, 0xc0001c80e0, 0x0, 0x0, 0x400, 0x400)
C:/Dropbox/gsuneido/runtime/interp.go:31 +0x204 fp=0xc000127720 sp=0xc0001276a0 pc=0x4f0504
github.com/apmckinlay/gsuneido/runtime.(*SuFunc).Call(0xc0001c80e0, 0xc000198000, 0x0, 0x0, 0x84e3c0, 0x6a9be0, 0x89f41b)
C:/Dropbox/gsuneido/runtime/sufunc.go:56 +0x2d4 fp=0xc0001277c0 sp=0xc000127720 pc=0x507414
github.com/apmckinlay/gsuneido/runtime.(*Thread).interp(0xc000198000, 0xc000127bc8, 0xc000127bc0, 0x0, 0x0)
C:/Dropbox/gsuneido/runtime/interp.go:450 +0x47bc fp=0xc000127b88 sp=0xc0001277c0
pc=0x4f513c
github.com/apmckinlay/gsuneido/runtime.(*Thread).run(0xc000198000, 0xc000127c80, 0x550835)
C:/Dropbox/gsuneido/runtime/interp.go:67 +0x156 fp=0xc000127c10 sp=0xc000127b88 pc=0x4f06d6
github.com/apmckinlay/gsuneido/runtime.(*Thread).Start(0xc000198000, 0xc0001c81c0, 0x0, 0x0, 0xc0001c81c0, 0x0)
C:/Dropbox/gsuneido/runtime/interp.go:31 +0x204 fp=0xc000127c90 sp=0xc000127c10 pc=0x4f0504
main.eval(0xc00000bc00, 0x17)
C:/Dropbox/gsuneido/gsuneido.go:200 +0x1ff fp=0xc000127d40 sp=0xc000127c90 pc=0x5e972f
main.repl()
C:/Dropbox/gsuneido/gsuneido.go:150 +0x34e fp=0xc000127eb0 sp=0xc000127d40 pc=0x5e926e
main.main()
C:/Dropbox/gsuneido/gsuneido.go:81 +0x23f fp=0xc000127f88 sp=0xc000127eb0 pc=0x5e89cf
runtime.main()
C:/Users/Andrew/sdk/gotip/src/runtime/proc.go:204 +0x212 fp=0xc000127fe0 sp=0xc000127f88 pc=0x438512
runtime.goexit()
C:/Users/Andrew/sdk/gotip/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc000127fe8 sp=0xc000127fe0 pc=0x462531

goroutine 6 [syscall]:
os/signal.signal_recv(0x0)
C:/Users/Andrew/sdk/gotip/src/runtime/sigqueue.go:147 +0xa3
os/signal.loop()
C:/Users/Andrew/sdk/gotip/src/os/signal/signal_unix.go:23 +0x29
created by os/signal.Notify.func1
C:/Users/Andrew/sdk/gotip/src/os/signal/signal.go:127 +0x4b

@randall77
Copy link
Contributor

You can try -gcflags=-N which will turn off the new defer optimizations. It turns off a lot of other stuff also, so it's not a perfect test. But if the problem remains then it wasn't the new defer stuff.

@danscales

@apmckinlay
Copy link
Author

It still crashes the same way.
So presumably not the new defer stuff, but something else that changed in 1.14 ?
Thanks for the suggestion.

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 4, 2020
@dmitshur dmitshur added this to the Go1.15 milestone Mar 4, 2020
@dmitshur dmitshur changed the title crash on 1.14 with unexpected return pc, fatal error: unknown caller pc runtime: crash on 1.14 with unexpected return pc, fatal error: unknown caller pc Mar 4, 2020
@dmitshur
Copy link
Contributor

dmitshur commented Mar 4, 2020

Also /cc @aclements @ianlancetaylor from owners.

@danscales
Copy link
Contributor

@apmckinlay I'm happy to work on debugging this if you can create a code example that you are able to share. Even though it is not specifically related to the defer changes, I also have recently worked with the panic/recover implementation. One change that also went into Go 1.14 relating to panic/recover is https://go-review.googlesource.com/c/go/+/200081 . It should only affect behavior if you did a panic/recover after initiating a Goexit(). I assuming that was not your scenario, but the change could have had some other unintended side effect.

@apmckinlay
Copy link
Author

apmckinlay commented Mar 4, 2020

@danscales Thank you very much! It's open source so no problem sharing.
I will put together instructions/files to recreate the problem.

There shouldn't be an exit involved from my code.

It appears to be a two step process - a first panic/defer/recover works properly, but then a second one crashes. Running the second one by itself is fine. It's as if the first one leaves something behind that affects the second one. The second can be quite simple, but the first has to be more complex to cause the later crash.

Another point that may or may not be relevant is that the panic can be from the same frame as the defer/recover/re-panic (perhaps not typical?)

Call stack depth also appears to be relevant. Possibly stack movement is a factor?
Is there any way to trace stack movement to see when it occurs?
Or a way to set a larger initial default stack size to avoid movement?

@apmckinlay
Copy link
Author

apmckinlay commented Mar 4, 2020

@danscales Here is a set of files that should allow you to recreate the crash.
Instructions in README.txt

crash37664.zip

@danscales danscales self-assigned this Mar 5, 2020
@danscales
Copy link
Contributor

@apmckinlay Thanks for setting up the repro case! It actually reproduces on Linux, though I had to fix the sys_nix.go file (syscall.Sysctl no longer exists on Linux).

The bug is actually related to the new open-coded defers and their interaction with panic/recover. Confusingly, 'go build' doesn't recompile all the sub-packages with the -N option unless you do:

go build -gcflags="all=-N"

The bug goes away if you do that, since the problem is a defer in interp.go. As a more targeted work-around, you can put a 1-iteration for loop around the defer statement in interp() (and no -N option needed) and the problem goes away:

for i := 0; i < 1; i++ {
    defer func() {
        // this is an optimization ...
    }()
}

The bug does require several sequences of panics and recovers with a further re-panic, before doing the final recover.

I think that I have the actual fix in the Go runtime, which is fairly simple, but I'm still working to verify it is the full fix, do more testing, etc.

@apmckinlay
Copy link
Author

@danscales That's great, thanks! I'll add the work around so I can move to 1.14

Do you think the fix will get cherrypicked to 1.14.1 ?

The build issue crossed my mind, but I'm pretty sure I did go build -a -v and saw the package listed so I thought that covered it. Maybe cleaning the build cache would have been safer?

PS. Just listened to the GoTime podcast you were on. Including, coincidentally, the challenge of testing these kinds of changes.

@danscales
Copy link
Contributor

Currently, it seems like it could be a good option for cherrypicking for 1.14.1, but will have to confirm the fix and check with release folks, etc. I'll update as I learn more.

@dmitshur
Copy link
Contributor

dmitshur commented Mar 5, 2020

@danscales Once you know more, if you believe this meets the criteria for backporting documented at https://golang.org/wiki/MinorReleases, feel free to follow the process described there to open backport issues. Thanks.

@apmckinlay
Copy link
Author

I added the workaround to the defer's that re-panic.
But that didn't solve all the problems.
I added the workaround to a few more defer's that call recover.
But I still have some lingering issues.
Do I need the workaround on every defer with a recover? Or every defer?
But would that mean I'd still have issues with defer's in the Go runtime?
Or should I just stick to 1.13 until the fix comes out? (hopefully in 1.14.1 rather than 1.15)

Note: compiling with -gcflags="all=-N" does eliminate the problems

@danscales
Copy link
Contributor

I would have expected that you would only need the workaround for every defer that has a recover, then possible re-panic, or is likely to be on the stack when such panic-recover-re-panics are happening. I don't expect you would have any trouble with the defer in the go Runtime (really just in runtime.main, I think, at the very start of the program). However, it may be a little hard to catch all such defers.

It would be helpful if you happen to try a bit more to apply the workaround to the various defers that seem to be related to the panic-recover-repanic loop, and let me know through the bug if you are successful (and how many defers you tried fixing, whether successful or not).

I have the fix and a sample test that I'm about to put out for review.

@gopherbot
Copy link

Change https://golang.org/cl/222420 mentions this issue: runtime: fix problem with repeated panic/recover/re-panics and open-coded defers

@apmckinlay
Copy link
Author

@danscales I made another pass over the code and added the workaround to all the defers with recover that I thought might end up on the call stack. (now 14 places) It's tricky because it's a language implementation so the behavior is very dynamic and it's hard to statically determine what might be nested. With the additional workarounds, everything seems to be working fine, although I haven't done extensive testing. I will roll it out to some beta users and see what happens.

@danscales
Copy link
Contributor

@gopherbot please consider this for backport to 1.14, it's a regression (and the fix is quite simple).

@gopherbot
Copy link

Backport issue(s) opened: #37782 (for 1.14).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@gopherbot
Copy link

Change https://golang.org/cl/222818 mentions this issue: [release-branch.go1.14] runtime: fix problem with repeated panic/recover/re-panics and open-coded defers

gopherbot pushed a commit that referenced this issue Mar 11, 2020
…ver/re-panics and open-coded defers

In the open-code defer implementation, we add defer struct entries to the defer
chain on-the-fly at panic time to represent stack frames that contain open-coded
defers. This allows us to process non-open-coded and open-coded defers in the
correct order. Also, we need somewhere to be able to store the 'started' state of
open-coded defers. However, if a recover succeeds, defers will now be processed
inline again (unless another panic happens). Any defer entry representing a frame
with open-coded defers will become stale once we run the corresponding defers
inline and exit the associated stack frame. So, we need to remove all entries for
open-coded defers at recover time.

The current code was only removing the top-most open-coded defer from the defer
chain during recovery. However, with recursive functions that do repeated
panic-recover-repanic, multiple stale entries can accumulate on the chain. So, we
just adjust the loop to process the entire chain. Since this is at panic/recover
case, it is fine to scan through the entire chain (which should usually have few
elements in it, since most defers are open-coded).

The added test fails with a SEGV without the fix, because it tries to run a stale
open-code defer entry (and the stack has changed).

Updates #37664.
Fixes #37782.

Change-Id: I8e3da5d610b5e607411451b66881dea887f7484d
Reviewed-on: https://go-review.googlesource.com/c/go/+/222420
Run-TryBot: Dan Scales <danscales@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
(cherry picked from commit fae87a2)
Reviewed-on: https://go-review.googlesource.com/c/go/+/222818
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
@apmckinlay
Copy link
Author

@danscales I don't know if it's related, and currently have no way to reproduce it, but a user had a crash: "fatal error: found bad pointer in Go heap". The program uses unsafe and cgo so it's entirely possible it's unrelated. But one of the go routine stack traces looked similar, with a panic inside a panic, and open defers. The object with the bad pointer is at 0xc0000e6548 and runOpenDeferFrame has an argument of 0xc0000e6500

This is with Go 1.15.6 amd64 windows

link to the entire crash output

runtime.readvarintUnsafe(0xaf48d6, 0xc0007bbdb8, 0x8)
	c:/go/src/runtime/panic.go:793 +0xc5 fp=0xc0007bac70 sp=0xc0007bac68 pc=0x468cc5
runtime.runOpenDeferFrame(0xc000936300, 0xc0000e6500, 0x0)
	c:/go/src/runtime/panic.go:845 +0x1e5 fp=0xc0007bad00 sp=0xc0007bac70 pc=0x468ec5
panic(0xaabd80, 0xc0001103e0)
	c:/go/src/runtime/panic.go:969 +0x1c7 fp=0xc0007badc8 sp=0xc0007bad00 pc=0x469387
github.com/apmckinlay/gsuneido/runtime.(*Thread).interp.func6(0xc0007bb808, 0xc0007bb900, 0xc00175c1f8, 0xc00175c000, 0xc0007bb818, 0xc0007bb8d8)
	v:/gsuneido/runtime/interp.go:146 +0x31b fp=0xc0007bae38 sp=0xc0007badc8 pc=0x5d2ddb
runtime.call64(0x0, 0xae2090, 0xc0002a5c48, 0x3000000030)
	c:/go/src/runtime/asm_amd64.s:541 +0x45 fp=0xc0007bae88 sp=0xc0007bae38 pc=0x499de5
runtime.reflectcallSave(0xc0007bafc8, 0xae2090, 0xc0002a5c48, 0xc000000030)
	c:/go/src/runtime/panic.go:881 +0x5f fp=0xc0007baeb8 sp=0xc0007bae88 pc=0x4690ff
runtime.runOpenDeferFrame(0xc000936300, 0xc0002a5c00, 0x0)
	c:/go/src/runtime/panic.go:855 +0x2d9 fp=0xc0007baf48 sp=0xc0007baeb8 pc=0x468fb9
panic(0xaabd80, 0xc0001103e0)
	c:/go/src/runtime/panic.go:969 +0x1c7 fp=0xc0007bb010 sp=0xc0007baf48 pc=0x469387
github.com/apmckinlay/gsuneido/runtime.(*Thread).interp(0xc00175c000, 0xc0007bb3e8, 0xc0007bb3e0, 0x0, 0x0)
	v:/gsuneido/runtime/interp.go:505 +0x67e6 fp=0xc0007bb3a8 sp=0xc0007bb010 pc=0x5a71e6

@danscales
Copy link
Contributor

@apmckinlay As far as I can tell, the panic inside the panic is happening in user code. That is, we have a panic in interp, and then the deferred function interp.func6 (closure in interp) run by the open defer code is causing another panic (maybe because it is accessing something that is nil related to the cause of the original panic?). So, unless we have further information, it looks like your user's crash is unrelated to this bug. Also, the other "bad pointer in Go heap" panic is unrelated to defers, and is possibly because of a bad use of unsafe or cgo. It just means that the garbage collector ran into an invalid pointer (an address in the range of the Go heap, but doesn't refer to a valid object) while doing GC marking (scanning).

@golang golang locked and limited conversation to collaborators Jan 12, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants