Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: frame pointer check fails on arm64 for C-to-Go calls #59401

Open
nsrip-dd opened this issue Apr 3, 2023 · 4 comments
Open

runtime: frame pointer check fails on arm64 for C-to-Go calls #59401

nsrip-dd opened this issue Apr 3, 2023 · 4 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@nsrip-dd
Copy link
Contributor

nsrip-dd commented Apr 3, 2023

What version of Go are you using (go version)?

$ go version
go version devel go1.21-8edcdddb23 Fri Mar 31 23:25:07 2023 +0000 darwin/arm64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/nick.ripley/Library/Caches/go-build"
GOENV="/Users/nick.ripley/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/nick.ripley/go/pkg/mod"
GONOPROXY="--obfuscated--"
GONOSUMDB="--obfuscated--"
GOOS="darwin"
GOPATH="/Users/nick.ripley/go"
GOPRIVATE="--obfuscated--"
GOPROXY="--obfuscated--,direct"
GOROOT="/Users/nick.ripley/repos/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/nick.ripley/repos/go/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="devel go1.21-8edcdddb23 Fri Mar 31 23:25:07 2023 +0000"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/nick.ripley/repos/go/src/go.mod"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/f3/g91d13pd6kd3vdxts_gsgd1r0000gn/T/go-build4033147164=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Ran the TestCgoTracebackGoroutineProfile test with debugCheckBP enabled on an arm64 machine.

What did you expect to see?

The test to pass.

What did you see instead?

The test fails:

runtime: found invalid frame pointer
bp=0x170ee6da8 min=0x140001c4000 max=0x140001c4800
fatal error: bad frame pointer

runtime stack:
runtime.throw({0x10104f25d?, 0x10104c977?})
	/Users/nick.ripley/repos/go/src/runtime/panic.go:1076 +0x40 fp=0x170ee67b0 sp=0x170ee6780 pc=0x100f71520
runtime.adjustframe(0x170ee6a80, 0x170ee6960)
	/Users/nick.ripley/repos/go/src/runtime/stack.go:714 +0x678 fp=0x170ee68d0 sp=0x170ee67b0 pc=0x100f8b7f8
runtime.copystack(0x14000184820, 0x1000)
	/Users/nick.ripley/repos/go/src/runtime/stack.go:971 +0x470 fp=0x170ee6c00 sp=0x170ee68d0 pc=0x100f8c680
runtime.newstack()
	/Users/nick.ripley/repos/go/src/runtime/stack.go:1152 +0x35c fp=0x170ee6da0 sp=0x170ee6c00 pc=0x100f8cabc
runtime.morestack()
	/Users/nick.ripley/repos/go/src/runtime/asm_arm64.s:316 +0x70 fp=0x170ee6da0 sp=0x170ee6da0 pc=0x100fa4a00

goroutine 35 [copystack, locked to thread]:
runtime.deductSweepCredit(0x2000, 0x0)
	/Users/nick.ripley/repos/go/src/runtime/mgcsweep.go:866 +0x14c fp=0x1400011cb90 sp=0x1400011cb90 pc=0x100f613cc
runtime.(*mcentral).cacheSpan(0x1011bbd18)
	/Users/nick.ripley/repos/go/src/runtime/mcentral.go:84 +0x48 fp=0x1400011cc00 sp=0x1400011cb90 pc=0x100f51548
runtime.(*mcache).refill(0x101484a68, 0x0?)
	/Users/nick.ripley/repos/go/src/runtime/mcache.go:182 +0x1bc fp=0x1400011cc50 sp=0x1400011cc00 pc=0x100f50e9c
runtime.(*mcache).nextFree(0x101484a68, 0x5)
	/Users/nick.ripley/repos/go/src/runtime/malloc.go:855 +0x84 fp=0x1400011cca0 sp=0x1400011cc50 pc=0x100f48014
runtime.mallocgc(0x8, 0x0, 0x0)
	/Users/nick.ripley/repos/go/src/runtime/malloc.go:1017 +0x2ac fp=0x1400011cd10 sp=0x1400011cca0 pc=0x100f4841c
runtime.growslice(0x0, 0x100f7c238?, 0x140001c45b8?, 0x100f7c23c?, 0x140001c45e8?)
	/Users/nick.ripley/repos/go/src/runtime/slice.go:266 +0x3bc fp=0x1400011cd70 sp=0x1400011cd10 pc=0x100f89f1c
runtime.cgocallbackg1(0x101049b10, 0x0?, 0x1)
	/Users/nick.ripley/repos/go/src/runtime/cgocall.go:278 +0xd0 fp=0x1400011ce50 sp=0x1400011cd70 pc=0x100f3e8d0
runtime.cgocallbackg(0x140001c4728?, 0x170ee6ea7, 0x14000184820?)
	/Users/nick.ripley/repos/go/src/runtime/cgocall.go:242 +0x124 fp=0x1400011cee0 sp=0x1400011ce50 pc=0x100f3e774
runtime.cgocallbackg(0x101049b10, 0x170ee6ea7, 0x1)
	<autogenerated>:1 +0x1c fp=0x1400011cf10 sp=0x1400011cee0 pc=0x100fa8b9c
runtime.cgocallback(0x140001c4788, 0x101041120, 0x10104aa28)
	/Users/nick.ripley/repos/go/src/runtime/asm_arm64.s:1104 +0xb0 fp=0x1400011cf40 sp=0x1400011cf10 pc=0x100fa6e00
runtime.systemstack_switch()
	/Users/nick.ripley/repos/go/src/runtime/asm_arm64.s:200 +0x8 fp=0x1400011cf50 sp=0x1400011cf40 pc=0x100fa48c8
runtime.cgocall(0x10104aa28, 0x140001c47c8)
	/Users/nick.ripley/repos/go/src/runtime/cgocall.go:168 +0x70 fp=0x1400011cf90 sp=0x1400011cf50 pc=0x100f3e5f0
main._Cfunc_CallGoSleep()
	_cgo_gotypes.go:177 +0x30 fp=0x1400011cfc0 sp=0x1400011cf90 pc=0x101041120
main.GoroutineProfile.func1()
	/Users/nick.ripley/repos/go/src/runtime/testdata/testprogcgo/gprof.go:33 +0x24 fp=0x1400011cfd0 sp=0x1400011cfc0 pc=0x101049944
runtime.goexit()
	/Users/nick.ripley/repos/go/src/runtime/asm_arm64.s:1196 +0x4 fp=0x1400011cfd0 sp=0x1400011cfd0 pc=0x100fa6ef4
created by main.GoroutineProfile in goroutine 1
	/Users/nick.ripley/repos/go/src/runtime/testdata/testprogcgo/gprof.go:33 +0x40

With additional stack debug logging turned on, I see that the check fails when visiting the runtime.cgocallbackg frame:

Debug logging output
copystack gp=0x14000184820 [0x140001c4000 0x140001c4390 0x140001c4800] -> [0x1400011c000 0x1400011cb90 0x1400011d000]/4096
                adjust ptr 0x14000184888:0x140001c4388 -> 0x1400011cb88
                adjust ptr 0x1400011cb88:0x140001c43f8 -> 0x1400011cbf8
            adjusting runtime.deductSweepCredit frame=[0x1400011cb90,0x1400011cb90] pc=0x100f613cc continpc=0x100f613cc
            adjusting runtime.(*mcentral).cacheSpan frame=[0x1400011cb90,0x1400011cc00] pc=0x100f51548 continpc=0x100f51548
              locals 0/3 2 words 0x1010599e4
              saved bp
                adjust ptr 0x1400011cbf8:0x140001c4448 -> 0x1400011cc48
            adjusting runtime.(*mcache).refill frame=[0x1400011cc00,0x1400011cc50] pc=0x100f50e9c continpc=0x100f50e9c
              locals 1/2 1 words 0x10105932d
              saved bp
                adjust ptr 0x1400011cc48:0x140001c4498 -> 0x1400011cc98
            adjusting runtime.(*mcache).nextFree frame=[0x1400011cc50,0x1400011cca0] pc=0x100f48014 continpc=0x100f48014
              locals 1/2 1 words 0x10105932d
              saved bp
                adjust ptr 0x1400011cc98:0x140001c4508 -> 0x1400011cd08
            adjusting runtime.mallocgc frame=[0x1400011cca0,0x1400011cd10] pc=0x100f4841c continpc=0x100f4841c
              locals 2/7 3 words 0x10105b6be
              saved bp
                adjust ptr 0x1400011cd08:0x140001c4568 -> 0x1400011cd68
              args
            adjusting runtime.growslice frame=[0x1400011cd10,0x1400011cd70] pc=0x100f89f1c continpc=0x100f89f1c
              locals 1/5 1 words 0x10105a715
              saved bp
                adjust ptr 0x1400011cd68:0x140001c4648 -> 0x1400011ce48
              args
            adjusting runtime.cgocallbackg1 frame=[0x1400011cd70,0x1400011ce50] pc=0x100f3e8d0 continpc=0x100f3e8d0
              locals 1/4 13 words 0x10105df4e
              saved bp
                adjust ptr 0x1400011ce48:0x140001c46d8 -> 0x1400011ced8
              args
            adjusting runtime.cgocallbackg frame=[0x1400011ce50,0x1400011cee0] pc=0x100f3e774 continpc=0x100f3e774
              locals 3/6 4 words 0x10105b2f7
        adjust ptr 0x140001c4750 runtime.cgocallbackg
              saved bp

It appears that the runtime.cgocall assembly routine attempts to put a frame pointer on the Go stack when switching to it. But from this test failure, it seems like we're getting the wrong frame pointer? Possibly an address on the C stack (when the thread is created by C rather than by the runtime)?

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Apr 3, 2023
@mknyszek
Copy link
Contributor

mknyszek commented Apr 3, 2023

I believe there are known issues with frame pointers on arm64, and there are some issues open about it and progress being made. I'll dig those up in a bit.

EDIT: Whoops, I just saw who posted this. 😅

@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Apr 3, 2023
@mknyszek mknyszek added this to the Backlog milestone Apr 3, 2023
@nsrip-dd
Copy link
Contributor Author

nsrip-dd commented Apr 3, 2023

Ah, I forgot to mention other relevant issues, thanks! There's at least #39524, and #58432 is also related. I opened this issue to document a specific case that's not working right now :)

Also cc @felixge @aarzilli

@nsrip-dd
Copy link
Contributor Author

runtime.cgocallbackg is indeed picking up whatever was in the frame pointer register from the C code. So the frame pointer check enabled with debugCheckBP fails when it checks runtime.cgocallbackg's frame, seeing that the frame pointer is pointing outside of the goroutine's stack.

I think there are two ways we could address this:

  • Have the frame pointer check ignore the runtime.cgocallbackg frame, assuming that it might inherit the frame pointer from the C code calling into Go.
  • Have runtime.cgocallback set the frame pointer explicitly before calling into runtime.cgocallbackg. Say, to 0, or to point to the previous frame on the Go stack if there is one?

I think it would be good to start with the first option. Right now we don't use frame pointer unwinding for execution tracing if there are C calls on the stack, specifically because of this kind of problem. We could follow that pattern for future uses of frame pointer unwinding internally (e.g. #16638). And in that case, we know that runtime.cgocallbackg's frame won't be visited by frame pointer unwinding, so there would be no value in checking it.

We could consider the second option in the future. It would probably be worth a little more discussion, though. For example, if we wanted to be able to use frame pointer unwinding even with C frames on the stack, we'd still want a way to take advantage of runtime.SetCgoTraceback handlers if they're present.

@aarzilli
Copy link
Contributor

Have you considered changing asmcgocall/cgocallback so they save on the stack, instead of BP, the value of BP - curg.stack.hi: a stacktracer can recognize that the value is special (because it's negative/it has the msb set) and adjust it by reading the stack bounds from the current goroutine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: In Progress
Development

No branches or pull requests

4 participants