Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: "fatal error: missed stack barrier" failures on the darwin/arm64 android/arm64 builders #15138

Closed
eliasnaur opened this issue Apr 5, 2016 · 10 comments
Milestone

Comments

@eliasnaur
Copy link
Contributor

The iOS builders are not the most reliable, but I have seen quite a few GC-related builder failures that look like they could be real:

http://build.golang.org/log/eaa998d553346cee530c589e84f08eb0a2e524c0
http://build.golang.org/log/5bf02c8fea879278f4c0f9dc8d22a8caf8752f07
http://build.golang.org/log/405e3f41f9efe579885e11a4ecd4dc626017be38
http://build.golang.org/log/84cf52a8162f42452627ad54a72a04e48afb6f1d
http://build.golang.org/log/743957631e9b3af9376e63759780329aae4d0543
http://build.golang.org/log/d97088062bd9ffa500fe37c24c0d373a996ee0cb

(all darwin/arm64, I haven't seen a similar failure on darwin/arm)

The iOS builders had falled into disarray for months until around the 25. of March, so I don't think I can come closer to an offending CL, if any.

@eliasnaur
Copy link
Contributor Author

I found similar failures on the more reliable android/arm64 builders:

http://build.golang.org/log/10e13869523708ea0968eb6fc2d8c0871a55825a
http://build.golang.org/log/102819b526620f5699bb1bbaba795271ada8f462
http://build.golang.org/log/767c37d013529f5d76b1c018f3f4c6710469aaee

I haven't seen anything on the android/386, android/arm nor on linux/arm64, which reminds me of physical page size issues like #9993 (according to #11886 the linux/arm64 builder is using 4k pages).

@eliasnaur eliasnaur changed the title Persistent GC related failures on the darwin/arm64 builder Persistent GC related failures on the {darwin,android}/arm64 builders Apr 6, 2016
@eliasnaur eliasnaur changed the title Persistent GC related failures on the {darwin,android}/arm64 builders Persistent GC related failures on the darwin/arm64 android/arm64 builders Apr 6, 2016
@eliasnaur eliasnaur changed the title Persistent GC related failures on the darwin/arm64 android/arm64 builders "fatal error: missed stack barrier" failures on the darwin/arm64 android/arm64 builders Apr 6, 2016
@eliasnaur
Copy link
Contributor Author

cc @aclements @randall77

@eliasnaur eliasnaur changed the title "fatal error: missed stack barrier" failures on the darwin/arm64 android/arm64 builders runtime: "fatal error: missed stack barrier" failures on the darwin/arm64 android/arm64 builders Apr 6, 2016
@aclements
Copy link
Member

The android/arm64 ones are all in TestStackBarrierProfiling, which uses a funny debug mode that may simply be broken on arm64 (it's definitely broken on ppc64).

The darwin/arm64 ones are more concerning. It looks like the stack barrier list is just wrong (e.g., [@@@ ==> *0xd=0x0] in the last one). That's a bad pointer and a bad value. The "missed stack barrier" panic is almost certainly just a symptom of the earlier nil-pointer dereference in gcRemoveStackBarrier, but I'm not sure where the bad barrier pointers came from in the first place.

Is this builder a multicore ARM64 or single core? If it's multicore, this could be a weak memory order issue.

@aclements aclements added this to the Go1.7 milestone Apr 6, 2016
@eliasnaur
Copy link
Contributor Author

I don't know the exact hardware on the builder, but the builder name (darwin-arm64-a7ios) suggests it is running on an Apple A7 processor. A7 has 2 cores according to https://en.wikipedia.org/wiki/Apple_A7.

@mwhudson
Copy link
Contributor

mwhudson commented Apr 6, 2016

I'm not aware of any single core arm64 implementations fwiw. There probably are some, but they're definitely not the common case.

@crawshaw
Copy link
Member

TestStackBarrierProfiling just failed on android/arm with missed stack barrier:

https://build.golang.org/log/032ea3f75a48e9760600042046066e24a5e3234e

@eliasnaur
Copy link
Contributor Author

@gopherbot
Copy link

CL https://golang.org/cl/23134 mentions this issue.

gopherbot pushed a commit that referenced this issue May 16, 2016
This should help with debugging failures.

For #15138 and #15477.

Change-Id: I77db2b6375d8b4403d3edf5527899d076291e02c
Reviewed-on: https://go-review.googlesource.com/23134
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
@gopherbot
Copy link

CL https://golang.org/cl/23291 mentions this issue.

@eliasnaur
Copy link
Contributor Author

Here's a recent one from an android/arm64 builder:

https://build.golang.org/log/e7ef66722bdccf4446c52fbfbb8783883662c2b4

You mentioned that TestStackBarrierProfiling on android/arm64 might not work reliably for other reasons than this issue's topic. Should I open a new issue, and perhaps skip TestStackBarrierProfiling on android/arm64 (and perhaps other GOOS/GOARCH combinations) in the meantime?

@golang golang locked and limited conversation to collaborators Jun 7, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants