New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: "morestack on g0" in x/perf/storage/app on windows/arm64 #47557
Comments
This is a release-blocker via #11811.
(However, it looks like a pretty severe runtime bug to me.) |
cc @mknyszek |
This is delightfully reproducible and fails very reliably. |
cc @aclements |
I'll take a look. |
After a few hours of holding The following runs fine:
And the following fails:
CC @thanm maybe? |
Oh, interesting! By coincidence, that difference between |
I would say that you could add the |
Interesting problem. It looks like one of these tests is failing: https://go.googlesource.com/go/+/24e798e2876f05d628f1e9a32ce8c7f4a3ed3268/src/cmd/link/internal/arm64/asm.go#610 meaning that the relocation won't reach, but we can't find the linker-introduced label symbol. Why it is happening with only DWARF relocations is a mystery though. I think this would be better off as another bug. Do you want to file it or should I? |
I am kind of curious about how you are going to debug the test once it's build properly with DWARF. Delve doesn't support windows+arm64, so I assume gdb... does the builder actually have a gdb that works? |
@thanm 🤦 yeah, you're right. I don't think it has gdb. It might have a Windows debugger. I guess it's just down to print debugging, anyway. I'll file the bug. |
OK actually, I know why we're getting a Something causes a signal to land on a thread not created by Go the first time (or so the runtime thinks). This calls into I'm not yet sure what causes the original signal, though. |
Coincidentally, I have a CL that fixes the recursive |
Change https://golang.org/cl/321789 mentions this issue: |
That's very strange. I've confirmed that binaries built with https://golang.org/cl/321789 on windows/arm64 do actually have the right code in |
Furthermore, the failure appears before any tests actually get executed. Having |
You might try working around the DWARF problem with go test -ldflags=-w -c |
Thanks @thanm! That worked. OK, it's definitely not the compiler crashing, it's the binary. But before any tests execute, I'm afraid. |
Looks like it's failing very early in runtime initialization. This explains the failure; there's no I've narrowed down the failure to this loop on the first module data encountered in moduledataverify. |
I've further confirmed that on the second iteration of that loop, this check passes, so there's already something wrong. However, then the runtime crashes on the following line, specifically, the This suggests to me that something is broken about the binary. It's worth noting that this is a cgo binary; all the tests in this package that produce the failing binary are build-tagged with cgo. I have a copy of the bad binary and also steps to reproduce; this isn't my area of expertise so any help would be appreciated. |
I have a similar issue, though I'm not sure if that is the exact same problem. I convert a go-library with CGO into a DLL for windows ARM64. That crashes during load/initialization with an "AccessViolation during read" followed by plenty of "AccessViolation during write" until the program quits with a "StackOverflow". The root-error seems to be around "morestack", too, and the runtime seems to try to raise a badsignal. So it kind of seems to fit. I could provide two dlls - one is working and the other is not. If one adds those to a UWP-app and tries to PInvoke into "uplink_internal_UniverseIsEmpty" on e.g. a Hololens 2, it crashes with the above described error-chain. |
Friendly ping on this issue as it's currently marked as a release-blocker. Also CC @bufflig in case you're able to take a look. |
Help from someone familiar with the compiler and/or linker would probably be best. Some parts of the binary being generated from these tests appear to be very broken. |
I'll take a look and see if I can understand anything... |
Change https://golang.org/cl/360895 mentions this issue: |
@gopherbot, please backport to Go 1.17. This failure mode is still occurring consistently on the go1.17 builder. |
Backport issue(s) opened: #49479 (for 1.17). Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases. |
$ greplogs --dashboard -l -md -e (?ms)morestack on g0.*FAIL\s+golang\.org/x/perf/storage/app
2021-02-20T03:31:36-40a54f1/6e73886/windows-arm64-10
2021-02-20T03:31:36-40a54f1/8a7ee4c/windows-arm64-10
2021-02-20T03:31:36-40a54f1/b8ca6e5/windows-arm64-10
CC @prattmic @cherrymui @ianlancetaylor
The text was updated successfully, but these errors were encountered: