-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: don't crash in exception handler on windows/arm64 #50877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @cherrymui @mknyszek (my guess) |
This comment from @ianlancetaylor seems very relevant here: #12465 (comment)
But this comment contradicts what the sigtramp code is doing. If getg returns nil, sigtramp calls My workaround was jut to modify sigtramp to return 0 when getg returns nil (marler8997@f645ec0). On windows this tells the caller that the exception is not handled and to continue to the next handler, which may be the functionality we want here. It's also interesting to note the naming of this function Also note that I see arm32 does the same thing here, so if this fix is correct then it should probably go to arm32 as well. |
I don't know enough about Windows exceptions off-hand to know whether the same problem occurs. It wouldn't be surprising. The windows-amd64 |
Yes the same problem occurs on Windows. On windows rather than signals they call this "Vectored Exception Handling". It's a simple mechanism where the applications can register multiple callbacks which are then called by whatever mechanism Windows uses to capture interrupts and propagate them to userspace. The handler will continue to call each one or stop depending on the return value of the previous call. What's happening in our case is another library is using exception handling as normal control flow. When the exception occurs it calls go's sigtramp on a thread that go's runtime doesn't know about and viola, here we are. |
Change https://go.dev/cl/442896 mentions this issue: |
Thanks for reporting this issue @TopperDEL @marler8997. I've been investigating it and get to the same conclusion as you: we should ignore exceptions coming from non-Go threads instead of calling |
Thanks for the update! The finding goes to @marler8997 and I'm still very thankfull for that! |
@TopperDEL The change linked above is under review (with a comment from Alex). After that comment is addressed, it could be merged (unless other issues arise). If it is merged, it'll be included in the next Go release (Go 1.20), scheduled for February 2023. Serious issues with no workarounds can be backported to a minor release. If that happens, the fix will also be available in the next 1.19 minor release (could be 1.19.3 or later). These are done roughly every month. |
Thank you very much for that info! |
@qmuntal any reason why my commit (marler8997@f645ec0) was not included in the one that was merged (6fe3ccd)? This commit was the result of many hours of work that spanned the course of a year, but, now my authorship is now gone from the commit history because you took my changes and incorporated them into your own commit. |
To be clear, I did not incorporate your changes into my commit, I didn't even read the discussions in this issue. I've recently gone through Windows Having said this, I hope you trust me when I said I did not act in bad faith. I could have done a better job by either asking you to submit your change or adding you as a co-author, but I didn't, and I'm sorry for that. Will take this a s a lesson. |
Ok fair enough. The changes looks almost exactly the same but I see there's a couple differences now that I look more closely. I had submitted this change back in January https://go-review.googlesource.com/c/go/+/381775#message-d3c3fc570adf49bb93dd96a011a1d2d20b6fa21a But, I remember there was a concern about the solution that I was going to try to look into but never came back to it. It was something about trying to tell whether the TLS was initialized or not when g was nil? You can find it in the linked discussion thread if you're interested. Anyway, I normally wouldn't care about a 7 line change, but, this is a result of dozens of hours of work over the coarse of a year so I'm a little bummed I won't get any credit in the git commit history, but, sounds like you came up with the change on your own without my help so fair is fair. |
I can only thank you both for fixing a bug in a world where I don't know anything about. :) |
That CL does not link to this issue but to 43800, probably by mistake, so gerritbot did not create the appropriate link and I couldn't know it was there. Unfortunate.
I made a conscious decision to not special-case uninitialized TLS and I was even going to submit a CL removing that from amd64 and 386, as seemed to me that that branch is not reachable (see #8224 (comment) for more discussion on this). But given the conversations that you linked, I might be wrong and will have to reconsider. Edit: |
If there is no current G while handling an exception it means the exception was originated in a non-Go thread. The best we can do is ignore the exception and let it flow through other vectored and structured error handlers. I've removed badsignal2 from sigtramp because we can't really know if the signal is bad or not, it might be handled later in the chain. Fixes golang#50877 Updates golang#56082 Change-Id: Ica159eb843629986d1fb5482f0b59a9c1ed91698 Reviewed-on: https://go-review.googlesource.com/c/go/+/442896 Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Auto-Submit: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> Reviewed-by: David Chase <drchase@google.com>
Change https://go.dev/cl/463116 mentions this issue: |
This CL removes badsignal2 function, as it is unused on Windows. badsignal2 was originally intended to abort the process when an exception was raised on a non-Go thread, following the same approach as Linux and others. Since it was added, back on https://golang.org/cl/5797068, it has caused several issues on Windows, see #8224 and #50877. That's because we can't know wether the signal is bad or not, as our trap might not be at the end of the exception handler chain. To fix those issues, https://golang.org/cl/104200046 and CL 442896 stopped calling badsignal2, and CL 458135 removed one last incorrect call on amd64 and 386. Change-Id: I5bd31ee2672118ae0f1a2c8b46a1bb0f4893a011 Reviewed-on: https://go-review.googlesource.com/c/go/+/463116 Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
I have a quite complex bug but together with @marler8997 we found a workaround.
I tried to get a go-library working on a Hololens 2 via .Net. So I build the library into a dll using the awesome zig-compiler, and then tried to PInvoke into it from C#/.Net. I did all this already successfull on Windows, Linux, MacOs, Android and iOs.
On Hololens, though, my app always crashed whenever the UWP (Universal Windows Platform) DLL came together with the go-runtime. We know found out that Go seems to try to handle an exception thrown from UWP and fails to do so leading to a "badsignal" and an application crash.
@marler8997 helped me with an many in-depth-sessions debugging assembly code and created that patch:
marler8997@f645ec0
May someone with knowledge of the win-arm64-build if this is a good solution or if there might be a better way of doing it? Thank you very much!
What version of Go are you using (
go version
)?1.17.6
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?Windows, ARM64, Hololens 2
The text was updated successfully, but these errors were encountered: