New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
misc/cgo: TestSignalForwardingExternal sometimes fails with wrong signal SIGINT #53907
Comments
@ianlancetaylor I think you might be the most familiar with this test. Could you have a look when you get a chance? |
The point of the test is to check that The point of waiting for the When I run this on linux-amd64 the signal is delivered to a C thread, so You say that I do agree that if the signal is somehow delivered to a Go thread, we may not take the expected action. One thing we should certainly do is log the test program's standard error, so that we can see what it thinks is happening. |
Thanks for the explanation, that helps a lot. Regarding the " Lines 435 to 437 in 244c8b0
which will always set the crash flag for c-archive programs if I understand correctly. I have the strace output for a failed test run and it includes both a failing test program execution and a successful one. Because TestSignalForwardingExternal runs the test program in a loop and the check done in the SIGINT strace output first time in the loop:pid 977139 - TestSignalForwardingExternal thread
SIGSEGV strace output second time in the loop:pid 977135 - TestSignalForwardingExternal thread
One difference between the two test program executions is where the |
Thanks for the pointer about It does seem possible to the signal to be delivered to a Go thread. I think that https://go.dev/cl/419014 will fix the test problem. Thanks. |
Change https://go.dev/cl/419014 mentions this issue: |
Thanks for creating the CL @ianlancetaylor! I'm running Since this issue shows up on the new s390x builder VMs I'm preparing, could it also be backported? |
@jonathan-albrecht-ibm Which releases would you want to see the patch in? |
@ianlancetaylor I'd like to get this patch into the last two releases so that when I add the new s390x dashboard builder VMs there wouldn't be any chance of those builders failing on this test when testing security/point releases. I'm assuming that's needed but let me know if not or if there is another way of handling it. I know 1.19 is about to be released so really I'd like to get it into which ever release branches are going to have further releases. Would that be release-branch.go1.19 and release-branch.go1.18 right now? |
I ran the TestSignalForwardingExternal and TestSignalForwardingGo tests from the CL on my s390x VM that was prone to failing the original TestSignalForwardingExternal test 10,000 times and got no failures: $ go test -failfast -count=10000 -timeout=0 ./ -run TestSignalForwarding[EG].*
ok misc/cgo/testcarchive 32440.474s |
@gopherbot Please open backport to 1.18. This is a test-only fix that avoids occasional test failures on s390x and possibly other systems. |
Backport issue(s) opened: #54056 (for 1.18), #54239 (for 1.19). Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases. |
Change https://go.dev/cl/425002 mentions this issue: |
Occasionally the signal will be sent to a Go thread, which will cause the program to exit with SIGQUIT rather than SIGSEGV. Add TestSignalForwardingGo to test the case where the signal is expected to be delivered to a Go thread. This is a roll forward of CL 419014 which was rolled back in CL 424954. This CL differs from 419014 in that it skips TestSignalForwardingGo on darwin-amd64. Fixes #53907 Change-Id: I5df3fd610c068df3bd48d9b3d7a9379248b97999 Reviewed-on: https://go-review.googlesource.com/c/go/+/425002 Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com>
Change https://go.dev/cl/425486 mentions this issue: |
Change https://go.dev/cl/425487 mentions this issue: |
Just back from vacation so belated huge thanks @ianlancetaylor! |
…tSignalForwardingExternal Occasionally the signal will be sent to a Go thread, which will cause the program to exit with SIGQUIT rather than SIGSEGV. Add TestSignalForwardingGo to test the case where the signal is expected to be delivered to a Go thread. This is a roll forward of CL 419014 which was rolled back in CL 424954. This CL differs from 419014 in that it skips TestSignalForwardingGo on darwin-amd64. For #53907 Fixes #54056 Change-Id: I5df3fd610c068df3bd48d9b3d7a9379248b97999 Reviewed-on: https://go-review.googlesource.com/c/go/+/425002 Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> (cherry picked from commit d05ce23) Reviewed-on: https://go-review.googlesource.com/c/go/+/425486 Reviewed-by: David Chase <drchase@google.com>
…tSignalForwardingExternal Occasionally the signal will be sent to a Go thread, which will cause the program to exit with SIGQUIT rather than SIGSEGV. Add TestSignalForwardingGo to test the case where the signal is expected to be delivered to a Go thread. This is a roll forward of CL 419014 which was rolled back in CL 424954. This CL differs from 419014 in that it skips TestSignalForwardingGo on darwin-amd64. For #53907 Fixes #54239 Change-Id: I5df3fd610c068df3bd48d9b3d7a9379248b97999 Reviewed-on: https://go-review.googlesource.com/c/go/+/425002 Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> (cherry picked from commit d05ce23) Reviewed-on: https://go-review.googlesource.com/c/go/+/425487 Reviewed-by: David Chase <drchase@google.com>
…tSignalForwardingExternal Occasionally the signal will be sent to a Go thread, which will cause the program to exit with SIGQUIT rather than SIGSEGV. Add TestSignalForwardingGo to test the case where the signal is expected to be delivered to a Go thread. This is a roll forward of CL 419014 which was rolled back in CL 424954. This CL differs from 419014 in that it skips TestSignalForwardingGo on darwin-amd64. For golang#53907 Fixes golang#54239 Change-Id: I5df3fd610c068df3bd48d9b3d7a9379248b97999 Reviewed-on: https://go-review.googlesource.com/c/go/+/425002 Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> (cherry picked from commit d05ce23) Reviewed-on: https://go-review.googlesource.com/c/go/+/425487 Reviewed-by: David Chase <drchase@google.com>
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
What did you expect to see?
Test should always pass
What did you see instead?
This test failed twice while I was testing new s390x builder hardware:
https://build.golang.org/log/b7171701f23a01295fc598a1ea596f5b1e2431e5
https://build.golang.org/log/2a0f9a0cade96e1972aed5ed315721f82c3031b0
This failure could also happen on other architectures but for some reason its more likely to happen on the new machine I'm testing. As far as I'm able to see, this failure has not happened on the existing s390x builders.
I'm able to reproduce the same failure on linux amd64 by removing the code related to waiting for the
testp
program to send theOK
trigger which makes the window where a go thread could handle the SIGSEV signal larger:go/misc/cgo/testcarchive/carchive_test.go
Line 630 in 2aa473c
I ran the test under strace and I think this could happen since this CL in go1.15 which makes SIGSEGV always crash the go program. In c-archive programs GOTRACEBACK=crash is always effectively set and that causes the go signal handler to send SIGINT to its own process:
go/src/runtime/signal_unix.go
Line 788 in 2aa473c
and that's how the tests sees the
testp
process has exited with SIGINT instead of SIGSEGV.I think the fix could be for the test to send SIGUSR1 to
testp
instead of SIGSEGV. I've tried that out on s390x and amd64 and haven't seen any failures. If that makes sense, I can submit a CL.The text was updated successfully, but these errors were encountered: