New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: CGO code EINTR errors at random points on AIX #50521
Comments
See this note from the 1.14 release notes:
See also Closing because this is working as expected. |
Some thoughts:
|
Thanks for the suggestion, but I don't think that would be appropriate in general. There are many Go programs that call into C code for a few specific purposes, and we don't want those programs to lose the advantages of preemption. The problem only arises with Go programs that call into C code that does I/O. That is a less common case, as Go is perfectly able to do I/O itself.
As far as I know that wouldn't help. It is already the case that the Go runtime does not send a signal to a thread running C code. The system must be returning |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
1.16.9 is the latest release from IBM.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Mavimax is working on Enduro/X middleware which has some nice bindings to Golang. Also, Enduro/X uses Golang for network connectivity, etc. The core of Enduro/X is written in C and follows the strict semantics of the Unix operating system, as middleware is used for mission-critical projects (real-time banking transactions, etc). When integrating with C, C code is activated by Go code. And what we have recently seen, that our unit-tests which uses Golang bindings randomly fail on AIX operating system.
When we have investigated this deeper, it founds out that we randomly receive some EINTR errors. One could say that you should add retries/loops on system calls to deal with EINTR (and this might be not a loops only, to restart the system call, timeouts shall be recalculated to reflect time already spent, etc.. - thus additional work). But I guess that is not the correct answer, as firstly somebody could use a lot of code which relay that signals and errors have a meaning, secondly there might be some proprietary libraries used which cannot be modified. I guess Golang shall not impose such behavior that other program C threads see some effects from other threads.
We have created a small program to illustrate this issue:
And when running on AIX for a few minutes, we get:
While on Linux, I was unable to see this output.
We have made test case with GODEBUG="asyncpreemptoff=1" set, then error does not appear on AIX:
(running several minutes).
What did you expect to see?
I expect that in CGO environment, Unix system calls are completed without EINTR errors at random points.
In the above example "INTERRUPTED" shall not be printed in any case.
What did you see instead?
Unix system calls, randomly return error with errno set to EINTR.
In the above example "INTERRUPTED" was printed.
I see that there was a similar case in your unit tests:
But here what I report is already a production-grade defect.
The text was updated successfully, but these errors were encountered: