-
Notifications
You must be signed in to change notification settings - Fork 18k
os/exec: failures with "netpollBreak write failed" on linux-amd64 since 2021-11-10 #49533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
(@prattmic, @ChrisHines: Is this possibly related to #44343?) |
errno 32 is EPIPE, meaning the other end of the pipe ( |
Ooh, that sounds like another |
Yep, here it is, I think: Lines 52 to 67 in 4d8db00
|
Oof, that is quite broken. But are we sure that is the cause of these crashes? That only runs for GO_WANT_HELPER_PROCESS=1, but the test logs look like it is the main test process crashing. Either way, this may be another case of the new pacer shaking out a bug with slightly earlier GC runs. cc @mknyszek |
If I'm reading correctly, it's the opposite: it is only skipped for |
🤦 |
(I am attempting to reproduce this issue, no luck yet.) Upon closer inspection, I still not entirely convinced this code is the issue. Note that:
I do see two potential sources of bugs:
|
I think there is a race condition here. The call to |
I did manage to reproduce this on a linux-amd64-fedora gomote after ~30min with:
|
I've also seen two failures of the form:
This is a failure attempting to add netpollBreakRd to epoll, which seems to be in the same vain. I'm going to fix the races in that init function and we'll see if that gets rid of the crashes. I think it is worthwhile either way. |
This one also looks like it could be related: 2021-11-08T17:30:10-6a9d811/illumos-amd64:
|
Change https://golang.org/cl/364035 mentions this issue: |
With https://golang.org/cl/364035, I have not reproduced a crash for >45min. Before I could repro in <5min. |
Have you seen something like
I'm investigating #49500 and managed to reproduce such a failure. (Running Full stack trace:
|
Actually yeah, given that |
greplogs --dashboard -md -l -e 'fatal error: runtime: netpollBreak write failed'
2021-11-11T04:54:05-4b27d40/linux-amd64-race
2021-11-10T21:32:50-f410786/linux-amd64-fedora
(Note the 2-year gap and difference in platforms here! This looks like a regression.)
2019-11-04T23:41:34-383b447/darwin-386-10_14
2019-11-04T16:32:38-7dcd343/darwin-amd64-nocgo
2019-11-02T21:51:21-177a36a/dragonfly-amd64
2019-11-02T21:51:14-a3ffb0d/darwin-amd64-10_14
2019-11-02T21:51:07-40b7455/darwin-amd64-nocgo
2019-11-02T05:52:33-8de0bb7/netbsd-amd64-8_0
2019-11-01T21:41:41-9bde9b4/darwin-amd64-10_14
2019-11-01T05:38:51-e96fd13/darwin-amd64-10_14
The text was updated successfully, but these errors were encountered: