Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/exec: "broken pipe" on testlog.txt while running test #35469

Closed
bcmills opened this issue Nov 8, 2019 · 12 comments
Closed

os/exec: "broken pipe" on testlog.txt while running test #35469

bcmills opened this issue Nov 8, 2019 · 12 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Nov 8, 2019

Several examples with this failure mode:

testing: can't write /workdir/tmp/go-build295821694/b061/testlog.txt: write /workdir/tmp/go-build295821694/b061/testlog.txt: broken pipe
FAIL	os/exec	0.627s

I suspect that this has something to do with exec'ing the test itself as a subprocess.

CC @bradfitz @ianlancetaylor

2019-11-08T20:05:25-7a5e0fe/linux-386-sid
2019-11-08T19:28:49-f6ff806/linux-386-sid
2019-11-07T08:25:32-7a2baa9/linux-mips-rtrk
2019-11-01T03:40:20-4a43a50/linux-386-sid

@bcmills bcmills added Testing An issue that has been verified to require only test changes, not just a test failure. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Nov 8, 2019
@bcmills bcmills added this to the Go1.14 milestone Nov 8, 2019
@bradfitz
Copy link
Contributor

bradfitz commented Nov 8, 2019

  -test.testlogfile file
        write test action log to file (for use only by cmd/go)

I don't recall what that is.

os/exec seems to only run subprocesses with a fresh argv, though, not inheriting that value from cmd/go.

@ianlancetaylor
Copy link
Contributor

This likely means that the closeUnexpectedFds function in os/exec/exec_test.go accidentally closed the testlog file.

(The testlog file is used to log what the test does in order to see whether the test results can be cached.)

@bradfitz
Copy link
Contributor

bradfitz commented Nov 8, 2019

I tried to do something like,

diff --git a/src/os/exec/exec_test.go b/src/os/exec/exec_test.go
index 19d2111743..aa8bb5c97b 100644
--- a/src/os/exec/exec_test.go
+++ b/src/os/exec/exec_test.go
@@ -470,9 +470,14 @@ func basefds() uintptr {
        return n
 }
 
+func isTestLogFD(fd uintptr) bool {
+       target, _ := os.Readlink(fmt.Sprintf("/proc/self/fd/%v", fd))
+       return strings.HasSuffix(target, "testlog.txt")
+}
+
 func closeUnexpectedFds(t *testing.T, m string) {
        for fd := basefds(); fd <= 101; fd++ {
-               if poll.IsPollDescriptor(fd) {
+               if poll.IsPollDescriptor(fd) || isTestLogFD(fd) {
                        continue
                }
                err := os.NewFile(fd, "").Close()

But it never fires, because I then see we already have:

// basefds returns the number of expected file descriptors                                                                                        
// to be present in a process at start.                                                                                                           
// stdin, stdout, stderr, epoll/kqueue, epoll/kqueue pipe, maybe testlog                                                                          
func basefds() uintptr {
        n := os.Stderr.Fd() + 1
        // The poll (epoll/kqueue) descriptor can be numerically                                                                                  
        // either between stderr and the testlog-fd, or after                                                                                     
        // testlog-fd.                                                                                                                            
        for poll.IsPollDescriptor(n) {
                n++
        }
        for _, arg := range os.Args {
                if strings.HasPrefix(arg, "-test.testlogfile=") {
                        n++
                }
        }
        return n
}

So is IsPollDescriptor wrong? Seems unlikely.

I'd snapshot ls -l /proc/self/fd/ and t.Log it so we saw the state of the world before we closed anything, except the tests passes (PASS in output)... it's testing package that fails later (in writeProfiles from m.after), so we'd need to put debug stuff in there.

Perhaps we can do that conditional on being on the builders for now?

@ianlancetaylor
Copy link
Contributor

It's interesting that the write is getting an EPIPE error. The testlog file is always a file, created by testing.(*M).before. As far as I can tell it's never a pipe.

@ianlancetaylor
Copy link
Contributor

This has been happening for a long time but except for one recent exception only on linux-386-sid.

2018-05-18T18:16:50-1771edb/linux-386-sid
2018-05-22T02:58:04-87a18c6/linux-386-sid
2018-05-22T14:15:46-837ed98/linux-386-sid
2018-05-22T15:55:01-840f25b/linux-386-sid
2018-05-22T20:27:52-402dd10/linux-386-sid
2018-05-23T20:46:02-1dc20e9/linux-386-sid
2018-05-29T17:45:36-db9341a/linux-386-sid
2018-05-30T17:23:57-b12e341/linux-386-sid
2018-05-31T17:36:45-57d40f1/linux-386-sid
2018-06-01T19:33:34-c7519f0/linux-386-sid
2018-06-06T17:59:08-11f6cb4/linux-386-sid
2018-06-06T18:18:20-9e4c344/linux-386-sid
2018-06-06T23:47:44-fe8a0d1/linux-386-sid
2018-06-08T21:07:22-96faeb0/linux-386-sid
2018-06-16T13:35:45-b459e00/linux-386-sid
2018-06-28T19:37:15-578b961/linux-386-sid
2018-07-16T23:37:14-274fde9/linux-386-sid
2018-07-17T14:17:23-5890e25/linux-386-sid
2018-07-18T02:08:58-5760ffc/linux-386-sid
2018-07-18T20:11:46-798f74b/linux-386-sid
2018-07-27T21:13:43-d970519/linux-386-sid
2018-07-28T01:15:05-a74cc47/linux-386-sid
2018-07-28T01:15:16-0090c13/linux-386-sid
2018-07-31T02:02:51-68170aa/linux-386-sid
2018-07-31T22:10:54-740e589/linux-386-sid
2018-08-01T18:44:46-acd30e9/linux-386-sid
2018-08-02T20:40:06-2f46dfb/linux-386-sid
2018-08-03T18:35:45-b800f20/linux-386-sid
2018-08-08T21:37:36-f2131f6/linux-386-sid
2018-08-09T23:56:40-89e13c8/linux-386-sid
2018-08-17T19:21:57-d46587c/linux-386-sid
2018-08-17T23:12:06-0a842d5/linux-386-sid
2018-08-21T02:26:17-97c7e0e/linux-386-sid
2018-08-22T03:38:38-90f2fa0/linux-386-sid
2018-08-24T07:06:58-c907a75/linux-386-sid
2018-08-24T17:31:42-38143ba/linux-386-sid
2018-08-24T18:40:16-05c0244/linux-386-sid
2018-08-28T02:37:18-3ca3e89/linux-386-sid
2018-08-28T14:10:31-2e23475/linux-386-sid
2018-08-30T19:02:30-0dac1e2/linux-386-sid
2018-09-04T20:29:33-1018a80/linux-386-sid
2018-09-05T14:26:11-81957dd/linux-386-sid
2018-09-11T20:28:15-023dbb1/linux-386-sid
2018-09-14T16:46:11-a708353/linux-386-sid
2018-09-15T14:58:35-fb061b5/linux-386-sid
2018-09-23T06:53:10-fdefaba/linux-386-sid
2018-09-27T02:52:14-d159405/linux-386-sid
2018-09-28T09:18:14-38861b5/linux-386-sid
2018-09-28T13:45:09-bf8e6b7/linux-386-sid
2018-10-04T04:08:08-62e5215/linux-386-sid
2018-10-06T12:04:57-963776e/linux-386-sid
2018-10-09T18:19:59-7d2f46d/linux-386-sid
2018-10-09T20:10:02-6c85693/linux-386-sid
2018-10-11T18:01:17-872a547/linux-386-sid
2018-10-15T04:19:20-85066ac/linux-386-sid
2018-10-15T16:53:03-7c96d87/linux-386-sid
2018-10-20T14:19:29-035f9e8/linux-386-sid
2018-10-23T06:42:56-3b091bf/linux-386-sid
2018-10-27T01:36:57-7ec3b5e/linux-386-sid
2018-11-02T17:56:07-745ec8b/linux-386-sid
2018-11-05T22:36:24-f1a9f1d/linux-386-sid
2018-11-08T17:26:22-be5f646/linux-386-sid
2018-11-12T20:46:48-c5718b6/linux-386-sid
2018-11-13T13:42:59-9d025bd/linux-386-sid
2018-11-14T21:47:50-0a40d45/linux-386-sid
2018-11-20T20:03:29-7780709/linux-386-sid
2018-11-23T22:03:37-1ac8499/linux-386-sid
2018-11-30T20:46:23-3ce9e5a/linux-386-sid
2018-12-03T06:01:35-1adbb2b/linux-386-sid
2018-12-10T23:31:21-f64385b/linux-386-sid
2018-12-11T16:41:35-ba60724/linux-386-sid
2018-12-12T02:36:44-ef780fb/linux-386-sid
2018-12-13T00:35:06-09da280/linux-386-sid
2018-12-14T18:23:41-bdc7d56/linux-386-sid
2018-12-14T21:58:38-37b7c99/linux-386-sid
2018-12-17T21:12:59-179acf4/linux-386-sid
2019-01-08T00:35:16-e3eb2ff/linux-386-sid
2019-01-17T16:59:49-79ac638/linux-386-sid
2019-01-30T17:54:01-3a91061/linux-386-sid
2019-02-01T20:35:09-e104ebf/linux-386-sid
2019-02-04T06:53:49-3fc276c/linux-386-sid
2019-02-05T18:11:45-10faf00/linux-386-sid
2019-02-11T23:22:35-7bc2aa6/linux-386-sid
2019-02-26T05:16:56-ac51237/linux-386-sid
2019-02-27T18:04:36-c33a951/linux-386-sid
2019-02-27T18:08:03-58bf401/linux-386-sid
2019-02-27T22:38:32-e32203f/linux-386-sid
2019-03-01T22:15:57-412f659/linux-386-sid
2019-03-05T23:08:13-7ac0a8b/linux-386-sid
2019-03-07T04:46:03-fd19bc6/linux-386-sid
2019-03-07T17:53:44-9a71015/linux-386-sid
2019-03-08T22:40:10-d3dd258/linux-386-sid
2019-03-12T07:27:56-14a58d6/linux-386-sid
2019-03-15T23:27:36-7765576/linux-386-sid
2019-03-17T17:00:42-746f405/linux-386-sid
2019-03-19T03:20:43-b3cfb0b/linux-386-sid
2019-03-21T01:34:26-ba96564/linux-386-sid
2019-03-22T15:40:37-b06d212/linux-386-sid
2019-03-22T17:26:03-d923309/linux-386-sid
2019-03-27T03:41:10-08ba9c0/linux-386-sid
2019-04-02T16:03:24-3aacfce/linux-386-sid
2019-04-02T17:40:15-578e281/linux-386-sid
2019-04-04T00:26:24-964fe4b/linux-386-sid
2019-04-15T19:30:44-6b7114b/linux-386-sid
2019-04-16T13:25:16-b39d0ea/linux-386-sid
2019-04-16T15:33:18-c4953a6/linux-386-sid
2019-04-16T18:49:50-6997671/linux-386-sid
2019-04-19T15:40:42-4aeac68/linux-386-sid
2019-04-22T23:02:46-ef2806e/linux-386-sid
2019-04-26T17:34:54-6c04c26/linux-386-sid
2019-04-30T01:32:48-d021dd6/linux-386-sid
2019-04-30T22:20:57-65b89c3/linux-386-sid
2019-05-01T13:25:02-7ee2213/linux-386-sid
2019-06-09T16:23:11-323212b/linux-386-sid
2019-07-15T23:42:01-7a853b6/linux-386-sid
2019-08-23T17:16:16-5c379a4/linux-386-sid
2019-08-29T15:37:46-647dc1a/linux-386-sid
2019-09-25T18:48:21-6c15c7c/linux-386-sid
2019-10-11T15:08:12-9036351/linux-386-sid
2019-10-15T16:45:31-11d7775/linux-386-sid
2019-10-17T21:06:37-82bc00d/linux-386-sid
2019-11-01T03:40:20-4a43a50/linux-386-sid
2019-11-07T08:25:32-7a2baa9/linux-mips-rtrk
2019-11-08T19:28:49-f6ff806/linux-386-sid
2019-11-08T20:05:25-7a5e0fe/linux-386-sid

@bcmills
Copy link
Contributor Author

bcmills commented Nov 11, 2019

@gopherbot
Copy link

Change https://golang.org/cl/206939 mentions this issue: os/exec: don't run TestExtraFiles if extra files were open for the test

gopherbot pushed a commit that referenced this issue Nov 13, 2019
Our attempts to close existing open files are flaky. They will fail if,
for example, file descriptor 3 is open when the test binary starts.
Instead, report any such cases, and skip TestExtraFiles.

Updates #35469

Change-Id: I7caec083f3f4a31579bf28fc9c82ae89b1bde49a
Reviewed-on: https://go-review.googlesource.com/c/go/+/206939
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
@gopherbot
Copy link

Change https://golang.org/cl/207078 mentions this issue: os/exec: skip poll descriptors when checking for open descriptors

gopherbot pushed a commit that referenced this issue Nov 13, 2019
It turns out that there is a path that initializes netpoll and opens
file descriptors before running the os/exec init function: on some
systems, the uses of NewFile when setting os.Stdin and friends can
initialize netpoll which can open file descriptors. This in itself
is not a problem, but when we check whether the new files are open
using os.NewFile, a side-effect is to put them into non-blocking mode.
This can then break future uses of netpoll.

Updates #35469
Fixes #35566

Change-Id: I1b2e2c943695d1c2d29496b050abbce9ee710a00
Reviewed-on: https://go-review.googlesource.com/c/go/+/207078
Reviewed-by: Bryan C. Mills <bcmills@google.com>
@ianlancetaylor
Copy link
Contributor

This is rare and has been happening for a long time. The CLs should give us more information if it happens again. Kicking to 1.15.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.14, Go1.15 Nov 16, 2019
@odeke-em
Copy link
Member

We haven't heard much in the past 6 months from bug reports. Moving this Go1.16 and perhaps then we can move it to Unreleased or decide how to proceed from then on.

@odeke-em odeke-em modified the milestones: Go1.15, Go1.16 May 23, 2020
@odeke-em
Copy link
Member

odeke-em commented Feb 5, 2021

Punting to Go1.17.

@odeke-em odeke-em modified the milestones: Go1.16, Go1.17 Feb 5, 2021
@ianlancetaylor
Copy link
Contributor

This hasn't happened since November, 2019. Let's say that it's fixed.

@golang golang locked and limited conversation to collaborators Apr 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

5 participants