Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syscall: don't check result of close(fd) in forkAndExecInChild on Plan9 #12851

Closed
kennylevinsen opened this issue Oct 6, 2015 · 4 comments
Closed

Comments

@kennylevinsen
Copy link

On almost every build of Go I have made so far, "Building packages and commands" have failed with either:

go build ...: ...: fork/exec .../pkg/tool/plan9_amd64/compile: fd out of range or not open

or:

go build ...: ...: fork/exec .../pkg/tool/plan9_amd64/asm: fd out of range or not open

Completing a build tends to take quite a few tries.

I have traced the error to forkAndExecInChild (

r1, _, _ = RawSyscall(SYS_CLOSE, uintptr(fdsToClose[i]), 0, 0)
):

    // Close fds we don't need.
    for i = 0; i < len(fdsToClose); i++ {
        r1, _, _ = RawSyscall(SYS_CLOSE, uintptr(fdsToClose[i]), 0, 0)
        if int32(r1) == -1 {
            goto childerror
        }
    }

This is in the child right after the fork succeeded.

I haven't figured out why this occurs yet. fd closed, fd never valid in the first place, ... The fd and index in "fdsToClose" changes, so it's not just the first fd (and maybe all of them), nor it is always the same fd. I do not yet know if the fds are open and valid prior to forking.

OS and Go versions

OS: plan9 (up-to-date 9front amd64)
Go: head of master
Go bootstrap: 1.4.3

Steps to reproduce

  • Install a version of Go for bootstrapping (pre master)
  • Fetch master
  • GOROOT_BOOTSTRAP=/the/other/go ./make.rc

Expected result

Go build completes and gives me a freshly baked toolchain.

Actual result

Build crashes the vast majority of the time. Completing a build takes many attempts with --no-clean.

@ianlancetaylor ianlancetaylor added this to the Go1.6 milestone Oct 6, 2015
@kennylevinsen
Copy link
Author

Plan9 appears to have a close-on-exec flag (OCEXEC), making the fdsToClose construct unecessary it if can be set afterwards.
EDIT: I can't seem to find a way to set it after open, and in APE, it just marks all the fd's internally and manually close them before exec'ing.

@0intro 0intro self-assigned this Oct 7, 2015
@0intro
Copy link
Member

0intro commented Oct 7, 2015

This bug only appears on multiprocessor machines. This is a long-standing issue.
It seems there is a race condition that leads some file descriptors to be closed twice.

A simple workaround is to ignore the error returned by the close syscall.
This should be safe since we just want to be sure the fd was closed.

But, of course, it would be better to fix the race.

@rsc rsc changed the title syscall: forkAndExecInChild failing with "fd out or range or not open" on Plan9 syscall: don't check result of close(fd) in forkAndExecInChild on Plan9 Nov 24, 2015
@rsc
Copy link
Contributor

rsc commented Nov 24, 2015

None of the other systems check the result from close. Neither should Plan 9.

@gopherbot
Copy link

CL https://golang.org/cl/17188 mentions this issue.

@0intro 0intro closed this as completed in 1d3e776 Nov 24, 2015
@golang golang locked and limited conversation to collaborators Nov 27, 2016
@rsc rsc unassigned 0intro Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants