Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: File.Close() regression - Close hangs while reader present #23943

Closed
gdamore opened this issue Feb 20, 2018 · 12 comments
Closed

os: File.Close() regression - Close hangs while reader present #23943

gdamore opened this issue Feb 20, 2018 · 12 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Milestone

Comments

@gdamore
Copy link

gdamore commented Feb 20, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

1.10

Does this issue reproduce with the latest release?

Yes, and it is a regression.

What operating system and processor architecture are you using (go env)?

go version go1.10 darwin/amd64

What did you do?

tcell (github.com/gdamore/tcell) is broken by the latest go 1.10 (works fine on 1.9.x).

We have an input loop that spins reading stdin (to scan for keypresses.)

When the program wants to terminate, we call Close() on that, to clean up things. We expect the input loop to fail with io.EOF or nil and for Close() to terminate immediately.

Instead Close() now hangs, until the user presses a key, and the check for closed file then succeeds.

It would appear that Close() no longer is thread safe, and no longer just calls the underlying OS close function.

@bradfitz
Copy link
Contributor

/cc @ianlancetaylor

@gdamore
Copy link
Author

gdamore commented Feb 20, 2018

In retrospect, it may be that the new Close() behavior is causing me grief because of a bug in the macOS tty driver, that does not wake sockets in poll() when the underlying descriptor is closed. In the past I've used termios timeouts to workaround this, but the real solution (better) usually requires a separate file descriptor (like a pipe() based one) that can be used to wake the poller, which must then poll on two separate descriptors.

Unfortunately, I don't know how to set up two descriptors and poll() on them from one goroutine in golang. So what we need I suppose is an "interruptible" kind of file.

Put another way, I think the new golang behavior is now exposing me to an underlying macOS behavior (one I consider defective), that I was previously shielded from. Unfortunately I can't think of any way to work around this using purely go constructs.

@ALTree ALTree changed the title File.Close regression - file.Close() hangs while reader present os: File.Close() regression - Close hangs while reader present Feb 20, 2018
@bradfitz
Copy link
Contributor

@gdamore, do you have a repro? We're much more likely to be able to help if we have code to run to see the problem.

@gdamore
Copy link
Author

gdamore commented Feb 20, 2018

Yes, sure. The easiest way to do this is:

% go get github.com/gdamore/tcell
% cd $GOPATH/src/github.com/gdamore/tcell
% go run _demo/boxes.go

Press ESC to exit the application. In order versions this would just abort the app. But now you have to press one more keystroke to do this.

This is on macOS. It may work fine on other systems (I haven't tested yet).

I'm pretty sure at this point that this is a well known (to me at least) defect in macOS, which I used to be shielded from, but which now I am not -- basically close of a tty based file descriptor does not wake a reader in select/poll. Its interesting to me that with the new runtime, this hangs Close() -- I guess there is some effort in the go runtime to clean up its own internal pollers, but I imagine that your runtime is now getting busted by the same bug.

In C, the way I workaround this is by using a separate notification descriptor (created with pipe()), and then my poller (in poll(), select(), or similar -- and I believe kqueue() also behaves the same here) polls both the tty input fd, and the "special" notification fd I've created. This lets me wake the reader, so that it can bail out of its loop.

In the tcell code, you'll see that in tscreen_bsd.go, it gets "stuck" trying to close the tty file at line 104. I.e. t.in.Close() hangs. The other salient fact here is that the input loop is running, blocked on t.in.Read().

I'll write a test program and post it here or on play shortly.

@gdamore
Copy link
Author

gdamore commented Feb 20, 2018

Test program: https://play.golang.org/p/BXLQplc69dL

Sorry its kind of longer than I'd like, but the termios stuff to set the input into raw mode is a key aspect of this.

If this program behaved properly, and you don't type anything after starting it, it will exit after sleeping for one second. But instead, it just "hangs" until you press a key.

Furthermore, the Read() in the inputLoop actually returns a valid keystroke, so Close() isn't actually tearing down the underlying FD.

@gdamore
Copy link
Author

gdamore commented Feb 20, 2018

Interestingly enough, syscall.Close() of the underlying FD doesn't wake up the reader either -- again, this is because of the macOS tty driver bug. Working around this is a major PITA because you have to set up a separate notification descriptor.

I'd be interested to know if FreeBSD suffers from the same broken behavior.

@gdamore
Copy link
Author

gdamore commented Feb 20, 2018

In tcell I worked around this by just making file.Close() async (goroutine) on darwin. The attached program above still demonstrates the broken behavior though.

@ianlancetaylor
Copy link
Contributor

The Go runtime doesn't assume that calling close on a descriptor will wake up a read on that same descriptor. About the only way I can explain the results you are seeing is if calling read on a TTY blocks even though the descriptor is set to nonblocking mode.

@ianlancetaylor ianlancetaylor added this to the Go1.11 milestone Feb 20, 2018
@ianlancetaylor ianlancetaylor added OS-Darwin NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Feb 20, 2018
@ianlancetaylor
Copy link
Contributor

The program passes on FreeBSD. I haven't been able to get gomote to give me a Darwin instance I can test

@bradfitz
Copy link
Contributor

Macs are all messed up right now. #23859

@gdamore
Copy link
Author

gdamore commented Feb 21, 2018

Its likely that the darwin tty driver doesn't even honor nonblocking mode reasonably. Most people use termios to set up timeouts, but that has a nasty effect of chewing up CPU, basically putting the console mode app in polling mode.

What I had witnessed in C land was that poll() or select() didn't wake up, when the fd was closed. Whether a non-blocking read of the tty also blocks or not is not known to me. But if you're using poll or select to wake up (the usual case for non-blocking mode I/O!) then the code will probably break.

With any other FOSS I could just fix the broken driver and submit the fix upstream. Alas, I have no knowledge of how to do that for Apple. The tty driver has been broken in this way for years.

@gopherbot
Copy link

Change https://golang.org/cl/99015 mentions this issue: internal/poll: if poller init fails, assume blocking mode

@golang golang locked and limited conversation to collaborators Mar 6, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Projects
None yet
Development

No branches or pull requests

4 participants