Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: Read() blocking on named pipe despite O_NONBLOCK #66239

Closed
poundifdef opened this issue Mar 11, 2024 · 3 comments
Closed

os: Read() blocking on named pipe despite O_NONBLOCK #66239

poundifdef opened this issue Mar 11, 2024 · 3 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done. OS-Darwin
Milestone

Comments

@poundifdef
Copy link

poundifdef commented Mar 11, 2024

Go version

go version go1.22.1 darwin/arm64

Output of go env in your module/workspace:

GO111MODULE='on'
GOARCH='arm64'
GOBIN=''
GOCACHE='/Users/jgoel/Library/Caches/go-build'
GOENV='/Users/jgoel/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/jgoel/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/jgoel/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/opt/homebrew/Cellar/go/1.22.1/libexec'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/opt/homebrew/Cellar/go/1.22.1/libexec/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.22.1'
GCCGO='gccgo'
AR='ar'
CC='cc'
CXX='c++'
CGO_ENABLED='1'
GOMOD='/Users/jgoel/github/scratchdata/scripts/duckdbcopy/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/6y/xyh8yvqx7dx_ftw31n218m940000gn/T/go-build1801743762=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

When calling os.Read() on a named pipe with O_NONBLOCK, that call will sometimes block. I only see the the bug on my M1 Mac. When I run this code in Linux or in the go playground, I do not see the bug. Here is a full program to reproduce:

https://go.dev/play/p/Ryc4jmgag3t

What did you see happen?

jgoel@jgoels-Air bug % go run main.go
2024/03/10 21:37:12 Opening pipe <nil>
2024/03/10 21:37:12 About to read...
2024/03/10 21:37:12 Read from pipe 5 <nil> Hello
2024/03/10 21:37:12 About to read...
2024/03/10 21:37:13 Read from pipe 5 <nil> Hello
2024/03/10 21:37:13 About to read...
2024/03/10 21:37:14 Read from pipe 5 <nil> Hello
2024/03/10 21:37:14 About to read...
2024/03/10 21:37:15 Read from pipe 5 <nil> Hello
2024/03/10 21:37:15 About to read...
2024/03/10 21:37:16 Read from pipe 5 <nil> Hello
2024/03/10 21:37:16 About to read...
2024/03/10 21:37:17 Writer closed pipe <nil>

(...program hangs indefinitely blocking on Read() when it should loop infinitely...)

What did you expect to see?

I expected to see the same output as above, plus the following lines looped infinitely:

2024/03/11 01:35:04 About to read...
2024/03/11 01:35:04 Read from pipe 0 EOF ol
2024/03/11 01:35:04 About to read...

This program does behave the way I'd like if I make any of these changes:

  1. If I replace os.OpenFile() and os.File.Read() with syscall.Open() and syscall.Read(), then the program does work as expected on Mac. It does not block, and it loops infinitely. Here is an example: https://go.dev/play/p/cflSTMW3RCk
  2. If I remove the call to time.Sleep() in the goroutine, then the program also works as expected.
  3. Finally, the original program works as expected on Linux. I only observe the blocking Read() on Mac.
@poundifdef poundifdef changed the title os: Read() blocking on named pipe when called with O_NONBLOCK os: Read() blocking on named pipe despite O_NONBLOCK Mar 11, 2024
@panjf2000
Copy link
Member

panjf2000 commented Mar 11, 2024

!!! Latest updates:

Even so, I still don't see why removing the time.Sleep() call can fix this issue:

  1. If I remove the call to time.Sleep() in the goroutine, then the program also works as expected.

I've figured it out! When there is no time.Sleep(), the writes to FIFO could go really fast and close the writer before the reader reads all the data. Therefore, the kernel would return 0 to indicate end-of-file when the reader tried to read from an empty FIFO, which would prevent it from falling into pollDesc.waitRead(). In contrast, when the time.Sleep() is added to retard the writes to FIFO, it could lead the reader to fall into:

if err == syscall.EAGAIN && fd.pd.pollable() {
if err = fd.pd.waitRead(fd.isFile); err == nil {
continue
}

and block infinitely cuz the last writer to a FIFO does not trigger a kqueue event for the reader.


See also #24164 and #63937.

The combination of CL 118566 and CL 156379 disabled the registration to netpoller for some specific fds of kindOpenFile, which would have prevented something like FIFO with O_NONBLOCK flag in this issue from being added to netpoller. Then CL 495079 came along and enabled the registration for fds of kindNonBlock to netpoller. Therefore, FIFO with O_NONBLOCK flag will be re-added to netpoller.

Besides, the comments:

go/src/os/file_unix.go

Lines 180 to 185 in 48b10c9

// If the caller passed a non-blocking filedes (kindNonBlock),
// we assume they know what they are doing so we allow it to be
// used with kqueue.
if kind == kindOpenFile {
switch runtime.GOOS {
case "darwin", "ios", "dragonfly", "freebsd", "netbsd", "openbsd":

It seems to be outdated after all those CLs mentioned above and should be removed or updated.

Even so, I still don't see why removing the time.Sleep() call can fix this issue:

  1. If I remove the call to time.Sleep() in the goroutine, then the program also works as expected.

I've drafted a CL for this, working on it.

CC @bcmills @ianlancetaylor

@panjf2000 panjf2000 self-assigned this Mar 11, 2024
@gopherbot
Copy link

Change https://go.dev/cl/570397 mentions this issue: os: prevent FIFO with O_NONBLOCK flag from being added to netpoller

@panjf2000
Copy link
Member

panjf2000 commented Mar 11, 2024

There are two alternatives for this: option 1 and option 2, not sure which one is better.

@panjf2000 panjf2000 added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Mar 11, 2024
@panjf2000 panjf2000 added this to the Go1.23 milestone Mar 11, 2024
@panjf2000 panjf2000 added the compiler/runtime Issues related to the Go compiler and/or runtime. label Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done. OS-Darwin
Projects
Development

No branches or pull requests

3 participants