Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: net: non-blocking Read on Conn #36973

Closed
jackc opened this issue Feb 2, 2020 · 6 comments
Closed

proposal: net: non-blocking Read on Conn #36973

jackc opened this issue Feb 2, 2020 · 6 comments

Comments

@jackc
Copy link

jackc commented Feb 2, 2020

I would like a way of efficiently reading any available data from a net.Conn without blocking.

I'm the creator of the PostgreSQL database driver https://github.com/jackc/pgx. The PostgreSQL protocol is synchronous in almost all cases. The server typically does not send any unexpected messages. The overwhelming majority of network activity is simply sending a query and reading the response.

However, on rare occasions the server will send an unsolicited message (e.g. server shutting down, server settings changed, listen / notify notifications, etc.). It would be very helpful to be able to check for a pending unread message before sending a query to the server.

As I understand it, the standard advice is to use a goroutine to read the net.Conn and pass received messages through a channel. (The other occasionally mentioned suggestion is to use syscall.RawConn and do non-blocking reads at a lower level, but I don't think that is possible when using a *tls.Conn). I built a proof of concept variation of the driver that used a goroutine and a buffered channel and it caused the overall runtime of a simple select to increase by 22% when connected to a server on the same host.

In addition to the unacceptable performance cost, it also greatly increases complexity to have to manage goroutines and channels. For example, the PostgreSQL copy protocol can be used to send large amounts of data to the server at once (potentially millions of rows). In theory, this could be done in a single conn.Write(...). However, in rare occasions the server may send back data before the client has finished sending data. If both the client and the server send a large amount of data a deadlock will occur when the network buffers on either side fill up.

I want to handle this with simple code like the following:

for buf, more := getNextChunk(); more {
  conn.Write(buf)
  // handle errors

  conn.ReadNonBlocking(...)
  // handle any messages the server sent us
}

Instead, I have to use a couple goroutines and do concurrent reading and writing -- much more complicated than the idealized code above.

One of major advantages of Go is that I/O operations can be written in an easy to understand synchronous style instead of using callbacks, promises, or async -- the runtime handles it in the background. Even though Go's channels and goroutines make writing concurrent code easier it is still difficult and error prone. I don't want to do it unless I really have to. And the lack of a non-blocking read forces me to write concurrent code.

@gopherbot gopherbot added this to the Proposal milestone Feb 2, 2020
@networkimprov
Copy link

Hi Jack, did you try this? And if so what did you find?

for buf, more := getNextChunk(); more {
   conn.Write(buf)
   // handle errors

   conn.SetReadDeadline(short)
   len, err := conn.Read(...)
   if err != nil {
      if err == ... {
         // io.EOF, etc
      } else if err.(*net.OpError).Timeout() {
         // no status msgs
         // note: TCP keepalive failures also cause this; keepalive is on by default
         continue
      }
      return err
   }
   // handle any messages the server sent us
}

@jackc
Copy link
Author

jackc commented Feb 3, 2020

I previously considered it, but I did not implement it for two reasons.

  1. In the normal case where the server does not send any messages this is equivalent to a sleep. I'd rather not sleep in performance sensitive code.
  2. It's not clear what amount of time is appropriate to wait. Perhaps something reasonable could be determined by trial and error -- but there is no guarantee the same value work across different platforms, versions of Go, host performance capabilities, and load levels.

@ianlancetaylor
Copy link
Contributor

Similar to #15735 (comment).

@ianlancetaylor ianlancetaylor changed the title proposal: net.Conn non-blocking Read proposal: net: non-blocking Read on Conn Feb 3, 2020
@networkimprov
Copy link

Re "equivalent to a sleep," that's true; but any syscall entails overhead (and often a context switch) so the practical difference between c.SetReadDeadline(time.Nanosecond); c.Read(...) and c.ReadNonBlocking(...) may be academic.

It may be helpful to benchmark both a SetReadDeadline() that Read() always trips, and a plain Read() that always returns a result (e.g. by using a 1-byte buffer), running against a mocked server on a TCP link. You'd limit plain Read() iterations based on the kernel's incoming TCP buffer size, and do an initial Read() to ensure it's primed before starting the benchmark.

If the runtime does internal I/O buffering, that may not fly, so vet the idea with the Go team first...

@networkimprov
Copy link

networkimprov commented Feb 3, 2020

BTW, I think this proposal is preferable to my suggestion, but extra goroutines are widely considered to be inexpensive; see discussion of #36402.

@ianlancetaylor
Copy link
Contributor

Closing as dup of #15735.

@golang golang locked and limited conversation to collaborators Jan 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants