net: add Buffers.ReadFrom and Buffers.Write? #17607

rsc · 2016-10-26T15:36:04Z

Go 1.8 is adding net.Buffers, with Read and WriteTo methods. These both move data from the buffers to a target, either a []byte (for Read) or a Writer (for WriteTo). The implementation of WriteTo on one of the package net-implemented net.Conns uses writev. So far so good.

What about readv? What is that going to look like? I ask because it might influence some of the finer details of Buffers. For example, if we can't support readv with the same data structure, we may want to call it WriteBuffers and have a separate ReadBuffers. I do think we can support readv, but we should make sure.

For example you could imagine that readv works by setting up a Buffers with space in the []bytes between len and cap, like:

b := Buffers{make([]byte, 0, 16), make([]byte, 0, 1024)}

to read the first 16 bytes into the first slice and the next 1k into the second, for both b.Write(data) and b.ReadFrom(conn). In order to allow repeating the reads until you get all the data you want, you would need to define that Write and ReadFrom (unlike Read and WriteTo) do not remove full slices from Buffers: they just top off what's available. Probably only top off the final non-empty buffer. That is, if you have

b := Buffers{make([]byte, 4, 10), make([]byte, 15, 20), make([]byte, 0, 1024)}

and you read 5 bytes, where do they go? It seems wrong for them to go into the first slice, since that would stick them in the middle of the 4+15 bytes already semantically in the buffer. So probably they go into the middle slice, which has 5 (20-15) spare bytes of capacity. This implies that Write/ReadFrom needs to scan backward from the end to find the first non-empty buffer.

It would be nice if Read/WriteTo, as they pull data out of the slices, didn't throw away the slices entirely, so that you could imagine using a Buffers not unlike a bytes.Buffer, where once allocated to a particular size you could repeatedly Write into it and then Read out from it (or ReadFrom into it and WriteTo out of it). Unfortunately even if the writing operations did avoid cutting slices out of the buffer, they need some way to track what is left to be written from a particular slice, and the way to do that is to advance the base pointer, reducing len and cap. At the end of a complete write the lens of the buffers are necessarily 0, and there's no way to back them back up. Concretely, if I have:

b := Buffers{make([]byte, 100)}

and I write 99 bytes, I'm left now with Buffers{slice(len=1,cap=1)}. If I write 100 bytes, I'm left with Buffers{} (no slices). The desire to reuse a Buffer would suggest that maybe Read/WriteTo shouldn't pull slices out, but it's not terribly useful to leave them in when they'd have len (and likely cap) 0 so they are basically useless anyway.

This implies that reuse of a Buffers for new reading into the buffers after writing from the buffers is probably just hard. You'd need to do something like:

allBuffers := Buffers{make([]byte, 0, 100), make([]byte, 0, 100)}
b := make(Buffers, len(allBuffers))
for {
    copy(b, allBuffers) // restore full capacity
    b.ReadFrom(conn1)
    b.WriteTo(conn2)
}

That seems OK to me. I just want to make sure we've thought about all this.

@bradfitz, what do you think? Does this seem like a reasonable basic plan for readv? Should we do it for Go 1.8 so that there's never a net.Buffers that only works for writev?

The text was updated successfully, but these errors were encountered:

nussjustin · 2016-10-26T17:03:49Z

FWIW I played with implementing readv (ReadFrom/Write) once (see cl/30102) and got some nice numbers (which I don't have anymore) in my tests. My implementation kept the current type and all calls to Write/ReadFrom just overwrote the old data. I didn't have a real use for it back than (and still currently don't have), but from the raw numbers I got it seemed like it could be easily used to speed up some tools (probably stuff like parsers, etc.)

You'd need to do something like:

As I understand your code ReadFrom would need to reslice net.Buffers to make WriteTo work properly. But what about Write? Would it append to the net.Buffers (and potentially adding a new slice) or only write as much as fits and return an error otherwise? If it appends a new slice users would need to remember to reuse it. Although Read/Write on net.Buffers will probably rarely be used, I'd like to have this clearly defined and though about.

bradfitz · 2016-10-26T20:28:16Z

Yeah, that copy is what I figured people caring about allocations could do.

I don't care whether readv support goes into Go 1.8 or later, though. @nuss-justin and @ianlancetaylor also seemed lukewarm on it. @ianlancetaylor said on that CL:

In a language like Go readv seems much less interesting than writev. It seems easy enough to read into a slice and then use slice expressions to pass around different versions to different places

ianlancetaylor · 2016-10-26T23:54:15Z

As far as I can see, the only meaningful reason to use readv in Go is to assemble a list of slices into existing arrays, and read into them. So examples like Buffers{make([]byte, 0, 16)} seem to me to be somewhat uninteresting. What you really do is Buffers{hdr[:16], data[:1024]}. In that mode, what the Buffers type looks like after Buffers.ReadFrom(conn) does not seem very interesting. All you really need to know is how much data you got.

I guess I don't see why anybody would want to use the same Buffers value for reading and writing. In the cases where a scatter-read is valuable, I'm not going to want to do a gather-write from the same set of buffers. If I wanted to send the data through I would just use a straight []byte and use byte slices to analyze it as needed.

rsc · 2016-11-03T14:20:03Z

@ianlancetaylor, I don't know that anyone would necessarily want to reuse the Buffers type, but it seemed worth thinking through what it might look like.

I do want to make sure that we know how Buffers will support readv, or else that we know we will never want to support readv. If readv will need something different, the "Buffers" may not be the right name. I agree that Buffers{hdr[:16], data[:1024]} is the interesting case. However, it matters what the Buffers looks like after ReadFrom(conn) because you might expect to get everything in one read but actually need two reads. The second b.ReadFrom(conn) needs to correctly pick up where the previous one left off.

rsc · 2016-11-03T14:20:39Z

It does sound like we agree that the current Buffers definition has a reasonable way to support readv, though, so I'll punt the actual addition of readv to Go 1.9.

noblehng · 2017-02-21T08:26:51Z

Making the Buffer a ring buffer seems better for reusing it even only for writev.

And what about sendmmsg/recvmmsg? They are more general than writev/readv but not as portable. Unlike writev, sendmmsg can write to unconnected sockets and multiple addresses, these could be useful for udp servers like QUIC. For now, you can use Buffer to write to a listening UDPConn, but will only get a error after doing the actual writev syscall, that matches the Write() behavior of UDPConn, so I guess it is not a problem.

DarienRaymond · 2017-04-21T10:14:41Z

+1 for supporting readv in net.Conn

I am working on a network proxy project. A proxy workflow (per connection) now looks like below:

Get a buffer ([]byte) from a sync.Pool
Fill in the buffer from an inbound connection.
Process the content (header, encoding, etc)
Send out the buffer through an outbound connection.
Release the buffer back to the pool.
Repeat until connection ends.

The problem is that it is difficult to determine the size of the buffer. Using a 32K or larger buffer can increase throughput for a single connection with large payload. But it increases (unnecessary) memory usage when there are many connections with little payload.

With net.Buffers.ReadFrom, I can do the following:

Get 10 of [2k]byte from the pool
Fill them in. Keep only the buffers that has content, say [5][2k]byte, and return the rest to the pool.
Process in the same way as usual.

In this way, the memory usage can automatic adapt for different network scenarios.

anacrolix · 2018-01-08T01:16:06Z

Can someone comment on the recvmmsg usage?

ianlancetaylor · 2018-09-07T12:55:04Z

I don't see how the existing net.Buffers type can be used reasonable with the sendmmsg and recvmmsg system calls. Those calls include peer information that can not be represented in net.Buffers at all. I think we need a different mechanism.

VictoriaRaymond · 2018-09-07T14:39:37Z

A quick follow-up on this. We have implemented our own version of readv() in V2Ray, based on syscall.RawConn interface. As of Go 1.11, the reader works on all major operation systems.

The library is coupled with the rest of the project for the moment. If others are interested, we can make it standalone package.

problame · 2018-12-18T23:00:09Z

@VictoriaRaymond how did you deal with EAGAIN? These are practically certain to occur because the file descriptor is in non-blocking mode.

AFAIK, we have two options:

SetNonblock(false) => connection deadlines won't work because those are implemented using the event loop (poll.FD if I'm not mistaken)
Treat EAGAIN like a zero-read and retry immediately. Effectively, this is busy-waiting.

Both of above options are not great.
We need runtime support.

I do not understand why https://go-review.googlesource.com/c/go/+/30102/ was abandoned due to missing application.
My not too atypical use case is the server implementation of a TLV-protocol (https://en.wikipedia.org/wiki/Type-length-value): One PDU is a very short header including payload length (8 bytes) followed by payload.
I use a buffer pool of round-up-to-power-of-2-buffers, which eliminates allocations and guarantees page alignment.
With that, the implementation is CPU-bound due to the two read syscalls.

One might argue that the following optimization is possbile:

On the very first PDU read:
- read one header (8 bytes)
- fall through to the following rule (as for all subsequent PDU reads)
Read the previous header's payload + the next header in one io.ReadFull call

However, on my Linux 4.9 system, above optimization results in two subsequent read syscalls: the first one is shorted by the kernel to the payload length, because the PDU-sender typically chooses payload lengths that are power-of-2.
The syscall latency induced by the second read can be fully avoided using readv.

@ianlancetaylor is it worthwile to condense that case down to an executable microbenchmark or are there other (political / resource) constraints prohibiting implementation of readv in Go?

dsnet · 2018-12-18T23:34:16Z

@problame, CL/30102 was abandoned to "missing application" by the author's own admission. It is not a statement that no such application exists.

ianlancetaylor · 2018-12-19T00:07:10Z

@problame I don't think there is any opposition to readv support. This issue is about how to implement it. I think it just hasn't been done.

(I'll note that I think it could be done today using the SyscallConn method and syscall.Syscall(SYS_READV, ...), but that is certainly not convenient.)

problame · 2018-12-19T00:29:13Z

@ianlancetaylor what are your thoughts on the EAGAIN problem pointed out above? Is the assumption that, when using raw syscalls, we need to set the FD to blocking? Or is it possible to use read inside syscall.RawConn.Read?

ianlancetaylor · 2018-12-19T00:44:46Z

The SyscallConn method gives you a way to block until a descriptor is readable. See https://golang.org/pkg/syscall/#RawConn .

rsc added this to the Go1.8 milestone Oct 26, 2016

rsc added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Oct 26, 2016

rsc modified the milestones: Go1.9, Go1.8 Nov 3, 2016

rsc added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Nov 3, 2016

bradfitz modified the milestones: Go1.10, Go1.9 Jun 9, 2017

bradfitz added FeatureRequest help wanted labels Jun 9, 2017

bradfitz modified the milestones: Go1.10, Unplanned Nov 28, 2017

dsnet mentioned this issue May 17, 2018

proto: add scatter/gather API for serialization golang/protobuf#609

Open

golang deleted a comment from Terry-Mao Aug 31, 2018

spenczar mentioned this issue Mar 26, 2019

Serialization and deserialization of large messages can be slow twitchtv/twirp#166

Closed

cbandy mentioned this issue Feb 22, 2022

net: Buffers.WriteTo is prone to memory leaks #45163

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

net: add Buffers.ReadFrom and Buffers.Write? #17607

net: add Buffers.ReadFrom and Buffers.Write? #17607

rsc commented Oct 26, 2016

nussjustin commented Oct 26, 2016

bradfitz commented Oct 26, 2016

ianlancetaylor commented Oct 26, 2016

rsc commented Nov 3, 2016

rsc commented Nov 3, 2016

noblehng commented Feb 21, 2017

DarienRaymond commented Apr 21, 2017

anacrolix commented Jan 8, 2018

ianlancetaylor commented Sep 7, 2018

VictoriaRaymond commented Sep 7, 2018

problame commented Dec 18, 2018 •

edited

dsnet commented Dec 18, 2018

ianlancetaylor commented Dec 19, 2018

problame commented Dec 19, 2018 •

edited

ianlancetaylor commented Dec 19, 2018

Navigation Menu

net: add Buffers.ReadFrom and Buffers.Write? #17607

net: add Buffers.ReadFrom and Buffers.Write? #17607

Comments

rsc commented Oct 26, 2016

nussjustin commented Oct 26, 2016

bradfitz commented Oct 26, 2016

ianlancetaylor commented Oct 26, 2016

rsc commented Nov 3, 2016

rsc commented Nov 3, 2016

noblehng commented Feb 21, 2017

DarienRaymond commented Apr 21, 2017

anacrolix commented Jan 8, 2018

ianlancetaylor commented Sep 7, 2018

VictoriaRaymond commented Sep 7, 2018

problame commented Dec 18, 2018 • edited

dsnet commented Dec 18, 2018

ianlancetaylor commented Dec 19, 2018

problame commented Dec 19, 2018 • edited

ianlancetaylor commented Dec 19, 2018

problame commented Dec 18, 2018 •

edited

problame commented Dec 19, 2018 •

edited