Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: arrange zero-copy of os.File and TCPConn to UnixConn #58808

Closed
panjf2000 opened this issue Mar 1, 2023 · 41 comments
Closed

net: arrange zero-copy of os.File and TCPConn to UnixConn #58808

panjf2000 opened this issue Mar 1, 2023 · 41 comments

Comments

@panjf2000
Copy link
Member

panjf2000 commented Mar 1, 2023

Background

For the most commonly used stream-oriented/stream-like file descriptor types that can benefit the most from Linux zero-copy tech: TCP socket, Unix domain socket, and regular file, Go currently supports these cases of zero-copy:

  • From TCP socket to TCP socket
  • From Unix socket to TCP socket
  • From file to TCP socket
  • From TCP socket to file
  • From Unix socket to file
  • From file to file
    from which the zero-copy supports whose destinations are Unix domain sockets are missing.

Since io.Copy(dst Writer, src Reader) leverages the src.(WriterTo).WriteTo or dst.(ReaderFrom).ReadFrom to try zero-copy for data transfer internally, and net.UnixConn has already implemented net.PacketConn that has WriteTo and ReadFrom methods with different function signatures, which means there is no way for us to achieve the zero-copy by implementing these two methods for net.UnixConn, thus, the zero-copy case unix-socket-to-unix-socket is off the table.

Proposal

As mentioned above, unix-socket-to-unix-socket is unlikely to accomplish, fortunately we can still support zero-copy of tcp-socket-to-unix-socket and file-to-unix-socket, therefore, I propose supporting zero-copy with destinations of Unix sockets by implementing io.WriterTo interface on os.File and net.TCPConn.

Once the implementation of this proposal is done, it will make the zero-copy technique support in Go more complete.

API changes

There will be one new method for net.TCPConn and os.File: WriteTo(io.Writer).

Benefit

Performance improvement

Benchmarks

goos: linux
goarch: amd64
pkg: net
cpu: DO-Premium-Intel
                             │      old      │                 new                  │
                             │    sec/op     │    sec/op     vs base                │
Splice/tcp-to-unix/1024-4       3.560µ ± 24%   5.165µ ± 27%  +45.10% (p=0.001 n=10)
Splice/tcp-to-unix/2048-4       4.402µ ± 21%   5.295µ ± 10%  +20.29% (p=0.007 n=10)
Splice/tcp-to-unix/4096-4       6.161µ ± 11%   5.535µ ± 19%        ~ (p=0.105 n=10)
Splice/tcp-to-unix/8192-4       7.009µ ± 22%   7.557µ ± 15%        ~ (p=0.218 n=10)
Splice/tcp-to-unix/16384-4     11.466µ ± 25%   7.228µ ± 25%  -36.96% (p=0.000 n=10)
Splice/tcp-to-unix/32768-4     19.262µ ± 34%   9.707µ ± 14%  -49.60% (p=0.000 n=10)
Splice/tcp-to-unix/65536-4      41.19µ ± 18%   19.77µ ± 16%  -52.01% (p=0.000 n=10)
Splice/tcp-to-unix/131072-4     85.08µ ± 10%   30.50µ ± 24%  -64.15% (p=0.000 n=10)
Splice/tcp-to-unix/262144-4    197.60µ ± 19%   55.84µ ± 12%  -71.74% (p=0.000 n=10)
Splice/tcp-to-unix/524288-4     345.8µ ± 16%   144.4µ ± 20%  -58.24% (p=0.000 n=10)
Splice/tcp-to-unix/1048576-4    696.2µ ± 12%   265.1µ ±  9%  -61.93% (p=0.000 n=10)
geomean                         30.94µ         18.80µ        -39.26%
                         │      old       │                  new                   │
                         │      B/s       │      B/s       vs base                 │

Splice/tcp-to-unix/1024-4 274.5Mi ± 20% 189.1Mi ± 21% -31.12% (p=0.001 n=10)
Splice/tcp-to-unix/2048-4 443.7Mi ± 20% 369.1Mi ± 9% -16.81% (p=0.007 n=10)
Splice/tcp-to-unix/4096-4 634.1Mi ± 13% 705.8Mi ± 24% ~ (p=0.105 n=10)
Splice/tcp-to-unix/8192-4 1.090Gi ± 20% 1.010Gi ± 15% ~ (p=0.218 n=10)
Splice/tcp-to-unix/16384-4 1.333Gi ± 20% 2.111Gi ± 20% +58.41% (p=0.000 n=10)
Splice/tcp-to-unix/32768-4 1.584Gi ± 25% 3.144Gi ± 12% +98.44% (p=0.000 n=10)
Splice/tcp-to-unix/65536-4 1.482Gi ± 15% 3.113Gi ± 19% +110.04% (p=0.000 n=10)
Splice/tcp-to-unix/131072-4 1.435Gi ± 11% 4.003Gi ± 19% +179.00% (p=0.000 n=10)
Splice/tcp-to-unix/262144-4 1.236Gi ± 16% 4.372Gi ± 10% +253.65% (p=0.000 n=10)
Splice/tcp-to-unix/524288-4 1.415Gi ± 14% 3.384Gi ± 25% +139.18% (p=0.000 n=10)
Splice/tcp-to-unix/1048576-4 1.403Gi ± 13% 3.691Gi ± 10% +163.08% (p=0.000 n=10)
geomean 1010.4Mi 1.625Gi +64.73%

                         │      old      │                   new                   │
                         │     B/op      │    B/op     vs base                     │

Splice/tcp-to-unix/1024-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/2048-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/4096-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/8192-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/16384-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/32768-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/65536-4 1.000 ± ? 0.000 ± 0% -100.00% (p=0.011 n=10)
Splice/tcp-to-unix/131072-4 2.000 ± 50% 0.000 ± 0% -100.00% (p=0.000 n=10)
Splice/tcp-to-unix/262144-4 5.000 ± 20% 0.000 ± 0% -100.00% (p=0.000 n=10)
Splice/tcp-to-unix/524288-4 9.000 ± 22% 0.000 ± 0% -100.00% (p=0.000 n=10)
Splice/tcp-to-unix/1048576-4 17.50 ± 20% 0.00 ± 0% -100.00% (p=0.000 n=10)
geomean ² ? ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean

                         │     old      │                 new                 │
                         │  allocs/op   │ allocs/op   vs base                 │

Splice/tcp-to-unix/1024-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/2048-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/4096-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/8192-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/16384-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/32768-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/65536-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/131072-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/262144-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/524288-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Splice/tcp-to-unix/1048576-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
geomean ² +0.00% ²
¹ all samples are equal
² summaries must be >0 to compute geomean

@panjf2000 panjf2000 added this to the Proposal milestone Mar 1, 2023
@gopherbot
Copy link

Change https://go.dev/cl/472475 mentions this issue: os,net: support zero-copy from file and TCP socket to Unix socket

@panjf2000
Copy link
Member Author

/cc @rsc @ianlancetaylor

@seankhliao
Copy link
Member

Quite an unfortunate collision of names.
While not automatic, would io.Copy(unixConn1.File(), unixConn2.File()) be zero copy?

@panjf2000
Copy link
Member Author

panjf2000 commented Mar 1, 2023

Quite an unfortunate collision of names. While not automatic, would io.Copy(unixConn1.File(), unixConn2.File()) be zero copy?

Interesting question, I've never verified whether that approach works but I doubt it because Unix domain sockets don’t actually write the data they send to the disk like regular files, as that would be far too slow. Instead, all the data is retained within kernel memory; the only point of the socket file on disk is maintaining a reference to the sockets and giving it filesystem permissions to control access, thus, copy_file_range(2) might not work for this kind of case.

@ianlancetaylor
Copy link
Contributor

It seems awkward for something like os.File to have both ReadFrom and WriteTo methods. io.Copy will prefer the WriteTo method, which seems to mean that if the WriteTo method can't optimize it will need to reimplement part of io.Copy. The situation with the two methods going back and forth is already rather confusing. I would prefer to not make it more confusing.

It's possible that something like #41198 can help here.

@panjf2000
Copy link
Member Author

In fact, I intended to fall back to ReadFrom in the new WriteTo method in order not to change the process of io.Copy, see the CL. But you're suggesting that we don't do the fallback in WriteTo and just return ErrUnsupported and update io.Copy to adapt it? @ianlancetaylor

@panjf2000 panjf2000 self-assigned this Mar 1, 2023
@ianlancetaylor
Copy link
Contributor

I think it's worth considering. I don't know for sure that it is better. I do think that it is confusing for WriteTo to call ReadFrom. The current code is already hard to understand and change, and I'm concerned about making that worse.

@panjf2000
Copy link
Member Author

Any other comments from the team on this proposal? I'd like to solicit more opinions here, thanks!

@gopherbot
Copy link

Change https://go.dev/cl/475535 mentions this issue: os: don't hide all methods in recursive call to io.Copy

@ianlancetaylor
Copy link
Contributor

After refreshing my memory of #41198, I don't think it necessarily helps. We can't really change ReadFrom to return ErrUnsupported, because historically it has not done so. Changing it do that would break existing callers outside of the standard library.

But we can do something like CL 475535. I think that approach will permit the cases we want here.

@panjf2000
Copy link
Member Author

panjf2000 commented Mar 11, 2023

After refreshing my memory of #41198, I don't think it necessarily helps. We can't really change ReadFrom to return ErrUnsupported, because historically it has not done so. Changing it do that would break existing callers outside of the standard library.

But we can do something like CL 475535. I think that approach will permit the cases we want here.

Actually, I didn't plan on returning ErrUnsupported from File.ReadFrom, I was talking about the new method File.WriteTo/TCPConn.WriteTo.

But look at it another way, if we can update the io.copyBuffer in a way like this

	if wt, ok := src.(WriterTo); ok {
		if n, err := wt.WriteTo(dst); n > 0 || err == nil || !errors.Is(err, errors.ErrUnsupported) {
			return n, err
		}
	}
	// Similarly, if the writer has a ReadFrom method, use it to do the copy.
	if rt, ok := dst.(ReaderFrom); ok {
		if n, err := rt.ReadFrom(src); n > 0 || err == nil || !errors.Is(err, errors.ErrUnsupported) {
			return n, err
		}
	}

then we will be able to avoid wasting the user-provided buffer via io.CopyBuffer, and also be able to eliminate all genericReadFrom and genericWriteTo, without breaking the compatibility of the current codebase.

@gopherbot
Copy link

Change https://go.dev/cl/475575 mentions this issue: io: optimize copyBuffer to make use of the user-provided buffer for fallbacks

@ianlancetaylor
Copy link
Contributor

After refreshing my memory of ErrUnsupported, I don't think that change is appropriate. We should only use ErrUnsupported when the case can't be executed at all. That is not true for ReadFrom or WriteTo. Those methods should do the right thing even when they can't implement a special case. So ErrUnsupported was a red herring. My apologies.

@panjf2000
Copy link
Member Author

After a second thought, I agree that ReadFrom and WriteTo are general implementations to transfer data and we shouldn't refit them into specific cases just for io.Copy.

Well, this is however not good news for this proposal, we still need an appropriate fallback from WriteTo to ReadFrom inside io.Copy, hoping anything better than my current implementation.

Another way to help this proposal is to add a new interface similar to io.ReaderFrom, try it before WriteTo and ReadFrom in io.Copy, and implement it on UnixConn, although the new interface does look like a compensation for the collision of io.ReaderFrom and io.WriterTo of UnixConn.

@ianlancetaylor
Copy link
Contributor

Perhaps we could support a wrapper around net.UnixConn that implements a ReadFrom method that is compatible with io.Copy.

@panjf2000
Copy link
Member Author

panjf2000 commented Mar 13, 2023

You mean an exported struct wrapper of UnixConn which is recommended to use when calling io.Copy with the destination of UnixConn?

Something like this:

package net

type UnixConnReaderFrom struct {
	*UnixConn
}

func (c UnixConnReaderFrom) ReadFrom(r io.Reader) (n int64, err error) {
	// do something with zero-copy...
	
	return
}

func NewUnixConnReaderFrom(c Conn) (*UnixConnReaderFrom, error) {
	uc, ok := c.(*UnixConn)
	if !ok {
		return nil, errors.New("must be a *net.UnixConn")
	}
	return &UnixConnReaderFrom{uc}, nil
}

package main

func main() {
	c, _ := net.Dial("unix", "/tmp/sock")
	uc, _ := net.NewUnixConnReaderFrom(c)
	f, _ := os.Create("log")
	io.Copy(uc, f)
}

@ianlancetaylor
Copy link
Contributor

Yes, something like that. That may be too horrible, though.

@panjf2000
Copy link
Member Author

You mean an exported struct wrapper of UnixConn which is recommended to use when calling io.Copy with the destination of UnixConn?

Something like this:

package net

type UnixConnReaderFrom struct {
	*UnixConn
}

func (c UnixConnReaderFrom) ReadFrom(r io.Reader) (n int64, err error) {
	// do something with zero-copy...
	
	return
}

func NewUnixConnReaderFrom(c Conn) (*UnixConnReaderFrom, error) {
	uc, ok := c.(*UnixConn)
	if !ok {
		return nil, errors.New("must be a *net.UnixConn")
	}
	return &UnixConnReaderFrom{uc}, nil
}

package main

func main() {
	c, _ := net.Dial("unix", "/tmp/sock")
	uc, _ := net.NewUnixConnReaderFrom(c)
	f, _ := os.Create("log")
	io.Copy(uc, f)
}

Pros and cons:

  • pros: we can support all three zero-copy cases: file -> unix, tcp -> unix and unix -> unix
  • cons: users have to use this wrapper explicitly with io.Copy, rather than naturally as in other zero-copy cases

gopherbot pushed a commit that referenced this issue Mar 15, 2023
In order to avoid a recursive call to ReadFrom, we were converting
a *File to an io.Writer. But all we really need to do is hide
the ReadFrom method. In particular, this gives us the option of
adding a WriteTo method.

For #58808

Change-Id: I20d3a45749d528c93c23267c467e607fc17dc83f
Reviewed-on: https://go-review.googlesource.com/c/go/+/475535
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
@panjf2000
Copy link
Member Author

panjf2000 commented Jul 21, 2023

Since the 1.22 tree had opened and we didn't come to a final implementation of this proposal at the time, I'd like to pick up where we left off and continue to discuss it.

Any new thoughts from the team?

@panjf2000
Copy link
Member Author

Kindly ping @rsc @ianlancetaylor

@ianlancetaylor
Copy link
Contributor

What do you see as the current proposal? Thanks.

@panjf2000
Copy link
Member Author

panjf2000 commented Aug 1, 2023

What do you see as the current proposal? Thanks.

Well, we've made several attempts at this proposal:

  1. The original implementation was to fall back to ReadFrom method in WriteTo, but it seemed to be confusing, quoted from your comment
  2. The next implementation tried to use errors. ErrUnsupported but that, as it turned out, was inapposite.
  3. The following approach would bring a new wrapper, it would enable us to support all three zero-copy cases: file -> unix, tcp -> unix and unix -> unix, but it might look a bit hideous

Therefore, I'd like to suggest adding a new interface (given that there is already a collision between net.PacketConn and net.ReaderFrom on UnixConn) for trying zero-copy techniques in io.Copy and having UnixConn implement it, that way, we can get what we want in this proposal and make the zero-copy transfer transparent to developers and avoid an explicit wrapper.

How do you like this new idea? @ianlancetaylor

@ianlancetaylor
Copy link
Contributor

Sorry, I'm not sure what I was thinking earlier. I don't think we want a new exported function net.NewUnixConnReaderFrom.

I'm not yet seeing a reasonable way to do this.

@panjf2000
Copy link
Member Author

As I said in the previous comment, is it acceptable to add a new interface say io.TransferFrom with method ReadFrom where we do the zero-copy, and make the type assertion with it in io.Copy, just like ReaderFrom with ReadFrom, then have UnixConn implement it?

@ianlancetaylor
Copy link
Contributor

I don't think adding another special method that io.Copy recognizes is a great idea. The current situation with ReadFrom and WriteTo is already confusing. That said I don't see how just adding a new interface with a ReadFrom method works, so maybe I am missing something. We already check for an interface with a ReadFrom method.

@ianlancetaylor
Copy link
Contributor

OK, thanks. I don't think adding another special method that io.Copy recognizes is a good approach. It's already too complicated. If we do want to consider that, we should consider it by itself.

@panjf2000
Copy link
Member Author

I think we have to face the fact that there might not be a good approach for this proposal based on the current problems of net.UnixConn and our previous discussions. Unless someone else has some new ideas about it, we should invite more people from the team to join this discussion?

@panjf2000
Copy link
Member Author

panjf2000 commented Aug 3, 2023

Inspired by this CL, I just uploaded a new approach in which we can avoid calling ReadFrom in WriteTo, and we also don't have to modify io.Copy now.

What do you think about it? @ianlancetaylor

@gopherbot
Copy link

Change https://go.dev/cl/515595 mentions this issue: net: don't hide all methods in recursive call to io.Copy

@ianlancetaylor
Copy link
Contributor

I'm going to have to look at it more, but it seems promising. Thanks.

@panjf2000

This comment was marked as resolved.

@rsc
Copy link
Contributor

rsc commented Oct 11, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc changed the title proposal: support the zero-copy of file-to-unix-socket and tcp-socket-to-unix-socket proposal: net: arrange zero-copy of os.File and TCPConn to UnixConn Oct 24, 2023
@rsc
Copy link
Contributor

rsc commented Oct 26, 2023

Have all remaining concerns about this proposal been addressed?

The API change is to define WriteTo methods on os.File and net.TCPConn.
When these are invoked by io.Copy with a net.UnixConn as the writer,
they will use a call like Linux’s sendfile(2) to make the copy, if possible.
An implementation is in https://go.dev/cl/472475.

@panjf2000
Copy link
Member Author

panjf2000 commented Oct 27, 2023

Have all remaining concerns about this proposal been addressed?

The API change is to define WriteTo methods on os.File and net.TCPConn.
When these are invoked by io.Copy with a net.UnixConn as the writer,
they will use a call like Linux’s sendfile(2) to make the copy, if possible.
An implementation is in https://go.dev/cl/472475.

I believe there are no remaining concerns about this proposal now.

@rsc
Copy link
Contributor

rsc commented Nov 2, 2023

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

The API change is to define WriteTo methods on os.File and net.TCPConn.
When these are invoked by io.Copy with a net.UnixConn as the writer,
they will use a call like Linux’s sendfile(2) to make the copy, if possible.
An implementation is in https://go.dev/cl/472475.

@rsc
Copy link
Contributor

rsc commented Nov 10, 2023

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

The API change is to define WriteTo methods on os.File and net.TCPConn.
When these are invoked by io.Copy with a net.UnixConn as the writer,
they will use a call like Linux’s sendfile(2) to make the copy, if possible.
An implementation is in https://go.dev/cl/472475.

@rsc rsc changed the title proposal: net: arrange zero-copy of os.File and TCPConn to UnixConn net: arrange zero-copy of os.File and TCPConn to UnixConn Nov 10, 2023
@rsc rsc modified the milestones: Proposal, Backlog Nov 10, 2023
@panjf2000 panjf2000 modified the milestones: Backlog, Go1.22 Nov 17, 2023
@gopherbot
Copy link

Change https://go.dev/cl/549197 mentions this issue: doc/go1.22: document zero-copy to net.UnixConn

gopherbot pushed a commit that referenced this issue Dec 12, 2023
For #58808

Change-Id: Id73b9e4b5fb96426a01b76ce7a1053a6ad61a58e
Reviewed-on: https://go-review.googlesource.com/c/go/+/549197
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Auto-Submit: Damien Neil <dneil@google.com>
@gopherbot
Copy link

Change https://go.dev/cl/549335 mentions this issue: doc/go1.22: correct the system call name used for zero-copy from net.TCPConn to net.UnixConn

gopherbot pushed a commit that referenced this issue Dec 14, 2023
…TCPConn to net.UnixConn

For #58808

Change-Id: I9b27af30888aaaa9659387a32c57aaea136b1c3a
Reviewed-on: https://go-review.googlesource.com/c/go/+/549335
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
ezz-no pushed a commit to ezz-no/go-ezzno that referenced this issue Feb 18, 2024
For golang#58808

Change-Id: Id73b9e4b5fb96426a01b76ce7a1053a6ad61a58e
Reviewed-on: https://go-review.googlesource.com/c/go/+/549197
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Auto-Submit: Damien Neil <dneil@google.com>
ezz-no pushed a commit to ezz-no/go-ezzno that referenced this issue Feb 18, 2024
…TCPConn to net.UnixConn

For golang#58808

Change-Id: I9b27af30888aaaa9659387a32c57aaea136b1c3a
Reviewed-on: https://go-review.googlesource.com/c/go/+/549335
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests

6 participants