-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
io: use syscalls like copy_file_range in Copy when possible #36817
Comments
Isn't splice(2) and/or sendfile(2) already called by io.Copy? copy_file_range(2) is |
|
While this can be a good performance improvement, only on kernels > 5.3 cross-file-system copies are supported. A fallback must be implemented in case copy_file_range returns -EXDEV. |
@ianlancetaylor thanks for the insight. I semeed to remember @beoran the original proposal did mention this, but in the context of |
The same fallback issues arise for func genericReadFrom(w io.Writer, r io.Reader) (n int64, err error) {
// Use wrapper to hide existing r.ReadFrom from io.Copy.
return io.Copy(writerOnly{w}, r)
} |
It seems like we generally agree that this is doable, so I'm going to replace NeedsDecision with NeedsFix. If anyone wants to work on this, feel free to send a CL. |
I am going to work on this and send a CL soon. |
Change https://golang.org/cl/229101 mentions this issue: |
Linux 4.5 introduced (and Linux 5.3 refined) the copy_file_range system call, which allows file systems the opportunity to implement copy acceleration techniques. This commit adds support for copy_file_range(2) to the os package. Introduce a new ReadFrom method on *os.File, which makes *os.File implement the io.ReaderFrom interface. If dst and src are both files, this enables io.Copy(dst, src) to call dst.ReadFrom(src), which, in turn, will call copy_file_range(2) if possible. If copy_file_range(2) is not supported by the host kernel, or if either of dst or src refers to a non-regular file, ReadFrom falls back to the regular io.Copy code path. Add internal/poll.CopyFileRange, which acquires locks on the appropriate poll.FDs and performs the actual work, as well as internal/syscall/unix.CopyFileRange, which wraps the copy_file_range system call itself at the lowest level. Rework file layout in internal/syscall/unix to accomodate the additional system call numbers needed for copy_file_range. Merge these definitions with the ones used by getrandom(2) into sysnum_linux_$GOARCH.go files. A note on additional optimizations: if dst and src both refer to pipes in the invocation dst.ReadFrom(src), we could, in theory, use the existing splice(2) code in package internal/poll to splice directly from src to dst. Attempting this runs into trouble with the poller, however. If we call splice(src, dst) and see EAGAIN, we cannot know if it came from src not being ready for reading or dst not being ready for writing. The write end of src and the read end of dst are not under our control, so we cannot reliably use the poller to wait for readiness. Therefore, it seems infeasible to use the new ReadFrom method to splice between pipes directly. In conclusion, for now, the only optimization enabled by the new ReadFrom method on *os.File is the copy_file_range optimization. Fixes golang#36817. Change-Id: I696372639fa0cdf704e3f65414f7321fc7d30adb Reviewed-on: https://go-review.googlesource.com/c/go/+/229101 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Change https://golang.org/cl/238864 mentions this issue: |
Now that we've added a os.File.ReadFrom method, io.CopyBuffer to a os.File will no longer use the provided buffer. For #16474 For #36817 For #37419 Change-Id: I79a3bf778ff93eab88e88dd9ecbb8c7ea101e868 Reviewed-on: https://go-review.googlesource.com/c/go/+/238864 Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
There is a discussion at https://lore.kernel.org/linux-fsdevel/20210126233840.GG4626@dread.disaster.area/T/#m05753578c7f7882f6e9ffe01f981bc223edef2b0 that suggests that perhaps the Go standard library should not use CC @acln0 |
Spooky. It seems like we should remove all of the Now that we have |
Yes, from that discussion on the Linux kernel mailing list, it seem to me that copy_file_range seems to be a Linux system call that is hard to use correctly in the general case, since it's not easy to fall back to other system calls in case it fails. Getting rid of it might cost us some performance but does have the benefit of simplifying code and lowering the requirements for the Linux kernel version. |
@beoran reopened. If a future Linux release makes copy_file_range easier to use correctly, we could always try using it again. |
Change https://golang.org/cl/291989 mentions this issue: |
I've opened an issue for the recently discovered problem at #44272. Let's use that for this specific problem rather than this general issue. |
On current Linux kernels copy_file_range does not correctly handle files in certain special file systems, such as /proc. For those file systems it fails to copy any data and returns zero. This breaks Go's io.Copy for those files. Fix the problem by assuming that if copy_file_range returns 0 the first time it is called on a file, that that file is not supported. In that case fall back to just using read. This will force an extra system call when using io.Copy to copy a zero-sized normal file, but at least it will work correctly. For #36817 Fixes #44272 Change-Id: I02e81872cb70fda0ce5485e2ea712f219132e614 Reviewed-on: https://go-review.googlesource.com/c/go/+/291989 Trust: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
Change https://golang.org/cl/292289 mentions this issue: |
…assume it failed On current Linux kernels copy_file_range does not correctly handle files in certain special file systems, such as /proc. For those file systems it fails to copy any data and returns zero. This breaks Go's io.Copy for those files. Fix the problem by assuming that if copy_file_range returns 0 the first time it is called on a file, that that file is not supported. In that case fall back to just using read. This will force an extra system call when using io.Copy to copy a zero-sized normal file, but at least it will work correctly. For #36817 For #44272 Fixes #44273 Change-Id: I02e81872cb70fda0ce5485e2ea712f219132e614 Reviewed-on: https://go-review.googlesource.com/c/go/+/291989 Trust: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> (cherry picked from commit 30641e3) Reviewed-on: https://go-review.googlesource.com/c/go/+/292289
@stapelberg just made me aware of the copy_file_range syscall. Quoting the summary:
It's also already supported by the sys repo: https://godoc.org/golang.org/x/sys/unix#CopyFileRange
When calling
Copy
, given that the conditions below are met:runtime.GOOS == "linux"
*os.File
Then, the syscall would be used to make a faster copy. I presume that we could do something similar on other platforms in the future, if they have similar syscalls.
I think
CopyN
andCopyBuffer
could also be included here, but they are left out initially for the sake of simplicity. For example,CopyBuffer
could simply ignore the buffer argument when taking the syscall shortcut, as no user space memory is required.There is only one tricky detail here, to my mind:
To not break semantics and backwards compatibility, if either of those methods exists, the syscall shortcut can't be taken. One would rely on the method having a similar syscall shortcut, if needed. Luckily, it seems like
os.File
implements neither method, so we're in the clear.The text was updated successfully, but these errors were encountered: