Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syscall: Sendfile needs documentation #64044

Open
bcmills opened this issue Nov 9, 2023 · 5 comments
Open

syscall: Sendfile needs documentation #64044

bcmills opened this issue Nov 9, 2023 · 5 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Documentation help wanted NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Nov 9, 2023

As of Go 1.21, the syscall.Sendfile function has no documentation.

For many functions in the syscall package, we assume POSIX semantics in the absence of explicit documentation. However, sendfile is not defined by POSIX, and its semantics vary significantly among platforms.

Notably:

On Linux, “sendfile() will transfer at most 0x7ffff000 (2,147,479,552) bytes, returning the number of bytes actually transferred”. FreeBSD, macOS, and Solaris do not document any such restriction.

The reporting of the actual number of bytes transferred varies by platform.

  • On FreeBSD and macOS, “sendfile() may send fewer bytes than requested” only “[w]hen using a socket marked for non-blocking I/O”. In that case, it sets the sbytes out-parameter to indicate then number of bytes written, returns -1, and sets errno to EAGAIN.
  • Linux documents that “a successful call to sendfile() may write fewer bytes than requested”, but does not specify what happens to the offset parameter or the input file's offset on error.
  • Illumos documents that “In some error cases sendfile() may still write some data before encountering an error and returning -1. When that occurs, off is updated to point to the byte that follows the last byte copied and should be compared with its value before calling sendfile() to determine how much data was sent.”

It appears that the return-value from Go's syscall.Sendfile on FreeBSD and macOS always reports the *sbytes (a.k.a len) out-parameter, which is always nonnegative. On Linux and Solaris, it reports the return value from the call, which is -1 on error.

The effect on the offset of the input file varies by platform.

  • Solaris and Linux document that if the offset parameter is null, “data will be read from in_fd starting at the file offset, and the file offset will be updated by the call.”
  • Illumos documents that “[t]he sendfile() function does not modify the current file pointer of in_fd, but does modify the file pointer for out_fd if it is a regular file.” It does not document any particular behavior if the off argument is null, but its error behavior seems to imply than a non-null offset pointer should always be used.
  • FreeBSD does not document whether the file offset of fd is modified by the call. (I'm guessing that it's not, though.)

The allowed output descriptors vary by platform.

  • FreeBSD and macOS require a socket.
  • Linux 2.6.33 and above allows any file.
  • Solaris and Illumos allow “a file descriptor to a regular file opened for writing or to a connected AF_INET or AF_INET6 socket of SOCK_STREAM type”.

In addition, on Solaris and Illumos it appears that EAGAIN can be returned for reasons other than full send buffers — it can also occur due to file or record locking on the input or output file.

Given these variations, it seems to me that the semantics and usage of the Go syscall wrapper should be documented — especially given that the signature of Go's syscall.Sendfile on FreeBSD and macOS doesn't match the signature of the corresponding system C function.

References:

@gopherbot gopherbot added compiler/runtime Issues related to the Go compiler and/or runtime. Documentation labels Nov 9, 2023
@heschi heschi added the NeedsFix The path to resolution is known, but the work has not been done. label Nov 9, 2023
@heschi heschi added this to the Go1.22 milestone Nov 9, 2023
@bcmills
Copy link
Contributor Author

bcmills commented Nov 9, 2023

(CC @panjf2000)

@paulzhol
Copy link
Member

paulzhol commented Nov 9, 2023

FreeBSD does not document whether the file offset of fd is modified by the call. (I'm guessing that it's not, though.)

I also don't think it modifies fd. I didn't catch any fo_seek() calls in vn_sendfile() however linux_sendfile_common() does them.
It is part of the Linuxulator (Linux Emulation) / Linux binary compatibility. That code also carries the following comment:

Differences between FreeBSD and Linux sendfile:
	/*
	 * Differences between FreeBSD and Linux sendfile:
	 * - Linux doesn't send anything when count is 0 (FreeBSD uses 0 to
	 *   mean send the whole file.)  In linux_sendfile given fds are still
	 *   checked for validity when the count is 0.
	 * - Linux can send to any fd whereas FreeBSD only supports sockets.
	 *   The same restriction follows for linux_sendfile.
	 * - Linux doesn't have an equivalent for FreeBSD's flags and sf_hdtr.
	 * - Linux takes an offset pointer and updates it to the read location.
	 *   FreeBSD takes in an offset and a 'bytes read' parameter which is
	 *   only filled if it isn't NULL.  We use this parameter to update the
	 *   offset pointer if it exists.
	 * - Linux sendfile returns bytes read on success while FreeBSD
	 *   returns 0.  We use the 'bytes read' parameter to get this value.
	 */

@panjf2000
Copy link
Member

panjf2000 commented Nov 10, 2023

Thank you for bringing this up. @bcmills

As the Linux man pages stated, sendfile(2) on Linux is indeed implemented distinctively from other UNIX systems.

As for the scenario of partial write, sendfile() may send fewer bytes than requested on either EAGAIN or EINTR on BSD-like OS's while a successful yet incomplete call to sendfile on Linux would return no error because EAGAIN from sendfile should only happen in the "zero-byte sent" case, as with other read/write-like system calls.

Another implementation detail worth mentioning is that sendfile(2) on Linux uses splice(2) to fulfill the zero-copy job under the hood since kernel v2.6.23, which might help us better understand the behavior of sendfile(2).

@gopherbot
Copy link

Change https://go.dev/cl/546295 mentions this issue: syscall: document Sendfile with semantics and usage

@panjf2000 panjf2000 self-assigned this Nov 30, 2023
@gopherbot
Copy link

Change https://go.dev/cl/537275 mentions this issue: internal/poll: revise the determination about [handled] and improve the code readability for SendFile

@gopherbot gopherbot modified the milestones: Go1.22, Go1.23 Feb 6, 2024
gopherbot pushed a commit that referenced this issue Feb 23, 2024
…he code readability for SendFile

There were a bit too many conditional branches in the old code,
resulting in a poor readability. It could be more concise by reducing
and consolidating some of the conditions.

Furthermore, how we've determined whether or not the data transimission
was handled by sendfile(2) seems inappropriate, because it marked the
operation as unhandled whenever any non-retryable error occurs from
calling sendfile(2), it doesn't look like a right approach, at least
this is an inconsistent behavior with what we've done in Splice.

Related to #64044

Change-Id: Ieb65e0879a8841654d0e64a1263a4e43179df1ba
Reviewed-on: https://go-review.googlesource.com/c/go/+/537275
TryBot-Result: Gopher Robot <gobot@golang.org>
Commit-Queue: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Andy Pan <panjf2000@gmail.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Documentation help wanted NeedsFix The path to resolution is known, but the work has not been done.
Projects
Development

No branches or pull requests

6 participants