Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: io: io.WriterToAt and io.ReaderFromAt #52383

Open
Jorropo opened this issue Apr 16, 2022 · 4 comments
Open

proposal: io: io.WriterToAt and io.ReaderFromAt #52383

Jorropo opened this issue Apr 16, 2022 · 4 comments

Comments

@Jorropo
Copy link
Member

Jorropo commented Apr 16, 2022

Proposal

WriterToAt

// Start gives you the current head point, length is how much data is remaining.
func (*SectionWriter) Unwrap() (r WriterAt, start int64, length int64)

type WriterToAt interface {
  // WriteToAt SHOULD be capable to unpack *SectionWriter objects to accelerate copies.
  // If a copy from an unwrapped *SectionWriter is performed, *SectionWriter also need to be seeked to reflect that.
  // Off is the offset in the reader. If you want pass an offset in the Writer, wrap it in an *SectionWriter first.
  // If the length missmatch between the reader and the writer copy until the smallest one and return EOF.
  // If the copy could not be performed more efficiently than with a simple read write copy loop, then return with errors.ErrUnsupported
  // If the copy was succesfull err == nil and n == length.
  // If length == -1 that means copy until EOF.
  WriteToAt(w Writer, off int64, length int64) (n int64, err error)
}

ReaderToAt (exactly the same but reader and writer are swapped)

// Start gives you the current head point, length is how much data is remaining.
func (*SectionReader) Unwrap() (r ReaderAt, start int64, length int64)

type ReaderFromAt interface {
  // ReadFromAt SHOULD be capable to unpack *SectionReader objects to accelerate copies
  // If a copy from an unwrapped *SectionReader is performed, *SectionReader also need to be seeked to reflect that.
  // Off is the offset in the reader. If you want pass an offset in the Reader, wrap it in an *SectionReader first.
  // If the length missmatch between the writer and the reader copy until the smallest one and return EOF.
  // If the copy was succesfull err == nil and n == length.
  // If the copy could not be performed more efficiently than with a simple read write copy loop, then return with errors.ErrUnsupported
  // If length == -1 that means copy until EOF.
  ReadFromAt(w Reader, off int64, length int64) (n int64, err error)
}

Motivations

Clean (not having to deal with FDs and CopyFileRange) concurrent zerocopy.
On linux, at least, concurrent reads are writes are perfectly safe and sane as long as no writes overlap and no reads overlap with a write.

Easly composable offset based zero copy.
For example a database that use a single big file could return *SectionReader to objects pointing to objects on the disks. And even if multiple step later (after being wrapped io.MultiReader call for example), zerocopy would still just work.

Generalisable, other types like bytes.Reader could be optimised that way too.

Current WriterTo and ReaderFrom allows similar operations if you combine them with seeking and LimitedReader but that is not concurrent safe.

The reason to use io.Section* types as wrappers that way is because this avoid an big (8) matrix of required interfaces:

{ReaderFrom,WriterTo}{At,}({Writer,Reader}{At,})

With this only 4 interfaces are required. 2 of them are already in the std already.
We replace the type matrix with type assertion in the different implementations.

Related issues:

@ianlancetaylor
Copy link
Contributor

Can we get the same effect by implementing io.SectionReader.WriteTo and io.SectionWriter.ReadFrom?

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Apr 17, 2022
@Jorropo
Copy link
Member Author

Jorropo commented Apr 17, 2022

Can we get the same effect by implementing io.SectionReader.WriteTo and io.SectionWriter.ReadFrom?

Assuming the goal is not add any new interface and support concurrent operations (so no Seek + WriteTo tricks).
To get zerocopy with that solution to work that would require io.SectionReader.WriteTo to understand that reader and writer are a *os.File and do CopyFileRange calls.
Same for bytes.Reader and bytes.Buffer, io.SectionReader.WriteTo would need to able to understand thoses two and copy from the underlying arrays.
Same for os.File to net.TCPConn and SendFile calls.
What if I create a custom AES-XTS wrapper that expose a ReaderAt / WriterAt managing AES-XTS encryption and writing to a an other device, are we gonna add support for that in io.SectionReader.WriteTo ? (no)

Without any standart API to expose WriterTo + offset + length, we are doomed to multiply adhoc solutions in potential consumers. (or be required to use solutions like importing unix and manually firing CopyFileRange, Splice, SendFile, ... calls from the buisness logic which has so many issues)

If this propsal is accepted, ultimately, *os.File, bytes.Buffer, bytes.Reader, *net.TCPConn, customAesXts.Wrapper would all implement io.WriterToAt and io.SectionReader.WriteTo implementation as easy as:

func (s *SectionReader) WriteTo(w io.Writer) (int64, error) {
  if wta, ok := s.r.(WriterToAt); ok {
    n, err := wta.WriteToAt(w, s.off, s.limit - s.off)
    r.off += n
    return n, err
  }
  return 0, errors.ErrUnsuported
}

From code organisation perspective, this proposal is also better looking to me, because for example, the zero copy code for files would be in *os.File.WriteToAt (where it belong, with the rest of the File code), not in *io.SectionReader.WriteTo.

@Jorropo
Copy link
Member Author

Jorropo commented Apr 17, 2022

Also, this isn't the question you asked, but still

Why should this be in the std, couldn't this be wrapped in an std alternative io module ?

  • Using such module would require using custom base io types (like custom net.TCPConn, os.File, ...) wrappers to enable thoses features.
    If this is done in std, and people use io.Copy or a friend, it magicaly works through the stack when io.Copy try the types assertions.
  • A custom io module wouldn't have access to internal/poll, and suboptimal thread management.

@CAFxX
Copy link
Contributor

CAFxX commented Jul 21, 2023

Recently ran into this while attempting to have multiple readers read (via io.Copy) from the same os.File in parallel. Wrapping the os.File in a SectionReader should do the trick, but that currently implies giving up on the WriterTo fast path, and in general on any hope of zero-copy send.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

4 participants