proposal: io: Add io.ReaderFromBuffer interface #58331

fredwangwang · 2023-02-05T04:23:34Z

When calling io.CopyBuffer(writer, reader, buf) , it tries to use the io.ReaderFrom interface on writer if it is available. For example net.TcpConn implemets the io.ReaderFrom and will try to use syscalls like splice or sendfile to avoid additional copy in the user space.
This is all well-intended and provides better performance in the case of copying from one net conn to another. However. If the reader is not a real file, but just some struct implements io.Reader, it actually falls back to using io.Copy.

This is problem-some, as the very intention of using io.CopyBuffer is that we get to control the buffer and potentially pool the buffers to avoid GCs. But the generic fallback to io.Copy makes the buffer passed in useless and makes an additional allocation.

This behavior is clearly shown in the following heap profile (left is the buffer passed in, right is the allocation made by genericReadFrom fallback (io.Copy)):

This is a very surprising behavior to the end developers without the understanding of TcpConn readfrom internals. As an obvious optimization of using io.CopyBuffer in the hot-path turns out to be even worse than just call io.Copy.

So would propose to add an interface similar to io.ReaderFrom, but allows to take additional buffer, thus the name io.ReaderFromBuffer. This way nice optimization regarding fd copies can still happen, but in case none applies, the genericReadFrom can fallback to io.CopyBuffer instead of io.Copy, to take advantage of the buffer passed in, if there is any.

The text was updated successfully, but these errors were encountered:

ianlancetaylor · 2023-02-06T00:31:33Z

See also #16474.

earthboundkid · 2023-02-06T16:50:29Z

The core issue is that the extended interface is always called if it exists. What one wants is a mechanism like errors.Unwrap or http.ResponseController.Unwrap or fs.ErrNotImplemented to signal that the method doesn't actually implement a direct ReaderFrom/WriterTo and shouldn't be used in a particular case.

fredwangwang · 2023-02-06T18:02:54Z

Thanks for the context @ianlancetaylor!

just thinking out..

What one wants is a mechanism like errors.Unwrap or http.ResponseController.Unwrap or fs.ErrNotImplemented to signal that the method doesn't actually implement a direct ReaderFrom/WriterTo

@carlmjohnson It does make sense to me. However, agree with the comment #16474 (comment) here:

... such as io.ErrNotSupported, but it would likely break callers on a massive scale

Considering v2 might not exist ever, to me at least, having something that is non-breaking yet making io.CopyBuffer less surprising has its merit.

Also interestingly mentioned here #16474 (comment)

Conceptually, it seems that ReaderFrom should have a buffer argument and ignore it if it is not needed or nil

this is actually what this proposal about, but in a non API breaking way.

there are two usecase of io.CopyBuffer that I am aware from reading the linked issue:

To optimize.
I suspect this is the reason most people use io.CopyBuffer. By adding io.ReaderFromBuffer interface and implements it in things like os.File, io.TcpConn, etc. The kernel fast path can still happen behind scene, and if it does not apply, it will gracefully fallback to io.CopyBuffer instead of io.Copy.
The net result is at least as fast as the unintuitive hack, with the exact same memory footprint

io.CopyBuffer(struct{ io.Writer }{dst}, struct{ io.Reader }{src}, buf)

To shape.
I am not positive that anyone actually relies on this. But it is mentioned here proposal: io: CopyBuffer should avoid ReadFrom/WriteTo #16474 (comment) in a test case:

... another case where the behavior of io.CopyBuffer was unexpected. The test relied upon copy being done in specific chunk sizes

Now there are two scenarios to this,

The struct implements io.ReaderFromBuffer wraps a file and fast path is available (e.g. io.TcpConn). Then shaping is not supported. But I doubt how much value there is to shape the chunk size when it is handling at the kernel level. It is important for kernel drivers for sure, but not so much with go at user space level.
The struct implements io.ReaderFromBuffer is just some code. Then shaping can still happen -- the struct can just take the passed in buffer and do whatever it needs with it.

Bottomline, having io.ReaderFromBuffer still does NOT guarantee the passed buffer will be used (if faster path available), but it will not result in unexpected additional allocations like io.ReaderFromdoes.

earthboundkid · 2023-02-06T19:02:10Z

Just starting returning io.ErrNotSupported in existing methods would break existing clients, yeah. What you'd want to do is to make a new interface, to avoid bikeshedding call it io.ReaderFromV2 and io.WriterToV2, and then those would be allowed to return io.ErrNotSupported, which io.Copy/io.CopyBuffer would know about and be able to deal with.

ianlancetaylor · 2023-02-07T00:11:28Z

ErrNotSupported is #41198, which was accepted but I've always feel a bit leery about implementing it.

earthboundkid · 2023-02-07T01:18:19Z

Yes. I wrote on #41198 that the proposal is flawed because “ErrNotSuported” is domain specific. Not supporting http.FlushError is different than not supporting io.ReaderFrom. An overhaul of io.Copy should have io-specific ErrNotSupported variables.

gopherbot · 2023-03-11T08:08:20Z

Change https://go.dev/cl/475575 mentions this issue: io: optimize copyBuffer to make use of the user-provided buffer for fallbacks

fredwangwang added the Proposal label Feb 5, 2023

gopherbot added this to the Proposal milestone Feb 5, 2023

seankhliao added the Performance label Feb 5, 2023

ianlancetaylor mentioned this issue Feb 6, 2023

proposal: io: CopyBuffer should avoid ReadFrom/WriteTo #16474

Open

costela mentioned this issue Jan 16, 2024

net/http: use Copy in ServeContent if CopyN not needed #65106

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: io: Add io.ReaderFromBuffer interface #58331

proposal: io: Add io.ReaderFromBuffer interface #58331

fredwangwang commented Feb 5, 2023

ianlancetaylor commented Feb 6, 2023

earthboundkid commented Feb 6, 2023

fredwangwang commented Feb 6, 2023

earthboundkid commented Feb 6, 2023

ianlancetaylor commented Feb 7, 2023

earthboundkid commented Feb 7, 2023

gopherbot commented Mar 11, 2023

proposal: io: Add io.ReaderFromBuffer interface #58331

proposal: io: Add io.ReaderFromBuffer interface #58331

Comments

fredwangwang commented Feb 5, 2023

ianlancetaylor commented Feb 6, 2023

earthboundkid commented Feb 6, 2023

fredwangwang commented Feb 6, 2023

earthboundkid commented Feb 6, 2023

ianlancetaylor commented Feb 7, 2023

earthboundkid commented Feb 7, 2023

gopherbot commented Mar 11, 2023