Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive/zip: make checksumReader implement io.WriterTo #56742

Open
aktau opened this issue Nov 15, 2022 · 5 comments
Open

archive/zip: make checksumReader implement io.WriterTo #56742

aktau opened this issue Nov 15, 2022 · 5 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@aktau
Copy link
Contributor

aktau commented Nov 15, 2022

We've stumbled upon a case where performance is left on the table in io.CopyBuffer when copying from an archive/zip.checksumReader to an *os.File:

var dst *os.File // ...
var src *zip.checksumReader // ...
var buf [32 * 1024]byte
io.CopyBuffer(dst, src, buf[:])

In this case, profiles [1] show that buf is unused, and io.CopyBuffer ends up allocating in a child call. The allocations take up 8% of cycles in an exemplar of a fast compilation step (which uses an auxiliary Go binary featuring zip files) at our company.

We can work around this problem now that we've discovered it using the struct hack mentioned in #16474, but this leaves performance on the table for others that copy from zip files to real files.

[1]:

3xr6WSThD42tP6V

@aktau aktau added the Proposal label Nov 15, 2022
@gopherbot gopherbot added this to the Proposal milestone Nov 15, 2022
@ianlancetaylor
Copy link
Contributor

checksumReader is not an exported type, so adding a WriteTo method will not affect the public API. This can just be an ordinary issue. Taking it out of the proposal process.

@ianlancetaylor ianlancetaylor changed the title proposal: archive/zip: make checksumReader implement io.WriterTo archive/zip: make checksumReader implement io.WriterTo Nov 16, 2022
@ianlancetaylor ianlancetaylor added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed Proposal labels Nov 16, 2022
@ianlancetaylor ianlancetaylor modified the milestones: Proposal, Backlog Nov 16, 2022
@ianlancetaylor
Copy link
Contributor

Oh, I see, it is returned by (*File).Open, as an io.ReadCloser. Still, I think we can make this change, if it is desirable, outside of the proposal process.

@ianlancetaylor
Copy link
Contributor

This is a specific instance of #16474.

@ianlancetaylor
Copy link
Contributor

Note that implementing checksumReader.WriteTo is still going to allocate a buffer, and is not going to use the buffer passed to io.CopyBuffer. I'm not sure how that will help.

@aktau
Copy link
Contributor Author

aktau commented Nov 16, 2022

You're right, of course. I had not considered this properly. Even in the very best case:

  1. checksumReader does streaming writes to the io.Writer, and
  2. The io.Writer buffers internally (or is wrapped with a bufio.Writer)

I can't imagine this being more (if at all) efficient than io.CopyBuffer.

So perhaps the proposal should be reversed: never implement WriteTo for checksumReader because then my current optimization io.CopyBuffer(struct{io.Writer}{f}, zr, buf[:]) would be invalidated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

4 participants