Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compress/flate: need a way to efficiently reuse the decompressor #7836

Closed
gopherbot opened this issue Apr 22, 2014 · 12 comments
Closed

compress/flate: need a way to efficiently reuse the decompressor #7836

gopherbot opened this issue Apr 22, 2014 · 12 comments
Milestone

Comments

@gopherbot
Copy link

by mail@joachim-bauch.de:

It seems there is no way to recover from an EOF error when decompressing a stream.

In my use case ("permessage-deflate" extension for WebSockets), multiple parts
of a compressed streams are received with a small header each:

  Header1a Compressed1a Header1b Compressed1b ...

When I received "Header1a", I wrap my ("io.LimitReader"-wrapped)
connection in a "flate.Reader" and start decompressing
"Compressed1a" until I get an EOF from "Read". Sometime later I
receive "Header1b" and should be able (after resetting the
"LimitReader") to use the same "flate.Reader" for further
decompression.

This however doesn't work as the previous EOF is cached in "f.err", so no more
reading happens on the underlying stream.

There also is no way to get the dictionary of the reader to create a new reader with an
existing dict for decompressing "Compressed1b".
@ianlancetaylor
Copy link
Contributor

Comment 1:

Labels changed: added repo-main, release-go1.4.

@minux
Copy link
Member

minux commented Apr 22, 2014

Comment 2:

too bad that compress/flate.NewReader returns just a io.ReadCloser,
and compress/flate already has an exported Reader interface (an
unrelated concept).
How about export flate.decompressor and add a Reset method?
maybe we can even go further and add a io.ReadResetCloser?
(probably not, too verbose, but I'm certainly open to introducing
io.Reseter)
Another way to approach the goal is to move all new()-allocated
buffers into the decompressor itself, and add a sync.Pool-backed
global free list for those decompressors, then we don't need to
add any APIs to achieve memory reuse.
What do you think?

Status changed to Accepted.

@remyoudompheng
Copy link
Contributor

Comment 3:

Why not chain your chunks into a single reader?

@gopherbot
Copy link
Author

Comment 4 by mail@joachim-bauch.de:

#3: I want to start decompression once I received the first chunk without having to put
all chunks in memory first (which might even not be possible for a long-standing
connection).
Please check my attached example which shows the problem.
Decompression of the data works (for example with Python):
>>> import zlib
>>> d = zlib.decompressobj(-15)
>>> d.decompress("\xf2H\xcd\xc9\xc9\x07\x00\x00\x00\xff\xff")
'Hello'
>>> d.decompress("\xf2\x00\x11\n\xe5\xf9E9)\x8a\x00\x00\x00\x00\xff\xff")
'Hello world!'

Attachments:

  1. issue7836_test.go (1438 bytes)

@gopherbot
Copy link
Author

Comment 5 by mail@joachim-bauch.de:

#2: A simpler fix probably would be to change "decompressor.Read" to try reading from
the underlying buffer even if an EOF-like error occurred before, instead of immediately
returning the cached error.

@minux
Copy link
Member

minux commented Apr 22, 2014

Comment 6:

#5, we'd better not change the semantics of existing API. however unlikely, people might
have depended on that.

@remyoudompheng
Copy link
Contributor

Comment 7:

You can already use a io.Pipe to solve your problem, or write your own chunked Reader.

@gopherbot
Copy link
Author

Comment 8 by mail@joachim-bauch.de:

Attached is a patch against the current development branch for this issue, that adds a
Decompressor interface which provides a "Reset" and a "Dictionary" method. Also included
are testcases for the new methods.
Please let me know what you think.

Attachments:

  1. issue_7836_fix.diff (4904 bytes)

@bradfitz
Copy link
Contributor

Comment 9:

We don't do code reviews or development with patches on the issue tracker.
See golang.org/doc/contribute.html ... but the tree is now frozen until Go 1.3 is out,
so please mail out a CL after Jun 1st approximately.

@gopherbot
Copy link
Author

Comment 10 by mail@joachim-bauch.de:

Finally added a CL: https://golang.org/cl/101620044

@rsc
Copy link
Contributor

rsc commented Oct 20, 2014

Comment 11:

Fixed by https://golang.org/cl/97140043/

@rsc
Copy link
Contributor

rsc commented Oct 20, 2014

Comment 12:

Status changed to Fixed.

@rsc rsc added this to the Go1.4 milestone Apr 14, 2015
@rsc rsc removed the release-go1.4 label Apr 14, 2015
@golang golang locked and limited conversation to collaborators Jun 25, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants