Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compress: unify Huffman logic in flate and bzip2 #20485

Open
dsnet opened this issue May 24, 2017 · 3 comments
Open

compress: unify Huffman logic in flate and bzip2 #20485

dsnet opened this issue May 24, 2017 · 3 comments
Milestone

Comments

@dsnet
Copy link
Member

dsnet commented May 24, 2017

The Huffman encoding in both flate and bzip2 is identical except for some minor differences:

  • bzip2 treats the leading bits in a bitstream as the MSB of a byte, while flate treats the leading bits as the LSB of a byte.
  • bzip2 allows Huffman codes to be up to 20-bits long, while flate allows codes up to 15-bits long.

Currently, the implementation of Huffman encoding in flate is superior (and faster) to the one in bzip2. The differences are very minor and it should be able to unify the two without any performance hit. We should extract the common logic as compress/internal/huffman.

@dsnet dsnet added this to the Go1.10 milestone May 24, 2017
@dsnet dsnet self-assigned this May 24, 2017
@dsnet dsnet modified the milestones: Go1.10, Unplanned Sep 26, 2017
@adamdrake
Copy link

Looks like an interesting performance optimization and stdlib simplification. Any plans to work on this personally or is it open for the community in general?

@dsnet
Copy link
Member Author

dsnet commented Nov 8, 2017

I re-wrote the flate decoder from the ground up in https://github.com/dsnet/compress/tree/d2570c4d5b0229583afd32c7c4767e51f314c608. It uses the unified Huffman logic idea I wrote about here, but I haven't gotten around to merging it into standard library.

The performance is significantly faster than the stdlib implementation (~1.4x).

@adamdrake
Copy link

That sounds great! I hope it makes it into a future release as those speed improvements would be welcome.

@rsc rsc unassigned dsnet Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants