compress: unify Huffman logic in flate and bzip2 #20485

dsnet · 2017-05-24T21:27:51Z

The Huffman encoding in both flate and bzip2 is identical except for some minor differences:

bzip2 treats the leading bits in a bitstream as the MSB of a byte, while flate treats the leading bits as the LSB of a byte.
bzip2 allows Huffman codes to be up to 20-bits long, while flate allows codes up to 15-bits long.

Currently, the implementation of Huffman encoding in flate is superior (and faster) to the one in bzip2. The differences are very minor and it should be able to unify the two without any performance hit. We should extract the common logic as compress/internal/huffman.

The text was updated successfully, but these errors were encountered:

adamdrake · 2017-11-08T20:19:45Z

Looks like an interesting performance optimization and stdlib simplification. Any plans to work on this personally or is it open for the community in general?

dsnet · 2017-11-08T21:20:52Z

I re-wrote the flate decoder from the ground up in https://github.com/dsnet/compress/tree/d2570c4d5b0229583afd32c7c4767e51f314c608. It uses the unified Huffman logic idea I wrote about here, but I haven't gotten around to merging it into standard library.

The performance is significantly faster than the stdlib implementation (~1.4x).

adamdrake · 2017-11-13T17:14:55Z

That sounds great! I hope it makes it into a future release as those speed improvements would be welcome.

dsnet added this to the Go1.10 milestone May 24, 2017

dsnet self-assigned this May 24, 2017

dsnet mentioned this issue May 24, 2017

compress/bzip2: dead shifts in newHuffmanTree #17949

Closed

dsnet added the Performance label Aug 30, 2017

dsnet modified the milestones: Go1.10, Unplanned Sep 26, 2017

rsc unassigned dsnet Jun 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compress: unify Huffman logic in flate and bzip2 #20485

compress: unify Huffman logic in flate and bzip2 #20485

dsnet commented May 24, 2017 •

edited

adamdrake commented Nov 8, 2017

dsnet commented Nov 8, 2017 •

edited

adamdrake commented Nov 13, 2017

compress: unify Huffman logic in flate and bzip2 #20485

compress: unify Huffman logic in flate and bzip2 #20485

Comments

dsnet commented May 24, 2017 • edited

adamdrake commented Nov 8, 2017

dsnet commented Nov 8, 2017 • edited

adamdrake commented Nov 13, 2017

dsnet commented May 24, 2017 •

edited

dsnet commented Nov 8, 2017 •

edited