Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compress/flate: document HuffmanOnly #16489

Closed
rasky opened this issue Jul 25, 2016 · 3 comments
Closed

compress/flate: document HuffmanOnly #16489

rasky opened this issue Jul 25, 2016 · 3 comments

Comments

@rasky
Copy link
Member

rasky commented Jul 25, 2016

Go 1.7 is introducing a new very fast compressor that can be selected through the flate.HuffmanOnly constant.

I find the name suboptimal:

  • The other compressions levels are called NoCompression, BestSpeed, BestCompression, DefaultCompression. None of the cite specific algorithm names. For instance, there is no Snappy constant to select the Snappy algorithm (that is instead called through the BestSpeed constant).
  • If you're not aware of the inners of the gzip compressor, you cannot know what HuffmanOnly implies. The Only word can hint at the fact that only a portion of the algorithm is being run, and you could assume that this means more speed and less ratio, but it's not immediately clear. The inline help also is very technical (Disables match search and only does Huffman entropy reduction).
  • When I ran an informal poll between colleagues, making them read the klauspost gzip repository's README, many were worried that the modifications would alter the DEFLATE bitstream compatibility. This produced Clarify that snappy is still gzip compatible klauspost/compress#46. I worry that HuffmanOnly might raise the same concerns. Compatibility with the DEFLATE bitstream is clarified in the release notes, but not in the documentation.

My suggestion is to change the constant to a name which better describes the results achieved rather than the algorithm used. I personally like MaxSpeed, but anything would go.

@dsnet dsnet added this to the Go1.7Maybe milestone Jul 25, 2016
@dsnet
Copy link
Member

dsnet commented Jul 25, 2016

/cc @nigeltao @klauspost

While HuffmanOnly does assume a person knows a little about the internals of DEFLATE, I believe it is intentional. Part of the use case for HuffmanOnly is to compress data that has already been compressed using LZ77 string matching, and the only benefit to be gained is to apply some form of entropy encoding. This detail is crucial in choosing HuffmanOnly over BestSpeed. Also, what would the difference be between MaxSpeed and BestSpeed.

Futhermore, there is some precedence for the name HuffmanOnly:

  1. It is analogous to the Z_HUFFMAN_ONLY from the C zlib library.
  2. It is not just the name of the algorithm, but the name of the methodology used as specified in RFC1951. In the Abstract section, it specifically mentions that DEFLATE is a combination of LZ77 and Huffman.

That said, I believe there is benefit in documenting what HuffmanOnly does and what the intended use case is for it.

@dsnet dsnet changed the title compress/flate: rename HuffmanOnly to something else? compress/flate: document HuffmanOnly Jul 25, 2016
@dsnet
Copy link
Member

dsnet commented Jul 25, 2016

We'll just document this, Although, I wouldn't be opposed to naming it LudicrousSpeed ;)

@dsnet dsnet modified the milestones: Go1.7, Go1.7Maybe Jul 25, 2016
@gopherbot
Copy link

CL https://golang.org/cl/25200 mentions this issue.

@golang golang locked and limited conversation to collaborators Jul 25, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants