Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compress/gzip: unable to reproduce the same gzip/DEFLATE compressed output as Python/Rust #40830

Closed
evandrix opened this issue Aug 16, 2020 · 2 comments
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@evandrix
Copy link

For the attached file eicar.bin.gz, 69 bytes of data X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*\n, including the trailing newline

Sample code in Go(lang):

	var buf bytes.Buffer
	zw,_ := gzip.NewWriterLevel(&buf,6)
	x,_ := hex.DecodeString("58354f2150254041505b345c505a58353428505e2937434329377d2445494341522d5354414e444152442d414e544956495255532d544553542d46494c452124482b482a0a")
	zw.Write(x)
	zw.Close()
	fmt.Println(hex.EncodeToString(buf.Bytes()))

produces a different output in the DEFLATE part, ignoring the gzip header, and CRC32 checksum and uncompressed size in footer: 8a30f5570c5075700c88368909888a3035d10888d3347776d634af5571f574760cd20d0e71f473710c72d175f40bf10cf30c0a0dd60d710d0ed175f3f4715554f1d0f6d0e202040000ffff

Under compression level=6, both Python and Rust (as well as the standard unix gzip cli) manage to all agree on the output: 8b30f5570c5075700c88368909888a3035d10888d3347776d634af5571f574760cd20d0e71f473710c72d175f40bf10cf30c0a0dd60d710d0ed175f3f4715554f1d0f6d0e20200 (DEFLATE part only, excluding gzip header and footer)

The difference, for the unobservant reader is in the starting byte 8a vs 8b, and the trailing bytes ...e20200 vs ...e202040000ffff

Is there a way to re-write the Golang code, so that the output is identical?

@martisch
Copy link
Contributor

martisch commented Aug 17, 2020

Its unclear if this is a question how to change the Go code to make the output match or a proposal that the standard library produce the exact byte by byte output as some other tools for compression.

If it is a question please see https://github.com/golang/go/wiki/Questions for venues to ask questions. The Go project does not use the issue tracker for answering questions about Go usage.

If this is a proposal please clarify why this matters (assuming the current output is otherwise correct according to compression specifications) and why it should be aligned with Python and Rust. Are there other tools/programming languages that produce different output? Why not align with them?

@martisch martisch added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Aug 17, 2020
@odeke-em odeke-em changed the title Unable to reproduce the same gzip/DEFLATE compressed output as Python/Rust compress/gzip: unable to reproduce the same gzip/DEFLATE compressed output as Python/Rust Aug 17, 2020
@evandrix
Copy link
Author

nvm, i decided to go with zstd alternatively to avoid this inconsistency. thanks

@golang golang locked and limited conversation to collaborators Aug 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

3 participants