New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compress/flate: deflatefast produces corrupted output #41420
Comments
compress/zlib
since go1.15 caused GCS data corruption?compress/zlib
since go1.15?
compress/zlib
since go1.15?
cc @dsnet |
FWIW, this could be related to the input file size. I tested the script with a smaller file (102MB). The outcomes were the same between different Go versions. |
Using decompressor is not sufficient to verify data corruption? |
And maybe CLs for #34121 is related |
It is unclear to me whether this issue is about the compressed output simply being different or that the compressed output is actually invalid (i.e., cannot be decompressed). Can you please clarify? |
If the output is valid (e.g., can be decompressed), then this is working as expected. The compression libraries make no guarantees that the output remains stable for all time. While that guarantee can be useful in some contexts, if unfortunately means that we can never make changes to the compression algorithm either to improve the speed or to improve the compression ratio, both which are properties that are generally considered more important than stability. |
Thanks for the advice. Unfortunately, It looks like the data was indeed corrupted, as I got an I added the decompression logic: package main
import (
"bufio"
"compress/zlib"
"flag"
"fmt"
"io"
"os"
"path/filepath"
"runtime"
"strings"
)
var fileSplit = "---"
func getBaseName(p string) string {
s := filepath.Base(p)
return strings.Split(s, fileSplit)[0]
}
func doCompression(filename string) error {
fi, err := os.Open(filename)
if err != nil {
return err
}
defer fi.Close()
outname := fmt.Sprintf("%s%scompressed-%s", getBaseName(filename), fileSplit, runtime.Version())
fo, err := os.Create(outname)
if err != nil {
return err
}
defer fo.Close()
const outBufSize = 1024 * 1024
foWr := bufio.NewWriterSize(fo, outBufSize)
compressor, err := zlib.NewWriterLevel(foWr, zlib.BestSpeed)
fmt.Printf("Compress: %s -> %s\n", filename, outname)
buf := make([]byte, outBufSize*3)
if _, err := io.CopyBuffer(compressor, fi, buf); err != nil {
compressor.Close()
return err
}
// compressor needs to be closed first to flush the rest of the data
// into the bufio.Writer
if err := compressor.Close(); err != nil {
return err
}
if err := foWr.Flush(); err != nil {
return err
}
return nil
}
func doDecompression(filename string) error {
fi, err := os.Open(filename)
if err != nil {
return err
}
defer fi.Close()
fiRd, err := zlib.NewReader(fi)
if err != nil {
return err
}
outname := fmt.Sprintf("%s%sdecompressed-%s", getBaseName(filename), fileSplit, runtime.Version())
fo, err := os.Create(outname)
if err != nil {
return err
}
defer fo.Close()
const outBufSize = 1024 * 1024
foWr := bufio.NewWriterSize(fo, outBufSize)
fmt.Printf("Decompress: %s -> %s\n", filename, outname)
buf := make([]byte, outBufSize*3)
if _, err := io.CopyBuffer(foWr, fiRd, buf); err != nil {
fiRd.Close()
return err
}
if err := fiRd.Close(); err != nil {
return err
}
if err := foWr.Flush(); err != nil {
return err
}
return nil
}
func main() {
var filename string
var decompression bool
flag.StringVar(&filename, "f", "", "filename")
flag.BoolVar(&decompression, "d", false, "should decompress")
flag.Parse()
var err error
if decompression {
err = doDecompression(filename)
} else {
err = doCompression(filename)
}
if err != nil {
panic(err)
}
} I first tested this with $ ./cli -f ~/chromium/src/out/Release/content_browsertests
Compress: ~/chromium/src/out/Release/content_browsertests -> content_browsertests---compressed-go1.14.2
$ ./cli -f content_browsertests---compressed-go1.14.2 -d
Decompress: content_browsertests---compressed-go1.14.2 -> content_browsertests---decompressed-go1.14.2
$ diff ~/chromium/src/out/Release/content_browsertests content_browsertests---decompressed-go1.14.2
# No diff Then I switched to $ ./cli -f ~/chromium/src/out/Release/content_browsertests
Compress: ~/chromium/src/out/Release/content_browsertests -> content_browsertests---compressed-go1.15
$ ./cli -f content_browsertests---compressed-go1.15 -d
Decompress: content_browsertests---compressed-go1.15 -> content_browsertests---decompressed-go1.15
panic: zlib: invalid checksum
goroutine 1 [running]:
main.main()
.../main.go:105 +0x1f9 And there was a difference between the original input and the decompressed file. A bit more update: I used On the other hand, I also tried this vice versa -- using |
I need more information to investigate this. The code snippets provided only do basic compression/decompression. I need real test data that exhibits the problem. |
Sent you the data link in an email. |
Does this issue not happen if you use randomly generated data having similar size? |
No it didn't happen with random data. I tested with |
You might be able to shrink the test case down from 3GB with https://github.com/dgryski/go-ddmin. |
I was at first able to reproduce only a part of this. I downloaded this small dataset and appended it to a file 28 times until it was 1139520480 bytes. I could then get a zlib file of 291858894 bytes in 1.14 and 291858627 bytes in 1.15. The first time the output changed was in 8d43307. cmp: foo14 foo15 differ: char 275153560, line 1014680 BUT, zlib reader in both 1.14 and 1.15 can read each other's output, so there doesn't appear to be any corruption there, just a change in layout. Then I tried it on a different file (~4.8 GB), sadly a proprietary dataset, and got a bit further. When decompressing the just-compressed file to calculate its checksum, I get a panic
there's a difference here: all14.gz all15.gz differ: char 199653755, line 768787 git bisect points to the same commit as before though. It's a shame I can't share this file, not even privately, but I can try and run it again using a different Go build to see if it's fixed. edited code to (de)compress
bisection code
|
@kokes, difference in output in itself is not particularly interesting since the Thank you @k-ye. I am able to reproduce which does indeed indicate that it's producing corrupted output. I'm bisecting the root cause now (may very well be the CLs related to #34121 that you pointed to earlier). |
Adding to Go1.15.3 milestone since this is a regression relative to Go1.14 and a serious one. |
@dsnet, requesting a backport adds it to 1.15.x milestone. This issue should target 1.16. |
Thanks @networkimprov. I've been a bit distant from Go toolchain development for a while, so I've gotten behind on the latest in processes. |
@dsnet Anyway you can share the reproducer? My github name @gmail.com |
Not sure whether this is helpful, but it seems setting |
@klauspost should lines https://github.com/golang/go/blob/master/src/compress/flate/deflatefast.go#L296 read:
At least this seems to fix the tests for the smaller bufferReset value. |
@egonelbre This also fixes my reproducer. (A 3.6GB file of concatenated Go source files from my $GOPATH). |
@egonelbre fixes my reproducer as well |
If I am reading the code right it looks to me like any input over 2,147,352,577 bytes (bufferReset) could be corrupted, and it's more likely for corruption to occur at every multiple of 32,768 bytes (maxMatchOffset) over that. Is that accurate ? |
@egonelbre also fixed my reproducer as well 🎉 |
@egonelbre No, that cannot be correct. The value may not matter, that will only fix 16383 out of 16384 cases, but will still result in a false hit when the hash(0)&16383 matches the index. The offset must be enough to invalidate it completely. I will see if I can figure out what is causing the false match. |
My own branch has a defensive So we will need to bump offset by an extra one in this case. Let me cook up a PR. |
Since matches are allowed to be up to and including at maxMatchOffset we must offset the buffer by an additional element to prevent the first 4 bytes to match after an out-of-reach value after shiftOffsets has been called. We offset by `maxMatchOffset + 1` so offset 0 in the table will now fail the `if offset > maxMatchOffset` in all cases. Fixes golang#41420 Change-Id: I2f45faee4ef75a8b072af5e28d7e18b973f4bdac
Since matches are allowed to be up to and including at maxMatchOffset we must offset the buffer by an additional element to prevent the first 4 bytes to match after an out-of-reach value after shiftOffsets has been called. We offset by `maxMatchOffset + 1` so offset 0 in the table will now fail the `if offset > maxMatchOffset` in all cases. Fixes golang#41420 Change-Id: Id6b054d9fd8adf2440d10490cb94962edac0bb02
If someone has reproducers, please throw everything you have at #41477 |
Change https://golang.org/cl/255879 mentions this issue: |
@klauspost fixes mine from #41420 (comment) |
Does anyone have a test case that can be included within the standard repo? It sounds like testing this requires a large (multi GB) input, which presumably also takes considerable CPU time to deflate/reflate. So if the test needs to be disabled by default that's fine. The test input needs to either be synthesized locally or downloaded as-needed from the network though. |
If test would somehow change Also, maybe there's some input that deflates/inflates relatively fast? |
Was AFK and preoccupied.
Yeah. Probably a reasonable thing (shiftOffsets, reset clears the history). I will see if I can add a reasonable test with this.
Zeroes. Should also be doable. Encode a MB of semi-compressible content. A bit more than 2GB zeroes and re-add some semi-compressible content that should find matches in the first part if broken. However, decompressing 2GB will probably be by far the slowest. I could see that part taking several seconds, even with this. |
PR/CL updated with a test that catches the regression and sanity tests a few more things. |
Friendly ping as this issue has the Soon label. @mdempsky Are you able to take another look at the updated CL? |
I commented on the CL. I think it's good, but I had a few questions to double-check that I understand the code correctly. |
Change https://golang.org/cl/266177 mentions this issue: |
The fastest compression mode can pick up a false match for every 2GB of input data resulting in incorrectly decompressed data. Since matches are allowed to be up to and including at maxMatchOffset we must offset the buffer by an additional element to prevent the first 4 bytes to match after an out-of-reach value after shiftOffsets has been called. We offset by `maxMatchOffset + 1` so offset 0 in the table will now fail the `if offset > maxMatchOffset` in all cases. Updates #41420. Fixes #41463. Change-Id: If1fbe01728e132b8a207e3f3f439edd832dcc710 GitHub-Last-Rev: 50fabab GitHub-Pull-Request: #41477 Reviewed-on: https://go-review.googlesource.com/c/go/+/255879 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Joe Tsai <thebrokentoaster@gmail.com> Trust: Matthew Dempsky <mdempsky@google.com> (cherry picked from commit ab541a0) Reviewed-on: https://go-review.googlesource.com/c/go/+/266177 Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org> Trust: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
I'm able to repro this in since
go1.15
, including the latestgo1.15.2
.What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
We have compressed the same data using
zlib.NewWriterLevel(writer, zlib.BestSpeed)
, and found that the output betweengo1.15
andgo1.14.2
were different. This difference has eventually led to a data corruption when we uploaded the compressed file to GCS.Here's the minimal reproducible example I have:
The input data was too big to be shared (2.5G). But I can share the data internally (FYI, my LDAP is yekuang@).
What did you expect to see?
No difference in the compressed data between
go1.14.2
andgo.1.15
.What did you see instead?
Compressed data were different.
Size of the compressed data:
go1.14.2
: 728571269go1.15
: 728570333I also did a
cmp
, and they started to differ at byte 363266597.Let me know if you need more information, thanks!
The text was updated successfully, but these errors were encountered: