Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compress/zlib: error due to compress/flate #11033

Closed
toqueteos opened this issue Jun 2, 2015 · 3 comments
Closed

compress/zlib: error due to compress/flate #11033

toqueteos opened this issue Jun 2, 2015 · 3 comments

Comments

@toqueteos
Copy link

Probably related to #11030.

Running this produces an error:

package main

import (
    "bytes"
    "compress/zlib"
    "fmt"
    "io"
    "log"
)

var blob = []byte{
    0xa2, 0x02, 0x78, 0x9c, 0x05, 0x00, 0xb1, 0x09, 0xc0, 0x20,
    0xcc, 0xa5, 0xd0, 0x17, 0xdc, 0xbc, 0xa0, 0x44, 0x88, 0x1e,
    0xe1, 0x03, 0x4e, 0xc1, 0x49, 0xc8, 0x98, 0x0b, 0xbc, 0x41,
    0x3c, 0x2a, 0xb3, 0x5b, 0xe9, 0xdc, 0xb1, 0x47, 0x34, 0x44,
    0x80, 0x8c, 0x18, 0xb8, 0x5f, 0xc2, 0xe2, 0xe8, 0x48, 0x6d,
    0x3c, 0xf5, 0x3f, 0xcb, 0x5c, 0xe4, 0xdf, 0x4f, 0xb7, 0xde,
    0x06, 0xbb, 0xbd, 0x0e, 0x21, 0xb6, 0x02, 0x78, 0x9c, 0x1d,
    0xc8, 0x51, 0x0a, 0xc0, 0x20, 0x08, 0x00, 0xd0, 0x7f, 0x4f,
    0x21, 0xb2,
}

func main() {
    fmt.Printf("blob: % x\n", blob[2:])

    var buf = bytes.NewBuffer(blob[2:40])
    r, err := zlib.NewReader(buf)
    if err != nil {
        log.Fatal(err)
    }

    var out bytes.Buffer
    n, err := io.Copy(&out, r)
    if err != nil {
        log.Fatal(err)
    }

    r.Close()

    fmt.Printf("io.Copy buf->out, %d bytes\n", n)
    fmt.Printf("out=% x\n", out.Bytes())
    fmt.Printf("buf=% x\n", buf.Bytes())
}

Playground link

I'm using compress/zlib heavily to process git packfiles and this error happens ocasionally.

Versions tested:

  • go version go1.3.3 windows/amd64
  • go version go1.4.2 windows/amd64
  • go version go1.4.2 linux/amd64
  • go version go1.4.2 darwin/amd64

It's implementation related because Python (example is py3) does just fine:

>>> zlib.decompress(bytearray.fromhex("78 9c 05 00 b1 09 c0 20 cc a5 d0 17 dc bc a0 44 88 1e e1 03 4e c1 49 c8 98 0b bc 41 3c 2a b3 5b e9 dc b1 47 34 44 80 8c 18 b8 5f c2 e2 e8 48 6d 3c f5 3f cb 5c e4 df 4f b7 de 06 bb bd 0e 21"))
b'100644 he.php\x00]\x055_~\xd9W\xff\x08J\x90\x92]\x19\xe3\xeb\xc6\xd0\xc6\xd7'
@dsnet
Copy link
Member

dsnet commented Jun 2, 2015

I have analyzed this data segment and confirmed that it is a duplicate of the issue #11030.

For brevity, I isolated the raw DEFLATE stream: http://play.golang.org/p/KFEHgR2zal
Running the snippet, we see that it complains about an error before offset 36

I decomposed the entire DEFLATE stream into the following (LSB on right):

[
    # Last, dynamic block
    "1", "10", 

    # HLIT: 257, HDIST: 1, HCLEN: 12
    "00000", "00000", "1000", 

    # HCLEN codes
    "000", "011", "011", "010", "000", "000", "000", "011", "000", "010", "000", "011", 

    # HLIT tree
    "10", "011", "001", "101", "00", "00", "101", "111", "0000101", "10", "011", "011", "10", "111", "0000010", "101", "00", "001", "10", "00", "00", "001", "10", "10", "111", "0001000", "10", "111", "0000001", "10", "011", "010", "001", "00", "10", "011", "010", "10", "00", "00", "001", "011", "100", "001", "111", "0000010", "10", "111", "0000110", "10", "00", "10", "111", "0101000", "001", "011", "110", "10", "011", "011", "101", "00", "101", "011", "110", "101", "011", "100", "101", "111", "0001000", "101", "001",

    # HDIST tree
    "00", 

    # Compressed data
    "10001", "0000", "0000", "11001", "1000", "1000", "00001", "1100", "11101", "010111", "0010", "1100", "0010", "01110", "0100", "000111", "01001", "01101", "00011", "001111", "10101", "111111", "100111", "00101", "10011", "01011", "0100", "11110", "101111", "011111", "1010", "11011", "1010", "110111", "0110"
]

Parsing the HCLEN codes, we generate the following Huffman table (LSB on right):

00  => 0
10  => 5
001 => 4
101 => 6
011 => 17
111 => 18

Thus, we can see that the HDIST tree is composed of a single zero-bit length (since the '00' code maps to the 0 symbol). To further confirm this, we can see the HDIST tree starts a bit offset 280. Converting to bytes, this is byte 35, which lies before the 36byte offset mentioned in the error.

EDIT (further analysis):

As a test, I modified the HDIST tree to be full tree. The only changes were changing the number of symbols in the HDIST tree (from 1 to 16) and changing the HDIST tree itself to contain 16 symbols, each of 4bits in length. This was the only way to generate a full HDIST tree without modifying the HCLENs.

[
    # Last, dynamic block
    "1", "10", 

    #  HLIT: 257, HDIST: 16 (CHANGED), HCLEN: 12
    "00000", "01111", "1000", 

    # HCLEN codes
    "000", "011", "011", "010", "000", "000", "000", "011", "000", "010", "000", "011", 

    # HLIT tree
    "10", "011", "001", "101", "00", "00", "101", "111", "0000101", "10", "011", "011", "10", "111", "0000010", "101", "00", "001", "10", "00", "00", "001", "10", "10", "111", "0001000", "10", "111", "0000001", "10", "011", "010", "001", "00", "10", "011", "010", "10", "00", "00", "001", "011", "100", "001", "111", "0000010", "10", "111", "0000110", "10", "00", "10", "111", "0101000", "001", "011", "110", "10", "011", "011", "101", "00", "101", "011", "110", "101", "011", "100", "101", "111", "0001000", "101", "001",

    # HDIST tree (CHANGED)
    "001", "001", "001", "001", "001", "001", "001", "001", "001", "001", "001", "001", "001", "001", "001", "001",

    # Compressed data
    "10001", "0000", "0000", "11001", "1000", "1000", "00001", "1100", "11101", "010111", "0010", "1100", "0010", "01110", "0100", "000111", "01001", "01101", "00011", "001111", "10101", "111111", "100111", "00101", "10011", "01011", "0100", "11110", "101111", "011111", "1010", "11011", "1010", "110111", "0110"
]

When composed into a byte stream, it now properly gets decoded by Go's flate library, further indicating that it is the HDIST tree causing issues.

http://play.golang.org/p/e1pLj5NKjs

@ianlancetaylor
Copy link
Contributor

Closing as duplicate.

@toqueteos
Copy link
Author

Thank you very much for that amazingly deep analysis @dsnet ! I'm keeping track of #11030

@golang golang locked and limited conversation to collaborators Jun 25, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants