Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compress/gzip: reading through subsequent gzip members allocates excessively #6317

Closed
kortschak opened this issue Sep 4, 2013 · 12 comments
Closed

Comments

@kortschak
Copy link
Contributor

What steps will reproduce the problem?
If possible, include a link to a program on play.golang.org.
1. curl
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data/HG00096/alignment/HG00096.unmapped.ILLUMINA.bwa.GBR.low_coverage.20120522.bam
> blocked.gz
2. gunzip -c blocked.gz | gzip -c - > flat.gz
3. Using the program at http://play.golang.org/p/5j4GM-JUxl as a _test.go file, run:
   go test -gz.file flat.gz -run ^$ -bench . -benchmem > flat.text
   go test -gz.file blocked.gz -run ^$ -bench . -benchmem > blocked.text
4. benchcmp flat.text blocked.text

What is the expected output?
Similar levels of allocation.

What do you see instead?

Go1.1.2:

benchmark          old ns/op    new ns/op    delta
BenchmarkGunzip  10182915083  10557713783   +3.68%

benchmark         old allocs   new allocs    delta
BenchmarkGunzip        52474       186766  255.92%

benchmark          old bytes    new bytes    delta
BenchmarkGunzip      9004552    295581208  3182.58%


Go tip:

benchmark          old ns/op    new ns/op    delta
BenchmarkGunzip  10252001101  10576255141   +3.16%

benchmark         old allocs   new allocs    delta
BenchmarkGunzip        52422       186048  254.90%

benchmark          old bytes    new bytes    delta
BenchmarkGunzip      8896296    292322024  3185.88%


Which compiler are you using (5g, 6g, 8g, gccgo)?
6g

Which operating system are you using?
linux

Which version are you using?  (run 'go version')
go version go1.1.2 linux/amd64
go version devel +6adf94d423d6 Tue Sep 03 16:19:41 2013 -0400 linux/amd64

Please provide any additional information below.
The blocked.gz file is a gzip file constructed from multiple compressed gzip members,
each made from <64KB of input data (the spec for this format, BGZF, is on p12 of
http://samtools.sourceforge.net/SAMv1.pdf).

At the end of a gzip member, gzip.Reader creates a new flate.Reader in
(*Reader).readHeader() discarding the old one.
@robpike
Copy link
Contributor

robpike commented Sep 4, 2013

Comment 1:

Labels changed: added priority-later, performance, garbage, removed priority-triage.

Status changed to Accepted.

@bradfitz
Copy link
Contributor

bradfitz commented Sep 4, 2013

Comment 2:

It's pretty easy to reduce the allocations (see https://golang.org/cl/13416045
for maybe after Go 1.2) but the benchmark in particular doesn't spend much time in GC.
benchmark          old ns/op    new ns/op    delta
BenchmarkGunzip   8628248473   8509271514   -1.38%
benchmark         old allocs   new allocs    delta
BenchmarkGunzip       209831       172185  -17.94%
benchmark          old bytes    new bytes    delta
BenchmarkGunzip    579662000    300007152  -48.24%

@kortschak
Copy link
Contributor Author

Comment 3:

Not good on the original comparison on my machine:
$ benchcmp flat.text blocked.text
benchmark          old ns/op    new ns/op    delta
BenchmarkGunzip  10170335203  10582966012   +4.06%
benchmark         old allocs   new allocs    delta
BenchmarkGunzip        52423       207223  295.29%
benchmark          old bytes    new bytes    delta
BenchmarkGunzip      8937184    579395272  6382.97%

@bradfitz
Copy link
Contributor

bradfitz commented Sep 5, 2013

Comment 4:

I don't know what you're comparing.
I compared blocked.text w/ Go tip vs blocked.test w/ my patch.
In any case, that was just a quick sketch and I made numbers go down. More could be
done.  I also didn't even touch compress/gzip to Close the flate.Reader in
Reader.readHeader, if z.decompressor != nil.  Shouldn't matter, though, if the parts are
read to EOF, which I believe they are.

@kortschak
Copy link
Contributor Author

Comment 5:

The comparison I was doing is the original comparison in the report.
I agree you make the numbers go down in your bench; I'm confused though since they go up
in mine (repeating your comparison here):
$ benchcmp blocked-tip.text blocked-bfp.text
benchmark          old ns/op    new ns/op    delta
BenchmarkGunzip  10570728658  10487071213   -0.79%
benchmark         old allocs   new allocs    delta
BenchmarkGunzip       186042       207243   11.40%
benchmark          old bytes    new bytes    delta
BenchmarkGunzip    292321576    579397016   98.21%
I've repeated this a few times to make sure I'm not confusing the runs, and I'm pretty
confident I'm not (the numbers look reversed between yours and mine).

@gopherbot
Copy link

Comment 6 by adilh@google.com:

Dan, is it possible your GOMAXPROCS is set to >1 and Brad's is set to 1?  I was
investigating 4x slower gzip times for a server on tip vs 1.1.2 (darwin).  The 4x
regression occurs when GOMAXPROCS is == NumCPU().  If set to 1, tip is faster, but
slower with any other value.

@kortschak
Copy link
Contributor Author

Comment 7:

I don't think so since the benchmarks are reporting GOMAXPROCS=1, but I'll repeat to
confirm (this is on an i7).
First go1.1.2 v tip and then tip v tip+bfp:
$ benchcmp go1.1.2.text tip-a6d1a3f0411a.text 
benchmark            old ns/op    new ns/op    delta
BenchmarkGunzip    10517583627  10523385408   +0.06%
BenchmarkGunzip-2  10612034671  10419333628   -1.82%
BenchmarkGunzip-4  10600087072  10532142350   -0.64%
BenchmarkGunzip-8  10730867496  10457371090   -2.55%
benchmark           old allocs   new allocs    delta
BenchmarkGunzip         186792       185860   -0.50%
BenchmarkGunzip-2       186604       186014   -0.32%
BenchmarkGunzip-4       186609       186231   -0.20%
BenchmarkGunzip-8       186562       186028   -0.29%
benchmark            old bytes    new bytes    delta
BenchmarkGunzip      295655608    292306872   -1.13%
BenchmarkGunzip-2    295196696    292362632   -0.96%
BenchmarkGunzip-4    294790328    292327336   -0.84%
BenchmarkGunzip-8    294335960    292415752   -0.65%
$ benchcmp tip-a6d1a3f0411a.text tip-a6d1a3f0411a+bfp.text 
benchmark            old ns/op    new ns/op    delta
BenchmarkGunzip    10523385408  10144445206   -3.60%
BenchmarkGunzip-2  10419333628  10228561271   -1.83%
BenchmarkGunzip-4  10532142350  10351262989   -1.72%
BenchmarkGunzip-8  10457371090  10527026324   +0.67%
benchmark           old allocs   new allocs    delta
BenchmarkGunzip         185860       207221   11.49%
BenchmarkGunzip-2       186014       207517   11.56%
BenchmarkGunzip-4       186231       207436   11.39%
BenchmarkGunzip-8       186028       207933   11.78%
benchmark            old bytes    new bytes    delta
BenchmarkGunzip      292306872    579395128   98.21%
BenchmarkGunzip-2    292362632    579410216   98.18%
BenchmarkGunzip-4    292327336    579403976   98.20%
BenchmarkGunzip-8    292415752    579551480   98.19%

@rsc
Copy link
Contributor

rsc commented Nov 27, 2013

Comment 8:

Labels changed: added go1.3maybe.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2013

Comment 9:

Labels changed: added release-none, removed go1.3maybe.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2013

Comment 10:

Labels changed: added repo-main.

@gopherbot
Copy link

Comment 11:

CL https://golang.org/cl/97140043 mentions this issue.

@nigeltao
Copy link
Contributor

Comment 12:

This issue was closed by revision 193d09a.

Status changed to Fixed.

@golang golang locked and limited conversation to collaborators Jun 25, 2016
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jun 25, 2018
…s multiple buffers

This adds a Reset() to compress/flate's decompressor and plumbs that through
to compress/zlib and compress/gzip's Readers so callers can avoid large
allocations when performing many inflate operations. In particular this
preserves the allocation of the decompressor.hist buffer, which is 32kb and
overwritten as needed while inflating.

On the benchmark described in issue 6317, produces the following speedup on
my 2.3ghz Intel Core i7 MBP with go version devel +6b696a34e0af Sun Aug 03
15:14:59 2014 -0700 darwin/amd64:

blocked.text w/out patch vs blocked.text w/ patch:
benchmark           old ns/op      new ns/op      delta
BenchmarkGunzip     8371577533     7927917687     -5.30%

benchmark           old allocs     new allocs     delta
BenchmarkGunzip     176818         148519         -16.00%

benchmark           old bytes     new bytes     delta
BenchmarkGunzip     292184936     12739528      -95.64%

flat.text vs blocked.text w/patch:
benchmark           old ns/op      new ns/op      delta
BenchmarkGunzip     7939447827     7927917687     -0.15%

benchmark           old allocs     new allocs     delta
BenchmarkGunzip     90702          148519         +63.74%

benchmark           old bytes     new bytes     delta
BenchmarkGunzip     9959528       12739528      +27.91%

Similar speedups to those bradfitz saw in  https://golang.org/cl/13416045.

Fixes golang#6317.
Fixes golang#7950.

LGTM=nigeltao
R=golang-codereviews, bradfitz, dan.kortschak, adg, nigeltao, jamesr
CC=golang-codereviews
https://golang.org/cl/97140043
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jun 26, 2018
…s multiple buffers

This adds a Reset() to compress/flate's decompressor and plumbs that through
to compress/zlib and compress/gzip's Readers so callers can avoid large
allocations when performing many inflate operations. In particular this
preserves the allocation of the decompressor.hist buffer, which is 32kb and
overwritten as needed while inflating.

On the benchmark described in issue 6317, produces the following speedup on
my 2.3ghz Intel Core i7 MBP with go version devel +6b696a34e0af Sun Aug 03
15:14:59 2014 -0700 darwin/amd64:

blocked.text w/out patch vs blocked.text w/ patch:
benchmark           old ns/op      new ns/op      delta
BenchmarkGunzip     8371577533     7927917687     -5.30%

benchmark           old allocs     new allocs     delta
BenchmarkGunzip     176818         148519         -16.00%

benchmark           old bytes     new bytes     delta
BenchmarkGunzip     292184936     12739528      -95.64%

flat.text vs blocked.text w/patch:
benchmark           old ns/op      new ns/op      delta
BenchmarkGunzip     7939447827     7927917687     -0.15%

benchmark           old allocs     new allocs     delta
BenchmarkGunzip     90702          148519         +63.74%

benchmark           old bytes     new bytes     delta
BenchmarkGunzip     9959528       12739528      +27.91%

Similar speedups to those bradfitz saw in  https://golang.org/cl/13416045.

Fixes golang#6317.
Fixes golang#7950.

LGTM=nigeltao
R=golang-codereviews, bradfitz, dan.kortschak, adg, nigeltao, jamesr
CC=golang-codereviews
https://golang.org/cl/97140043
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jul 9, 2018
…s multiple buffers

This adds a Reset() to compress/flate's decompressor and plumbs that through
to compress/zlib and compress/gzip's Readers so callers can avoid large
allocations when performing many inflate operations. In particular this
preserves the allocation of the decompressor.hist buffer, which is 32kb and
overwritten as needed while inflating.

On the benchmark described in issue 6317, produces the following speedup on
my 2.3ghz Intel Core i7 MBP with go version devel +6b696a34e0af Sun Aug 03
15:14:59 2014 -0700 darwin/amd64:

blocked.text w/out patch vs blocked.text w/ patch:
benchmark           old ns/op      new ns/op      delta
BenchmarkGunzip     8371577533     7927917687     -5.30%

benchmark           old allocs     new allocs     delta
BenchmarkGunzip     176818         148519         -16.00%

benchmark           old bytes     new bytes     delta
BenchmarkGunzip     292184936     12739528      -95.64%

flat.text vs blocked.text w/patch:
benchmark           old ns/op      new ns/op      delta
BenchmarkGunzip     7939447827     7927917687     -0.15%

benchmark           old allocs     new allocs     delta
BenchmarkGunzip     90702          148519         +63.74%

benchmark           old bytes     new bytes     delta
BenchmarkGunzip     9959528       12739528      +27.91%

Similar speedups to those bradfitz saw in  https://golang.org/cl/13416045.

Fixes golang#6317.
Fixes golang#7950.

LGTM=nigeltao
R=golang-codereviews, bradfitz, dan.kortschak, adg, nigeltao, jamesr
CC=golang-codereviews
https://golang.org/cl/97140043
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jul 30, 2018
…s multiple buffers

This adds a Reset() to compress/flate's decompressor and plumbs that through
to compress/zlib and compress/gzip's Readers so callers can avoid large
allocations when performing many inflate operations. In particular this
preserves the allocation of the decompressor.hist buffer, which is 32kb and
overwritten as needed while inflating.

On the benchmark described in issue 6317, produces the following speedup on
my 2.3ghz Intel Core i7 MBP with go version devel +6b696a34e0af Sun Aug 03
15:14:59 2014 -0700 darwin/amd64:

blocked.text w/out patch vs blocked.text w/ patch:
benchmark           old ns/op      new ns/op      delta
BenchmarkGunzip     8371577533     7927917687     -5.30%

benchmark           old allocs     new allocs     delta
BenchmarkGunzip     176818         148519         -16.00%

benchmark           old bytes     new bytes     delta
BenchmarkGunzip     292184936     12739528      -95.64%

flat.text vs blocked.text w/patch:
benchmark           old ns/op      new ns/op      delta
BenchmarkGunzip     7939447827     7927917687     -0.15%

benchmark           old allocs     new allocs     delta
BenchmarkGunzip     90702          148519         +63.74%

benchmark           old bytes     new bytes     delta
BenchmarkGunzip     9959528       12739528      +27.91%

Similar speedups to those bradfitz saw in  https://golang.org/cl/13416045.

Fixes golang#6317.
Fixes golang#7950.

LGTM=nigeltao
R=golang-codereviews, bradfitz, dan.kortschak, adg, nigeltao, jamesr
CC=golang-codereviews
https://golang.org/cl/97140043
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants