compress/gzip: reading through subsequent gzip members allocates excessively #6317

kortschak · 2013-09-04T00:02:26Z

What steps will reproduce the problem?
If possible, include a link to a program on play.golang.org.
1. curl
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data/HG00096/alignment/HG00096.unmapped.ILLUMINA.bwa.GBR.low_coverage.20120522.bam
> blocked.gz
2. gunzip -c blocked.gz | gzip -c - > flat.gz
3. Using the program at http://play.golang.org/p/5j4GM-JUxl as a _test.go file, run:
   go test -gz.file flat.gz -run ^$ -bench . -benchmem > flat.text
   go test -gz.file blocked.gz -run ^$ -bench . -benchmem > blocked.text
4. benchcmp flat.text blocked.text

What is the expected output?
Similar levels of allocation.

What do you see instead?

Go1.1.2:

benchmark          old ns/op    new ns/op    delta
BenchmarkGunzip  10182915083  10557713783   +3.68%

benchmark         old allocs   new allocs    delta
BenchmarkGunzip        52474       186766  255.92%

benchmark          old bytes    new bytes    delta
BenchmarkGunzip      9004552    295581208  3182.58%


Go tip:

benchmark          old ns/op    new ns/op    delta
BenchmarkGunzip  10252001101  10576255141   +3.16%

benchmark         old allocs   new allocs    delta
BenchmarkGunzip        52422       186048  254.90%

benchmark          old bytes    new bytes    delta
BenchmarkGunzip      8896296    292322024  3185.88%


Which compiler are you using (5g, 6g, 8g, gccgo)?
6g

Which operating system are you using?
linux

Which version are you using?  (run 'go version')
go version go1.1.2 linux/amd64
go version devel +6adf94d423d6 Tue Sep 03 16:19:41 2013 -0400 linux/amd64

Please provide any additional information below.
The blocked.gz file is a gzip file constructed from multiple compressed gzip members,
each made from <64KB of input data (the spec for this format, BGZF, is on p12 of
http://samtools.sourceforge.net/SAMv1.pdf).

At the end of a gzip member, gzip.Reader creates a new flate.Reader in
(*Reader).readHeader() discarding the old one.

robpike · 2013-09-04T02:02:12Z

Comment 1:

Labels changed: added priority-later, performance, garbage, removed priority-triage.

Status changed to Accepted.

bradfitz · 2013-09-04T22:20:38Z

Comment 2:

It's pretty easy to reduce the allocations (see https://golang.org/cl/13416045
for maybe after Go 1.2) but the benchmark in particular doesn't spend much time in GC.
benchmark          old ns/op    new ns/op    delta
BenchmarkGunzip   8628248473   8509271514   -1.38%
benchmark         old allocs   new allocs    delta
BenchmarkGunzip       209831       172185  -17.94%
benchmark          old bytes    new bytes    delta
BenchmarkGunzip    579662000    300007152  -48.24%

kortschak · 2013-09-05T00:00:43Z

Comment 3:

Not good on the original comparison on my machine:
$ benchcmp flat.text blocked.text
benchmark          old ns/op    new ns/op    delta
BenchmarkGunzip  10170335203  10582966012   +4.06%
benchmark         old allocs   new allocs    delta
BenchmarkGunzip        52423       207223  295.29%
benchmark          old bytes    new bytes    delta
BenchmarkGunzip      8937184    579395272  6382.97%

bradfitz · 2013-09-05T01:27:58Z

Comment 4:

I don't know what you're comparing.
I compared blocked.text w/ Go tip vs blocked.test w/ my patch.
In any case, that was just a quick sketch and I made numbers go down. More could be
done.  I also didn't even touch compress/gzip to Close the flate.Reader in
Reader.readHeader, if z.decompressor != nil.  Shouldn't matter, though, if the parts are
read to EOF, which I believe they are.

kortschak · 2013-09-05T01:48:18Z

Comment 5:

The comparison I was doing is the original comparison in the report.
I agree you make the numbers go down in your bench; I'm confused though since they go up
in mine (repeating your comparison here):
$ benchcmp blocked-tip.text blocked-bfp.text
benchmark          old ns/op    new ns/op    delta
BenchmarkGunzip  10570728658  10487071213   -0.79%
benchmark         old allocs   new allocs    delta
BenchmarkGunzip       186042       207243   11.40%
benchmark          old bytes    new bytes    delta
BenchmarkGunzip    292321576    579397016   98.21%
I've repeated this a few times to make sure I'm not confusing the runs, and I'm pretty
confident I'm not (the numbers look reversed between yours and mine).

gopherbot · 2013-09-14T09:26:07Z

Comment 6 by adilh@google.com:

Dan, is it possible your GOMAXPROCS is set to >1 and Brad's is set to 1?  I was
investigating 4x slower gzip times for a server on tip vs 1.1.2 (darwin).  The 4x
regression occurs when GOMAXPROCS is == NumCPU().  If set to 1, tip is faster, but
slower with any other value.

kortschak · 2013-09-14T11:31:23Z

Comment 7:

I don't think so since the benchmarks are reporting GOMAXPROCS=1, but I'll repeat to
confirm (this is on an i7).
First go1.1.2 v tip and then tip v tip+bfp:
$ benchcmp go1.1.2.text tip-a6d1a3f0411a.text 
benchmark            old ns/op    new ns/op    delta
BenchmarkGunzip    10517583627  10523385408   +0.06%
BenchmarkGunzip-2  10612034671  10419333628   -1.82%
BenchmarkGunzip-4  10600087072  10532142350   -0.64%
BenchmarkGunzip-8  10730867496  10457371090   -2.55%
benchmark           old allocs   new allocs    delta
BenchmarkGunzip         186792       185860   -0.50%
BenchmarkGunzip-2       186604       186014   -0.32%
BenchmarkGunzip-4       186609       186231   -0.20%
BenchmarkGunzip-8       186562       186028   -0.29%
benchmark            old bytes    new bytes    delta
BenchmarkGunzip      295655608    292306872   -1.13%
BenchmarkGunzip-2    295196696    292362632   -0.96%
BenchmarkGunzip-4    294790328    292327336   -0.84%
BenchmarkGunzip-8    294335960    292415752   -0.65%
$ benchcmp tip-a6d1a3f0411a.text tip-a6d1a3f0411a+bfp.text 
benchmark            old ns/op    new ns/op    delta
BenchmarkGunzip    10523385408  10144445206   -3.60%
BenchmarkGunzip-2  10419333628  10228561271   -1.83%
BenchmarkGunzip-4  10532142350  10351262989   -1.72%
BenchmarkGunzip-8  10457371090  10527026324   +0.67%
benchmark           old allocs   new allocs    delta
BenchmarkGunzip         185860       207221   11.49%
BenchmarkGunzip-2       186014       207517   11.56%
BenchmarkGunzip-4       186231       207436   11.39%
BenchmarkGunzip-8       186028       207933   11.78%
benchmark            old bytes    new bytes    delta
BenchmarkGunzip      292306872    579395128   98.21%
BenchmarkGunzip-2    292362632    579410216   98.18%
BenchmarkGunzip-4    292327336    579403976   98.20%
BenchmarkGunzip-8    292415752    579551480   98.19%

rsc · 2013-11-27T18:44:57Z

Comment 8:

Labels changed: added go1.3maybe.

rsc · 2013-12-04T01:29:27Z

Comment 9:

Labels changed: added release-none, removed go1.3maybe.

rsc · 2013-12-04T01:47:02Z

Comment 10:

Labels changed: added repo-main.

gopherbot · 2014-05-08T05:49:45Z

Comment 11:

CL https://golang.org/cl/97140043 mentions this issue.

nigeltao · 2014-10-20T01:58:18Z

Comment 12:

This issue was closed by revision 193d09a.

Status changed to Fixed.

…s multiple buffers This adds a Reset() to compress/flate's decompressor and plumbs that through to compress/zlib and compress/gzip's Readers so callers can avoid large allocations when performing many inflate operations. In particular this preserves the allocation of the decompressor.hist buffer, which is 32kb and overwritten as needed while inflating. On the benchmark described in issue 6317, produces the following speedup on my 2.3ghz Intel Core i7 MBP with go version devel +6b696a34e0af Sun Aug 03 15:14:59 2014 -0700 darwin/amd64: blocked.text w/out patch vs blocked.text w/ patch: benchmark old ns/op new ns/op delta BenchmarkGunzip 8371577533 7927917687 -5.30% benchmark old allocs new allocs delta BenchmarkGunzip 176818 148519 -16.00% benchmark old bytes new bytes delta BenchmarkGunzip 292184936 12739528 -95.64% flat.text vs blocked.text w/patch: benchmark old ns/op new ns/op delta BenchmarkGunzip 7939447827 7927917687 -0.15% benchmark old allocs new allocs delta BenchmarkGunzip 90702 148519 +63.74% benchmark old bytes new bytes delta BenchmarkGunzip 9959528 12739528 +27.91% Similar speedups to those bradfitz saw in https://golang.org/cl/13416045. Fixes golang#6317. Fixes golang#7950. LGTM=nigeltao R=golang-codereviews, bradfitz, dan.kortschak, adg, nigeltao, jamesr CC=golang-codereviews https://golang.org/cl/97140043

kortschak added fixed Performance GarbageCollector labels Oct 20, 2014

golang locked and limited conversation to collaborators Jun 25, 2016

gopherbot added the FrozenDueToAge label Jun 25, 2016

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compress/gzip: reading through subsequent gzip members allocates excessively #6317

compress/gzip: reading through subsequent gzip members allocates excessively #6317

kortschak commented Sep 4, 2013

robpike commented Sep 4, 2013

bradfitz commented Sep 4, 2013

kortschak commented Sep 5, 2013

bradfitz commented Sep 5, 2013

kortschak commented Sep 5, 2013

gopherbot commented Sep 14, 2013

kortschak commented Sep 14, 2013

rsc commented Nov 27, 2013

rsc commented Dec 4, 2013

rsc commented Dec 4, 2013

gopherbot commented May 8, 2014

nigeltao commented Oct 20, 2014

compress/gzip: reading through subsequent gzip members allocates excessively #6317

compress/gzip: reading through subsequent gzip members allocates excessively #6317

Comments

kortschak commented Sep 4, 2013

robpike commented Sep 4, 2013

bradfitz commented Sep 4, 2013

kortschak commented Sep 5, 2013

bradfitz commented Sep 5, 2013

kortschak commented Sep 5, 2013

gopherbot commented Sep 14, 2013

kortschak commented Sep 14, 2013

rsc commented Nov 27, 2013

rsc commented Dec 4, 2013

rsc commented Dec 4, 2013

gopherbot commented May 8, 2014

nigeltao commented Oct 20, 2014