Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(3444)

Issue 10006043: code review 10006043: compress/flate: reduce tiny allocs done by encoder. (Closed)

Can't Edit
Can't Publish+Mail
Start Review
Created:
10 years, 11 months ago by remyoudompheng
Modified:
10 years, 9 months ago
Reviewers:
r, imkrasin
CC:
imkrasin, golang-dev, bradfitz, r
Visibility:
Public.

Description

compress/flate: reduce tiny allocs done by encoder. benchmark old allocs new allocs delta BenchmarkEncodeDigitsSpeed1e4 942 91 -90.34% BenchmarkEncodeDigitsSpeed1e5 1919 178 -90.72% BenchmarkEncodeDigitsSpeed1e6 18539 1517 -91.82% BenchmarkEncodeDigitsDefault1e4 734 100 -86.38% BenchmarkEncodeDigitsDefault1e5 1958 193 -90.14% BenchmarkEncodeDigitsDefault1e6 17338 1426 -91.78% BenchmarkEncodeDigitsCompress1e4 734 100 -86.38% BenchmarkEncodeDigitsCompress1e5 1958 193 -90.14% BenchmarkEncodeDigitsCompress1e6 17338 1426 -91.78% BenchmarkEncodeTwainSpeed1e4 1865 109 -94.16% BenchmarkEncodeTwainSpeed1e5 3943 211 -94.65% BenchmarkEncodeTwainSpeed1e6 31279 1595 -94.90% BenchmarkEncodeTwainDefault1e4 1811 103 -94.31% BenchmarkEncodeTwainDefault1e5 3708 199 -94.63% BenchmarkEncodeTwainDefault1e6 26738 1330 -95.03% BenchmarkEncodeTwainCompress1e4 1811 103 -94.31% BenchmarkEncodeTwainCompress1e5 3693 190 -94.86% BenchmarkEncodeTwainCompress1e6 26902 1333 -95.04% benchmark old bytes new bytes delta BenchmarkEncodeDigitsSpeed1e4 1469438 1453920 -1.06% BenchmarkEncodeDigitsSpeed1e5 1490898 1458961 -2.14% BenchmarkEncodeDigitsSpeed1e6 1858819 1542407 -17.02% BenchmarkEncodeDigitsDefault1e4 1465903 1454160 -0.80% BenchmarkEncodeDigitsDefault1e5 1491841 1459361 -2.18% BenchmarkEncodeDigitsDefault1e6 1825424 1531545 -16.10% BenchmarkEncodeDigitsCompress1e4 1465903 1454160 -0.80% BenchmarkEncodeDigitsCompress1e5 1491681 1459361 -2.17% BenchmarkEncodeDigitsCompress1e6 1825424 1531545 -16.10% BenchmarkEncodeTwainSpeed1e4 1485308 1454400 -2.08% BenchmarkEncodeTwainSpeed1e5 1526065 1459878 -4.34% BenchmarkEncodeTwainSpeed1e6 2066627 1536296 -25.66% BenchmarkEncodeTwainDefault1e4 1484380 1454240 -2.03% BenchmarkEncodeTwainDefault1e5 1521793 1459558 -4.09% BenchmarkEncodeTwainDefault1e6 1977504 1523388 -22.96% BenchmarkEncodeTwainCompress1e4 1484380 1454240 -2.03% BenchmarkEncodeTwainCompress1e5 1521457 1459318 -4.08% BenchmarkEncodeTwainCompress1e6 1980000 1523609 -23.05% benchmark old ns/op new ns/op delta BenchmarkEncodeDigitsSpeed1e4 1472128 1384343 -5.96% BenchmarkEncodeDigitsSpeed1e5 8283663 8112304 -2.07% BenchmarkEncodeDigitsSpeed1e6 77459311 76364216 -1.41% BenchmarkEncodeDigitsDefault1e4 1813090 1746552 -3.67% BenchmarkEncodeDigitsDefault1e5 26221292 26052516 -0.64% BenchmarkEncodeDigitsDefault1e6 286512472 286099039 -0.14% BenchmarkEncodeDigitsCompress1e4 1809373 1747230 -3.43% BenchmarkEncodeDigitsCompress1e5 26231580 26038456 -0.74% BenchmarkEncodeDigitsCompress1e6 286140002 286025372 -0.04% BenchmarkEncodeTwainSpeed1e4 1594094 1438600 -9.75% BenchmarkEncodeTwainSpeed1e5 7669724 7316288 -4.61% BenchmarkEncodeTwainSpeed1e6 68731353 65938994 -4.06% BenchmarkEncodeTwainDefault1e4 2063497 1866488 -9.55% BenchmarkEncodeTwainDefault1e5 22602689 22221377 -1.69% BenchmarkEncodeTwainDefault1e6 233376842 232114297 -0.54% BenchmarkEncodeTwainCompress1e4 2062441 1949676 -5.47% BenchmarkEncodeTwainCompress1e5 28264344 27930627 -1.18% BenchmarkEncodeTwainCompress1e6 304369641 303704330 -0.22% benchmark old MB/s new MB/s speedup BenchmarkEncodeDigitsSpeed1e4 6.79 7.22 1.06x BenchmarkEncodeDigitsSpeed1e5 12.07 12.33 1.02x BenchmarkEncodeDigitsSpeed1e6 12.91 13.10 1.01x BenchmarkEncodeDigitsDefault1e4 5.52 5.73 1.04x BenchmarkEncodeDigitsDefault1e5 3.81 3.84 1.01x BenchmarkEncodeDigitsDefault1e6 3.49 3.50 1.00x BenchmarkEncodeDigitsCompress1e4 5.53 5.72 1.03x BenchmarkEncodeDigitsCompress1e5 3.81 3.84 1.01x BenchmarkEncodeDigitsCompress1e6 3.49 3.50 1.00x BenchmarkEncodeTwainSpeed1e4 6.27 6.95 1.11x BenchmarkEncodeTwainSpeed1e5 13.04 13.67 1.05x BenchmarkEncodeTwainSpeed1e6 14.55 15.17 1.04x BenchmarkEncodeTwainDefault1e4 4.85 5.36 1.11x BenchmarkEncodeTwainDefault1e5 4.42 4.50 1.02x BenchmarkEncodeTwainDefault1e6 4.28 4.31 1.01x BenchmarkEncodeTwainCompress1e4 4.85 5.13 1.06x BenchmarkEncodeTwainCompress1e5 3.54 3.58 1.01x BenchmarkEncodeTwainCompress1e6 3.29 3.29 1.00x

Patch Set 1 #

Patch Set 2 : diff -r 1c764773c6ce https://go.googlecode.com/hg/ #

Patch Set 3 : diff -r 1c764773c6ce https://go.googlecode.com/hg/ #

Total comments: 4

Patch Set 4 : diff -r be6c9de7b802 https://go.googlecode.com/hg/ #

Unified diffs Side-by-side diffs Delta from patch set Stats (+45 lines, -47 lines) Patch
M src/pkg/compress/flate/huffman_code.go View 1 2 6 chunks +45 lines, -47 lines 0 comments Download

Messages

Total messages: 14
remyoudompheng
Hello imkrasin@gmail.com, golang-dev@googlegroups.com (cc: golang-dev@googlegroups.com), I'd like you to review this change to https://go.googlecode.com/hg/
10 years, 11 months ago (2013-06-04 15:05:21 UTC) #1
remyoudompheng
This CL replaces CL 5573076. It produces even less garbage by totally eliminating the notion ...
10 years, 11 months ago (2013-06-04 15:08:41 UTC) #2
bradfitz
How's performance? I assume unchanged if you didn't cite that part of benchcmp? On Tue, ...
10 years, 11 months ago (2013-06-04 15:15:36 UTC) #3
remyoudompheng
I obtained the following benchmarks. Pooling compressors would eliminate the remaining allocations. benchmark old ns/op ...
10 years, 11 months ago (2013-06-04 15:27:21 UTC) #4
imkrasin
Hi Remi, a couple of questions. 1. How do I reproduce this benchmark results? I ...
10 years, 11 months ago (2013-06-04 15:48:41 UTC) #5
bradfitz
You don't need autobench. You just save the output of "go test -test=none -benchmark=. compress/flate" ...
10 years, 11 months ago (2013-06-04 16:14:06 UTC) #6
imkrasin
https://codereview.appspot.com/10006043/diff/1002/src/pkg/compress/flate/huffman_code.go File src/pkg/compress/flate/huffman_code.go (right): https://codereview.appspot.com/10006043/diff/1002/src/pkg/compress/flate/huffman_code.go#newcode194 src/pkg/compress/flate/huffman_code.go:194: copy(leafCounts[level][:level], leafCounts[level-1][:level]) I fail to understand how this copy ...
10 years, 11 months ago (2013-06-06 01:35:10 UTC) #7
bradfitz
R=nigeltao
10 years, 10 months ago (2013-06-17 20:06:55 UTC) #8
remyoudompheng
ping? is anybody interested?
10 years, 9 months ago (2013-07-24 06:53:50 UTC) #9
r
seems ok but there is one outstanding comment
10 years, 9 months ago (2013-07-24 08:06:00 UTC) #10
remyoudompheng
https://codereview.appspot.com/10006043/diff/1002/src/pkg/compress/flate/huffman_code.go File src/pkg/compress/flate/huffman_code.go (right): https://codereview.appspot.com/10006043/diff/1002/src/pkg/compress/flate/huffman_code.go#newcode194 src/pkg/compress/flate/huffman_code.go:194: copy(leafCounts[level][:level], leafCounts[level-1][:level]) On 2013/06/06 01:35:10, imkrasin wrote: > I ...
10 years, 9 months ago (2013-07-27 21:16:31 UTC) #11
r
LGTM
10 years, 9 months ago (2013-07-27 21:22:14 UTC) #12
imkrasin
https://codereview.appspot.com/10006043/diff/1002/src/pkg/compress/flate/huffman_code.go File src/pkg/compress/flate/huffman_code.go (right): https://codereview.appspot.com/10006043/diff/1002/src/pkg/compress/flate/huffman_code.go#newcode194 src/pkg/compress/flate/huffman_code.go:194: copy(leafCounts[level][:level], leafCounts[level-1][:level]) Thanks for the explanation. Also, I like ...
10 years, 9 months ago (2013-07-27 21:23:46 UTC) #13
remyoudompheng
10 years, 9 months ago (2013-07-28 07:43:16 UTC) #14
*** Submitted as https://code.google.com/p/go/source/detail?r=2fe813f4f3c2 ***

compress/flate: reduce tiny allocs done by encoder.

benchmark                          old allocs   new allocs    delta
BenchmarkEncodeDigitsSpeed1e4             942           91  -90.34%
BenchmarkEncodeDigitsSpeed1e5            1919          178  -90.72%
BenchmarkEncodeDigitsSpeed1e6           18539         1517  -91.82%
BenchmarkEncodeDigitsDefault1e4           734          100  -86.38%
BenchmarkEncodeDigitsDefault1e5          1958          193  -90.14%
BenchmarkEncodeDigitsDefault1e6         17338         1426  -91.78%
BenchmarkEncodeDigitsCompress1e4          734          100  -86.38%
BenchmarkEncodeDigitsCompress1e5         1958          193  -90.14%
BenchmarkEncodeDigitsCompress1e6        17338         1426  -91.78%
BenchmarkEncodeTwainSpeed1e4             1865          109  -94.16%
BenchmarkEncodeTwainSpeed1e5             3943          211  -94.65%
BenchmarkEncodeTwainSpeed1e6            31279         1595  -94.90%
BenchmarkEncodeTwainDefault1e4           1811          103  -94.31%
BenchmarkEncodeTwainDefault1e5           3708          199  -94.63%
BenchmarkEncodeTwainDefault1e6          26738         1330  -95.03%
BenchmarkEncodeTwainCompress1e4          1811          103  -94.31%
BenchmarkEncodeTwainCompress1e5          3693          190  -94.86%
BenchmarkEncodeTwainCompress1e6         26902         1333  -95.04%

benchmark                           old bytes    new bytes    delta
BenchmarkEncodeDigitsSpeed1e4         1469438      1453920   -1.06%
BenchmarkEncodeDigitsSpeed1e5         1490898      1458961   -2.14%
BenchmarkEncodeDigitsSpeed1e6         1858819      1542407  -17.02%
BenchmarkEncodeDigitsDefault1e4       1465903      1454160   -0.80%
BenchmarkEncodeDigitsDefault1e5       1491841      1459361   -2.18%
BenchmarkEncodeDigitsDefault1e6       1825424      1531545  -16.10%
BenchmarkEncodeDigitsCompress1e4      1465903      1454160   -0.80%
BenchmarkEncodeDigitsCompress1e5      1491681      1459361   -2.17%
BenchmarkEncodeDigitsCompress1e6      1825424      1531545  -16.10%
BenchmarkEncodeTwainSpeed1e4          1485308      1454400   -2.08%
BenchmarkEncodeTwainSpeed1e5          1526065      1459878   -4.34%
BenchmarkEncodeTwainSpeed1e6          2066627      1536296  -25.66%
BenchmarkEncodeTwainDefault1e4        1484380      1454240   -2.03%
BenchmarkEncodeTwainDefault1e5        1521793      1459558   -4.09%
BenchmarkEncodeTwainDefault1e6        1977504      1523388  -22.96%
BenchmarkEncodeTwainCompress1e4       1484380      1454240   -2.03%
BenchmarkEncodeTwainCompress1e5       1521457      1459318   -4.08%
BenchmarkEncodeTwainCompress1e6       1980000      1523609  -23.05%

benchmark                           old ns/op    new ns/op    delta
BenchmarkEncodeDigitsSpeed1e4         1472128      1384343   -5.96%
BenchmarkEncodeDigitsSpeed1e5         8283663      8112304   -2.07%
BenchmarkEncodeDigitsSpeed1e6        77459311     76364216   -1.41%
BenchmarkEncodeDigitsDefault1e4       1813090      1746552   -3.67%
BenchmarkEncodeDigitsDefault1e5      26221292     26052516   -0.64%
BenchmarkEncodeDigitsDefault1e6     286512472    286099039   -0.14%
BenchmarkEncodeDigitsCompress1e4      1809373      1747230   -3.43%
BenchmarkEncodeDigitsCompress1e5     26231580     26038456   -0.74%
BenchmarkEncodeDigitsCompress1e6    286140002    286025372   -0.04%
BenchmarkEncodeTwainSpeed1e4          1594094      1438600   -9.75%
BenchmarkEncodeTwainSpeed1e5          7669724      7316288   -4.61%
BenchmarkEncodeTwainSpeed1e6         68731353     65938994   -4.06%
BenchmarkEncodeTwainDefault1e4        2063497      1866488   -9.55%
BenchmarkEncodeTwainDefault1e5       22602689     22221377   -1.69%
BenchmarkEncodeTwainDefault1e6      233376842    232114297   -0.54%
BenchmarkEncodeTwainCompress1e4       2062441      1949676   -5.47%
BenchmarkEncodeTwainCompress1e5      28264344     27930627   -1.18%
BenchmarkEncodeTwainCompress1e6     304369641    303704330   -0.22%

benchmark                            old MB/s     new MB/s  speedup
BenchmarkEncodeDigitsSpeed1e4            6.79         7.22    1.06x
BenchmarkEncodeDigitsSpeed1e5           12.07        12.33    1.02x
BenchmarkEncodeDigitsSpeed1e6           12.91        13.10    1.01x
BenchmarkEncodeDigitsDefault1e4          5.52         5.73    1.04x
BenchmarkEncodeDigitsDefault1e5          3.81         3.84    1.01x
BenchmarkEncodeDigitsDefault1e6          3.49         3.50    1.00x
BenchmarkEncodeDigitsCompress1e4         5.53         5.72    1.03x
BenchmarkEncodeDigitsCompress1e5         3.81         3.84    1.01x
BenchmarkEncodeDigitsCompress1e6         3.49         3.50    1.00x
BenchmarkEncodeTwainSpeed1e4             6.27         6.95    1.11x
BenchmarkEncodeTwainSpeed1e5            13.04        13.67    1.05x
BenchmarkEncodeTwainSpeed1e6            14.55        15.17    1.04x
BenchmarkEncodeTwainDefault1e4           4.85         5.36    1.11x
BenchmarkEncodeTwainDefault1e5           4.42         4.50    1.02x
BenchmarkEncodeTwainDefault1e6           4.28         4.31    1.01x
BenchmarkEncodeTwainCompress1e4          4.85         5.13    1.06x
BenchmarkEncodeTwainCompress1e5          3.54         3.58    1.01x
BenchmarkEncodeTwainCompress1e6          3.29         3.29    1.00x

R=imkrasin, golang-dev, bradfitz, r
CC=golang-dev
https://codereview.appspot.com/10006043

https://codereview.appspot.com/10006043/diff/1002/src/pkg/compress/flate/huff...
File src/pkg/compress/flate/huffman_code.go (right):

https://codereview.appspot.com/10006043/diff/1002/src/pkg/compress/flate/huff...
src/pkg/compress/flate/huffman_code.go:194: copy(leafCounts[level][:level],
leafCounts[level-1][:level])
On 2013/07/27 21:23:47, imkrasin wrote:
> Thanks for the explanation.
> 
> Also, I like that you have included the memory profiler stats to the CL
> description. Could you please do the same for CPU profile? 
> 
> Otherwise, LGTM.

Done.
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b