cmd/compile: t*10 seems slower than t<<3+t+t #2671

griesemer · 2012-01-09T19:44:01Z

The code generated by 6g for multiplication with a constant (t*10) appears to run slower
then a "manual" multiplication (t<<3+t+t). Specifically, for math/big,
revision 78944b6b971c, changing line nat.go:869 from:

s[i] = charset[r-t*10]

to:

s[i] = charset[r-t<<3-t-t]

leads to a significant (> 10%) performance improvement for some code:

gotest -bench=String1000*Base10

benchmark                          old ns/op    new ns/op    delta
big.BenchmarkString100Base10            1909         1855   -2.83%
big.BenchmarkString1000Base10          12867        12243   -4.85%
big.BenchmarkString10000Base10         65974        58307  -11.62%
big.BenchmarkString100000Base10     19938530     19831100   -0.54%

The compiler should be able to do the right thing here w/o manual intervention.

rsc · 2012-09-12T21:42:49Z

Comment 2:

Labels changed: added go1.1maybe.

davecheney · 2012-11-07T04:59:45Z

Comment 3:

@gri: are you still able to reproduce this issue.
I rolled back the change to nat.go and confirmed that on 6g we get 
                                s[i] = charset[r-t*10]
  43511d:       48 89 f5                mov    %rsi,%rbp
  435120:       48 89 b4 24 d0 00 00    mov    %rsi,0xd0(%rsp)
  435127:       00 
  435128:       4c 39 f6                cmp    %r14,%rsi
  43512b:       73 43                   jae    435170 <math/big.nat.convertWords+0x580>
  43512d:       48 8d 1c 2b             lea    (%rbx,%rbp,1),%rbx
  435131:       48 89 c5                mov    %rax,%rbp
  435134:       48 6b ed 0a             imul   $0xa,%rbp,%rbp
but on my cpu, SNB core i5 there isn't much in it
benchmark                      old ns/op    new ns/op    delta
BenchmarkString100Base10            1318         1317   -0.08%
BenchmarkString1000Base10           7262         7388   +1.74%
BenchmarkString10000Base10         29226        29257   +0.11%
BenchmarkString100000Base10      9287810      9271090   -0.18%

rsc · 2012-11-07T14:41:06Z

Comment 4:

This was originally on a 2008-era Mac Pro, whatever chip they used.

griesemer · 2012-11-07T23:51:17Z

Comment 5:

I am still seeing a bit of a slowdown:
benchmark                      old ns/op    new ns/op    delta
BenchmarkString100Base10            1662         1695   +1.99%
BenchmarkString1000Base10           9732         9838   +1.09%
BenchmarkString10000Base10         42616        43845   +2.88%
BenchmarkString100000Base10     14408500     15512376   +7.66%
The new numbers are with using t*10 instead of the existing code. With 6g, the generated
code still looks like before as far as I can tell:
5420 (nat.go:887) MOVQ    SI,BP
5421 (nat.go:887) MOVQ    SI,i+-32(SP)
5422 (nat.go:887) CMPQ    SI,R14
5423 (nat.go:887) JCS     $1,5426
5424 (nat.go:887) CALL    ,runtime.panicindex+0(SB)
5425 (nat.go:887) UNDEF   ,
5426 (nat.go:887) LEAQ    (BX)(BP*1),BX
5427 (nat.go:887) MOVQ    AX,BP
5428 (nat.go:887) IMULQ   $10,BP   <<<<<<<<<< not using
shifts and adds for *10
5429 (nat.go:887) MOVQ    DI,R8
5430 (nat.go:887) SUBQ    BP,R8
5431 (nat.go:887) MOVQ    R8,BP
5432 (nat.go:887) MOVQ    charset+48(FP),R8
5433 (nat.go:887) MOVQ    charset+56(FP),R9
5434 (nat.go:887) CMPQ    BP,R9
5435 (nat.go:887) JCS     $1,5438
5436 (nat.go:887) CALL    ,runtime.panicindex+0(SB)
5437 (nat.go:887) UNDEF   ,
5438 (nat.go:887) LEAQ    (R8)(BP*1),R8
5439 (nat.go:887) MOVBQZX (R8),BP
5440 (nat.go:887) MOVB    BP,(BX)
I prefer to leave this open and no make changes to nat.go for a bit longer. For one, the
code is working fine as is and appears to be faster for large strings.

davecheney · 2012-11-07T23:52:57Z

Comment 6:

Understood. I will add a benchmark to the micro-benchmarking package
for this case.

robpike · 2013-03-07T22:10:03Z

Comment 7:

Labels changed: removed go1.1maybe.

minux · 2013-03-09T12:42:58Z

Comment 8:

Issue #4199 has been merged into this issue.

rsc · 2013-03-12T20:18:47Z

Comment 10:

[The time for maybe has passed.]

rsc · 2013-07-30T14:28:40Z

Comment 11:

Labels changed: added go1.3.

robpike · 2013-08-20T23:36:15Z

Comment 12:

Labels changed: removed go1.3.

rsc · 2013-11-27T18:50:30Z

Comment 13:

Labels changed: added go1.3maybe.

rsc · 2013-12-04T01:27:52Z

Comment 14:

Labels changed: added release-none, removed go1.3maybe.

rsc · 2013-12-04T01:43:48Z

Comment 15:

Labels changed: added repo-main.

gopherbot · 2016-05-24T07:00:22Z

CL https://golang.org/cl/23372 mentions this issue.

griesemer added accepted Performance labels Dec 4, 2013

griesemer assigned rsc Dec 4, 2013

minux mentioned this issue Dec 8, 2014

cmd/6g: investigate x*5 => x<<2 + x #4199

Closed

rsc added this to the Unplanned milestone Apr 10, 2015

rsc removed priority-later labels Apr 10, 2015

rsc changed the title ~~cmd/gc: t*10 seems slower than t<<3+t+t~~ cmd/compile: t*10 seems slower than t<<3+t+t Jun 8, 2015

gopherbot closed this as completed in ab44147 May 24, 2016

golang locked and limited conversation to collaborators May 24, 2017

gopherbot added the FrozenDueToAge label May 24, 2017

rsc removed their assignment Jun 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/compile: t*10 seems slower than t<<3+t+t #2671

cmd/compile: t*10 seems slower than t<<3+t+t #2671

griesemer commented Jan 9, 2012

rsc commented Sep 12, 2012

davecheney commented Nov 7, 2012

rsc commented Nov 7, 2012

griesemer commented Nov 7, 2012

davecheney commented Nov 7, 2012

robpike commented Mar 7, 2013

minux commented Mar 9, 2013

rsc commented Mar 12, 2013

rsc commented Jul 30, 2013

robpike commented Aug 20, 2013

rsc commented Nov 27, 2013

rsc commented Dec 4, 2013

rsc commented Dec 4, 2013

gopherbot commented May 24, 2016

cmd/compile: t*10 seems slower than t<<3+t+t #2671

cmd/compile: t*10 seems slower than t<<3+t+t #2671

Comments

griesemer commented Jan 9, 2012

rsc commented Sep 12, 2012

davecheney commented Nov 7, 2012

rsc commented Nov 7, 2012

griesemer commented Nov 7, 2012

davecheney commented Nov 7, 2012

robpike commented Mar 7, 2013

minux commented Mar 9, 2013

rsc commented Mar 12, 2013

rsc commented Jul 30, 2013

robpike commented Aug 20, 2013

rsc commented Nov 27, 2013

rsc commented Dec 4, 2013

rsc commented Dec 4, 2013

gopherbot commented May 24, 2016