Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: too many LEAQ/LEAL instructions on amd64, 386 #1914

Closed
rsc opened this issue Jun 3, 2011 · 24 comments
Closed

cmd/compile: too many LEAQ/LEAL instructions on amd64, 386 #1914

rsc opened this issue Jun 3, 2011 · 24 comments

Comments

@rsc
Copy link
Contributor

rsc commented Jun 3, 2011

More can and should be gotten rid of.
They are making the range loop here less
efficient than the explicit for loop in 8g.
There are some lingering in 6g too, but fewer.

package main

import (
    "fmt"
    "testing"
)

var siz = 1024 * 1024
var buf []byte

func init() {
    buf = make([]byte, siz)
    for i := range buf {
        buf[i] = byte(i & 0xFF)
    }
}

func BenchmarkRange(b *testing.B) {
    n := 0
    for i := 0; i < b.N; i++ {
        for _, c := range buf {
            if c == 0 {
                n++
            }
        }
    }
}

func BenchmarkLoop(b *testing.B) {
    n := 0
    for i := 0; i < b.N; i++ {
        for j := 0; j < len(buf); j++ {
            if buf[j] == 0 {
                n++
            }
        }
    }
}

func main() {
    fmt.Println(testing.Benchmark(BenchmarkRange))
    fmt.Println(testing.Benchmark(BenchmarkLoop))
}
@gopherbot
Copy link

Comment 1 by qyzhai:

zhai@zhai$ objdump -d $GOROOT/bin/godoc | awk -F'\t' 'BEGIN{OFMT="%5.2f"}/^
/{s=substr($3,1,7);if (s!="") map[s]++;n++ }END{for (s in map) print
map[s]"\t"map[s]/n*100"  "s; print n}' | sort -n 
4647    1.26883  sub    
4981    1.36002  ja     
6279    1.71443  ret    
7194    1.96426  xor    
13563   3.70327  movl   
15572   4.25181  add    
16650   4.54615  cmp    
20891   5.70412  cld       // too many cld  
24779   6.76571  call   
45227   12.3489  lea    
46036   12.5698  movsl  
109359  29.8596  mov    
366244

@lvdlvd
Copy link

lvdlvd commented Nov 7, 2011

Comment 2:

Labels changed: added compilerbug.

@rsc
Copy link
Contributor Author

rsc commented Dec 9, 2011

Comment 3:

Labels changed: added priority-later, removed priority-medium.

@remyoudompheng
Copy link
Contributor

Comment 5:

I can't reproduce this. Is it still valid?

@rsc
Copy link
Contributor Author

rsc commented Apr 13, 2012

Comment 6:

I think there are still some lingering.  Try linking with 6l -a and grep the output for
LEAQ.*SP.
Most LEAQ n(SP), R (for any register R) are indicative of a problem to be fixed.  The
most common pattern is
LEAQ 10(SP), AX
MOVQ x, 0(AX)
MOVQ y, 4(AX)
and also with the moves going the other way.  These should just use 10(SP) and 14(SP)
and avoid the LEAQ, which will registerize better.

@remyoudompheng
Copy link
Contributor

Comment 7:

See http://golang.org/cl/6489067/ for a step towards resolution of this.

@remyoudompheng
Copy link
Contributor

Comment 8:

After apply CL 6489067, 6493099, 6494107, 6501110 on today's tip
benchmark                      old ns/op    new ns/op    delta
BenchmarkCodeEncoder           118455100     99560900  -15.95%
BenchmarkCodeMarshal           123140450    102622900  -16.66%
BenchmarkCodeDecoder           360685800    355718200   -1.38%
BenchmarkCodeUnmarshal         516606800    354142200  -31.45%
BenchmarkCodeUnmarshalReuse    506730200    346208400  -31.68%
BenchmarkSkipValue              33405060     30414140   -8.95%
Notable frame size reductions:
src/pkg/fmt/print.go:851) TEXT    (*pp).printReflectValue+0(SB),$2224-56
src/pkg/fmt/print.go:851) TEXT    (*pp).printReflectValue+0(SB),$1400-56
src/pkg/encoding/json/decode.go:616) TEXT    (*decodeState).literalStore+0(SB),$960-56
src/pkg/encoding/json/decode.go:616) TEXT    (*decodeState).literalStore+0(SB),$632-56
src/pkg/reflect/value.go:350) TEXT    Value.call+0(SB),$1024-72
src/pkg/reflect/value.go:350) TEXT    Value.call+0(SB),$792-72

@nigeltao
Copy link
Contributor

nigeltao commented Sep 9, 2012

Comment 9:

Those encoding/json benchmark deltas are delicious! What's the before/after for
$GOROOT/test/bench/go1?

@davecheney
Copy link
Contributor

Comment 10:

reflect.Value.call used to consume a kilobyte of stack. wow. 
Fantastic work Rémy!

@remyoudompheng
Copy link
Contributor

Comment 11:

There's not really a difference in the shootout. I think the json benchmark differences
are entirely caused by the displacement of stack splits. The go1 benchmark suite has
very random results:
benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    7915835000   7861939000   -0.68%
BenchmarkFannkuch11      4950781000   4949182000   -0.03%
BenchmarkGobDecode         28233020     26061040   -7.69%
BenchmarkGobEncode         17074820     28036010  +64.20%
BenchmarkGzip             737569600    743740200   +0.84%
BenchmarkGunzip           209303500    204007200   -2.53%
BenchmarkJSONEncode       120488400     98826400  -17.98%
BenchmarkJSONDecode       527092400    346794200  -34.21%
BenchmarkMandelbrot200      8997515      9001385   +0.04%
BenchmarkParse             10048985      9788460   -2.59%
BenchmarkRevcomp         1500392000   1508287000   +0.53%
BenchmarkTemplate         420592000    387395600   -7.89%
In the new code, GobEncode falls into a pathological stack split that consumes 30% of
CPU time.

@remyoudompheng
Copy link
Contributor

Comment 12:

I suppose CL 6489067 is what will give the most deterministic performance improvement
(for 8g)

@rsc
Copy link
Contributor Author

rsc commented Sep 12, 2012

Comment 13:

Labels changed: added go1.1maybe.

@robpike
Copy link
Contributor

robpike commented Mar 7, 2013

Comment 14:

Labels changed: removed go1.1maybe.

@rsc
Copy link
Contributor Author

rsc commented Mar 12, 2013

Comment 15:

[The time for maybe has passed.]

@rsc
Copy link
Contributor Author

rsc commented Jul 30, 2013

Comment 16:

Labels changed: added go1.2maybe.

Owner changed to @rsc.

@rsc
Copy link
Contributor Author

rsc commented Jul 30, 2013

Comment 17:

Labels changed: added feature.

@robpike
Copy link
Contributor

robpike commented Aug 16, 2013

Comment 18:

Deferring to Go 1.3, since any "lingering" stuff is unlikely to be addressed soon.

Labels changed: added go1.3maybe, performance, removed go1.2maybe, feature.

@robpike
Copy link
Contributor

robpike commented Aug 20, 2013

Comment 19:

Labels changed: removed go1.3maybe.

@rsc
Copy link
Contributor Author

rsc commented Nov 27, 2013

Comment 20:

Labels changed: added go1.3maybe.

@rsc
Copy link
Contributor Author

rsc commented Dec 4, 2013

Comment 21:

Labels changed: added release-none, removed go1.3maybe.

@rsc
Copy link
Contributor Author

rsc commented Dec 4, 2013

Comment 22:

Labels changed: added repo-main.

@rsc
Copy link
Contributor Author

rsc commented Mar 3, 2014

Comment 23:

Issue #7014 has been merged into this issue.

josharian added a commit to josharian/go that referenced this issue Jan 9, 2015
Fix a flipped nil check.
The flipped check prevented componentgen
from zeroing a non-cadable nl.
This fix reduces the number of non-SB LEAQs
in godoc from 35323 to 34920 (-1.1%).

Update golang#1914

Change-Id: I15ea303068835f606f883ddf4a2bb4cb2287e9ae
josharian added a commit that referenced this issue Feb 13, 2015
Fix a flipped nil check.
The flipped check prevented componentgen
from zeroing a non-cadable nl.
This fix reduces the number of non-SB LEAQs
in godoc from 35323 to 34920 (-1.1%).

Update #1914

Change-Id: I15ea303068835f606f883ddf4a2bb4cb2287e9ae
Reviewed-on: https://go-review.googlesource.com/2605
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
@rsc rsc added this to the Unplanned milestone Apr 10, 2015
@rsc rsc changed the title cmd/6g, cmd/8g: too many LEAQ/LEAL instructions cmd/compile: too many LEAQ/LEAL instructions on amd64, 386 Jun 8, 2015
@randall77
Copy link
Contributor

Fixed with the SSA compiler.

@rsc
Copy link
Contributor Author

rsc commented Mar 10, 2016

Hooray!

@golang golang locked and limited conversation to collaborators Mar 13, 2017
@rsc rsc removed their assignment Jun 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants