Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/gc: why did json get slower with convT2I optimization? #3787

Closed
rsc opened this issue Jun 29, 2012 · 20 comments
Closed

cmd/gc: why did json get slower with convT2I optimization? #3787

rsc opened this issue Jun 29, 2012 · 20 comments
Labels
FrozenDueToAge Suggested Issues that may be good for new contributors looking for work to do.
Milestone

Comments

@rsc
Copy link
Contributor

rsc commented Jun 29, 2012

CL 6337058 made various things a little faster but made the JSON benchmark 30% slower.
Why?
@nigeltao
Copy link
Contributor

nigeltao commented Jul 2, 2012

Comment 1:

As a data point, http://golang.org/cl/6356053 adds an extra argument to convT2I
but does not cache itab lookups. That extra argument is simply ignored. The JSONDecode
benchmark shows no dramatic change, compared to the status quo before CL 6337058
"cmd/gc: cache itab lookup in convT2I" was submitted.
benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17    5902507000   5918781000   +0.28%
BenchmarkFannkuch11      3926283000   3936780000   +0.27%
BenchmarkGobDecode         22264160     22356350   +0.41%
BenchmarkGobEncode         12815560     12704780   -0.86%
BenchmarkGzip             568959600    568881800   -0.01%
BenchmarkGunzip           178905400    179608400   +0.39%
BenchmarkJSONEncode        89589000     89617050   +0.03%
BenchmarkJSONDecode       310999600    316518000   +1.77%
BenchmarkMandelbrot200      7016514      7018210   +0.02%
BenchmarkParse              7865130      7985445   +1.53%
BenchmarkRevcomp         1284725000   1254541000   -2.35%
BenchmarkTemplate         567248200    571401600   +0.73%

@DanielMorsing
Copy link
Contributor

Comment 2:

It might be interesting to see if introducing the global symbol has any effect on the
benchmarks.
Dave Cheney did some benchmarks for linux/arm, which doesn't have the performance
regression. That makes me think that there is some sort of alignment or size issue.
Maybe there is value to running these benchmarks under 386, since it has the same word
size.

@nigeltao
Copy link
Contributor

nigeltao commented Jul 6, 2012

Comment 3:

JSONDecode program times in seconds, 5 iterations, with and sans itab cache, on my
desktop.
amd64 with/sans: 2.520 / 1.980; ratio = 1.27
386   with/sans: 3.550 / 3.770; ratio = 0.94
The "with itab cache" path avoids the modulus calculation in hashing.
http://golang.org/cl/6256062/ suggests that the soft mod is relatively
expensive on ARM.

@nigeltao
Copy link
Contributor

nigeltao commented Jul 6, 2012

Comment 4:

Another data point: there may be confounding factors, but with itab cache is faster than
sans itab cache when decoding a deep JSON tree. With is slower than sans when decoding a
broad JSON tree.
With:
BenchmarkJSONDecodeBroad         500       4188660 ns/op       5.74 MB/s
BenchmarkJSONDecodeDeep      500       6037996 ns/op       3.49 MB/s
Sans:
BenchmarkJSONDecodeBroad         500       3925550 ns/op       6.13 MB/s
BenchmarkJSONDecodeDeep      500       7444352 ns/op       2.83 MB/s
GOARCH=amd64. example_test.go attached.

Attachments:

  1. example_test.go (1018 bytes)

@davecheney
Copy link
Contributor

Comment 5:

> The "with itab cache" path avoids the modulus calculation in hashing.
http://golang.org/cl/6256062/ suggests that the soft mod is relatively
expensive on ARM.
It sure is. Simple changes like making the itab cache 1024 elements wide, thus avoiding
soft mod, give substantial speedups.

@nigeltao
Copy link
Contributor

nigeltao commented Jul 9, 2012

Comment 6:

Another data point (linux/amd64 desktop). Starting from revision c533f48701cb, which is
the one before "cmd/gc: cache itab lookup in convT2I"
http://code.google.com/p/go/source/detail?r=ef432c94ce9c6ed6a0c343ec0a8124237f81a522
This innocuous looking diff (which introduces no new global symbols):
----------------
$ hg identify
c533f48701cb+
$ hg diff -U 5
diff -r c533f48701cb test/bench/go1/json_test.go
--- a/test/bench/go1/json_test.go   Mon Jul 02 15:30:00 2012 -0700
+++ b/test/bench/go1/json_test.go   Mon Jul 09 14:50:03 2012 +1000
@@ -82,6 +82,9 @@
 func BenchmarkJSONDecode(b *testing.B) {
    b.SetBytes(int64(len(jsonbytes)))
    for i := 0; i < b.N; i++ {
        jsondec()
    }
+   if false {
+       print([]int(nil))
+   }
 }
----------------
has this effect on the benchmark:
----------------
benchmark              old ns/op    new ns/op    delta
BenchmarkJSONDecode    312624800    403244800  +28.99%
----------------
The generated code for func BenchmarkJSONDecode is identical (the if-false is removed by
dead code elimination) except for the stack requirement. Note the $8 versus $24 in the
"6g -S" before/after:
0230 (json_test.go:82) TEXT    BenchmarkJSONDecode+0(SB),$8-8
0230 (json_test.go:82) TEXT    BenchmarkJSONDecode+0(SB),$24-8

@davecheney
Copy link
Contributor

Comment 7:

@nigeltao, very interesting diagnosis.
Ultimately this sounds like bad fortune caused the parts of the inner loop of the json
decoder to straddle a stack boundary. 
Would you suggest that there is a bug in the dead code elimination logic, in that it is
not properly adjusting the stack accounting ?

@minux
Copy link
Member

minux commented Jul 10, 2012

Comment 8:

On Monday, July 9, 2012, dfc wrote:
Very probable.
The stack frame issue is possibly issue #2734.

@rsc
Copy link
Contributor Author

rsc commented Sep 12, 2012

Comment 9:

Labels changed: added go1.1maybe.

@rsc
Copy link
Contributor Author

rsc commented Sep 14, 2012

Comment 10:

Labels changed: removed go1.1maybe.

@rsc
Copy link
Contributor Author

rsc commented Dec 10, 2012

Comment 11:

Labels changed: added size-l.

@rsc
Copy link
Contributor Author

rsc commented Mar 12, 2013

Comment 12:

Labels changed: added go1.1maybe, removed go1.1.

@robpike
Copy link
Contributor

robpike commented May 18, 2013

Comment 13:

Labels changed: added go1.2maybe, removed go1.1maybe.

@randall77
Copy link
Contributor

Comment 14:

Should get fixed with contiguous stacks, maybe for Go 1.3.  See the attached graph.

Labels changed: added go1.3, removed go1.2maybe.

Owner changed to @randall77.

Attachments:

  1. both.png (6967 bytes)

@robpike
Copy link
Contributor

robpike commented Aug 20, 2013

Comment 15:

Labels changed: removed go1.3.

@rsc
Copy link
Contributor Author

rsc commented Nov 27, 2013

Comment 16:

Labels changed: added go1.3maybe.

@rsc
Copy link
Contributor Author

rsc commented Dec 4, 2013

Comment 17:

Labels changed: added release-none, removed go1.3maybe.

@rsc
Copy link
Contributor Author

rsc commented Dec 4, 2013

Comment 18:

Labels changed: added repo-main.

@ianlancetaylor
Copy link
Contributor

Comment 19:

Keith, is this fixed now?

Labels changed: added suggested, release-go1.3maybe, removed priority-later, release-none.

@randall77
Copy link
Contributor

Comment 20:

Yes, it should be as of https://golang.org/cl/69620043

Status changed to Fixed.

@rsc rsc added fixed Suggested Issues that may be good for new contributors looking for work to do. labels Apr 3, 2014
@rsc rsc added this to the Go1.3 milestone Apr 14, 2015
@golang golang locked and limited conversation to collaborators Jun 24, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge Suggested Issues that may be good for new contributors looking for work to do.
Projects
None yet
Development

No branches or pull requests

9 participants