Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: method expressions in json.scanner are very slow #26273

Closed
mvdan opened this issue Jul 8, 2018 · 7 comments
Closed

cmd/compile: method expressions in json.scanner are very slow #26273

mvdan opened this issue Jul 8, 2018 · 7 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@mvdan
Copy link
Member

mvdan commented Jul 8, 2018

encoding/json.scanner has a field step func(*scanner, byte) int, and that gets assigned to different global funcs like func stateEndValue(s *scanner, c byte) int depending on the state of the scanner.

Note how the calls end up being of the form scan.step(scan, c), since the passed scanner is separate from the scanner that holds the step func itself.

I was trying to clean all of this up by replacing the field with step func(byte) int, the funcs with methods like func (s *scanner) stateEndValue(c byte) int, and the calls like scan.step(scan, c) with scan.step(c).

However, this produced a massive performance degradation, presumably because of the tons of extra allocations:

name           old time/op    new time/op    delta
CodeDecoder-4    31.0ms ± 1%    76.8ms ± 0%   +147.90%  (p=0.002 n=6+6)

name           old speed      new speed      delta
CodeDecoder-4  62.6MB/s ± 1%  25.3MB/s ± 0%    -59.66%  (p=0.002 n=6+6)

name           old alloc/op   new alloc/op   delta
CodeDecoder-4    2.74MB ± 0%   32.37MB ± 0%  +1081.23%  (p=0.004 n=6+5)

name           old allocs/op  new allocs/op  delta
CodeDecoder-4     77.5k ± 0%   1845.4k ± 0%  +2279.69%  (p=0.004 n=6+5)

I don't know if this method expression trickery confuses the escape analysis, or what exactly it is. I have tried to reproduce this with smaller, self-contained benchmarks without encoding/json, but I haven't been able to get the same slowdown.

I'll upload a sample CL with the changes, so that others can reproduce the numbers too.

@mvdan mvdan added Performance NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jul 8, 2018
@gopherbot
Copy link

Change https://golang.org/cl/122466 mentions this issue: encoding/json: use method expressions in the scanner

@mvdan
Copy link
Member Author

mvdan commented Jul 8, 2018

Sample patch with numbers uploaded above. The benchstat is comparing the commit against its parent.

I'm assuming that I didn't do anything obviously wrong. I've read the changes twice, and all the tests pass, so I think it is correct.

@huguesb
Copy link
Contributor

huguesb commented Jul 8, 2018

My guess is that every time a method expression is stored in a function pointer a new lambda w/ capture is allocated. i.e:

s.step = s.stateBeginValue

is translated to:

s.step = func(b byte) { s.stateBeginValue(b) }

which would explain the huge increase in allocations

@mvdan
Copy link
Member Author

mvdan commented Jul 8, 2018

Yes, you are probably right. Though I couldn't replicate it with isolated benchmarks (global func variables vs method expression variables), so I wasn't sure if it was the cause.

Still, ignoring the details of the compiler, it would be a bit of a shame if method expressions were more expensive by design. I'd hope that either the slowness or allocations could be minimised.

@huguesb
Copy link
Contributor

huguesb commented Jul 8, 2018

I doubt there's a way to avoid allocations when doing what is essentially a variable capture. A method pointer is necessarily "fatter" than a straight-up function pointer has it includes an extra pointer to the instance, otherwise you're back to the old code where you have to manually provide the instance pointer at every call site.

One possible cleanup would be to make step a method, like:

func (s *Scanner) step(b byte) {
  s.stepFunction(s, b)
}

which should presumably be inlined, avoiding any slowdown, and allow all call sites to be cleaner. It's not clear to me that it's worthwhile though...

@mvdan
Copy link
Member Author

mvdan commented Jul 8, 2018

I thought about the wrapper method :) Didn't try it because, without mid-stack inlining, it's likely to make things worse right now.

@ianlancetaylor ianlancetaylor modified the milestones: 12, Go1.12 Jul 11, 2018
@mvdan
Copy link
Member Author

mvdan commented Nov 28, 2018

I don't think there's anything we can do here, and noone had any ideas either. Closing.

@mvdan mvdan closed this as completed Nov 28, 2018
@golang golang locked and limited conversation to collaborators Nov 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

4 participants