cmd/compile: NilCheckDeep performance regression #25389

TocarIP · 2018-05-14T19:26:01Z

After 00c8e14 NilCheckDeep benchmark shows significant regression:

NilCheckDeep1-8         358ns ± 2%     374ns ± 1%   +4.64%  (p=0.000 n=10+10)
NilCheckDeep10-8        641ns ± 3%     690ns ± 2%   +7.69%  (p=0.000 n=10+10)
NilCheckDeep100-8      3.52µs ± 4%    4.03µs ± 3%  +14.44%  (p=0.000 n=10+10)
NilCheckDeep1000-8     39.1µs ± 2%    45.4µs ± 1%  +15.96%  (p=0.000 n=10+9)
NilCheckDeep10000-8     412µs ± 2%     484µs ± 2%  +17.53%  (p=0.000 n=10+10)

name                 old speed      new speed      delta
NilCheckDeep1-8      2.79MB/s ± 2%  2.67MB/s ± 1%   -4.48%  (p=0.000 n=10+10)
NilCheckDeep10-8     15.6MB/s ± 3%  14.5MB/s ± 2%   -7.14%  (p=0.000 n=10+10)
NilCheckDeep100-8    28.4MB/s ± 4%  24.8MB/s ± 3%  -12.64%  (p=0.000 n=10+10)
NilCheckDeep1000-8   25.5MB/s ± 2%  22.0MB/s ± 1%  -13.77%  (p=0.000 n=10+9)
NilCheckDeep10000-8  24.3MB/s ± 2%  20.7MB/s ± 2%  -14.92%  (p=0.000 n=10+10)

name                 old alloc/op   new alloc/op   delta
NilCheckDeep1-8          101B ± 0%       93B ± 0%   -7.92%  (p=0.000 n=10+10)
NilCheckDeep10-8         192B ± 0%      184B ± 0%   -4.17%  (p=0.000 n=10+10)
NilCheckDeep100-8      1.17kB ± 0%    1.16kB ± 0%   -0.68%  (p=0.000 n=10+10)
NilCheckDeep1000-8     10.3kB ± 0%    10.3kB ± 0%   -0.08%  (p=0.000 n=10+10)
NilCheckDeep10000-8     102kB ± 0%     102kB ± 0%   -0.01%  (p=0.000 n=10+10)

name                 old allocs/op  new allocs/op  delta
NilCheckDeep1-8          6.00 ± 0%      5.00 ± 0%  -16.67%  (p=0.000 n=10+10)
NilCheckDeep10-8         6.00 ± 0%      5.00 ± 0%  -16.67%  (p=0.000 n=10+10)
NilCheckDeep100-8        6.00 ± 0%      5.00 ± 0%  -16.67%  (p=0.000 n=10+10)
NilCheckDeep1000-8       6.00 ± 0%      5.00 ± 0%  -16.67%  (p=0.000 n=10+10)
NilCheckDeep10000-8      6.00 ± 0%      5.00 ± 0%  -16.67%  (p=0.000 n=10+10)

This is caused by extra time spent in zeroing 256 bytes for:

stores := make([]*Value, 0, 64)

Reducing 64 to 8 removes degradation

The text was updated successfully, but these errors were encountered:

TocarIP · 2018-05-14T19:29:54Z

Running compilerbench with reduced size, shows compile time degradation, so NilCheckDeep looks like a bad benchmark. Should we just remove it altogether?
cc @randall77 @ALTree

ALTree · 2018-05-14T19:50:55Z

Running compilerbench with reduced size, shows compile time degradation

This doesn't surprise me. For those stack-allocate changes I always tried to choose the smallest cap constant that would cover most of the calls. Reducing the cap will negate the effect and actually make things worse, because if we stack-allocate small (like with cap = 8) and then most calls actually go over 8, we'll have to do even more work because we'll often need to move the worklist to the heap.

NilCheckDeep looks like a bad benchmark.

Even if it doesn't reflect real-world compiler performance, it could still be useful as a microbenchmark for that specific part of the compiler. The comment says:

benchmarkNilCheckDeep is a stress test of nilcheckelim.
...
Run with multiple depths to observe big-O behavior.

which makes sense. My vote is to keep it, and also keep my stack-allocate change (since it makes the compiler faster); and ignore the benchmarkNilCheckDeep regression, since it's a stress-test by its own admission.

OneOfOne · 2018-05-15T01:37:31Z

@ALTree I don't have time to test right now, but wouldn't something like this help a bit? or would it still trigger the zeroing?

var stash [64]*Value
store := stash[:0]

ALTree · 2018-05-15T08:27:01Z

@OneOfOne Just checked, they generate basically the exact same code. Both go on the stack and then there's a DUFFZERO $152 call.

TocarIP · 2018-05-15T18:33:20Z

My vote is to keep it, and also keep my stack-allocate change (since it makes the compiler faster); and ignore the benchmarkNilCheckDeep regression, since it's a stress-test by its own admission.

Agreed, closing.

ALTree added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label May 14, 2018

ALTree added this to the Go1.11 milestone May 14, 2018

TocarIP closed this as completed May 15, 2018

golang locked and limited conversation to collaborators May 15, 2019

gopherbot added the FrozenDueToAge label May 15, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/compile: NilCheckDeep performance regression #25389

cmd/compile: NilCheckDeep performance regression #25389

TocarIP commented May 14, 2018

TocarIP commented May 14, 2018

ALTree commented May 14, 2018 •

edited

OneOfOne commented May 15, 2018 •

edited

ALTree commented May 15, 2018

TocarIP commented May 15, 2018

cmd/compile: NilCheckDeep performance regression #25389

cmd/compile: NilCheckDeep performance regression #25389

Comments

TocarIP commented May 14, 2018

TocarIP commented May 14, 2018

ALTree commented May 14, 2018 • edited

OneOfOne commented May 15, 2018 • edited

ALTree commented May 15, 2018

TocarIP commented May 15, 2018

ALTree commented May 14, 2018 •

edited

OneOfOne commented May 15, 2018 •

edited