Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: possible confusion over whether phis are always the first values in a block #20178

Closed
josharian opened this issue Apr 29, 2017 · 16 comments
Milestone

Comments

@josharian
Copy link
Contributor

At some point in the distant past, it was (almost) always the case that OpPhi values, if present, were the first values in the block. And some passes appear to assume that. Here are two comments from the deadstore pass:

		for _, v := range b.Values {
			if v.Op == OpPhi {
				// Ignore phis - they will always be first and can't be eliminated
				continue
			}

/* ... */

		// walk to previous store
		if v.Op == OpPhi {
			continue // At start of block.  Move on to next block.
		}

But it is definitely no longer the case that all phis are the at the beginning of the block. Among other reasons, the nilcheck and writebarrier passes call storeOrder, which re-orders values in a way that would move some phis away from the beginning.

I tried to test this by adding code that randomizes value order before all passes prior to scheduling and running toolstash, but register selection changes made that unhelpful before I could detect anything substantive. Sample:

inconsistent log line:
/var/folders/1t/n61cbvls5bl293bbb0zyypqw0000gn/T/go-build711603550/runtime.a.log:30617:
	obj: 00154 (/var/folders/1t/n61cbvls5bl293bbb0zyypqw0000gn/T/toolstash-check-240465411/go/src/runtime/mbitmap.go:1533)	ANDQ	R14, R11
/var/folders/1t/n61cbvls5bl293bbb0zyypqw0000gn/T/go-build711603550/runtime.a.stash.log:30617:
	obj: 00154 (/var/folders/1t/n61cbvls5bl293bbb0zyypqw0000gn/T/toolstash-check-240465411/go/src/runtime/mbitmap.go:1533)	ANDQ	R11, R14
2017/04/29 14:53:10 exit status 2

Let's:

  • decide if/when we want to assume that OpPhis are always first, and if we ever do, add an ssa internal check for it
  • fix up any code that depends on faulty assumptions
  • make the schedule pass more fully deterministic
  • do some randomization-based testing to make sure that any assumptions about value order are safe/correct (perhaps always when ssa/check is on, somehow?)

cc @randall77 @cherrymui @dr2chase

@josharian josharian added this to the Go1.9 milestone Apr 29, 2017
@cherrymui
Copy link
Member

cherrymui commented Apr 29, 2017

I thought storeOrder doesn't move Phi from the beginning. I'll check. (EDIT: this is not true)
Do you have a testcase that demonstrates it?

@cherrymui
Copy link
Member

Keith has a CL https://go-review.googlesource.com/c/33909/ that does the value order randomization.

@cherrymui
Copy link
Member

@josharian you are right that storeOrder doesn't always keep Phis at the beginning. And there are other passes that doesn't keep this. For example, decompose may decompose a Phi into multiple Phis and add them to the end. Maybe we should make code don't depend on this. Many passes indeed don't need this.

@josharian
Copy link
Contributor Author

Maybe we should make code don't depend on this.

SGTM

@josharian
Copy link
Contributor Author

And in passes where this proves complicated or risky or expensive, we can just fix it up at the beginning. That's a fairly cheap O(1) single pass through the slice.

@dr2chase
Copy link
Contributor

dr2chase commented May 2, 2017

The loop-unroller also appends new phis to the per-block value slices.

josharian added a commit to josharian/go that referenced this issue May 10, 2017
Some passes assume phis are first, even though that no longer holds.
Since we're in the freeze, instead of fixing that assumption, make it true.

Updates golang#20178

Change-Id: I6e1d4499e82f6939dabbb8c17c8577964077601e
@josharian
Copy link
Contributor Author

A visual inspection of the ssa passes suggests that the only two passes that depend on phis being first are deadstore and the various rule-based rewrites (see e.g. comments in canMergeLoad).

@gopherbot
Copy link

CL https://golang.org/cl/43570 mentions this issue.

gopherbot pushed a commit that referenced this issue May 17, 2017
fuseBlockPlain was accidentally quadratic.
If you had plain blocks b1 -> b2 -> b3 -> b4,
each containing single values v1, v2, v3, and v4 respectively,
fuseBlockPlain would move v1 from b1 to b2 to b3 to b4,
then v2 from b2 to b3 to b4, etc.

There are two obvious fixes.

* Look for runs of blocks in fuseBlockPlain
  and handle them in a single go.
* Fuse from end to beginning; any given value in a run
  of blocks to fuse then moves only once.

The latter is much simpler, so that's what this CL does.

Somewhat surprisingly, this change does not pass toolstash-check.

The resulting set of blocks is the same,
and the values in them are the same,
but the order of values in them differ,
and that order of values (while arbitrary)
is enough to change the compiler's output.
This may be due to #20178; deadstore is the next pass after fuse.

Adding basic sorting to the beginning of deadstore
is enough to make this CL pass toolstash-check:

	for _, b := range f.Blocks {
		obj.SortSlice(b.Values, func(i, j int) bool { return b.Values[i].ID < b.Values[j].ID })
	}

Happily, this CL appears to result in better code on average,
if only by accident. It cuts 4k off of cmd/go; go1 benchmarks
are noisy as always but don't regress (numbers below).

No impact on the standard compilebench benchmarks.
For the code in #13554, this speeds up compilation dramatically:

name  old time/op       new time/op       delta
Pkg         53.1s ± 2%        12.8s ± 3%  -75.92%  (p=0.008 n=5+5)

name  old user-time/op  new user-time/op  delta
Pkg         55.0s ± 2%        14.9s ± 3%  -73.00%  (p=0.008 n=5+5)

name  old alloc/op      new alloc/op      delta
Pkg        2.04GB ± 0%       2.04GB ± 0%   +0.18%  (p=0.008 n=5+5)

name  old allocs/op     new allocs/op     delta
Pkg         6.21M ± 0%        6.21M ± 0%     ~     (p=0.222 n=5+5)

name  old object-bytes  new object-bytes  delta
Pkg         28.4M ± 0%        28.4M ± 0%   +0.00%  (p=0.008 n=5+5)

name  old export-bytes  new export-bytes  delta
Pkg           208 ± 0%          208 ± 0%     ~     (all equal)


Updates #13554


go1 benchmarks:

name                     old time/op    new time/op    delta
BinaryTree17-8              2.29s ± 2%     2.26s ± 2%  -1.43%  (p=0.000 n=48+50)
Fannkuch11-8                2.74s ± 2%     2.79s ± 2%  +1.63%  (p=0.000 n=50+49)
FmtFprintfEmpty-8          36.6ns ± 3%    34.6ns ± 4%  -5.29%  (p=0.000 n=49+50)
FmtFprintfString-8         58.3ns ± 3%    59.1ns ± 3%  +1.35%  (p=0.000 n=50+49)
FmtFprintfInt-8            62.4ns ± 2%    63.2ns ± 3%  +1.19%  (p=0.000 n=49+49)
FmtFprintfIntInt-8         95.1ns ± 2%    96.7ns ± 3%  +1.61%  (p=0.000 n=49+50)
FmtFprintfPrefixedInt-8     118ns ± 3%     113ns ± 2%  -4.00%  (p=0.000 n=50+49)
FmtFprintfFloat-8           191ns ± 2%     192ns ± 2%  +0.40%  (p=0.034 n=50+50)
FmtManyArgs-8               419ns ± 2%     420ns ± 2%    ~     (p=0.228 n=49+49)
GobDecode-8                5.26ms ± 3%    5.19ms ± 2%  -1.33%  (p=0.000 n=50+49)
GobEncode-8                4.12ms ± 2%    4.15ms ± 3%  +0.68%  (p=0.007 n=49+50)
Gzip-8                      198ms ± 2%     197ms ± 2%  -0.50%  (p=0.018 n=48+48)
Gunzip-8                   31.9ms ± 3%    31.8ms ± 3%  -0.47%  (p=0.024 n=50+50)
HTTPClientServer-8         64.4µs ± 0%    64.0µs ± 0%  -0.55%  (p=0.000 n=43+46)
JSONEncode-8               10.6ms ± 2%    10.6ms ± 3%    ~     (p=0.543 n=49+49)
JSONDecode-8               43.3ms ± 3%    43.1ms ± 2%    ~     (p=0.079 n=50+50)
Mandelbrot200-8            3.70ms ± 2%    3.70ms ± 2%    ~     (p=0.553 n=47+50)
GoParse-8                  2.70ms ± 2%    2.71ms ± 3%    ~     (p=0.843 n=49+50)
RegexpMatchEasy0_32-8      70.5ns ± 4%    70.4ns ± 4%    ~     (p=0.867 n=48+50)
RegexpMatchEasy0_1K-8       162ns ± 3%     162ns ± 2%    ~     (p=0.739 n=48+48)
RegexpMatchEasy1_32-8      66.1ns ± 5%    66.2ns ± 4%    ~     (p=0.970 n=50+50)
RegexpMatchEasy1_1K-8       297ns ± 7%     296ns ± 7%    ~     (p=0.406 n=50+50)
RegexpMatchMedium_32-8      105ns ± 5%     105ns ± 5%    ~     (p=0.702 n=50+50)
RegexpMatchMedium_1K-8     32.3µs ± 4%    32.2µs ± 3%    ~     (p=0.614 n=49+49)
RegexpMatchHard_32-8       1.75µs ±18%    1.74µs ±12%    ~     (p=0.738 n=50+48)
RegexpMatchHard_1K-8       52.2µs ±14%    51.3µs ±13%    ~     (p=0.230 n=50+50)
Revcomp-8                   366ms ± 3%     367ms ± 3%    ~     (p=0.745 n=49+49)
Template-8                 48.5ms ± 4%    48.5ms ± 4%    ~     (p=0.824 n=50+48)
TimeParse-8                 263ns ± 2%     256ns ± 2%  -2.98%  (p=0.000 n=48+49)
TimeFormat-8                265ns ± 3%     262ns ± 3%  -1.35%  (p=0.000 n=48+49)
[Geo mean]                 41.1µs         40.9µs       -0.48%


Change-Id: Ib35fa15b54282abb39c077d150beee27f610891a
Reviewed-on: https://go-review.googlesource.com/43570
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
@randall77 randall77 self-assigned this Jun 6, 2017
@randall77
Copy link
Contributor

Before values are ordered in a block, phis will not necessarily be first. After they are ordered (in the schedule pass), phis must be first. I mailed CL 45018 to check the latter in check.go.

CL 33909 (randomize the order of b.Values before the schedule pass) will not get in for 1.9. I intend to get that in for 1.10. It will help make sure that before the schedule pass, we don't depend on phis being first.

I looked closely at both deadstore and canMergeLoad - they don't depend on phis being first in b.Values, just being logically the first store in the store chain for a block. That should still be the case, even if they are out of order in b.Values.

@gopherbot
Copy link

CL https://golang.org/cl/45018 mentions this issue.

gopherbot pushed a commit that referenced this issue Jun 7, 2017
Update #20178

Change-Id: I603f77268ed38afdd84228c775efe006f08f14a7
Reviewed-on: https://go-review.googlesource.com/45018
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
@randall77
Copy link
Contributor

Punting the block order randomization, and this this issue, to 1.10.

@randall77 randall77 modified the milestones: Go1.10, Go1.9 Jun 7, 2017
@ianlancetaylor
Copy link
Contributor

@randall77 This issue is marked 1.10. Should it be moved to 1.11, or Unplanned?

@randall77
Copy link
Contributor

I will punt to 1.11. I have a CL that mostly, but not quite, works.

@randall77 randall77 modified the milestones: Go1.10, Go1.11 Dec 6, 2017
@dr2chase
Copy link
Contributor

1.12 now?

@randall77 randall77 modified the milestones: Go1.11, Go1.12 Jun 26, 2018
@gopherbot
Copy link

Change https://golang.org/cl/151517 mentions this issue: cmd/compile: order nil checks by source position

@gopherbot
Copy link

Change https://golang.org/cl/33909 mentions this issue: cmd/compile: randomize value order in block for testing

gopherbot pushed a commit that referenced this issue Nov 28, 2018
There's nothing enforcing ordering between redundant nil checks when
they may occur during the same memory state. Commit to using the
earliest one in source order.

Otherwise the choice of which to remove depends on the ordering of
values in a block (before scheduling). That's unfortunate when trying
to ensure that the compiler doesn't depend on that ordering for
anything.

Update #20178

Change-Id: I2cdd5be10618accd9d91fa07406c90cbd023ffba
Reviewed-on: https://go-review.googlesource.com/c/151517
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
@golang golang locked and limited conversation to collaborators Nov 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants