-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: possible confusion over whether phis are always the first values in a block #20178
Comments
I thought storeOrder doesn't move Phi from the beginning. I'll check. (EDIT: this is not true) |
Keith has a CL https://go-review.googlesource.com/c/33909/ that does the value order randomization. |
@josharian you are right that storeOrder doesn't always keep Phis at the beginning. And there are other passes that doesn't keep this. For example, decompose may decompose a Phi into multiple Phis and add them to the end. Maybe we should make code don't depend on this. Many passes indeed don't need this. |
SGTM |
And in passes where this proves complicated or risky or expensive, we can just fix it up at the beginning. That's a fairly cheap O(1) single pass through the slice. |
The loop-unroller also appends new phis to the per-block value slices. |
Some passes assume phis are first, even though that no longer holds. Since we're in the freeze, instead of fixing that assumption, make it true. Updates golang#20178 Change-Id: I6e1d4499e82f6939dabbb8c17c8577964077601e
A visual inspection of the ssa passes suggests that the only two passes that depend on phis being first are deadstore and the various rule-based rewrites (see e.g. comments in canMergeLoad). |
CL https://golang.org/cl/43570 mentions this issue. |
fuseBlockPlain was accidentally quadratic. If you had plain blocks b1 -> b2 -> b3 -> b4, each containing single values v1, v2, v3, and v4 respectively, fuseBlockPlain would move v1 from b1 to b2 to b3 to b4, then v2 from b2 to b3 to b4, etc. There are two obvious fixes. * Look for runs of blocks in fuseBlockPlain and handle them in a single go. * Fuse from end to beginning; any given value in a run of blocks to fuse then moves only once. The latter is much simpler, so that's what this CL does. Somewhat surprisingly, this change does not pass toolstash-check. The resulting set of blocks is the same, and the values in them are the same, but the order of values in them differ, and that order of values (while arbitrary) is enough to change the compiler's output. This may be due to #20178; deadstore is the next pass after fuse. Adding basic sorting to the beginning of deadstore is enough to make this CL pass toolstash-check: for _, b := range f.Blocks { obj.SortSlice(b.Values, func(i, j int) bool { return b.Values[i].ID < b.Values[j].ID }) } Happily, this CL appears to result in better code on average, if only by accident. It cuts 4k off of cmd/go; go1 benchmarks are noisy as always but don't regress (numbers below). No impact on the standard compilebench benchmarks. For the code in #13554, this speeds up compilation dramatically: name old time/op new time/op delta Pkg 53.1s ± 2% 12.8s ± 3% -75.92% (p=0.008 n=5+5) name old user-time/op new user-time/op delta Pkg 55.0s ± 2% 14.9s ± 3% -73.00% (p=0.008 n=5+5) name old alloc/op new alloc/op delta Pkg 2.04GB ± 0% 2.04GB ± 0% +0.18% (p=0.008 n=5+5) name old allocs/op new allocs/op delta Pkg 6.21M ± 0% 6.21M ± 0% ~ (p=0.222 n=5+5) name old object-bytes new object-bytes delta Pkg 28.4M ± 0% 28.4M ± 0% +0.00% (p=0.008 n=5+5) name old export-bytes new export-bytes delta Pkg 208 ± 0% 208 ± 0% ~ (all equal) Updates #13554 go1 benchmarks: name old time/op new time/op delta BinaryTree17-8 2.29s ± 2% 2.26s ± 2% -1.43% (p=0.000 n=48+50) Fannkuch11-8 2.74s ± 2% 2.79s ± 2% +1.63% (p=0.000 n=50+49) FmtFprintfEmpty-8 36.6ns ± 3% 34.6ns ± 4% -5.29% (p=0.000 n=49+50) FmtFprintfString-8 58.3ns ± 3% 59.1ns ± 3% +1.35% (p=0.000 n=50+49) FmtFprintfInt-8 62.4ns ± 2% 63.2ns ± 3% +1.19% (p=0.000 n=49+49) FmtFprintfIntInt-8 95.1ns ± 2% 96.7ns ± 3% +1.61% (p=0.000 n=49+50) FmtFprintfPrefixedInt-8 118ns ± 3% 113ns ± 2% -4.00% (p=0.000 n=50+49) FmtFprintfFloat-8 191ns ± 2% 192ns ± 2% +0.40% (p=0.034 n=50+50) FmtManyArgs-8 419ns ± 2% 420ns ± 2% ~ (p=0.228 n=49+49) GobDecode-8 5.26ms ± 3% 5.19ms ± 2% -1.33% (p=0.000 n=50+49) GobEncode-8 4.12ms ± 2% 4.15ms ± 3% +0.68% (p=0.007 n=49+50) Gzip-8 198ms ± 2% 197ms ± 2% -0.50% (p=0.018 n=48+48) Gunzip-8 31.9ms ± 3% 31.8ms ± 3% -0.47% (p=0.024 n=50+50) HTTPClientServer-8 64.4µs ± 0% 64.0µs ± 0% -0.55% (p=0.000 n=43+46) JSONEncode-8 10.6ms ± 2% 10.6ms ± 3% ~ (p=0.543 n=49+49) JSONDecode-8 43.3ms ± 3% 43.1ms ± 2% ~ (p=0.079 n=50+50) Mandelbrot200-8 3.70ms ± 2% 3.70ms ± 2% ~ (p=0.553 n=47+50) GoParse-8 2.70ms ± 2% 2.71ms ± 3% ~ (p=0.843 n=49+50) RegexpMatchEasy0_32-8 70.5ns ± 4% 70.4ns ± 4% ~ (p=0.867 n=48+50) RegexpMatchEasy0_1K-8 162ns ± 3% 162ns ± 2% ~ (p=0.739 n=48+48) RegexpMatchEasy1_32-8 66.1ns ± 5% 66.2ns ± 4% ~ (p=0.970 n=50+50) RegexpMatchEasy1_1K-8 297ns ± 7% 296ns ± 7% ~ (p=0.406 n=50+50) RegexpMatchMedium_32-8 105ns ± 5% 105ns ± 5% ~ (p=0.702 n=50+50) RegexpMatchMedium_1K-8 32.3µs ± 4% 32.2µs ± 3% ~ (p=0.614 n=49+49) RegexpMatchHard_32-8 1.75µs ±18% 1.74µs ±12% ~ (p=0.738 n=50+48) RegexpMatchHard_1K-8 52.2µs ±14% 51.3µs ±13% ~ (p=0.230 n=50+50) Revcomp-8 366ms ± 3% 367ms ± 3% ~ (p=0.745 n=49+49) Template-8 48.5ms ± 4% 48.5ms ± 4% ~ (p=0.824 n=50+48) TimeParse-8 263ns ± 2% 256ns ± 2% -2.98% (p=0.000 n=48+49) TimeFormat-8 265ns ± 3% 262ns ± 3% -1.35% (p=0.000 n=48+49) [Geo mean] 41.1µs 40.9µs -0.48% Change-Id: Ib35fa15b54282abb39c077d150beee27f610891a Reviewed-on: https://go-review.googlesource.com/43570 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org>
Before values are ordered in a block, phis will not necessarily be first. After they are ordered (in the schedule pass), phis must be first. I mailed CL 45018 to check the latter in check.go. CL 33909 (randomize the order of b.Values before the schedule pass) will not get in for 1.9. I intend to get that in for 1.10. It will help make sure that before the schedule pass, we don't depend on phis being first. I looked closely at both deadstore and canMergeLoad - they don't depend on phis being first in b.Values, just being logically the first store in the store chain for a block. That should still be the case, even if they are out of order in b.Values. |
CL https://golang.org/cl/45018 mentions this issue. |
Update #20178 Change-Id: I603f77268ed38afdd84228c775efe006f08f14a7 Reviewed-on: https://go-review.googlesource.com/45018 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Punting the block order randomization, and this this issue, to 1.10. |
@randall77 This issue is marked 1.10. Should it be moved to 1.11, or Unplanned? |
I will punt to 1.11. I have a CL that mostly, but not quite, works. |
1.12 now? |
Change https://golang.org/cl/151517 mentions this issue: |
Change https://golang.org/cl/33909 mentions this issue: |
There's nothing enforcing ordering between redundant nil checks when they may occur during the same memory state. Commit to using the earliest one in source order. Otherwise the choice of which to remove depends on the ordering of values in a block (before scheduling). That's unfortunate when trying to ensure that the compiler doesn't depend on that ordering for anything. Update #20178 Change-Id: I2cdd5be10618accd9d91fa07406c90cbd023ffba Reviewed-on: https://go-review.googlesource.com/c/151517 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
At some point in the distant past, it was (almost) always the case that OpPhi values, if present, were the first values in the block. And some passes appear to assume that. Here are two comments from the
deadstore
pass:But it is definitely no longer the case that all phis are the at the beginning of the block. Among other reasons, the
nilcheck
andwritebarrier
passes callstoreOrder
, which re-orders values in a way that would move some phis away from the beginning.I tried to test this by adding code that randomizes value order before all passes prior to scheduling and running toolstash, but register selection changes made that unhelpful before I could detect anything substantive. Sample:
Let's:
cc @randall77 @cherrymui @dr2chase
The text was updated successfully, but these errors were encountered: