cmd/compile: missed opportunity to inline runtime.memmove #41662

josharian · 2020-09-27T22:22:16Z

package p

func f(b []byte, x *[8]byte) {
	_ = b[8]
	copy(b, x[:])
}

This should compile down to a pair of MOVQs, one to load from x and one to write to b. It doesn't; it still contains a call to runtime.memmove. From a quick glance at ssa.html, the problem is that the lowering of runtime.memmove to Move happens during generic.rules, but we haven't detected that the size of the memmove is a constant until after lowering.

To fix this, we could either improve the analysis in the generic stages or add arch-specific runtime.memmove-to-Move lowering.

cc @randall77 @dr2chase @martisch @mundaym

kels-ng · 2021-01-15T11:10:39Z

Is the issue still actual? Let me take it.

dr2chase · 2021-01-15T15:25:06Z

Here's a sample of a similar optimization where small memequals calls are optimized.

josharian · 2021-01-15T15:43:59Z

@Kels9009 great! Ask here if you have questions. Please be sure to add a codegen test (see test/codegen).

kels-ng · 2021-01-18T15:09:25Z

@dr2chase @josharian OK, thanks for useful information.

kels-ng · 2021-01-20T15:22:28Z

Hello,

I investigated the issue and I would like to get an advice to select the right direction in order to fix it.

First, I tried to make generic solution. Here the SSA dump:

As you can see on the late stage of optimization (late opt) the problem is memmove are using Phi operation as a size argument (v50). Phi operation can't be eliminated due to rules require both arguments to be constant (we have Arg + Const):

// basic phi simplifications
...
(Phi (Const64  [c]) (Const64  [c])) => (Const64  [c])
...

But 'memmove' size argument detected as a constant on the next stage (before lowering) - dead auto elim + generic deadcode:

In this case memmove can be inlined using current rules without modification.

So, I added another one opt stage after dead auto elim + generic deadcode. As a result memmove was inlined successfully and almost all all.bash tests were passed except few codegen tests (for example writebarrier).
Another solution was to move late opt to position after dead auto elim + generic deadcode. Unfortunately, the result is broken code generation.

So, as a result a have few branches for next steps:

Add arch-specific runtime.memmove-to-Move lowering. I already have the first implementation in my local branch for AMD64 and it works fine. I don't like this solution because it requires to implement this rule for each archs and I think generic rules are more proper place for such optimization. But still, we can use it and ignore my doubts :).
Add another generic opt stage after dead auto elim + generic deadcode and fix all failed tests. I think this is the worst solution, because we can't predict how many hidden problems we will create.
Extract memmove recognition rules from generic to the new file and insert new opt stage after dead auto elim + generic deadcode. This solution move memmove inlining to the later position. In this case we reduce chance to overlook mistake and restructure rules. The problem is current source-code structure doesn't have dedicated files for each types of optimizations (they all are stored in generic and arch-specific rules), and I'm not sure I can decide by myself such change without experts.

What do you think about it? Which one of these directions is right solution?

ianlancetaylor · 2021-01-20T20:14:25Z

@kels-ng A note for the future: when pasting text, please just paste plain text, not an image. At least for me your images are nearly illegible, whereas everyone using this issue tracker can read plain text. Thanks.

josharian · 2021-01-20T21:44:00Z

Thanks, @kels-ng. We should do option (1).

New rules are pretty cheap. There are a fair number of rules duplicated in generic and lowered opt, for similar reasons. We don't want to introduce new opt passes without a very compelling need, and moving around existing passes is risky.

gopherbot · 2021-02-03T09:45:49Z

Change https://golang.org/cl/289151 mentions this issue: cmd/compile: add arch-specific inlining for runtime.memmove

gopherbot · 2021-09-24T15:33:37Z

Change https://golang.org/cl/352054 mentions this issue: cmd/compile: add PP64-specific inlining for runtime.memmove

Add rule to PPC64.rules to inline runtime.memmove in more cases, as is done for other target architectures Updated tests in codegen/copy.go to verify changes are done on ppc64/ppc64le Updates #41662 Change-Id: Id937ce21f9b4f4047b3e66dfa3c960128ee16a2a Reviewed-on: https://go-review.googlesource.com/c/go/+/352054 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Trust: Lynn Boger <laboger@linux.vnet.ibm.com>

josharian added Performance help wanted labels Sep 27, 2020

josharian added this to the Unplanned milestone Sep 27, 2020

andybons added the NeedsFix label Sep 29, 2020

gopherbot closed this as completed in 3b321a9 May 12, 2021

golang locked and limited conversation to collaborators Sep 24, 2022

gopherbot added the FrozenDueToAge label Sep 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/compile: missed opportunity to inline runtime.memmove #41662

cmd/compile: missed opportunity to inline runtime.memmove #41662

josharian commented Sep 27, 2020 •

edited

Loading

kels-ng commented Jan 15, 2021

dr2chase commented Jan 15, 2021

josharian commented Jan 15, 2021

kels-ng commented Jan 18, 2021

kels-ng commented Jan 20, 2021 •

edited

Loading

ianlancetaylor commented Jan 20, 2021

josharian commented Jan 20, 2021

gopherbot commented Feb 3, 2021

gopherbot commented Sep 24, 2021

cmd/compile: missed opportunity to inline runtime.memmove #41662

cmd/compile: missed opportunity to inline runtime.memmove #41662

Comments

josharian commented Sep 27, 2020 • edited Loading

kels-ng commented Jan 15, 2021

dr2chase commented Jan 15, 2021

josharian commented Jan 15, 2021

kels-ng commented Jan 18, 2021

kels-ng commented Jan 20, 2021 • edited Loading

ianlancetaylor commented Jan 20, 2021

josharian commented Jan 20, 2021

gopherbot commented Feb 3, 2021

gopherbot commented Sep 24, 2021

josharian commented Sep 27, 2020 •

edited

Loading

kels-ng commented Jan 20, 2021 •

edited

Loading