Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: store combine not working on ppc64 and arm64 with global variables #24242

Closed
rasky opened this issue Mar 4, 2018 · 4 comments
Closed
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.

Comments

@rasky
Copy link
Member

rasky commented Mar 4, 2018

While migrating the code generation tests for load/store combiners to to the top-level testsuite (fad31e5), I had to disable some tests for arm64 and ppc64le because it looks like that store combining is failing in some situations.

For instance:

func store_le16_idx(b []byte, idx int) {
// amd64:`MOVW\s`
// arm64(DISABLED):`MOVH`,-`MOVB`
// ppc64le(DISABLED):`MOVH\s`
binary.LittleEndian.PutUint16(b[idx:], sink16)
}

It looks like the bug is related to the fact that the store is made from a global variable rather than a local variable (as was being previously tested).

@rasky rasky added the NeedsFix The path to resolution is known, but the work has not been done. label Mar 4, 2018
@rasky
Copy link
Member Author

rasky commented Mar 4, 2018

@gopherbot
Copy link

Change https://golang.org/cl/98397 mentions this issue: cmd/compile/internal/ssa: improve store combine optimization on arm64

@laboger
Copy link
Contributor

laboger commented Mar 5, 2018

I noticed this regression in combining little endian stores while testing my current change for combining loads and stores in big endian order on ppc64le. I have a fix that I was going to include with the big endian change, but could make it separate if that is preferred.

@gopherbot
Copy link

Change https://golang.org/cl/98136 mentions this issue: cmd/compile,test: combine byte loads and stores on ppc64le

gopherbot pushed a commit that referenced this issue May 8, 2018
CL 74410 added rules to combine consecutive byte loads and
stores when the byte order was little endian for ppc64le. This
is the corresponding change for bytes that are in big endian order.
These rules are all intended for a little endian target arch.

This adds new testcases in test/codegen/memcombine.go

Fixes #22496
Updates #24242

Benchmark improvement for encoding/binary:
name                      old time/op    new time/op    delta
ReadSlice1000Int32s-16      11.0µs ± 0%     9.0µs ± 0%  -17.47%  (p=0.029 n=4+4)
ReadStruct-16               2.47µs ± 1%    2.48µs ± 0%   +0.67%  (p=0.114 n=4+4)
ReadInts-16                  642ns ± 1%     630ns ± 1%   -2.02%  (p=0.029 n=4+4)
WriteInts-16                 654ns ± 0%     653ns ± 1%   -0.08%  (p=0.629 n=4+4)
WriteSlice1000Int32s-16     8.75µs ± 0%    8.20µs ± 0%   -6.19%  (p=0.029 n=4+4)
PutUint16-16                1.16ns ± 0%    0.93ns ± 0%  -19.83%  (p=0.029 n=4+4)
PutUint32-16                1.16ns ± 0%    0.93ns ± 0%  -19.83%  (p=0.029 n=4+4)
PutUint64-16                1.85ns ± 0%    0.93ns ± 0%  -49.73%  (p=0.029 n=4+4)
LittleEndianPutUint16-16    1.03ns ± 0%    0.93ns ± 0%   -9.71%  (p=0.029 n=4+4)
LittleEndianPutUint32-16    0.93ns ± 0%    0.93ns ± 0%     ~     (all equal)
LittleEndianPutUint64-16    0.93ns ± 0%    0.93ns ± 0%     ~     (all equal)
PutUvarint32-16             43.0ns ± 0%    43.1ns ± 0%   +0.12%  (p=0.429 n=4+4)
PutUvarint64-16              174ns ± 0%     175ns ± 0%   +0.29%  (p=0.429 n=4+4)

Updates made to functions in gcm.go to enable their matching. An existing
testcase prevents these functions from being replaced by those in encoding/binary
due to import dependencies.

Change-Id: Idb3bd1e6e7b12d86cd828fb29cb095848a3e485a
Reviewed-on: https://go-review.googlesource.com/98136
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
@golang golang locked and limited conversation to collaborators Mar 6, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

3 participants