Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: unaligned load constant offsets on ppc64 #44739

Closed
randall77 opened this issue Mar 2, 2021 · 10 comments
Closed

cmd/compile: unaligned load constant offsets on ppc64 #44739

randall77 opened this issue Mar 2, 2021 · 10 comments
Labels
arch-ppc64x FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@randall77
Copy link
Contributor

package main

type T struct {
	x [4]byte
	y [8]byte
}

func f(a T, _ byte, b T) bool {
	return a.y == b.y
}
% GOOS=linux GOARCH=ppc64le ~/go1.16/bin/go tool compile -S ~/gowork/tmp1.go
compile: invalid offset for DS form load/store 00004 (/Users/khr/gowork/tmp1.go:9)	MOVD	"".b+17(R1), R4

This has been failing since at least 1.13.

My reading of the PPC64.rules is that 2 byte and 4 byte operations can be at any offset, but 8 byte operations need to be at a multiple of 4. (Search for %4 == 0 in those rules). To be precise, the operations can happen at any address, but we can't encode constant offsets that aren't a multiple of 4. Do I have that correct?

Split off from #42385

@laboger

@laboger
Copy link
Contributor

laboger commented Mar 2, 2021

@randall77 Yes, that is correct. The error is when encoding the instruction word.
After looking at this a bit I have a suspicion that the rules to combine byte loads and stores are causing the problem here. For variables on the stack, the entire offset is not being considered.

@randall77
Copy link
Contributor Author

Maybe the assembler could use REGTMP to add in odd offsets, if needed.

MOVD 19(R1), R2

to

ADD $3, R1, R31
MOVD 16(R31), R2

@laboger
Copy link
Contributor

laboger commented Mar 2, 2021

There are several options: 2 MOVWs, Create the full address 19+R1 in a base register, load the offset into an index register and use an indexed load or store. I've been avoiding this to keep the one to one mapping of Go -> power asm. But perhaps generating 2 instructions is better than failing in this way.

I'm thinking the indexed load or store would be best and we could do that in the rules, although would need a way to get the current stack offset at that point (which is found using AddAux in ppc64/ssa.go).

@randall77
Copy link
Contributor Author

Unfortunately, we don't know the offset of local variables at lowering rewrite time. But we do know the alignment, so maybe we load-combine only if the type of the local variable is at least 4-byte aligned?

I think I'd rather go the assembly modification route. Weird offsets will be rare, so we won't be adding assembly very often.
And it will let us remove the i%4==0 tests from the rewrite rules, which would more than pay for the extra instruction in many cases.

@cherrymui
Copy link
Member

I think I'd rather go the assembly modification route.

I'd go with that, too. This happened earlier on ARM64, caused quite a number of bugs, and we had to introduce several special cases around it. Eventually I changed the assembler and it's all done.

@laboger
Copy link
Contributor

laboger commented Mar 2, 2021

Yep just realized that.
I remember that those combining rules are mostly used with byte slices so if we add alignment checks most of those won't be lowered. But they could be changed to generate the full address in the base register and then have a 0 offset. That would also allow the i%4 check in those rules to be removed, allowing more cases to be matched.

I agree that we should fix this in the assembler though and get rid of those checks for alignment of 4 in the rules. And then the assembler won't fail if we hit this case. But I think @pmur has a change for asm9.go that we should wait on before changing this. I see there are multiple places to fix.

@laboger
Copy link
Contributor

laboger commented Mar 4, 2021

I think I can fix this in ppc64/ssa.go when processing the MOVD loads and stores. I'm testing out a change now.

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. arch-ppc64x labels Mar 5, 2021
@dmitshur dmitshur added this to the Go1.17 milestone Mar 5, 2021
@gopherbot
Copy link

Change https://golang.org/cl/299789 mentions this issue: cmd/compile/internal: improve handling of DS form offsets on ppc64x

gopherbot pushed a commit that referenced this issue Mar 10, 2021
In the ppc64 ISA DS form loads and stores are restricted to offset
fields that are a multiple of 4. This is currently handled with checks
in the rules that generate MOVDload, MOVWload, MOVDstore and
MOVDstorezero to prevent invalid instructions from getting to the
assembler.

An unhandled case was discovered which led to the search for a better
solution to this problem. Now, instead of checking the offset in the
rules, this will be detected when processing these Ops in
ssaGenValues in ppc64/ssa.go. If the offset is not valid, the address
of the symbol to be loaded or stored will be computed using the base
register + offset, and that value used in the new base register.
With the full address in the base register, the offset field can be
zero in the instruction.

Updates #44739

Change-Id: I4f3c0c469ae70a63e3add295c9b55ea0e30ef9b3
Reviewed-on: https://go-review.googlesource.com/c/go/+/299789
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
@laboger
Copy link
Contributor

laboger commented Mar 22, 2021

I think this is done?

@cherrymui
Copy link
Member

I think so. Thanks.

@golang golang locked and limited conversation to collaborators Mar 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-ppc64x FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

5 participants