New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: combine AND 0xff and MOVQload into MOVBQZXload #15105
Comments
|
CL https://golang.org/cl/21508 mentions this issue. |
Where possible replace ANDQ with MOV?ZX. Takes care that we don't regress wrt bounds checking, for example [1000]int{}[i&255]. According to "Intel 64 and IA-32 Architectures Optimization Reference Manual" Section: "3.5.1.13 Zero-Latency MOV Instructions" MOV?ZX instructions have zero latency on newer processors. Updates #15105 Change-Id: I63539fdbc5812d5563aa1ebc49eca035bd307997 Reviewed-on: https://go-review.googlesource.com/21508 Reviewed-by: Айнар Гарипов <gugl.zadolbal@gmail.com> Reviewed-by: David Chase <drchase@google.com>
code from
code from
So we are now rewriting the AND to a MOVZX. Alexandru is right, combining the extension with the load in |
The LoadReg case reminds me a lot of #15300. I know you have serious reservations about a post-regalloc peep pass, Keith, but it does seem like it might be the right way to handle this small, specific class of optimizations. And perhaps with docs and code review we could keep it from growing past the bare minimum necessary. |
CL https://golang.org/cl/28190 mentions this issue. |
CL https://golang.org/cl/31639 mentions this issue. |
According to "Intel 64 and IA-32 Architectures Optimization Reference Manual" Section: "3.5.1.13 Zero-Latency MOV Instructions" MOV?ZX instructions have zero latency on newer processors. during make.bash: (ANDLconst [0xFF] x) -> (MOVBQZX x) applies 422 times (ANDLconst [0xFFFF] x) -> (MOVWQZX x) applies 114 times Updates #15105 Change-Id: I10933af599de3c26449c52f4b5cd859331028f39 Reviewed-on: https://go-review.googlesource.com/31639 TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: David Chase <drchase@google.com>
For cases where we already have the ops, combine sign or zero extension with the previous load (even if the load is larger width). Update #15105 Change-Id: I76c5ddd69e1f900d2a17d35503083bd3b4978e48 Reviewed-on: https://go-review.googlesource.com/28190 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
I noticed that for uint&0xFF go tip on amd64 generates e.g.:
0x001d 00029 (main.go:10) MOVQ "".u+8(FP), DX
0x0022 00034 (main.go:10) ANDQ $255, DX
http://play.golang.org/p/CKhi_nt615
It might be possible to make these a zero latency MOV that does not require an ALU by using MOVZX for the applicable AND values or it might be possible to change the previous load to be MOVZX.
0x0022 00034 (main.go:10) MOVBQZX DL, DX
See "Intel 64 and IA-32 Architectures Optimization Reference Manual" Section: "3.5.1.13 Zero-Latency MOV Instructions"
The text was updated successfully, but these errors were encountered: