Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: loop optimization #24240

Open
MichaelTJones opened this issue Mar 4, 2018 · 10 comments
Open

cmd/compile: loop optimization #24240

MichaelTJones opened this issue Mar 4, 2018 · 10 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@MichaelTJones
Copy link
Contributor

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

master (1.11 dev)

Does this issue reproduce with the latest release?

this is a compilation strategy proposal

What operating system and processor architecture are you using (go env)?

amd64 / Darwin (macbook pro)

What did you do?

I have code that iterates without referencing the loop variable. Changing that from:

for i := 0; i < n; i++ {
// no access to i
}

...to...

for i := -n; i < 0; i++ {
// no access to i
}

saves me 4%-5% in my test cases.

Looking at the .S files, I see that there are two gains (as expected): one is that the loop limit needs only be addressable and referenced at the start of the loop--which frees a register over the body of the loop; and also, because the test is against zero and there are dedicated instructions for that rather than loading immediate literals or otherwise.

It seems (from afar) that there could be a rule along the lines of:

if loop variable not referenced in body
and assignment has block structure (i:=)
and type of loop variable is signed
and incr RHS does not depend on additive offset to variable (i++ ok, i*=2 not),
then transform "for v := low; v OP high; incr" into "for v := low-high; v OP 0; incr"

I don't have any comprehensive data about how common this situation may be, but it does help.

What did you expect to see?

no change because i expected the compiler to be doing this.

What did you see instead?

5% speedup

@bradfitz bradfitz changed the title Loop optimization cmd/compile: loop optimization Mar 4, 2018
@bradfitz bradfitz added this to the Unplanned milestone Mar 4, 2018
@bradfitz bradfitz added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 4, 2018
@rasky
Copy link
Member

rasky commented Mar 4, 2018

Since you have the code handy, can you provide a working example that includes a benchmark?

@MichaelTJones
Copy link
Contributor Author

MichaelTJones commented Mar 4, 2018 via email

@RalphCorderoy
Copy link

When n is negated, if that's the style chosen to re-write the loop, then look out for n starting off as the smallest possible value as two's complement representation is asymmetrical.

@RLH
Copy link
Contributor

RLH commented Mar 4, 2018

Isn't the decrement of i followed by a jz (or jnz) another way to write this.
for i := n; i != 0; i-- { }

@RalphCorderoy
Copy link

Writing assembly for various architectures, a pre-decrement test against zero is the normal way I'm used to seeing, yes. But I wanted to warn against a flaw in the OP's suggestion.

@MichaelTJones
Copy link
Contributor Author

MichaelTJones commented Mar 4, 2018 via email

@RalphCorderoy
Copy link

Something else to ponder in implementing this... I hear some folk use a debugger; might they try and inspect the loop's variable and be confused that it's changed sign, or running in the opposite direction? Or perhaps DWARF can represent the transformation for the debugger to reverse? If other compilers have implemented this for other languages then they might give ideas.

@dgryski
Copy link
Contributor

dgryski commented Mar 5, 2018

OTOH, stepping through C code that's been built withgcc -O3 is pretty painful. It's fine as long as you're reading the assembly and not trying to match it up to the C code that produced it.

@MichaelTJones
Copy link
Contributor Author

Ok, here is a simplified and extracted example. I called it "loop_test.go"

https://play.golang.org/p/uRdymTPtTCi

Alas, i don't see such a big difference in this case. The code is not the same, not so much register pressure.

Summary: maybe this is not so important.

@RalphCorderoy
Copy link

I think it's still worth trying to measure gains across more tests, especially as some platforms have more register pressure than others. But I've also a niggling recollection that this has been suggested before and turned down because of debugger confusion, probably by @rsc. :-)

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

8 participants