-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: optimize TrailingZerosNN by using info from prove pass #25077
Comments
It's actually easy to add to prove, once you know where to look :) You need to add this transformation (Ctz8 -> Ctz8NonZero) within simplifyBlock. It currently just handles OpSlicemask, so change it to a switch to make it handle more ops. Once you're there, use ft.isNonNegative() to gate the transformation. |
We need |
Right, so copy the ft.limits[v.ID] check from isNonNegative, and tweak it. I'm not 100% that ft.limits has the right information though. Try dumping the limits content for the right variable at the right point. If it doesn't, we need to teach update() that v!=0 for an unsigned variable implies lim.umin>0. I can help you with that. |
And BTW, this variant surely has the right limit information already: for x > 0 {
sink = bits.TrailingZeros64(x) // use result
x &= x - 1
} |
Thanks, @rasky, this helps a lot. |
Change https://golang.org/cl/109358 mentions this issue: |
A very typical use of math/bits.TrailingZerosNN is to visit all set bits.
(Among other things, I'd like to do this in some hot code in the runtime.)
This currently compiles on amd64 to:
However, since we know that
x != 0
in the loop, we should be able to skip the CMOVEQ stuff and just use BSFQ directly. Similar (though less dramatic) optimizations apply for small int widths.This seems like information that the prove pass has readily available. I haven't been able to figure out how to use it, though; the prove pass is complicated, and I'm not sure where best to make the change.
There seem like two reasonable options for encoding the information: Add new Ctz8NonZero ops, and change the value's op from Ctz8 to Ctz8NonZero, or giving Ctz8 an auxint indicating non-zero-ness. I weakly prefer the former.
@rasky any chance you could help out on this, if it is easy?
cc also @aclements
The text was updated successfully, but these errors were encountered: