You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, since we know that x != 0 in the loop, we should be able to skip the CMOVEQ stuff and just use BSFQ directly. Similar (though less dramatic) optimizations apply for small int widths.
This seems like information that the prove pass has readily available. I haven't been able to figure out how to use it, though; the prove pass is complicated, and I'm not sure where best to make the change.
There seem like two reasonable options for encoding the information: Add new Ctz8NonZero ops, and change the value's op from Ctz8 to Ctz8NonZero, or giving Ctz8 an auxint indicating non-zero-ness. I weakly prefer the former.
@rasky any chance you could help out on this, if it is easy?
It's actually easy to add to prove, once you know where to look :)
You need to add this transformation (Ctz8 -> Ctz8NonZero) within simplifyBlock. It currently just handles OpSlicemask, so change it to a switch to make it handle more ops. Once you're there, use ft.isNonNegative() to gate the transformation.
A very typical use of math/bits.TrailingZerosNN is to visit all set bits.
(Among other things, I'd like to do this in some hot code in the runtime.)
This currently compiles on amd64 to:
However, since we know that
x != 0
in the loop, we should be able to skip the CMOVEQ stuff and just use BSFQ directly. Similar (though less dramatic) optimizations apply for small int widths.This seems like information that the prove pass has readily available. I haven't been able to figure out how to use it, though; the prove pass is complicated, and I'm not sure where best to make the change.
There seem like two reasonable options for encoding the information: Add new Ctz8NonZero ops, and change the value's op from Ctz8 to Ctz8NonZero, or giving Ctz8 an auxint indicating non-zero-ness. I weakly prefer the former.
@rasky any chance you could help out on this, if it is easy?
cc also @aclements
The text was updated successfully, but these errors were encountered: