New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: consider teaching prove about unexported integer fields #28952
Comments
The only disadvantage I can imagine with doing this is that exporting the field or adding a method that sets the offset to |
We don’t do much cross-function (whole package) analysis, and it’s all before SSA: typechecking, escape analysis, inlining. Doing whole package analysis in SSA is possible but would make concurrent compilation much less effective. And it’s not really built for it. I’d say this is probably unlikely to happen. |
I imagined that the answer would be no, but I still think it's worth writing this down in an issue :) For example, I think this makes using unsigned integers for offset/index fields reasonable, if BCE matters. |
Can the compiler hoist the bound check outside the loop, if the integer variable is only increasing and cannot overflow? In this case, if it panics it will panic on the first iteration, and if the first iteration doesn't panic, it won't panic. This would be a local optimization, and I think it would bring most of the performance benefit. |
@cherrymui you mean rewriting the code to be like:
I'm not sure how that would work, though. If there's an actual out of bounds panic, what line would the panic point to? Presumably the original Another workaround for users could be to do that transformation themselves, meaning that there wouldn't be a need for the field to be unsigned. That doesn't work at the moment, but could be a simple enough change to the prove pass, without needing to care about fields:
|
@mvdan yes. The compiler can annotate the panic as the same line number of |
Or say it this way: this statement does have a bound check (which is expected), just not in every iteration. |
This sounds eerily familiar: |
Sounds doable. I think I'll raise a separate issue about that non-field BCE failure I showed at the end of my last comment, as I don't think it's related to this issue or its potential fix. |
Separated that into #28956. |
@cherrymui if the loop iterations have observable side effects, I think it’d be strange to panic partway through at a line number previous to the loop. |
@josharian to be clear, I don't mean to do anything with line numbers. Currently the compiler generates a bound check, with the same line number, runs on every iteration. What I mean is that the compiler generates a bound check, with the same line number, runs only once. Side effect is a good point, though. I can see it is harder to transform code like
It'd probably need to unroll the loop once, which is more complex and may or may not be beneficial when the iteration number is small. |
Consider the following program:
Lengths are signed,
int
is simple, and it's the default integer type, so it's understandable why most indexes and offsets are signed. This works fine in our first example; we get no bounds checks.However, this breaks down when we start using struct field integers, if the loop logic is split between multiple methods. We can see that the second example has a bounds check. Only making the field unsigned is enough to convince prove/BCE that the offset is always within the bounds of the slice, presumably because in the signed integer field case the compiler isn't smart enough to figure out that
t.intOff >= 0
in all cases.For example, see https://go-review.googlesource.com/c/go/+/150919, where the JSON decoder gets ~1.5% faster by just making the offset field unsigned, but at the cost of adding more type conversions in a bunch of places.
I presume that we could prove that a signed integer field is never negative, granted that it's an unexported field and is only ever assigned non-negative values, such as:
t.field = N
, where N is bounded to be non-negativeif t.field < N { t.field++ }
, where N is bounded to be non-negativeThe JSON example above does use
d.off = len(d.data) + 1
to mark an EOF, which could get the field to overflow iflen(d.data)
is the maximum integer value, but I presume we could rewrite it to used.off = len(data)
. Otherwise, all the assignments seem to keep the offset within non-negative bounds.Ideally, the prove pass would be smart enough to also work with variations of these rules, such as the code below. But that could be tracked as a separate issue.
If there's a good reason why the prove pass can't ever deal with fields, please feel free to close this issue. I'm working with the assumption that it's possible, but hasn't been implemented yet.
/cc @aclements @rasky @josharian @bradfitz
The text was updated successfully, but these errors were encountered: