New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: odd inlining heuristic under mid-stack inlining #22310
Comments
Is there a metric to try to predict if a func call would happen often enough for the call overhead to be noticeable? For example if G is run inside a loop, but F isn't, that likely means that you would want to inline G instead of F. Not saying that this can be measured accurately on a per-func basis, but some cases like I described above could lead to a clear winner when a comparison is made. |
GCC uses a number of heuristics. GCC also has multiple inliners that work at different points during the compilation. Among the heuristics are:
Beyond these heuristics (I'm probably missing some) GCC estimates the time required to execute the inlined function at the call site, and compares that to the time required to call the function (these time estimates are all based on profiling data if available). If that is positive, and if the size increase is acceptable based on the command line arguments ( |
LLVM inliner makes use of synthetic/static profile (via static branch prediction, Wu & Larus type stuff) to drive inlining decisions when there is no actual dynamic execution profile. I would guess that GCC does the same. |
A specialization of Ian's heuristics for go would be:
See binary.Read for an example where there are many fast-paths that could be inlined. |
Here's a concrete example of interfaces with type-switch (simpler than binary.Read) that is a great candidate for inlining. func ValueOf(v interface{}) Value {
switch v := v.(type) {
case bool:
return valueOfBool(v)
case int32:
return valueOfInt(int64(v))
case int64:
return valueOfInt(v)
case uint32:
return valueOfUint(uint64(v))
case uint64:
return valueOfUint(v)
case float32:
return valueOfFloat(float64(v))
case float64:
return valueOfFloat(v)
case string:
return valueOfString(v)
case []byte:
return valueOfBytes(v)
case EnumNumber:
return valueOfEnum(v)
default:
panic(fmt.Sprintf("invalid type: %T", v))
}
} Each individual |
Punting to 1.13, too late for anything major in 1.12. |
Suppose:
When midstack inlining (#19348) is enabled (-l=4), F is inlinable under two conditions:
However, this leaves out a third possibility:
This seems counter-intuitive to me: that simplifying G from being non-inlineable (case 2) to inlineable (case 3), might cause F to no longer be inlineable.
On the other hand, maybe this is desirable: if we can't inline F+G, maybe it's better to make non-inline calls to F+G, than to inline F and make a non-inline call to G.
What inlining heuristics do other compilers use in situations like this?
/cc @ianlancetaylor
The text was updated successfully, but these errors were encountered: