-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some updates and fixes to inlining cost #27857
Conversation
@nanosoldier |
Can we get that |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
Yes, you can have always_inline, but beware: if we can't infer the types of arguments ( |
I suppose another alternative is to aggressively eliminate many current But there's a lot of sense in this proposal. |
return 0 | ||
end | ||
argcost = 0 | ||
for a in ex.args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motivation for making this non-recursive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The structure of the IR has changed such that calls no longer contain other calls, so it's just unnecessary.
5acca3f
to
6cb12f0
Compare
Performance summary: some nice speedups from avoiding inlining things with dynamic dispatches, but some big slowdowns in find functions. The slowdowns were mostly related to the use of |
@nanosoldier |
6cb12f0
to
19d1f8b
Compare
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
19d1f8b
to
514767a
Compare
Fixes #26446 |
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
514767a
to
8f2e0fb
Compare
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
@nanosoldier |
Your benchmark job has completed - no performance regressions were detected. A full report can be found here. cc @ararslan |
base/compiler/optimize.jl
Outdated
end | ||
return plus_saturate(argcost, T_IFUNC_COST[iidx]) | ||
elseif ftyp isa DataType && isdefined(ftyp, :instance) | ||
f = ftyp.instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch should be dead-code
(also, it's available as the singleton_type
function)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might indicate a problem elsewhere, but it's not dead code --- in fact I needed this to fix a regression in an earlier version of the PR. A call to Core.sizeof
had typeof(Core.sizeof)
as the type of the function.
base/compiler/optimize.jl
Outdated
end | ||
a = ex.args[2] | ||
if a isa Expr | ||
cost += statement_cost(a, -1, src, spvals, slottypes, params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cost = plus_saturate(cost, ...)
- Instead of always inlining functions marked at-inline, increase the cost threshold 20x - Don't inline functions inferred not to return - statement_cost no longer needs to look at nested Exprs in general - Fix cost of `:copyast`
8f2e0fb
to
e246245
Compare
Bump for Issues like JuliaArrays/StaticArrays.jl#494 are quite frustrating when you always have to worry that the threshold might have gotten reached. |
There might be something wrong with our cost model in this case --- a few dozen arithmetic ops should not be a problem. I'm worried about dynamic dispatches that take thousands of times longer than that. We can add forceinline as well though. |
The example there has a SIMD loop though so might be why it doesn't inline. Anyway, having a way to force would be good in the inevitable cases where the cost model gets it wrong. |
If the function type is maximally typed ( |
This implements the suggestion in #27857 (comment)
…29258) This implements the suggestion in #27857 (comment)
…29258) This implements the suggestion in #27857 (comment) (cherry picked from commit 8d6c1ce)
…29258) This implements the suggestion in #27857 (comment) (cherry picked from commit 8d6c1ce)
…29258) This implements the suggestion in #27857 (comment) (cherry picked from commit 8d6c1ce)
@inline
, increase the cost threshold10x20xstatement_cost
no longer needs to look at nested Exprs in general:copyast
This also fixes the regression in
mapslices
(#27417). It was due to inlining a function marked@inline
, even though the argument types were abstract and the body contained many dynamic calls. I believe most compilers just increase the threshold (by a lot) for functions markedinline
, and have a separate always_inline to truly force inlining. We should do the same thing; otherwise we end up inlining silly things the user is unlikely to have intended.