-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inlining: avoid source deserialization by using volatile inference result #51934
Conversation
Your benchmark job has completed - no performance regressions were detected. A full report can be found here. |
4479657
to
236295b
Compare
@nanosoldier |
Your benchmark job has completed - no performance regressions were detected. A full report can be found here. |
An interesting optimization additional here might be to also skip copying it the first time and instead delete it, so that only later uses of the same result make a copy (from the serialized cache), if present. If that makes sense? |
236295b
to
da70026
Compare
@nanosoldier |
Your job failed. |
da70026
to
1a28775
Compare
@nanosoldier |
Your benchmark job has completed - no performance regressions were detected. A full report can be found here. |
It looks like your idea enhances the performance pretty much. Thanks! |
1a28775
to
c882bf4
Compare
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
c882bf4
to
d901ba0
Compare
@nanosoldier |
Your benchmark job has completed - no performance regressions were detected. A full report can be found here. |
9b27727
to
f2bd35a
Compare
@nanosoldier |
f2bd35a
to
4808e7a
Compare
@nanosoldier |
Your benchmark job has completed - no performance regressions were detected. A full report can be found here. |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
4808e7a
to
2019fc3
Compare
Currently the inlining algorithm is allowed to use inferred source of const-prop'ed call that is always locally available (since const-prop' result isn't cached globally). For non const-prop'ed and globally cached calls, however, it undergoes a more expensive process, making a round-trip through serialized inferred source. We can actually bypass this expensive deserialization when inferred source for globally-cached result is available locally, i.e. when it has been inferred in the same inference shot. Note that it would be more efficient to propagate `IRCode` object directly and skip inflation from `CodeInfo` to `IRCode` as experimented in #47137, but currently the round-trip through `CodeInfo`-representation is necessary because it often leads to better CFG simplification and `cfg_simplify!` seems to be still expensive.
2019fc3
to
b8a0065
Compare
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
Destructive inlining introduced by #51934 implicitly presupposes that inferred `CodeInfo` source has been compressed to the `String` format when cached, which is not generally true for external `AbstractInterpreter`s that may disable the `may_compress` and/or `may_discard_trees` settings. This commit adds a safeguard to ensure the eligibility of such `CodeInfo` source propagated by `VolatileInferenceResult` for destructive inlining.
Destructive inlining introduced by #51934 implicitly presupposes that inferred `CodeInfo` source has been compressed to the `String` format when cached, which is not generally true for external `AbstractInterpreter`s that may disable the `may_compress` and/or `may_discard_trees` settings. This commit adds a safeguard to ensure the eligibility of such `CodeInfo` source propagated by `VolatileInferenceResult` for destructive inlining.
Destructive inlining introduced by #51934 implicitly presupposes that inferred `CodeInfo` source has been compressed to the `String` format when cached, which is not generally true for external `AbstractInterpreter`s that may disable the `may_compress` and/or `may_discard_trees` settings. This commit adds a safeguard to ensure the eligibility of such `CodeInfo` source propagated by `VolatileInferenceResult` for destructive inlining.
* adjust to JuliaLang/julia#51934 * more followups * disable some tests
Currently the inlining algorithm is allowed to use inferred source of
const-prop'ed call that is always locally available (since const-prop'
result isn't cached globally). For non const-prop'ed and globally cached
calls, however, it undergoes a more expensive process, making a
round-trip through serialized inferred source.
We can improve efficiency by bypassing the serialization round-trip for
newly-inferred and globally-cached frames. As these frames are never
cached locally, they can be viewed as volatile. This means we can use
their source destructively while inline-expanding them.
The benchmark results show that this optimization achieves 2-4%
allocation reduction and about 5% speed up in the real-world-ish
compilation targets (
allinference
).Note that it would be more efficient to propagate
IRCode
objectdirectly and skip inflation from
CodeInfo
toIRCode
as experimentedin #47137, but currently the round-trip through
CodeInfo
-representation is necessary because it often leads to betterCFG simplification while
cfg_simplify!
being expensive (xref: #51960).