inlining: avoid source deserialization by using volatile inference result #51934

aviatesk · 2023-10-30T11:39:15Z

Currently the inlining algorithm is allowed to use inferred source of
const-prop'ed call that is always locally available (since const-prop'
result isn't cached globally). For non const-prop'ed and globally cached
calls, however, it undergoes a more expensive process, making a
round-trip through serialized inferred source.

We can improve efficiency by bypassing the serialization round-trip for
newly-inferred and globally-cached frames. As these frames are never
cached locally, they can be viewed as volatile. This means we can use
their source destructively while inline-expanding them.

The benchmark results show that this optimization achieves 2-4%
allocation reduction and about 5% speed up in the real-world-ish
compilation targets (allinference).

Note that it would be more efficient to propagate IRCode object
directly and skip inflation from CodeInfo to IRCode as experimented
in #47137, but currently the round-trip through
CodeInfo-representation is necessary because it often leads to better
CFG simplification while cfg_simplify! being expensive (xref: #51960).

nanosoldier · 2023-10-30T12:39:06Z

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

aviatesk · 2023-10-31T01:39:37Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2023-10-31T02:39:33Z

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

vtjnash · 2023-10-31T02:45:18Z

An interesting optimization additional here might be to also skip copying it the first time and instead delete it, so that only later uses of the same result make a copy (from the serialized cache), if present. If that makes sense?

aviatesk · 2023-10-31T07:29:07Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2023-10-31T07:43:01Z

Your job failed.

aviatesk · 2023-10-31T07:49:58Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2023-10-31T08:49:44Z

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

aviatesk · 2023-10-31T09:17:28Z

An interesting optimization additional here might be to also skip copying it the first time and instead delete it, so that only later uses of the same result make a copy (from the serialized cache), if present. If that makes sense?

It looks like your idea enhances the performance pretty much. Thanks!

aviatesk · 2023-10-31T09:44:26Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2023-10-31T10:44:23Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

aviatesk · 2023-10-31T14:31:31Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2023-10-31T15:31:30Z

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

aviatesk · 2023-11-01T17:05:46Z

@nanosoldier runbenchmarks("inference", vs=":master")

aviatesk · 2023-11-01T17:39:49Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2023-11-01T18:05:39Z

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

nanosoldier · 2023-11-01T18:39:35Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

Currently the inlining algorithm is allowed to use inferred source of const-prop'ed call that is always locally available (since const-prop' result isn't cached globally). For non const-prop'ed and globally cached calls, however, it undergoes a more expensive process, making a round-trip through serialized inferred source. We can actually bypass this expensive deserialization when inferred source for globally-cached result is available locally, i.e. when it has been inferred in the same inference shot. Note that it would be more efficient to propagate `IRCode` object directly and skip inflation from `CodeInfo` to `IRCode` as experimented in #47137, but currently the round-trip through `CodeInfo`-representation is necessary because it often leads to better CFG simplification and `cfg_simplify!` seems to be still expensive.

aviatesk · 2023-11-06T07:37:40Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2023-11-06T08:37:29Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

Destructive inlining introduced by #51934 implicitly presupposes that inferred `CodeInfo` source has been compressed to the `String` format when cached, which is not generally true for external `AbstractInterpreter`s that may disable the `may_compress` and/or `may_discard_trees` settings. This commit adds a safeguard to ensure the eligibility of such `CodeInfo` source propagated by `VolatileInferenceResult` for destructive inlining.

* adjust to JuliaLang/julia#51934 * more followups * disable some tests

aviatesk mentioned this pull request Oct 30, 2023

inference: don't put globally-cached results into inference local cache #51888

Merged

aviatesk changed the title ~~avoid source deserialization by using locally available inferred source~~ inlining: avoid source deserialization by using locally available inferred source Oct 30, 2023

Base automatically changed from avi/const_results to master October 31, 2023 01:32

aviatesk force-pushed the avi/inferred_result branch from 4479657 to 236295b Compare October 31, 2023 01:36

aviatesk force-pushed the avi/inferred_result branch from 236295b to da70026 Compare October 31, 2023 07:26

aviatesk force-pushed the avi/inferred_result branch from da70026 to 1a28775 Compare October 31, 2023 07:49

aviatesk force-pushed the avi/inferred_result branch from 1a28775 to c882bf4 Compare October 31, 2023 09:44

aviatesk force-pushed the avi/inferred_result branch from c882bf4 to d901ba0 Compare October 31, 2023 14:31

aviatesk force-pushed the avi/inferred_result branch 2 times, most recently from 9b27727 to f2bd35a Compare November 1, 2023 17:05

aviatesk force-pushed the avi/inferred_result branch from f2bd35a to 4808e7a Compare November 1, 2023 17:39

aviatesk force-pushed the avi/inferred_result branch from 4808e7a to 2019fc3 Compare November 2, 2023 01:17

aviatesk changed the title ~~inlining: avoid source deserialization by using locally available inferred source~~ inlining: avoid source deserialization by using volatile inference result Nov 2, 2023

aviatesk added 5 commits November 6, 2023 11:41

implement Jameson's idea

7068823

don't empty! const call results for the usages in external packages

b68a07a

go further, directly use SemiConcreteResult IR

ffb1a12

propagate VolatileInferenceResult(::InferenceResult)

b8a0065

aviatesk force-pushed the avi/inferred_result branch from 2019fc3 to b8a0065 Compare November 6, 2023 02:41

aviatesk merged commit fae6b78 into master Nov 6, 2023
3 checks passed

aviatesk deleted the avi/inferred_result branch November 6, 2023 15:34

aviatesk mentioned this pull request Nov 7, 2023

allow destructive inlining only when the source is volatile #52062

Merged

aviatesk added a commit to JuliaDebug/Cthulhu.jl that referenced this pull request Nov 7, 2023

adjust to JuliaLang/julia#51934

264f2ec

aviatesk added a commit to JuliaDebug/Cthulhu.jl that referenced this pull request Nov 7, 2023

adjust to JuliaLang/julia#51934

9de5743

aviatesk added a commit to JuliaDebug/Cthulhu.jl that referenced this pull request Nov 9, 2023

adjust to JuliaLang/julia#51934 (#514)

8262aca

* adjust to JuliaLang/julia#51934 * more followups * disable some tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inlining: avoid source deserialization by using volatile inference result #51934

inlining: avoid source deserialization by using volatile inference result #51934

aviatesk commented Oct 30, 2023 •

edited

Loading

nanosoldier commented Oct 30, 2023

aviatesk commented Oct 31, 2023

nanosoldier commented Oct 31, 2023

vtjnash commented Oct 31, 2023

aviatesk commented Oct 31, 2023

nanosoldier commented Oct 31, 2023

aviatesk commented Oct 31, 2023

nanosoldier commented Oct 31, 2023

aviatesk commented Oct 31, 2023

aviatesk commented Oct 31, 2023

nanosoldier commented Oct 31, 2023

aviatesk commented Oct 31, 2023

nanosoldier commented Oct 31, 2023

aviatesk commented Nov 1, 2023

aviatesk commented Nov 1, 2023

nanosoldier commented Nov 1, 2023

nanosoldier commented Nov 1, 2023

aviatesk commented Nov 6, 2023

nanosoldier commented Nov 6, 2023

inlining: avoid source deserialization by using volatile inference result #51934

inlining: avoid source deserialization by using volatile inference result #51934

Conversation

aviatesk commented Oct 30, 2023 • edited Loading

nanosoldier commented Oct 30, 2023

aviatesk commented Oct 31, 2023

nanosoldier commented Oct 31, 2023

vtjnash commented Oct 31, 2023

aviatesk commented Oct 31, 2023

nanosoldier commented Oct 31, 2023

aviatesk commented Oct 31, 2023

nanosoldier commented Oct 31, 2023

aviatesk commented Oct 31, 2023

aviatesk commented Oct 31, 2023

nanosoldier commented Oct 31, 2023

aviatesk commented Oct 31, 2023

nanosoldier commented Oct 31, 2023

aviatesk commented Nov 1, 2023

aviatesk commented Nov 1, 2023

nanosoldier commented Nov 1, 2023

nanosoldier commented Nov 1, 2023

aviatesk commented Nov 6, 2023

nanosoldier commented Nov 6, 2023

aviatesk commented Oct 30, 2023 •

edited

Loading