Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inlining: avoid source deserialization by using volatile inference result #51934

Merged
merged 5 commits into from
Nov 6, 2023

Conversation

aviatesk
Copy link
Member

@aviatesk aviatesk commented Oct 30, 2023

Currently the inlining algorithm is allowed to use inferred source of
const-prop'ed call that is always locally available (since const-prop'
result isn't cached globally). For non const-prop'ed and globally cached
calls, however, it undergoes a more expensive process, making a
round-trip through serialized inferred source.

We can improve efficiency by bypassing the serialization round-trip for
newly-inferred and globally-cached frames. As these frames are never
cached locally, they can be viewed as volatile. This means we can use
their source destructively while inline-expanding them.

The benchmark results show that this optimization achieves 2-4%
allocation reduction and about 5% speed up in the real-world-ish
compilation targets (allinference).

Note that it would be more efficient to propagate IRCode object
directly and skip inflation from CodeInfo to IRCode as experimented
in #47137, but currently the round-trip through
CodeInfo-representation is necessary because it often leads to better
CFG simplification while cfg_simplify! being expensive (xref: #51960).

@aviatesk aviatesk changed the title avoid source deserialization by using locally available inferred source inlining: avoid source deserialization by using locally available inferred source Oct 30, 2023
@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

Base automatically changed from avi/const_results to master October 31, 2023 01:32
@aviatesk aviatesk force-pushed the avi/inferred_result branch from 4479657 to 236295b Compare October 31, 2023 01:36
@aviatesk
Copy link
Member Author

@nanosoldier runbenchmarks("inference", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

@vtjnash
Copy link
Member

vtjnash commented Oct 31, 2023

An interesting optimization additional here might be to also skip copying it the first time and instead delete it, so that only later uses of the same result make a copy (from the serialized cache), if present. If that makes sense?

@aviatesk aviatesk force-pushed the avi/inferred_result branch from 236295b to da70026 Compare October 31, 2023 07:26
@aviatesk
Copy link
Member Author

@nanosoldier runbenchmarks("inference", vs=":master")

@nanosoldier
Copy link
Collaborator

Your job failed.

@aviatesk aviatesk force-pushed the avi/inferred_result branch from da70026 to 1a28775 Compare October 31, 2023 07:49
@aviatesk
Copy link
Member Author

@nanosoldier runbenchmarks("inference", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

@aviatesk
Copy link
Member Author

An interesting optimization additional here might be to also skip copying it the first time and instead delete it, so that only later uses of the same result make a copy (from the serialized cache), if present. If that makes sense?

It looks like your idea enhances the performance pretty much. Thanks!

@aviatesk aviatesk force-pushed the avi/inferred_result branch from 1a28775 to c882bf4 Compare October 31, 2023 09:44
@aviatesk
Copy link
Member Author

@nanosoldier runbenchmarks("inference", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@aviatesk aviatesk force-pushed the avi/inferred_result branch from c882bf4 to d901ba0 Compare October 31, 2023 14:31
@aviatesk
Copy link
Member Author

@nanosoldier runbenchmarks("inference", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

@aviatesk aviatesk force-pushed the avi/inferred_result branch 2 times, most recently from 9b27727 to f2bd35a Compare November 1, 2023 17:05
@aviatesk
Copy link
Member Author

aviatesk commented Nov 1, 2023

@nanosoldier runbenchmarks("inference", vs=":master")

@aviatesk aviatesk force-pushed the avi/inferred_result branch from f2bd35a to 4808e7a Compare November 1, 2023 17:39
@aviatesk
Copy link
Member Author

aviatesk commented Nov 1, 2023

@nanosoldier runbenchmarks("inference", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@aviatesk aviatesk force-pushed the avi/inferred_result branch from 4808e7a to 2019fc3 Compare November 2, 2023 01:17
@aviatesk aviatesk changed the title inlining: avoid source deserialization by using locally available inferred source inlining: avoid source deserialization by using volatile inference result Nov 2, 2023
Currently the inlining algorithm is allowed to use inferred source of
const-prop'ed call that is always locally available (since const-prop'
result isn't cached globally). For non const-prop'ed and globally cached
calls, however, it undergoes a more expensive process, making a
round-trip through serialized inferred source. We can actually bypass
this expensive deserialization when inferred source for globally-cached
result is available locally, i.e. when it has been inferred in the same
inference shot.

Note that it would be more efficient to propagate `IRCode` object
directly and skip inflation from `CodeInfo` to `IRCode` as experimented
in #47137, but currently the round-trip through `CodeInfo`-representation
is necessary because it often leads to better CFG simplification and
`cfg_simplify!` seems to be still expensive.
@aviatesk aviatesk force-pushed the avi/inferred_result branch from 2019fc3 to b8a0065 Compare November 6, 2023 02:41
@aviatesk
Copy link
Member Author

aviatesk commented Nov 6, 2023

@nanosoldier runbenchmarks("inference", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@aviatesk aviatesk merged commit fae6b78 into master Nov 6, 2023
3 checks passed
@aviatesk aviatesk deleted the avi/inferred_result branch November 6, 2023 15:34
aviatesk added a commit that referenced this pull request Nov 7, 2023
Destructive inlining introduced by #51934 implicitly
presupposes that inferred `CodeInfo` source has been compressed to the
`String` format when cached, which is not generally true for external
`AbstractInterpreter`s that may disable the `may_compress`
and/or `may_discard_trees` settings. This commit adds a safeguard to
ensure the eligibility of such `CodeInfo` source propagated by
`VolatileInferenceResult` for destructive inlining.
aviatesk added a commit to JuliaDebug/Cthulhu.jl that referenced this pull request Nov 7, 2023
aviatesk added a commit to JuliaDebug/Cthulhu.jl that referenced this pull request Nov 7, 2023
aviatesk added a commit that referenced this pull request Nov 8, 2023
Destructive inlining introduced by #51934 implicitly
presupposes that inferred `CodeInfo` source has been compressed to the
`String` format when cached, which is not generally true for external
`AbstractInterpreter`s that may disable the `may_compress`
and/or `may_discard_trees` settings. This commit adds a safeguard to
ensure the eligibility of such `CodeInfo` source propagated by
`VolatileInferenceResult` for destructive inlining.
aviatesk added a commit that referenced this pull request Nov 8, 2023
Destructive inlining introduced by #51934 implicitly
presupposes that inferred `CodeInfo` source has been compressed to the
`String` format when cached, which is not generally true for external
`AbstractInterpreter`s that may disable the `may_compress` and/or
`may_discard_trees` settings. This commit adds a safeguard to ensure the
eligibility of such `CodeInfo` source propagated by
`VolatileInferenceResult` for destructive inlining.
aviatesk added a commit to JuliaDebug/Cthulhu.jl that referenced this pull request Nov 9, 2023
* adjust to JuliaLang/julia#51934

* more followups

* disable some tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants