-
-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize maybe_macroexpand
#1991
Conversation
Try this Pull Request!Open Julia and type: julia> import Pkg
julia> Pkg.activate(temp=true)
julia> Pkg.add(url="https://github.com/fonsp/Pluto.jl", rev="rh/maybe_macroexpand")
julia> using Pluto |
@@ -988,9 +992,15 @@ function maybe_macroexpand(ex::Expr; recursive = false, expand_bind = true) | |||
end | |||
|
|||
if recursive && (result isa Expr) | |||
Expr(result.head, maybe_macroexpand.(result.args; recursive = recursive, expand_bind = expand_bind)...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting rid of broadcasting is totally fine if it speeds stuff up, go for loops!
BUT, did you see how it performed with map(result.args) do x maybe_macroexpand(x, recursive=recursive, expand_bind=expand_bind) end
? Just curious if that takes similar compilation-related performance hits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious if that takes similar compilation-related performance hits
Noice. Well spotted. map
is fine too.
Benchmarks below. All after restarting Julia with julia --project -ie 'using Pluto'
.
with broadcasting (original)
julia> @time @eval Pluto.ExpressionExplorer.maybe_macroexpand(:(@time 1));
0.306775 seconds (894.16 k allocations: 48.205 MiB, 22.07% gc time, 99.59% compilation time)
I think that this is so bad because the types are unknown and multiple types show up in various calls in the recursion which need to be compiled just-in-time. Also, broadcasting takes long to compile. See @code_typed identity.([1, 2])
for more information.
with map
julia> @time @eval Pluto.ExpressionExplorer.maybe_macroexpand(:(@time 1));
0.079647 seconds (86.74 k allocations: 4.639 MiB, 101.12% compilation time)
with a loop
julia> @time @eval Pluto.ExpressionExplorer.maybe_macroexpand(:(@time 1));
0.090846 seconds (87.07 k allocations: 4.655 MiB, 99.57% compilation time)
This PR optimizes and fixes inference for
maybe_macroexpand
andsplit_funcname
. This reduces the time to first X (TTFX) and running time.The reason that better type inference causes lower TTFX is because
maybe_macroexpand
is called somewhere in theSessionActions.open
. Via theprecompile
that we've put onSessionActions.open
, the compiler will use type inference to figure out what methods it needs to be compile. However, if the compiler cannot infer some type when stepping through all the function bodies, it cannot compile the called methods. For example, we can make a functiong
which callsf
but make the object that is passed intof
non-inferable viaBase.inferencebarrier
and callprecompile
on the outer function:Now, we can see that Julia didn't compile any specializations:
The compilation will happen just-in-time when we call
g
:On the other hand, without the inferencebarrier, precompilation hits the inner and outer function straight away:
Another thing that saved surprisingly much compilation time is getting rid of broadcasting. It sounds a bit annoying and I love broadcasting too, but in some cases is it worth it if it means extra compilation time for every person that is running Pluto? See also Chris saying "I'm the compiler now!" (SciML/OrdinaryDiffEq.jl#1465). I don't think we have to get rid of all broadcasts, but mainly on places where type inference is falling back to
Any
and gives up precompiling. These cases can be quite easily found via JET.jl and@report_opt annotate_types=true SessionActions.open
. Just search for failed to optimize in the output.Anyway, time for the benchmarks:
maybe_macroexpand
Branch
main
Branch
rh/maybe_macroexpand
(this PR)So, that's about 50 MiB reduction in allocations on
SessionActions.open
.cc: @ghaetinger