Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.5.0-rc2 Up to 2x performance regression loading precompiled packages #18030

Closed
ufechner7 opened this issue Aug 15, 2016 · 17 comments
Closed

0.5.0-rc2 Up to 2x performance regression loading precompiled packages #18030

ufechner7 opened this issue Aug 15, 2016 · 17 comments
Labels
compiler:precompilation Precompilation of modules performance Must go faster
Milestone

Comments

@ufechner7
Copy link

Julia 0.4.6

tic(); using PyPlot; toc()
elapsed time: 2.569278583 seconds

Julia 0.5.0-rc2

tic(); using PyPlot; toc();
elapsed time: 5.360103935 seconds

This happens with precompiled packages, therefore - to my understanding - the root cause cannot be, that the new llvm version is slower compiling.
It would be very nice, if this regression could be fixed.
Computer: Linux 64 bit, Ubuntu 14.04, i7-3770 CPU @ 3.40GHz × 4

@yuyichao
Copy link
Contributor

Precompile doesn't remove codegen overhead. Although it can't explain all the difference, it seems that at least 50% of the time is indeed spent in llvm.

@ufechner7
Copy link
Author

Some more results. They are not consistent: Loading Gtk (master branch) is even faster on 0.5, compared to 0.4.6. The slow down factor of loading JuMP is 1.5 .

Julia 0.4.6
tic(); using PyPlot; toc()
elapsed time: 2.569278583 seconds

julia> tic(); using Gtk; toc();
elapsed time: 1.424508958 seconds

julia> tic(); using JuMP; toc();
elapsed time: 0.617230441 seconds


Julia 0.5.0-rc2

tic(); using PyPlot; toc();
elapsed time: 5.360103935 seconds

julia> tic(); using Gtk; toc();
elapsed time: 0.943486983 seconds

julia> tic(); using JuMP; toc();
elapsed time: 0.910472513 seconds

@ufechner7 ufechner7 changed the title 0.5.0-rc2 2x performance regression loading precompiled packages 0.5.0-rc2 Up to 2x performance regression loading precompiled packages Aug 15, 2016
@ufechner7
Copy link
Author

Why is precompilation not removing the codegen overhead? I thought, that this is exactly the purpose of precompilation?

@tkelman
Copy link
Contributor

tkelman commented Aug 15, 2016

we don't currently save native code in the .ji files, so llvm still has work to do

@ufechner7
Copy link
Author

ufechner7 commented Aug 15, 2016

Which data format is stored in the .ji files?

@yuyichao
Copy link
Contributor

Serialized (inferred) AST.

@ViralBShah ViralBShah added performance Must go faster compiler:precompilation Precompilation of modules labels Aug 15, 2016
@ViralBShah ViralBShah added this to the 0.5.x milestone Aug 15, 2016
@vtjnash vtjnash removed their assignment Aug 15, 2016
@vtjnash
Copy link
Member

vtjnash commented Aug 15, 2016

half is due to llvm. the other half is due to jl_recache_types. the worklist in that function is optimized for hundreds of items (as it had on v0.4). on v0.5, it now often has hundreds of thousands.

@ufechner7
Copy link
Author

Why does it have so many items now?

@ViralBShah
Copy link
Member

From a user perspective, it would be nice if PyPlot loading time is as before or lesser. I hope there is something we can do here.

@stevengj
Copy link
Member

@ufechner7, in 0.5, every function corresponds to a unique type.

@ufechner7
Copy link
Author

Is there a way to determine the number of unique types, that are defined?

@JeffBezanson
Copy link
Member

Most of the types that exist in the system are not explicitly defined, but derived --- mostly tuples of various combinations of types actually. It's possible to get counts by poking into the system, but I'm not really sure how it would help to know.

@ufechner7
Copy link
Author

Well, if the performance of jl_recache_types for large number of types is the bottleneck, it would be good to know the number in different scenarios. For example, is this number after loading PyPlot really so much higher then after loading Gtk.

@stevengj
Copy link
Member

Maybe just add printf("flagref_list.len = %zd", flagref_list.len); in jl_recache_types?

@ufechner7
Copy link
Author

Same results with 0.5-rc3 .

@vtjnash
Copy link
Member

vtjnash commented Aug 24, 2016

fixed by #18191. now there should only be dozens of types to recache.

@vtjnash vtjnash closed this as completed Aug 24, 2016
@vtjnash
Copy link
Member

vtjnash commented Aug 24, 2016

fwiw, I blame @carnaval for this regression. :P

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:precompilation Precompilation of modules performance Must go faster
Projects
None yet
Development

No branches or pull requests

7 participants