Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving package load times #4373

Closed
amitmurthy opened this issue Sep 26, 2013 · 42 comments
Closed

Improving package load times #4373

amitmurthy opened this issue Sep 26, 2013 · 42 comments
Labels
performance Must go faster
Milestone

Comments

@amitmurthy
Copy link
Contributor

Some background first:

  • AWS.jl (https://github.com/amitmurthy/AWS.jl) has around 500+ type definitions and 900+ function definitions
  • The bulk of the code is currently pre-generated from the WSDL for EC2 . Handwritten for S3 since the S3 WSDL does not map perfectly onto the S3 REST API - but I could write a spec for S3 API generation too. The generated code is around 11000 lines.
  • @loladiro has provided a patch which moves the code generation to load time - Reorganization of Code Generation JuliaCloud/AWS.jl#6 .
  • However, the fact that we still have to process the huge number of types/functions at load time means that loading AWS.jl still takes 10 seconds to load on my fairly current laptop. More on lower spec'ed machines.
  • One suggestion has been to only expose a higher level, better abstracted EC2 API. In my experience, this does not work for any real world apps where you want the full power of the raw EC2 API to do everything from tagging your resources, filtering them, mounting volumes programmatically, etc. It is not just about starting/stopping machines.
  • The other suggestion given is to have a simpler generic API, which takes in all parameters as a key-value dict, and returns a generic XML object to be parsed by the user. Having specific functions and input/output types for each AWS function makes the user code more simpler, concise, easier to read and less prone to typo errors.
  • But, having said this, a typical use of AWS.jl may only use 2-3% of the APIs - it is just that we do not know which 2-3% of the types/functions are required.

To improve the above, I just wanted to sound out if either of the following approaches is feasible/makes sense :

Approach 1

  • Base provides a function syms_on_demand(syms::Vector{Symbol}, load_sym_cb::Function).
  • At load time, a Module file will call syms_on_demand with a list of symbols that it wants to be defined/loaded only when used. This list is recorded by the julia interpreter.
  • load_sym_cb(s::Symbol) is a callback that executes an appropriate @eval for the specified symbol
  • Upon encountering an undefined symbol (or a defined symbol also existing in a syms_on_demand list), the julia interpreter will execute the callback to define the same, remove it from its internal syms_on_demand list and then do a multiple dispatch.

Approach 2

  • A new macro @eval_on_demand <symbols> <code block> does the same as above, i.e., it registers the symbols (and associates it with the particular module), but does not evaluate the code block till required.

I am not familiar with the intricate details/issues related to code generation, but just thought I'll put this up for discussion. Also any alternate suggestions for improving the load time of AWS.jl are welcome.

@JeffBezanson
Copy link
Member

I'm against adding features to deal with this. We just have to make it faster.

@JeffBezanson
Copy link
Member

Features like pre-compiling things are reasonable however.

@Keno
Copy link
Member

Keno commented Sep 26, 2013

I'm with Jeff on this on. I also disagree that code generation is a good way to go about wrapping APIs, but that's a different can of worms. We should just get to static pre-compilation already ;).

@JeffBezanson
Copy link
Member

At least a couple seconds (I would guess maybe 3-4 out of the 10 seconds) are just in the front end. We could create __jlcache__ directories in packages to hold binary pre-processed representations without the full can of worms of static compilation.

@ivarne
Copy link
Member

ivarne commented Sep 26, 2013

Similar to pythons .pyc files?

Please don't make jlcache usable without the source file without renaming the cache file. I once spent 3 hours to figure out why a python file caused trouble after it was deleted.

@StefanKarpinski
Copy link
Member

Please don't make jlcache usable without the source file without renaming the cache file. I once spent 3 hours to figure out why a python file caused trouble after it was deleted.

Don't worry – we've all been bitten by that and will not make the same mistake.

@nhodas
Copy link

nhodas commented Sep 27, 2013

The LLVM blog describes using the MCJIT to cache and pre-compile objects (http://blog.llvm.org/2013/08/object-caching-with-kaleidoscope.html).

"However, MCJIT provides a mechanism for caching generated object images. Once we’ve compiled a module, we can store the image and never have to compile it again. This is not available with the JIT execution engine and gives MCJIT a significant performance advantage when a library is used in multiple invocations of the program."

If the MCJIT and JIT can talk to each other, would this be a promising route?

@ihnorton
Copy link
Member

Please see #260 #3922 #3892 (among others) for all the previous discussion of this topic.

@Keno
Copy link
Member

Keno commented Sep 27, 2013

Yes, that's the idea. The problem here is not that we don't know what needs to be done, we do. The problem is that it is a significant amount of work and nobody has done it yet.

@quinnj
Copy link
Member

quinnj commented Jun 4, 2014

It seems like there are several issues discussed here that are already open elsewhere. In terms of package loading times, there is the usable, though fairly undocumented technique using userimg.jl for precompiling packages. Forum thread.

What else would be needed to close this issue? userimg.jl documentation?

@tknopp
Copy link
Contributor

tknopp commented Jun 4, 2014

I don't think this can be closed yet. The userimg.jl trick requires a from source build. If I understand correctly this is due to the linking step that requires a linker to be available.

IMHO a nice solution would be if any module (or package) could be compiled into its own .so file that is cached somewhere. If this is feasible one could autogenerate these files on either the installation of a package or when using the package for the first time.

@lindahua
Copy link
Contributor

Requiring users to modify Julia source (e.g., userimg.jl) is not an acceptable solution.

The system should work out of box, which means that packages should be able to load reasonably fast without any modification to Julia base files. I agree that we should probably cache pre-compiled images, invalidate the cache whenever the source changes in any way.

@tkelman
Copy link
Contributor

tkelman commented Aug 10, 2014

Can we put heads together and figure out some form of package caching/precompilation first thing after getting LLVM 3.5 running?

@kmsquire
Copy link
Member

+1

@IainNZ
Copy link
Member

IainNZ commented Aug 10, 2014

+100. Its so bad that we are actually hesitant to merge a very large PR into JuMP because of the impact on loading times...

@StefanKarpinski
Copy link
Member

Something like @vtjnash's #6884 change wouldn't solve the problem but it would help a lot by allowing packages that can interact with a lot of other packages to not depend on those other packages. Gadfly, for example, could be much faster to load.

@IainNZ
Copy link
Member

IainNZ commented Aug 10, 2014

We (JuliaOpt) are looking forward to that one too to make our conditional solver load code less hacky, but we're still bound mostly just by our own code.
I'm not sure it'll actually help Gadfly that much Gadfly deps
most of that looks fairly needed to me

@StefanKarpinski
Copy link
Member

Gadfly doesn't really need Datetime or DataFrames, which cuts out a big part of that graph. I suspect other parts aren't really required by the core of Gafly either.

@IainNZ
Copy link
Member

IainNZ commented Aug 10, 2014

Oh if it doesn't need Dataframes then that'd be pretty cool.

@timholy
Copy link
Member

timholy commented Aug 10, 2014

As usual, if one @vtjnash PR doesn't help the problem, there's bound to be a second @vtjnash PR that will 😄. See #5061.

@StefanKarpinski
Copy link
Member

precompilation, precompile

@pao
Copy link
Member

pao commented Sep 8, 2014

#7977, which was on the 0.4-projects milestone, was closed as a dup. Should this be put on the 0.4-projects milestone in its stead?

@StefanKarpinski StefanKarpinski added this to the 0.4-projects milestone Sep 8, 2014
@vtjnash vtjnash modified the milestones: 0.5, 0.4 Mar 7, 2015
@IainNZ
Copy link
Member

IainNZ commented Mar 17, 2015

Is putting this as milestone 0.5 up for discussion? I'm not doing the work myself, so I don't want to dictate priorities, but I'm kinda distressed by the idea that package loading will be this slow until... December?

@vtjnash
Copy link
Member

vtjnash commented Mar 17, 2015

sounds like we need a faster 0.4 release then?

there's more than one way to skin a cat :)

but, in fact, i tried to remove a lot of "nice to haves" from the 0.4 list specifically to help with schedule

@JeffBezanson
Copy link
Member

Please do not edit the 0.4 milestone without discussion.

@tkelman
Copy link
Contributor

tkelman commented Mar 17, 2015

Should we centralize an overall scope discussion somewhere, either as an issue or on julia-dev?

@timholy
Copy link
Member

timholy commented Mar 17, 2015

My own personal opinion: of things that seem "close" for 0.4 (#8745), in terms of importance I'd say there's nothing even remotely in the same league. A substantial improvement in package loading times would be my vote for the very first item mentioned in the announcement for whatever release this makes it into (unless the debugger gets merged in the same release).

If there's stuff we can do to help, please do let us know. I guess I should start checking out #8745 and playing with it.

@timholy
Copy link
Member

timholy commented Mar 17, 2015

(I should add that multithreading might also be a competitor for the top spot...)

@tknopp
Copy link
Contributor

tknopp commented Mar 17, 2015

My take: if its 3 month more time we should release 0.4 without precompilation and make it the only goal for 0.5 to be released asap. Maybe those who have the skills to finish it could comment on a realistic time schedule (me not)

@nalimilan
Copy link
Member

If it's a non-breaking feature, it could even be introduced in a minor release.

@jiahao
Copy link
Member

jiahao commented Mar 17, 2015

Faster package loading is no longer just a "nice to have" feature. It's frankly quite difficult to claim that Julia is a fast language and then have the second thing a user try (after 1+1) is to plot something and wait forever for Gadfly to load.

@tknopp
Copy link
Contributor

tknopp commented Mar 17, 2015

@jiahao: i think we all agree on that. But we still have to do realistic release planing. It does not help to wait for a feature when it is not realistic to get in. So it would be good if jameson, jeff, and keno make a clear decision here.

@tkelman
Copy link
Contributor

tkelman commented Mar 17, 2015

That's just as much the case now as it was a year ago. We're rate limited on implementation labor (and code review? in the cases of already-open PR's) for big core features that everyone knows need to be done.

Decisions and/or plans should probably start to be made pretty soon whether the 0.4.0 roadmap is going to be feature-defined or schedule-defined. If the former, by which features (and expect it to take a while), or if the latter, by what target (and expect it to not have as many finished features as everyone would like).

@tknopp
Copy link
Contributor

tknopp commented Mar 17, 2015

I though that there was agreement on a time-based schedule (e.g. by @StefanKarpinski https://groups.google.com/d/msg/julia-users/aqGvjGLVaLk/CI7p8R8XZGEJ )

@elcritch
Copy link

Any updates on the current status of improving package load times? I'm interested to see if I could help with anything. My LLVM-skills are getting rusty and it'd be good to have an excuse to work on them again. ;) Would that be a questions to ask @vtjnash directly?

@ViralBShah
Copy link
Member

@elcritch Actually, if I could convince you, a good place for LLVM skills is the julia debugger - I believe @Keno is almost on the verge of outlining what remains to be done - and this may be a good reason to do so.

@vtjnash
Copy link
Member

vtjnash commented Apr 12, 2015

agreed. this issue doesn't have much to do with llvm. but see #9336 for a list of llvm36 issues to burn down. helping improve/fix debugging info would also be a huge win (tracking inline functions for line number tables and emitting the debugging symbol tables for seeing the values of local variables in lldb/gdb).

@tkelman
Copy link
Contributor

tkelman commented Apr 12, 2015

Keno also has a pile of LLVM patches up for review that are moving slowly, not sure if us bumping them will make them go any faster.

@elcritch
Copy link

@ViralBShah Great, sounds like some work on the LLVM debugging info would be helpful. I will need to look into how the Julia front-end handles the debugging info. @vtjnash, it looks like the serialization code is only serializing the AST, or did I miss how it's serializing the JIT'ed code?

@ihnorton
Copy link
Member

For debug info, look at "step 5" in emit_function within codegen.cpp. For serialization, start with julia_save in init.c. There are two parts to it, the JIT'd code is written by jl_dump_bitcode, and there are some helpers to serialize global values and pointers -- start from jl_save_system_image in dump.c.

(other questions should probably go to julia-dev)

@Timmmm
Copy link

Timmmm commented Jan 4, 2019

Is there an open issue for this? This is still laughably slow. Even after precompilation. Even after it is already loaded!

$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.0.2 (2018-11-08)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> @time using Gadfly
 10.652760 seconds (18.66 M allocations: 1.003 GiB, 6.66% gc time)

julia> @time using Gadfly
  0.653415 seconds (1.09 M allocations: 52.098 MiB, 3.30% gc time)

julia> @time using Gadfly
  0.001039 seconds (283 allocations: 15.063 KiB)

This is on a fairly high spec Macbook Pro. I never had to wait 10 seconds to make a plot in Matlab...

@JeffBezanson
Copy link
Member

There are several open issues for it; see the "latency" label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

No branches or pull requests