Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precompile everything after Pkg.update() and friends #16409

Closed
davidanthoff opened this issue May 17, 2016 · 50 comments
Closed

Precompile everything after Pkg.update() and friends #16409

davidanthoff opened this issue May 17, 2016 · 50 comments
Labels
compiler:precompilation Precompilation of modules packages Package management and loading

Comments

@davidanthoff
Copy link
Contributor

This issue is meant as a discussion starter, I'm actually not sure it should actually be done ;)

Here is my situation: I normally Pkg.update() in the morning. I know it will take some time, so I typically do it when I don't need a julia prompt the next second. Later that day someone comes by the office, and I want to show him/her how cool and fast julia is. I start my neat example code, and it all starts by spending a couple of minutes precompiling things. And I stutter "uh, oh, precompile, you know...". It bites me every time I try to demo julia.

At least for me I would much prefer if precompile happend with Pkg.update(). I'm already in a mood that things will take a while when I run Pkg.update(), so if they take a little longer, no problem. Whereas precompile on demand seems to hit me always in moments when it is really, really inconvenient. Having precompile as part of Pkg.update() would make things much more predictable, which is something I would value.

For this example, precompile should probably also be triggered by all the other Pkg.* functions that change packages.

Alternatives:

  • a flag like Pkg.update(precompile=true). Would work but I would probably forget it, and then the precompile cost would again hit me in an inconvenient moment.
  • a global configuration flag. Could maybe go into the user julia startup script or something like that.

Thoughts?

@lobingera
Copy link

Different people have different workstyles, so putting this precompilation step as default in Pkg.update doesn't sound right to me...
Pkg.update_and_compile() ?

@pkofod
Copy link
Contributor

pkofod commented May 17, 2016

Pkg.update_and_compile() ?

Pkg.update(precompile=true) seems better to me. I also think precompilation as default seems aggressive.

@JaredCrean2
Copy link
Contributor

JaredCrean2 commented May 17, 2016

This would also be helpful for MPI programs on clusters with a shared file system. It would be better to have a single process precompile everything before launching the MPI tasks. Currently, every MPI task precompiles every package.

@davidanthoff
Copy link
Contributor Author

I guess another option would be to just have a function precompile_everything_that_is_stale() that I could e.g. call before a presentation. That would somewhat help: when I want no surprise precompilation happening, I could just call that function, and as long as I don't touch any package, I would be guaranteed to not trigger a precompile. For my workflow I would still prefer automatic precompilation after Pkg.update(), but this might be a first simple step.

@tkelman
Copy link
Contributor

tkelman commented May 17, 2016

https://github.com/staticfloat/MakePkgUpdatePrecompileInTheBackground.jl - evidently some of the repl backgrounding cleverness there requires a patch that hasn't been backported to 0.4 (yet? haven't reviewed the branch in detail).

@stevengj
Copy link
Member

A separate Pkg.precompile_stale() that acts on all installed packages also makes more sense to me than putting this into update.

@kshyatt kshyatt added packages Package management and loading compiler:precompilation Precompilation of modules labels Jun 27, 2016
@davidanthoff
Copy link
Contributor Author

There will probably be people like me who prefer to have something like precompile_stale() run after any package operation, so that I never face an unexpected precompile event. Others probably don't want this. This does strike me as a case where a configuration option would be ideal, maybe in an environmental variable?

Another idea (but probably more long term) might be background precompile, like the example @tkelman linked to. .Net has something like that as well with background ngen compilation. Not sure how successful that is...

@pkofod
Copy link
Contributor

pkofod commented Jun 27, 2016

There will probably be people like me who prefer to have something like precompile_stale() run after any package operation, so that I never face an unexpected precompile event. Others probably don't want this. This does strike me as a case where a configuration option would be ideal, maybe in an environmental variable?

$ echo 'Pkg.precompile_stale()' > ~/.juliarc.jl

if such a function gets added?

@davidanthoff
Copy link
Contributor Author

$ echo 'Pkg.precompile_stale()' > ~/.juliarc.jl

Hm, that would still be unpredictable, now I could never be sure whether I will wait for precompile at julia startup. I would really prefer that it just happens automatically after any operation via the package manager.

@timholy
Copy link
Member

timholy commented Jun 27, 2016

@davidanthoff, in the meantime you can always add this to your .juliarc.jl:

function recompile()
    for pkg in Pkg.available()
        try
            pkgsym = Symbol(pkg)
            eval(:(using $pkgsym))
        catch
        end
    end
end

pkgupdate() = (Pkg.update(); recompile())

Not perfect, but pretty useful.

@davidanthoff
Copy link
Contributor Author

@timholy Any reason I shouldn't use Base.compilecache instead of the eval(:(using $pkgsym))?

@tkelman
Copy link
Contributor

tkelman commented Jun 27, 2016

compilecache doesn't respect __precompile__(false) annotations IIRC

@nalimilan
Copy link
Member

nalimilan commented Jun 28, 2016

What's the problem with precompiling packages on install by default? According to the "vote" at the top, this makes a lot of sense.

The issue with leaving it opt-in is that only advanced users will find out about this, which will give the impression that Julia is slow any time they will show it to their colleagues after an update.

@lobingera
Copy link

In the long run precompilation should be hidden somewhere in a background job. Currently precompilation takes time and blocks REPL - several minutes. As a user and developer i'd like to take control when this happens. Also currently Pkg.update (depending on the connectivity?) gets slower and slower and spending additional time on recompilation ... i just don't know.
btw: What happened to #13487?

@nalimilan
Copy link
Member

btw: What happened to #13487?

This: #17132

@davidanthoff
Copy link
Contributor Author

If precompile happend with package operations, I wonder whether one could do some clever async weaving. If Pkg.update is mainly IO (network) bound right now, shouldn't it be possible to trigger a git fetch, and while that is running, precompile some package that has already been fetched?

@davidanthoff
Copy link
Contributor Author

Another idea is to enable automatic precompile after package operations for one release (with an option to disable it via Pkg.update(precompile=false), and just see how people like it... If the user experience is bad, it could still be changed back in a later release.

@wildart
Copy link
Member

wildart commented Jun 28, 2016

Precompilation can only be done when package and its dependencies are updated and built. I believe, there is no particular order in update operation, so some reordering is needed for to enable precompilation while downloading rest of the updates.

@eschnett
Copy link
Contributor

So, hypothetically, there's a new Julia user, and some tells them about a new package they should try. So they type

Pkg.update()
Pkg.add("UnicodePlots")
using UnicodePlots

and would expect that to be as fast as possible. Having to wait for Blosc and HDF5 to download and recompile isn't what they'd expect or want to happen in that case.

My problems and workflows shouldn't enter the discussion here, since I already know how to circumvent problems and configure things. But having to explain to a new user that Pkg.update() is a potentially a very slow operation isn't good. Explaining to them that using UnicodePlots might be slow the first time is much easier to justify.

@wildart
Copy link
Member

wildart commented Jun 28, 2016

You do not need to run Pkg.update first.

@eschnett
Copy link
Contributor

Julia often tells me to run Pkg.update(). Or maybe the other person said at JuliaCon that they just tagged a new version of the package, and recommends using that version.

Anyway -- if we first improve Pkg.update() to be more convenient (i.e. precompile), and then say it's not necessary to run it (since it's inconvenient), something is off. I assume that many people want to run other package commands after updating, so making people wait after updating isn't right.

I realize this is a discussion only about the default behaviour; all the features that people need will be there anyway. So my point is that the default should be right for beginners -- it should not surprise beginners. Letting experienced users save a few keystrokes isn't the right choice for interactive package management.

@davidanthoff
Copy link
Contributor Author

So my point is that the default should be right for beginners -- it should not surprise beginners. Letting experienced users save a few keystrokes isn't the right choice for interactive package management.

I think the current behavior of using is unpredictable, and I think this is bad for users (beginners and advanced). I've seen this with my lab group, all new users, and all annoyed with this behavior.

But having to explain to a new user that Pkg.update() is a potentially a very slow operation isn't good.

It already is, even without precompile. And it will always be unpredictable, given that packages can do whatever they like in their build phase. I can't see a scenario with the current design where a user would know beforehand whether Pkg.update will be slow or not.

Explaining to them that using UnicodePlots might be slow the first time is much easier to justify.

I think this is a bad user experience. Users now have to remember whether they have used that package before in order to tell whether the command will be slow or not, which is really unpredictable.

@eschnett
Copy link
Contributor

I'm not arguing that using should be slow. I agree there's a problem. I'm arguing that Pkg.update() is the wrong place to put the slowness.

Compare to apt-get: There is a stage where caches are rebuilt, but it doesn't happen during apt-get update (the metadata update). If fact, packages update their caches in the background, after the install has finished. In Julia, the equivalent of apt-get update is already slow, and making it slower doesn't help.

I want separate Pkg.update() (fast metadata update), Pkg.select() / Pkg.deselect() (fast operations, just remembering what needs to be done), and then a Pkg.do_everything_and_be_thorough() that I can call once in the end.

@davidanthoff
Copy link
Contributor Author

I want separate Pkg.update() (fast metadata update)

Yes, a fast metadata update would be great. But that is not Pkg.update, as long as it also updates all packages and builds them it will always be slow, precompile or not.

@eschnett
Copy link
Contributor

@davidanthoff It seems we are in agreement now.

@nalimilan
Copy link
Member

@eschnett Looks like you're asking for an equivalent of apt-get update, but Pkg.update is currently equivalent to apt-get upgrade. Since it actually installs packages, it makes sense to also build them (which apt-get upgrade does when e.g. building system caches).

@davidanthoff
Copy link
Contributor Author

I guess here is another way to split this: there could be one operation that git fetches everything (both METADATA and all local package repos), but does NOT change the checked out version of any package on the system. Even for checked out packages this would just fetch, not pull. The only folder that would be git pulled would be METADATA. So you can actually be sure that not a single of your installed packages will be changed by that command. [at this moment of writing @nalimilan's comment appeared] And then have another command that upgrades all packages to their latest available version, like @nalimilan's upgrade suggestion.

This would enable a whole bunch of nice things: one could do a Pkg.update, and then a Pkg.status could highlight for which packages there are newer versions around. And one could also selectively upgrade by specifying a package name etc...

In that model, whenever the checked out version in a package directory is changed by the package manager, it would be precompiled.

@eschnett
Copy link
Contributor

@davidanthoff "... it would be precompiled." You probably mean "it and all its dependents" here.

@davidanthoff
Copy link
Contributor Author

@eschnett: Yes. But unless the change of version of package X also changed the checked out versions of the packages it depends on, those packages would already be precompiled and would not have to be recompiled, I believe.

@eschnett
Copy link
Contributor

@davidanthoff I meant dependents as in children, not dependencies as in parents. As in, you update package X, and then you need to check all other packages whether they depend on X, and recompile them as well.

@ufechner7
Copy link

Yes. Currently, each time Compat.jl is updated nearly all the packages, that I use need to be recompiled, because they all depend on Compat.jl.

@davidanthoff
Copy link
Contributor Author

@eschnett Ah, yes, that is right.

@Ismael-VC
Copy link
Contributor

I use a slightly different version of Tim's function, but it also does Pkg.build, I'm ok with naming it Pkg.upgrade.

function recompile_packages()
    for pkg in keys(Pkg.installed())
        try
            info("Compiling: $pkg")
            eval(Expr(:toplevel, Expr(:using, Symbol(pkg))))
            println(SEPARATOR)
        catch err
            warn("Unable to precompile: $pkg")
            warn(err)
            println(SEPARATOR)
        end
    end
end

emerge() = (Pkg.update(); Pkg.build(); recompile_packages())

@miakramer
Copy link

Hi, sorry for the bump, but I think this warrants further discussion. I would consider myself a beginner, and my use-case (and I think a lot of people in the sciences like myself) want to do something like use Julia in a REPL or Jupyter Notebook to do some nice data exploration type tasks (like how people use Matlab). Opening up Jupyter and then running using SymPy, DataFrames, Plots and having to wait five minutes actually makes me drop Julia and just open up Python instead. I like the features of Julia for data work, but I'll just use numpy instead if it means I can actually get work done. I have the same problem as other people in this thread, I really like Julia and I want to show my colleagues the cool stuff that makes my work easier but opening it up and going "no wait, it just does this for a few minutes and then it's fast" leaves a really bad impression.

I think that having Pkg.update() precompile (including dependents) by default is the best option. In my experience, non-programmers don't often update, and are probably fine with it taking longer if it means they can do work faster after. Updates usually take a long time (look at R or MATLAB) anyway, so I really don't think Pkg.update() being slow is an issue. As other people have brought up, doing the work in parallel would be a good idea.

@miakramer
Copy link

Once finals are over, I'd gladly take this on myself if no one else wants to. Been meaning to learn more of Julia's internals!

@guihigashi
Copy link

I wrote myself a batch script that Pkg.update() if i choose to, then writes a temporary list of all installed packages in some text file, i.e. instd_pkgs_list.txt in temp folder. Then executes julia -e "using ..." in a for loop from cmd. I leave it running from time to time.

@echo off

REM julia and list path
set jl=%LOCALAPPDATA%\Julia-0.6.0\bin\julia
set instd_pkgs=%temp%\instd_pkgs_list.txt

REM ask if update
set update_packages=y
set /p update_packages=Update packages ([%update_packages%]/n)? 
if %update_packages% equ y %jl% -e "Pkg.update()"

REM delete old list
if exist %instd_pkgs% del %instd_pkgs%

REM get new list
%jl% -e "cd(tempdir());writecsv(\"instd_pkgs_list.txt\",keys(Pkg.installed()));"

REM julia -e "using ..."
if exist %instd_pkgs% (
    for /f %%a in (%instd_pkgs%) do (
        echo using %%a
        %jl% -e "using %%a"
    )    
)

REM print list location
echo List: %instd_pkgs%

@yipinghuang1991
Copy link

yipinghuang1991 commented Dec 23, 2017

My version here, pretty ugly. It stops when a compilation task failed.

#!/bin/zsh

PACKAGES=$(julia -E "keys(Pkg.installed())")
PACKAGES=${PACKAGES:gs/String[/} # remove "String["
PACKAGES=${PACKAGES:gs/]/} # remove "]"
PACKAGES=${PACKAGES:gs/\"/} # remove '"'
PACKAGES=${PACKAGES:gs/,/} # remove ","
PACKAGES=("${(@s/ /)PACKAGES}")

for package in ${PACKAGES[@]}; do
	echo "Precompiling $package..."
	command="using $package"
	julia -E "$command"
	echo
done

@ufechner7
Copy link

I suggest to add the emerge function mentioned above under the name Pkg.upgrade() to Julia 0.7. Very little work, very useful. Just needs documentation. So, please add the label "1.0".

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Dec 23, 2017

Package management and precompilation is going to be quite different in 0.7, so this thread while well-intentioned and useful under the premise that Pkg is going to stay much the same, that premise isn't true and this is not actually very helpful. Something in the spirit of this thread can and will certainly be supported, however. Changing how and when compilation happens is not a breaking change, so even if we wanted to do this, the issue would not be a 1.0-blocker and therefore does not belong on the milestone.

@holocronweaver
Copy link

To be clear, we are going to have precompilation in 1.0, right? As ufechner7 said, this is a small change that would give a large user experience improvement. Papercuts like this really add up and can make or break adoption. If 1.0 does not have a smooth like butter user experience, it shouldn't be called 1.0.

@stevengj
Copy link
Member

stevengj commented Dec 24, 2017

@holocronweaver, to be clear, Julia has precompilation now — the issue here is when this happens (aPkg.update time vs. the first time you import a new/updated package … although you could easily write a function/script now that updates then imports/precompiles all of the installed packages). And a separate issue is increasing the amount of compilation work that is cached by the precompiler.

(The Julia developers have decided that 1.0 is about backwards compatibility, not "smooth like butter user experience." Certainly when 1.0 is released it should be clear that many improvements are still in the queue.)

@holocronweaver
Copy link

I meant precompilation during installing/upgrading. Precompiling libraries when you run scripts should only be done as last resort or during library development. As you say, scripts above can accomplish this, but it is absurd to ask users to write such scripts when the package manager should have it built in and it is trivial to implement.

(Releasing an unpolished product as 1.0 will surely drop Julia adoption rate and feed into the "Julia will never gain wide adoption" narrative that is common among scientists and engineers.)

@StefanKarpinski
Copy link
Member

I'm not exactly sure how many times I can possibly assure people that I'm aware of this being a desire and that we'll ship with something to address this but also say that this particular implementation does not make any sense in Pkg3.

@holocronweaver
Copy link

Sorry, I think there is a misunderstanding. My question was: will some implementation be included in 1.0? My concern: I think this is too important to user experience not to be a 1.0-blocker.

@StefanKarpinski
Copy link
Member

Sure

@lobingera
Copy link

@StefanKarpinski could you please explain that "we'll ship with something" in more detail?

@mauro3
Copy link
Contributor

mauro3 commented Dec 25, 2017

@lobingera, probably easiest if you checkout Pkg3.

@lobingera
Copy link

@mauro3, i'm in the process of switching my package repo to Pkg3 right now. Stefan's comment sounded to me like, there is "something" on top to be expected ...

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Dec 27, 2017

@lobingera, please be patient. I can either spend time working on it or explaining it, not both. Also the exact nature of the feature remains to be determined and will probably evolve after the feature freeze since changing how precompilation works is a non-breaking change.

@simonbyrne
Copy link
Contributor

There is now a precompile statement in the new Pkg. Further feature requests should be discussed at https://github.com/JuliaLang/Pkg.jl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:precompilation Precompilation of modules packages Package management and loading
Projects
None yet
Development

No branches or pull requests