Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further startup time improvements #17285

Closed
Keno opened this issue Jul 5, 2016 · 28 comments
Closed

Further startup time improvements #17285

Keno opened this issue Jul 5, 2016 · 28 comments
Labels
compiler:latency Compiler latency

Comments

@Keno
Copy link
Member

Keno commented Jul 5, 2016

Doing some profiling of startup time, the main items are:

  • BLAS/LibGit2 initialization (each 10%)
  • Creating Function * objects for symbols in the system image (~15%)
  • Codegen for functions called by cholmod's init (~7-8%)
  • FLISP initialization (~5%)
  • Dynamic linker (~2-3%)
  • Most of the rest is in dump.c, restoring the system image.

Low hanging fruits are perhaps, creating Function * objects lazily, and deferring libgit2 initialization until it's needed. Might also be worth looking into cholmod's init to try to get it to compile everything up front.

@StefanKarpinski
Copy link
Member

Loading and parsing ~/.julia_history may also be contributing when present. Doing that lazily could be a big win and shouldn't be too hard.

@Keno
Copy link
Member Author

Keno commented Jul 5, 2016

Yes, it can be significant if your history file is large.

@JeffreySarnoff
Copy link
Contributor

Why is it not possible to pre-create Function * objects for symbols in the system image within the system image? Would it make sense to defer BLAS initialization until it is needed?
I think many history files are large; and as the user base expands that proportion will grow.

@kshyatt kshyatt added the performance Must go faster label Jul 28, 2016
@JeffBezanson JeffBezanson added the compiler:latency Compiler latency label Jul 19, 2018
@KristofferC
Copy link
Member

KristofferC commented Jul 25, 2018

Update of what takes time to startup (ran at #28118)

screen shot 2018-07-25 at 13 19 16

So, ~55% is restoring the sysimg, 18% is blas, 5% is initialization of frontend, 4% is a call to srand.

@KristofferC KristofferC removed the performance Must go faster label Jul 25, 2018
@StefanKarpinski
Copy link
Member

True, but also about 100x faster than Ruby 😁

@ghost
Copy link

ghost commented Aug 14, 2018

@StefanKarpinski thats the exact opposite of what i just said - can you explain

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Aug 14, 2018

What I'm saying is that in exchange for Julia's 20% slower startup time, you get a language that's 100x faster than Ruby. The fact that Julia is only 20% slower to startup is impressive given that compiling your C++ code would take considerably longer, for example. Startup and compilation time is and will continue to be a high priority for us and will be improved going forward, but posting comparisons between Julia's startup time and the startup time of various slow, interpreted languages doesn't really contribute anything useful. We already know how long it takes to start Julia and there's nothing we can learn from those languages since they are so technologically dissimilar. If you happen to know of a fast, JIT compiled language with really snappy startup time, then that would be potentially helpful.

@ghost
Copy link

ghost commented Aug 14, 2018

@StefanKarpinski first one that comes to mind is PowerShell - but that might not be a good match - i will link in case it is helpful - thanks

http://github.com/PowerShell/PowerShell

@StefanKarpinski
Copy link
Member

PowerShell command execution time is more comparable to timing evaluation in Julia's REPL, which is quite fast for everyone's favorite super useful example:

julia> @time println("Hello, world")
Hello, world
  0.002232 seconds (27 allocations: 1.750 KiB)

julia> @time println("Hello, world")
Hello, world
  0.000022 seconds (8 allocations: 240 bytes)

@ysmood
Copy link

ysmood commented Aug 15, 2018

@StefanKarpinski how about luajit?

image

just joking 😂

BTW, here is a list of the innovative features in LuaJIT, hope it helps.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Aug 17, 2018

LuaJIT is a super impressive piece of work. There's still not much to learn since Lua is notoriously minimal and LuaJIT doesn't do any of the things that are taking time above: loading and initializing a BLAS library, a high-performance RNG, multithreading infrastructure, etc.

Please keep future commentary on this thread to constructive points about Julia's startup time.

@PallHaraldsson
Copy link
Contributor

PallHaraldsson commented Dec 23, 2018

"4% is a call to srand."

How about getting rid of that/MersenneTwister for Julia 1.1?

Does having to do "using Random" eliminate srand already? For compatibility "using RandomMersenneTwister" could be a possibility while Karpinski has said on my issue on replacement it's not to be relied on. Alternative good and much faster RNG Julia code exists that doesn't have a huge state.

@ViralBShah
Copy link
Member

Which codes are you referring to?

@PallHaraldsson
Copy link
Contributor

PallHaraldsson commented Dec 26, 2018

https://sunoru.github.io/RandomNumbers.jl/stable/man/xorshifts/

"The successor to Xorshift128 series." (that series is also implemented, e.g. "Xorshift128Plus is presently used in the JavaScript engines of Chrome, Firefox and Safari."). #27614

@ViralBShah
Copy link
Member

I thought those RNG don't pass the RNG testsuite.

@KristofferC
Copy link
Member

KristofferC commented Nov 19, 2019

Use 1.3. 1.2 has a known startup time regression.

And you use some arbitrary julia package on the julia code. Just use Printf.

@ViralBShah
Copy link
Member

Also, startup times of interpreters are not comparable with jit compiled languages.

@KristofferC
Copy link
Member

KristofferC commented Nov 19, 2019

I will try new version and also Printf later and update

This specific example is pretty arbitrary though. If we only talk about latency you can do

❯ cat app.jl
using Printf
@printf "%05.2f\n" 1.2

❯ time julia --compile=min app.jl
01.20
julia --compile=min app.jl  0.08s user 0.06s system 133% cpu 0.104 total

Comparing random small pieces of code between different languages is not really productive.

@ghost
Copy link

ghost commented Nov 19, 2019

@KristofferC but it is a comparison. I think youre in a better position to criticise my methods, after youve presented your own comparison.

@KristofferC
Copy link
Member

My point is that simple comparisons don't provide any information that help make Julia improve. Profiling like #17285 (comment), identifying hot spots where things can be optimized etc, on the other hand, might be useful.

Saying Julia does x in 0.5s while OtherLanguage does it in 0.3s is just pointless in an issue like this. You could post the result on a blog or something.

@ghost
Copy link

ghost commented Nov 19, 2019

@KristofferC youre right in that it doesnt help fix the problem.

But it does help confirm the existence of the problem, and the degree.

Something to help fix the problem would be more valuable, which I think you are
alluding to. But suggesting that comparison testing has no value is just wrong.
Comparison is literally the only way one could know a problem exists in the
first place. Without comparison, youd have to have some objective measure of
what is fast and slow in regard to interpreter startup, and I am not aware of
any such standard.

At any rate, in order to minimize additional noise I am editing my comment with
the following. I reran the test with suggestions and it does make considerable
difference. Julia is stiller slower than all, but by lesser margin:

$ bin/julia -v
julia version 1.3.0-rc5

$ cat app.jl
using Printf
@printf "%05.2f\n" 1.2

$ time bin/julia --compile=min app.jl
01.20

real    0m0.180s
$ cat app.rb
s1 = '%05.2f' % 1.2
puts s1

$ time ruby app.rb
01.20

real    0m0.140s
$ cat app.py
s1 = format(1.2, '05.2f')
print(s1)

$ time python3 app.py
01.20

real    0m0.094s
$ cat app.php
<?php
$s1 = sprintf('%05.2f', 1.2);
var_dump($s1);

$ time php app.php
string(5) "01.20"

real    0m0.078s

@ViralBShah
Copy link
Member

For cholmod, we could avoid compiling it into the system image. I in imagine compiling SuitesSparse is hardly noticeable.

I imagine libgit2 can also be removed from the system image in 1.4 once Pkg does not need it.

@ViralBShah
Copy link
Member

ViralBShah commented Feb 8, 2020

Not building blas into the system image will open the door to easier use of alternate blas libraries. However this one will affect a lot of people, if it is not in the system image. And these packages probably have significant compile time.

@KristofferC
Copy link
Member

I imagine libgit2 can also be removed from the system image in 1.4 once Pkg does not need it.

Pkg needs LibGit2 in 1.4.

@ViralBShah
Copy link
Member

Is it 1.5 then or am I misunderstanding?

@KristofferC
Copy link
Member

KristofferC commented Feb 8, 2020

Yes, we still support using git for registries and to add unregistered packages via git URLs.

@JeffreySarnoff
Copy link
Contributor

Not building blas into the system image will open the door to easier use of alternate blas libraries. However this one will affect a lot of people, if it is not in the system image. And these packages probably have significant compile time.

Why not build it into the system image and also allow an alternate blas library to be used?

@ViralBShah
Copy link
Member

Should we close this now?

@vtjnash vtjnash closed this as completed May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:latency Compiler latency
Projects
None yet
Development

No branches or pull requests

11 participants