Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why Julia can not provide MKL? #18374

Closed
simonnier opened this issue Sep 6, 2016 · 63 comments
Closed

Why Julia can not provide MKL? #18374

simonnier opened this issue Sep 6, 2016 · 63 comments

Comments

@simonnier
Copy link

We all know intel MKL is free now. And platform like anaconda already shipped with MKL for free. Why can't Julia? Juia lags behind python. As I tested, my Julia 0.46 with openBLAS is slow compared to anaconda's numpy in eigenvalue solving. Is there any plan to include MKL in the furture version of Julia?

@KristofferC
Copy link
Member

Previous discussion #10969

@yuyichao
Copy link
Contributor

yuyichao commented Sep 6, 2016

Close as dup at least?

@simonnier
Copy link
Author

@yuyichao Hi, guys, at least give an yes or no. That will make this clear. I don't need to read a long discussion.

@KristofferC
Copy link
Member

KristofferC commented Sep 6, 2016

There are plans yes. No one that I know of is actively working on it right now though. It is possible to compile Julia with MKL yourself.

@simonnier
Copy link
Author

simonnier commented Sep 6, 2016

@KristofferC Thanks for answering. I hope julia could make mkl default, because it is the fastest, that suits the "high performance" tag of Julia. But by the way, I am kind of disappointed when you say "No one that I know of is actively working on it“...

@KristofferC
Copy link
Member

I am sorry to disappoint you but as with most open source projects developer time is scarce. Many features have gotten implemented by someone having a need for it and then reads up enough to implement and submit a PR. For now, users will have to compile with MKL themselves but for someone with high requirements on the performance of specific linear algebra functions that is maybe not such a big deal.

@simonnier
Copy link
Author

@KristofferC Oh, I thought since there is a company http://juliacomputing.com/about , there probably people working fulltime on it to push it as fast as possible so that Julia won't beat by other communities like python.

@KristofferC
Copy link
Member

Of course, but at this moment, it seems that they have deemed improving the performance of eigenvalue solver that by default ships with Julia as less important than having a nicely working debugger for example.

@StefanKarpinski
Copy link
Member

This is not a technical issue primarily, it's a legal one. You can easily build Julia with MKL, but as long as Julia ships with GPL libraries like FFTW and SuiteSparse, shipping Julia with MKL is illegal. Non-GPL licenses for both FFTW and SuiteSparse can be purchased, but obviously that's not a viable open source default. If you're willing to pay for a commercially supported Julia Pro, then you can get Julia built with non-GPL versions of these libraries and MKL.

@tkelman
Copy link
Contributor

tkelman commented Sep 6, 2016

And on some platforms building Julia with MKL works much better when using the Intel compilers, which aren't quite as easy to get or redistribute as MKL itself.

@simonnier
Copy link
Author

@StefanKarpinski I google about GPL, I am not sure if I understand it correctly. Do you mean that because Julia used GPL FFTW, so all should be open source? While MKL is not open sourced, so Julia cannot ship GPL FFTW and MKL at same time? But on the other hand, MKL already contains FFTW and sparsematrix, why not directly use it? I believe MKL is at least as good as other packages.

@johnmyleswhite
Copy link
Member

I think this conversation should continue on julia-users as it's not novel information for most of the people receiving GitHub notifications about this thread.

@StefanKarpinski
Copy link
Member

Yes, it is illegal to distribute a derived work combining GPL libraries with proprietary libraries like MKL. It might be possible to use MKL to replace FFTW and SuiteSparse, but it is not a drop-in replacement (at least not for SuiteSparse), so there's some non-trivial work to be done there. More significantly, it would make the Julia distribution non-open-source by default, which is not acceptable. It's fine to provide non-open-source options for people to build, but that cannot be the default distribution for the open source Julia project.

As @johnmyleswhite said, let's move this conversation off of GitHub. You can already build Julia with MKL and you can pay for a non-GPL version of Julia that uses proprietary libraries. Note that Anaconda is not comparable to Julia: it is a specific distribution of Python and various libraries, created and hosted by Continuum Analytics. Julia, on the other hand, is an open source project.

@RoyiAvital
Copy link

RoyiAvital commented Feb 8, 2017

@StefanKarpinski , Any chance to have the Anaconda model with Julia?

Namely provide a good out of the box experience for developers?

I want to download a ZIP Files / Installer and have all ready to work:

  • Julia Engine.
  • IDE + Debugger + Plotting.
  • Jupyter.
  • MKL + IPP Integrated.

Anaconda is quite close to give the user just that (For free).
MATLAB is doing it greatly (Not free).

For me, just an engineer, all this compiling stuff sounds like hacking.
I don't want to hack, I want to work.

@JeffBezanson
Copy link
Member

@RoyiAvital
Copy link

Well, Yea, I looked into that.

I can buy 15 licenses of MATLAB in the price of one Julia + MKL.

I asked about Anaconda style.

@kmsquire
Copy link
Member

kmsquire commented Feb 8, 2017

Isn't it close to Anaconda style? You only pay if you want enterprise support. The basic version is, if not exactly what you asked for above, at least in that vein (and free).

@RoyiAvital
Copy link

@kmsquire, It comes limited as no MKL in it.
Again, Julia is all about speed.

If it doesn't have MKL for numerical computing, what's the point?

@KristofferC
Copy link
Member

True, better go back to programming in Python then.

@RoyiAvital
Copy link

RoyiAvital commented Feb 8, 2017

Wasn't trying to start a mayhem.
I actually do want to work with Julia and enjoy its speed and capabilities.
Yet every time I tried it or tried to show it to someone else the process of installing it and starting working with it felt like "Black Magic" or hacking.

Just wanted to think on the regular user out there.
He might have Windows, he doesn't do "Hacking" or doesn't use to compile his own software.

I think there is a gap between the people making these kind of amazing products.
They are highly talented and extremely capable with "Hacking" their daily computer tasks.

Yet many of the target user base aren't like that.
Many of them are daily MATLAB users (The brave ones use Anaconda + Spyder).

The question is, can we do something to make Julia accessible to them?

@JeffBezanson
Copy link
Member

I believe openblas is very close in performance to MKL.

@StefanKarpinski
Copy link
Member

I can buy 15 licenses of MATLAB in the price of one Julia + MKL.

Unless you have an unusually cheap source for Matlab licenses, this seems to be factually inaccurate: MKL is free and JuliaPro for Enterprise with MKL is about 1/3 the cost of Matlab. Even if paying any money is too much, you can download MKL yourself for free and download and build Julia with MKL support yourself – at no cost. We cannot legally distribute Julia with both GPL libraries and MKL.

@RoyiAvital
Copy link

RoyiAvital commented Feb 8, 2017

@StefanKarpinski ,
Have a look here:

https://www.mathworks.com/products/matlab-home.html, https://www.mathworks.com/store/link/products/home/new.

You can get something with similar capabilities using MATLAB for regular home user for 100$.
Not to speak what you can get if you're a student:

https://www.mathworks.com/academia/student_version.html

Regarding GPL.
Well I'm really not an expert in those things.
I believe you it can't be done easily.
Maybe there is no good solution here.

Thank You.

@StefanKarpinski
Copy link
Member

Fortunately, free is still cheaper than Matlab.

@RoyiAvital
Copy link

RoyiAvital commented Feb 8, 2017

@JeffBezanson ,

I believe openblas is very close in performance to MKL.

You know what, I will download Julia Pro (Regular license, I guess it comes with OpenBLAS) now.

I will compare it to MATLAB on few Linear Algebra tests and will publish here.
If they are comparable, then Julia Pro, the free package is a great thing.

I was just counting on the original post stating that OpenBlas + Julia was slower than Numpy + MKL.

@StefanKarpinski

Fortunately, free is still cheaper than Matlab.

Well, when you stated:

Unless you have an unusually cheap source for Matlab licenses, this seems to be factually inaccurate: MKL is free and JuliaPro for Enterprise with MKL is about 1/3 the cost of Matlab. Even if paying any money is too much, you can download MKL yourself for free and download and build Julia with MKL support yourself – at no cost. We cannot legally distribute Julia with both GPL libraries and MKL.

I'm pretty sure we both talked about the case of Julia + MKL.
To have it with Out of the Box experience for home user / student MATLAB is cheaper than Julia noticeably.

I will try to compare both for performance.

I just want to say again.
I have nothing but appreciation to Julia and people behind it, the accomplishment is amazing.
I still remember few years ago looking at Julia front page and I couldn't believe the speed.
I really think Julia is a jewelry.
It just need a better entry point for the home user.

@ararslan
Copy link
Member

ararslan commented Feb 8, 2017

To have it with Out of the Box experience for home user / student MATLAB is cheaper than Julia noticeably.

Julia is free and open source, MKL is free, and you can build Julia using MKL, so Julia is infinitely cheaper than Matlab. While that's not entirely "out of the box," Building Julia from source is actually very, very simple: You download a source tarball, add a file that says basically says "I want to use MKL," and run make. The rest happens automatically. This is very well documented in the Julia repo.

I will try to compare both for performance.

There are performance benchmarks on the website that compare Julia with a number of languages, including Matlab.

If your argument is that MKL is faster, I think the more meaningful comparison is between OpenBLAS and MKL, not between Julia and Matlab.

@RoyiAvital
Copy link

@ararslan , We're on the same boat.
I believe Julia is faster than MATLAB as a language.

The context of this discussion is having MKL in Julia.
The basic assumption is that Julia is worth trying or even making it the Go To stop for Numerical Programming. No need to pursue on that.

The whole point was simple, One can not have Out of The Box Experience with Julia (On Windows, I take your word that on Linux compiling with MKL is easy) with MKL included is much pricier than MATLAB for Home User.

@JeffBezanson argued OpenBLAS isn't far behind MKL.
If it holds, then everything is perfect, you get for free something great.

On a side note, please try to sympathize with me when I say that https://github.com/JuliaLang/julia/blob/master/README.windows.md is a blocker for Julia users on Windows.
Most users won't do that for MKL.

Anyhow, curiosity is the real thing here.
I already created the Benchmark in Julia (I'm not experienced so maybe my code isn't optimal), we'll soon know (I will share the results, will be happy to be proven the performance [In the context of Linear Algebra] is there even without MKL).
It was fun anyway.

Thank You.

@tkelman
Copy link
Contributor

tkelman commented Feb 9, 2017

You can actually just rebuild the system image of an existing Julia install to point to a copy of MKL if you have one, without needing to recompile anything else. At least on Windows I've gotten that to work without needing to use Intel compilers (the MKL calling conventions don't always work well against a gfortran Julia build on all platforms). I have some code for that somewhere, but I think in a private repo.

@RoyiAvital
Copy link

@tkelman , That's interesting and as good as it gets.

Could you share how to do it?

@RoyiAvital
Copy link

RoyiAvital commented Feb 9, 2017

@KristofferC , I'd be happy to see a simplified code.
Are you looping on the vectors instead of using Matrix Multiplication (This is what I get from https://github.com/JuliaStats/Distances.jl/blob/master/src/generic.jl#L84) and using view in order not to make copies?

At least in MATLAB using Broadcasting (Namely for one vector calculate the distance to all other vectors) it is much slower than the method I used. I will check it on Julia (Would be happy to have a simplified code for that if you want).

Yet since I do the same for both (MATLAB & Julia), It won't invalidate the results.

@KristofferC
Copy link
Member

It uses matrix multiplication. 4k points in 2D takes 0.09 seconds. Just because you write the literal translation from one programming language to another doesn't mean it is a fair comparison.
'

using Distances
using BenchmarkTools
times = []
0-element Array{Any,1
for n in 500:500:4000
       x = rand(2, n)
       push!(times, @belapsed pairwise(SqEuclidean(), $x, $x))
end


julia> times'
1×8 Array{Any,2}:
 0.000684744  0.00286224  0.00855549  0.0148897  0.0266748  0.0362454  0.0651381  0.0920572

Anyway, this discussion is completely derailing the original thread and should be had on discourse.julialang.org. Happy to continue there.

@RoyiAvital
Copy link

RoyiAvital commented Feb 9, 2017

@KristofferC

Could you please write the explicit code (Without the package, just to run it)?
Something I will be able to understand (I have limited knowledge about Julia).
I will add it to the test.

My computation is following:

mX and mY are matrices where each column of them is a vector.
The output is the distance between each column of mY to each column of mX.

I use the fact that ||x - y||^2 = ||x||^2 - 2 <x, y> + ||y||^2.

I'd be happy for faster code.

Anyhow, opened a discussion Benchmark MATLAB & Julia for Matrix Operations.

Thank You.

P. S.
The code above doesn't do what I do in the test (If I understood it correctly).
I calculate the distance between each column of two 4000 x 4000 matrices.
Namely the output Matrix has 16,000,000 elements.

Here is the code of the function (Julia):

function CalcDistanceMatrixRunTime( matrixSize )

    mX = randn(matrixSize, matrixSize);
    mY = randn(matrixSize, matrixSize);

    tic();
    mA = sum(mX .^ 2, 1).' .- (2 .* mX.' .* mY) .+ sum(mY .^ 2, 1);
    runTime = toq();

  return mA, runTime;
end

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Feb 9, 2017

Just because you write the literal translation from one programming language to another doesn't mean it is a fair comparison.

Since that's exactly what we did in our microbenchmarks, I'd argue that it's entirely fair game. Regardless of whether some way of writing the code is flattering or not, it's good to see the results of the same algorithm across systems to know where the strengths and weaknesses are. The naive way of computing Euclidean distances may not be the fastest way to compute them, but it's still a perfectly reasonable benchmark.

These benchmarks are very much focusing on Matlab's strong suit: heavily vectorized operations that can be done entirely in pre-defined high performance kernels. I'd be very curious to see how much of this is due to MKL vs. OpenBlas and how much is due to language differences.

@RoyiAvital
Copy link

RoyiAvital commented Feb 9, 2017

@StefanKarpinski ,
I liked your answer 👍 .

I do think the calculation is fast and efficient as it uses Matrix Multiplication which is probably the most highly tuned operation out there.
The code @KristofferC shared wasn't doing what the test is doing.

See above:

My computation is following:
mX and mY are matrices where each column of them is a vector.
The output is the distance between each column of mY to each column of mX.
I use the fact that ||x - y||^2 = ||x||^2 - 2 <x, y> + ||y||^2.

The code above doesn't do what I do in the test (If I understood it correctly).
I calculate the distance between each column of two 4000 x 4000 matrices.
Namely the output Matrix has 16,000,000 elements.

I got some remarks about assumptions I had about using the dot and fused loops.
It seems they are not ready yet in 0.5.0.4 I'm using and not Multi Threaded.
I will make another version with Multi Threaded Devectorized loop.

This is a case where MATLAB has advantage not due to MKL just a tuned JIT engine which fuse the Element Wise Operations and is Multi Threaded.
What Julia seems to bring to the table is the ability (In the future at least) to do this on User Defined Functions.
Once you have this feature with Multi Threading and Loop Fusion it will be amazing.

On other notes it seems OpenBLAS is behind in some Decompositions tasks.
Maybe someone should let their developers know about this test as well.

Another weak point seems to be reductions.
Are those by BLAS or internally?

@StefanKarpinski , @JeffBezanson or any one else, If you have more tests you'd like me to add, I'd be happy to do it.

This is firstly done to improve both MATLAB and Julia.
Or at least let their users have more data.

Thank You.

@RoyiAvital
Copy link

@StefanKarpinski ,

If I had access to Julia with MKL I could run this and then we'd have a clear view where OpenBLAS holds Julia back or when Julia itself is holding back.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Feb 9, 2017

I will say this one last time:

WE CANNOT LEGALLY SHIP JULIA WITH MKL.

It's not that we don't want to ship Julia with MKL. We're not even telling you you can't use Julia with MKL – you can, and there are instructions how to do so in Julia's README. You can clone Julia's source, set a build flag to point at an MKL library, and build Julia to use it. It is perfectly legal for you to do that yourself. It is, however, illegal for us to distribute it to you that way. If we were to distribute Julia with both GPL licenses and MKL, we could be sued and fined large amounts of money for doing so. I do not make this stuff up – this is how the law works in the US and many other places.

@RoyiAvital
Copy link

RoyiAvital commented Feb 9, 2017

@StefanKarpinski ,

My intention was that if someone has a pre compiled version of it I'd be able to do it.
I'm using Windows and I have little knowledge about compiling (Just basic C Course of first year in EE).
So compiling Julia with MKL on Windows isn't an option for me.

So rephrasing it, I'm sure you have someone with access to Julia + MKL and MATLAB R2016b and can run the test to understand them better.

All good, we're here to improve things.

Thank You.

@StefanKarpinski
Copy link
Member

My intention was that if someone has a pre compiled version of it I'd be able to do it.

Sending that to you is precisely what would be illegal.

@tkelman
Copy link
Contributor

tkelman commented Feb 9, 2017

And you should read the license agreements of software that includes MKL carefully. It can actually be prohibited to use the copies of MKL that ship with Matlab or Anaconda with any other software. The Anaconda MKL agreement, last time I read it, said you weren't even allowed to make any copies of the files.

edit: looks like that was the old version (https://docs.continuum.io/mkl-optimizations/eula.html) which said "You may not copy any part of the Software Product except to the extent that licensed use inherently demands the creation of a temporary copy stored in computer memory and not permanently affixed on storage medium" - the new version (https://docs.continuum.io/anaconda/eula) is not as bad, but only says "You are specifically authorized to use the MKL binaries with your installation of Anaconda. You are also authorized to redistribute the MKL binaries with Anaconda or in the conda package that contains them." which isn't permission to use or redistribute them with other things.

@StefanKarpinski
Copy link
Member

We can and should, however, try these benchmarks on a version of Julia built to use MKL.

@KristofferC
Copy link
Member

KristofferC commented Feb 9, 2017

I can run them and compare. And yes, the code I have above does compute the distance matrix for all 4k points in 2D (= 16 000 000) values in the resulting matrix using the matmult "trick". However, I noticed know that the code in the repo used nd points, not 2d so the timing was lower for my run than it should have been.

@andreasnoack
Copy link
Member

@KristofferC It would be great if you could make a run with master as well. The fusion makes a big difference for some of these benchmarks.

@StirlingNewberry
Copy link

StirlingNewberry commented Feb 9, 2017 via email

@KristofferC
Copy link
Member

KristofferC commented Feb 9, 2017

This is what I got just running the code. I didn't look if the benchmarking was done carefully: http://imgur.com/a/5m1Yc

Correct: http://imgur.com/a/rBOo8

@RoyiAvital
Copy link

RoyiAvital commented Feb 9, 2017

Looking at your results I come to these conclusions:

  1. Julia 0.6 improves some Element Wise Operations (As expected). See Matrix Addition and Squared Distance Matrix.
  2. Julia 0.5 + OpenBLAS and Julia 0.5 + MKL are close on all tests. This has two logical explanations:
    • They were actually using the same library (Either OpenBLAS or MKL).
    • OpenBLAS is on par with Intel MKL.

Now, to analyze what happens regarding 2 is difficult.
If the BLAS performance of Julia 0.6 should be the same as 0.5 I'd say both Julia 0.5 versions are actually using MKL (Namely some errors on the test itself).

If there is degradation of BLAS performance in 0.6 and we can see MKL doesn't improve the performance of 0.5 it means MATLAB is doing something better utilizing MKL (Better choice of the function to run?).
Which is good news since it can be fixed.

Update
@KristofferC , Updated the tests results.
It seems, as I assumed, both 0.5 and 0.5 + MKL were actually using MKL.
So my results are valid, OpenBLAS, at the moment, keeps Julia behind MATLAB on many important tasks.

@StefanKarpinski
Copy link
Member

There are a number of known regressions in 0.6, which will be addressed before 0.6 final is released (we haven't even released 0.6-alpha yet). This is a good set of tests to add to the list.

@martin-frbg
Copy link

@KristofferC thank you (and RoyiAvital) for what looks like very instructive results. Could you name the hardware and operating system you used please ? (Apologies if it was mentioned somewhere and I overlooked it).

@hiccup7
Copy link

hiccup7 commented May 23, 2017

I see that Intel has updated the MKL license agreements.
This page was updated Feb. 8th, 2017: https://software.intel.com/en-us/articles/free-mkl
This page was added March 22nd, 2017: https://software.intel.com/en-us/articles/end-user-license-agreement

@StefanKarpinski - Do the legal statements made in this thread still apply? If so, please link to the applicable licenses so we can look for changes in the future.

If distributing MKL and GPL together is still a problem, how about making an optional MKL package (with no GPL code)? This way, MKL is not included in the official Julia installer. The Julia language would then use OpenBLAS by default, and MKL instead only if the MKL package is installed (by each user).

@ararslan
Copy link
Member

Do the legal statements made in this thread still apply?

Yes

please link to the applicable licenses so we can look for changes in the future.

MKL is under the Intel Simplified Software License. Though more permissive, AFAICT this is still incompatible with GPL, both version 2 and version 3.

I'm actively working on removing the GPL-licensed dependencies from Base, which will simplify things somewhat, though I doubt it will ever really make sense for us to distribute Julia with MKL by default.

@ViralBShah
Copy link
Member

Replacing BLAS so that a different one can be picked up as desired without compiling things is not too difficult, but a fair amount of work. Contributions welcome.

@StefanKarpinski
Copy link
Member

Unless Intel open sources MKL, it will remain incompatible with the GPL. If that happens, it will be a much bigger news item than a minor license update.

@jeffhammond
Copy link

If MKL and OpenBLAS are ABI-compatible, then LD_PRELOAD should do the trick. If they are not ABI-compatible, the one can imagine writing a shim that dispatches to one or the other at runtime based upon the availability of shared libraries. That way, you can ship a binary that can take advantage of MKL if the user has it, but otherwise falls back on the OSS options that exist today.

There is also the option of building Julia from source locally against whatever libraries are available, but I suppose this thread wouldn't be this long if everyone loved building from source as much as I do 😆

As MKL ships an FFTW-compatible interface, it sounds like SuiteSparse is your only mandatary GPL dependency interfering with MKL binary redistribution.

@ocsobservatory
Copy link

Intel MKL:

https://juliacomputing.com/products/juliapro.html

personal is free:
v0.6.2.2

https://shop.juliacomputing.com/Products/

JuliaPro-0.6.2.2 – MKL (for Windows) - (762.17M)
JuliaPro-0.6.2.2 – MKL (for Linux) - (1.02G)
JuliaPro 0.6.2.2 - MKL (for Mac) - (2.47G)
JuliaPro-0.6.2.2 – MKL (for Linux) – ASC - (490.00B)

@dnk8n
Copy link

dnk8n commented Jul 27, 2018

Sorry to bring up an old thread. I am curious as to what changed in order for Julia to be able to distribute both an MKL and non-MKL version?

I am faced with choosing which one to choose to install. Correct me if I am wrong, but MTL is not applicable to Intel(R) Core(TM)2 Duo CPU right?

As an aside, for my personal use, software freedom is more important than slightly faster computation so I would choose non-MTL. If I really needed fast computation I would spin up an AMI on an instance that is much more powerful than my development machine and use MTL there.

Does this mentality work though? Is it simple to do development without considering MTL and then run the code produced in production where MTL is present?

@ViralBShah
Copy link
Member

Please discuss this on discourse.

@JuliaLang JuliaLang locked and limited conversation to collaborators Jul 27, 2018
@ViralBShah
Copy link
Member

I am locking this in order to not discuss this further here, and redirecting folks to discourse.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests