-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large, unnecessary, proprietary mkl package included in numpy and pandas install, inflates binary by 600MB #84
Comments
First we don’t currently build |
That's not quite complete. There is a potential issue here, especially when using PyInstaller or a similar such tool: it's possible that it's a GPL violation to distribute an executable with both MKL and a GPL component. The NumPy team has talked to Intel about this (answer, Intel will not give definitive legal advice) and gotten good independent advice (answer, GPL violation potentially possible here but the likelihood of that is case-specific). To add to the answer to @answerquest: MKL or another BLAS package is definitely necessary for numpy. You're getting MKL because you have installed the Anaconda default numpy. If you use |
Thanks for the clarification. What could help in this matter is if we could have a list of numpy/pandas commands that actually do need mkl, then people can have an objective way of determining whether their programs need it or not. The difference is a whopping 600MB in program size, so that is significant for any program creator (my program's binary is just 30MB when I go the no-conda way, and none of the functions are failing. It doesn't make any sense for me to include mkl just out of a sense of formality/loyalty) and is well worth the disambiguation. Also, in a conda install, if there can be a way to manually specify which dependency is to be excluded, then that can also be a good workaround, as the other benefits of conda over pip are still there and I still want to use conda. |
@answerquest that's not the best recommendation unfortunately. It works in that case, but installing
|
@rgommers my bad, sorry, I had not read the OpenBLAS line correctly. If Definitely using virtual environment to create the binary. |
Indeed, should be <10 MB. |
Use the conda-forge packages instead of the default packages. Especially for numpy, this means using OpenBLAS instead of MKL. conda-forge/numpy-feedstock#84 conda-forge/numpy-feedstock#97
Use the conda-forge packages instead of the default packages. Especially for numpy, this means using OpenBLAS instead of MKL. conda-forge/numpy-feedstock#84 conda-forge/numpy-feedstock#97
Use the conda-forge packages instead of the default packages. Especially for numpy, this means using OpenBLAS instead of MKL. conda-forge/numpy-feedstock#84 conda-forge/numpy-feedstock#97
Use the conda-forge packages instead of the default packages. Especially for numpy, this means using OpenBLAS instead of MKL. conda-forge/numpy-feedstock#84 conda-forge/numpy-feedstock#97
Use the conda-forge packages instead of the default packages. Especially for numpy, this means using OpenBLAS instead of MKL. conda-forge/numpy-feedstock#84 conda-forge/numpy-feedstock#97
Use the conda-forge packages instead of the default packages. Especially for numpy, this means using OpenBLAS instead of MKL. conda-forge/numpy-feedstock#84 conda-forge/numpy-feedstock#97
Use the conda-forge packages instead of the default packages. Especially for numpy, this means using OpenBLAS instead of MKL. conda-forge/numpy-feedstock#84 conda-forge/numpy-feedstock#97
Use the conda-forge packages instead of the default packages. Especially for numpy, this means using OpenBLAS instead of MKL. conda-forge/numpy-feedstock#84 conda-forge/numpy-feedstock#97
Use the conda-forge packages instead of the default packages. Especially for numpy, this means using OpenBLAS instead of MKL. conda-forge/numpy-feedstock#84 conda-forge/numpy-feedstock#97
The latest update on the anaconda `default` channel has a broken, 32bit Windows MKL FFT lib that crashes the numpy import on it. conda-forge/numpy-feedstock#84 conda-forge/numpy-feedstock#97
For anyone trying to do as @rgommers suggests (option 1. - it worked in the end!). The following might save you 1 hour of puzzling: stackoverflow thread. I was having difficulty installing pyinstaller AND numpy with openblas just now because my "conda install -c conda-forge pyinstaller" command resulted in numpy being "upgraded" to an mkl-linked one. The link explained a great deal and pyinstaller now makes my .py "import numpy" into an exe (on windows) of <14mb :) Still, scary to be so dependent on what version is available/downloaded via conda. Would be a shame not to be able to make small executables which make use of numpy. Should I be worried? |
FWIW what I typically do is |
Yes, or add blas=*=openblas To your condarc, https://conda.io/docs/user-guide/configuration/use-condarc.html#always-add-packages-by-default-create-default-packages |
@msarahan How do I add this? I tried adding it at the bottom of the .condarc in my environment, but then I get the following error
|
Hi, just FYI (not replying to any earlier post here), I've since had no problems in using just Earlier pip was having a problem with pandas, which was why I was using conda, but that got resolved just some days after I had posted here. This update isn't relevant for this repo but seeing that there's activity here and I was the OP, I have an obligation to disclose how I finally solved the problem on my end. I went with pip and it worked out fine. |
This issue is fixed in newer versions, so installing numpy from conda-forge should get you a openblas-version. But there is no openblas/nomkl version of scipy on Windows yet, so I'm using pip to install scipy. I have the same experience as you, no issues, but something is probably not getting installed correctly. But I prefer not mixing pip and conda, so I'd love a |
Ref:
mkl
package is co-installed when we install either pandas or numpy using conda. It is a very large package clocking at ~200MB for download, and is ~600MB when installed in thepkgs
folder of my MiniConda installation. Thepip
installer does not include this package when installing pandas. It is not there among conda feedstocks list and it has no description given on https://pypi.org/project/mkl/ . And..I do not know more about this subject, but when I searched for
mkl
I came across more results formkl-fft
andmkl-random
which is are not the same asmkl
, and are under free licenses.mkl-fft
's description on pypi also seems more numpy-involved. https://pypi.org/project/mkl-fft/My hunch is that
mkl-fft
andmkl-random
were the ones supposed to be included in thenumpy
installs andmkl
got included by accident.Where this is really causing a problem : when generating self-contained binaries for distribution, the
mkl
packages gets roped in for programs that import eithernumpy
orpandas
if conda has installed it in the python environment. For windows binary that thePyInstaller
program creates, it balloons up the dist by about 600MBs.Please investigate this and if it's not essential to numpy then remove
mkl
from the numpy installation by conda.Info: Conda version: 4.5.1, on Windows 7 64-bit. As part of MiniConda Python3 64-bit.
Sharing lines from the numpy json file I found in my MiniConda installation's
conda-meta
folder:Sharing lines from
[Miniconda3]\pkgs\mkl-2018.0.2-1\info\LICENSE.txt
:The text was updated successfully, but these errors were encountered: