Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using multiple threads/cores on OSX #18

Open
daducci opened this issue Mar 11, 2022 · 26 comments · May be fixed by #20
Open

Using multiple threads/cores on OSX #18

daducci opened this issue Mar 11, 2022 · 26 comments · May be fixed by #20

Comments

@daducci
Copy link
Collaborator

daducci commented Mar 11, 2022

Is it possible to use multiple threads/cores under OSX? It works on Linux, but I'm not able to use this feature on the Mac.

@gdurif
Copy link
Contributor

gdurif commented Mar 14, 2022

It should be possible and it should be working.

Could try to run the function check_openmp() defined in the setup.py file here.

It returns 0 if linking against OpenMP works and 1 if not.

@samuelstjean
Copy link
Collaborator

It is, but you need to compile with not the default mac toolchain since that does not work (I think they have a non standard implementation or something similar). I am personally using another module to check for it automatically here https://github.com/samuelstjean/spams-python/blob/master/setup.py#L114 Seems like it was also added to a special package to do just that if it's cleaner to use https://github.com/astropy/extension-helpers

@gdurif
Copy link
Contributor

gdurif commented Mar 15, 2022

Thanks @samuelstjean, I'll look into what you coded, we should reuse it.

extension-helpers seems great, the github repository seems to be still active but there has not been any release on PyPI since december 2019, and that worries me a little.

@samuelstjean
Copy link
Collaborator

I am personally using the older version, directly stolen from astropy, so just including this one file could be an idea. If stuff breaks we would have to fix it though, or copy an updated one, so it should not require too much work if it works as is and nobody touches it.

@gdurif
Copy link
Contributor

gdurif commented Mar 15, 2022

I asked them about releasing on PyPI more often (c.f. astropy/extension-helpers#38).

Depending on their answer, we will either directly integrate their file as you did or just use extension-helpers as a requirement (which would be less maintenance on our side if possible).

@gdurif gdurif linked a pull request Mar 15, 2022 that will close this issue
@gdurif
Copy link
Contributor

gdurif commented Mar 15, 2022

Should be fixed by #20

@daducci could you try with the corresponding version (c.f. below) and tell me if the mutli-threading works?

pip install git+https://github.com/getspams/spams-python.git@fix_openmp_req_macos

@lukeolson lukeolson mentioned this issue Mar 24, 2022
2 tasks
@daducci
Copy link
Collaborator Author

daducci commented Mar 26, 2022

Sorry for the delay. I just tested this version, but it does not work, only 1 core is used. I found that I could fix the issue with OpenMP on OSX if I change the following in the version on this repository:

  • setup.py:34 from
['cc', '-fopenmp' ,'-o',

to

['cc', '-Xpreprocessor', '-fopenmp', '-lomp', '-o',
  • setup.py:133 from
if check_openmp() == 0:
    cc_flags.append('-fopenmp')
    link_flags.append('-fopenmp')

to

if check_openmp() == 0:
    cc_flags.append('-Xpreprocessor')
    cc_flags.append('-fopenmp')
    link_flags.append('-lomp')

@gdurif
Copy link
Contributor

gdurif commented Mar 28, 2022

@daducci Thanks for the report. I'll do some tests (and eventually report to https://github.com/astropy/extension-helpers if needed).

@samuelstjean
Copy link
Collaborator

Seems like someone also found it astropy/extension-helpers#40, my guess is that those flags are for gcc, and apple is using clang as the default compiler. For a very long time, their version had issues or something else, and people had to install a third party gcc to get stuff working (like by using homebrew for example).

Seems like the easy way here is to let extension helper do the job and put the correct flag for the correct compiler, or hack it in somehow, but that sounds like trouble in the future to make it work properly, and on old versions also at the same time

@gdurif
Copy link
Contributor

gdurif commented Mar 31, 2022

I am digging a bit on this. As long as extension-helpers does not support this (no answer on the issue), it will be tricky on our part.

setuptools.command.build_ext.new_compiler does not discriminate^[1] between gcc and clang (even less between Apple clang and LLVM clang I would guess). So we will have to do the check manually, and edit the flags accordingly (after calling extension-helpers) for the moment.

[1] See

$ python setup.py build_ext --help-compiler
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -c test_openmp.c -o objects/test_openmp.o -fopenmp
x86_64-linux-gnu-gcc -pthread objects/test_openmp.o -o test_openmp -fopenmp
Compiling Cython/C/C++ extension with OpenMP support
List of available compilers:
  --compiler=bcpp     Borland C++ Compiler
  --compiler=cygwin   Cygwin port of GNU C Compiler for Win32
  --compiler=mingw32  Mingw32 port of GNU C Compiler for Win32
  --compiler=msvc     Microsoft Visual C++
  --compiler=unix     standard UNIX-style compiler

@gdurif
Copy link
Contributor

gdurif commented Apr 1, 2022

I am trying to fix extension-helpers regarding this matter, c.f. https://github.com/gdurif/extension-helpers/tree/support_apple_clang

I'll keep you updated (I will propose a PR when it is working).

Update: PR on the way astropy/extension-helpers#42

@gdurif
Copy link
Contributor

gdurif commented Apr 1, 2022

@daducci could you tell me if installing the modified version of extension-helpers before installing spams solves the problem?

pip install git+https://github.com/gdurif/extension-helpers.git@support_apple_clang
pip install git+https://github.com/getspams/spams-python.git@fix_openmp_req_macos

@daducci
Copy link
Collaborator Author

daducci commented Apr 4, 2022

No, in my case only 1 core is used. :-( Multiple cores are used only when installing the version with the manual patch I suggested few days ago. Dunno why this one does not work, as it seems all the modifications in there match mine.

@gdurif
Copy link
Contributor

gdurif commented Apr 4, 2022

@daducci ok thanks, I'll continue digging. Since the manual patch works, I hope we are not foo far from finding the solution!

@gdurif
Copy link
Contributor

gdurif commented Apr 11, 2022

@daducci I think the issue was found (astropy/extension-helpers#42 (comment)). You are using the new Apple CPUs if I recall correctly ? (hence not Intel-based, hence a different location for libomp)

@daducci
Copy link
Collaborator Author

daducci commented Apr 11, 2022

Yes, I am! Sorry, I didn't think Apple would change paths/etc... based on the vendor of its CPUs!

@gdurif
Copy link
Contributor

gdurif commented Apr 11, 2022

No problem, we will find a solution, the person suggested one in his response.

@gdurif
Copy link
Contributor

gdurif commented Apr 11, 2022

@daducci Could you tell me the output of brew --prefix libomp on your machine please?

Edit: and also python -c "import sys; print(sys.platform)" ? (just to be sure) Thanks

@daducci
Copy link
Collaborator Author

daducci commented Apr 11, 2022

brew --prefix libomp --> /usr/local/opt/libomp

python -c "import sys; print(sys.platform)" --> darwin

@gdurif
Copy link
Contributor

gdurif commented Apr 12, 2022

@daducci could you retry ? (maybe uninstall every thing before or try in a clean environment)

pip install git+https://github.com/gdurif/extension-helpers.git@support_apple_clang
pip install git+https://github.com/getspams/spams-python.git@fix_openmp_req_macos

On my MacOS VM, multi-threading seems to work.

@daducci
Copy link
Collaborator Author

daducci commented Apr 21, 2022

Hi @gdurif ,

sorry for the long delay, I have been off for Easter. Unfortunately, it doesn't work; I can make it work only when using the "manual trick" from a previous message. Here is the code snippet I use to benchmark:

import time
import spams
import numpy as np
from tqdm import tqdm

# data generation
np.random.seed(0)
X = np.asfortranarray(np.random.normal(size=(100,200)))
X = np.asfortranarray(X / np.tile(np.sqrt((X*X).sum(axis=0)),(X.shape[0],1)))
D = np.asfortranarray(np.random.normal(size=(100,1000)))
D = np.asfortranarray(D / np.tile(np.sqrt((D*D).sum(axis=0)),(D.shape[0],1)))

tic = time.time()
for i in tqdm(range(500)):
    alpha = spams.lasso( X, D=D, lambda1=0.15, numThreads=-1 )
print( f"{time.time() - tic:.2f}s")

If needed, we can chat one of these days and make some test in real time. Just let me know if this can help!

@gdurif
Copy link
Contributor

gdurif commented Apr 21, 2022

Hi @daducci
No problem ;-) Thanks for the code snippet. I will use it to improve my tests. And also we could definitely chat about it, I'll send you an email.
Best

@daducci
Copy link
Collaborator Author

daducci commented Aug 3, 2022

Hi guys, what about resuming this discussion to try getting to the bottom of it?

@samuelstjean
Copy link
Collaborator

Looks like it will be complicated for now, since building on arm is done by cross-compiling with the thing I am using, and that means homebrew pulls stuff for x64 instead. It does build without putting in the libs, so people might be able to work around it by installing openblas themselves.
If you want to try it (or even the old x64 version), that should be it https://we.tl/t-zeELwQpqmC And of course it could not be tested for the same reasons, so it might not even start.

@samuelstjean
Copy link
Collaborator

Well if we wait a bit normally there will be some buildbots directly on arm64 for macs. As of now, this means it will be impossible to install a premade blas, unless we want to compile our own each time, but I would not do that because it means we would need to check the test and fix ourselves every little bugs since we build from source. Hopefully this will fix it pypa/cibuildwheel#1204

As of now, I am using the conda version on windows, the cent os version on linux and the homebrew version for mac to avoid that, but that means we won't have a blas version for mac M1 until someone else makes it for me. With a bit of luck maybe the M1 version can use the libs from the x64 version as it apparently emulates stuff behind the scene for you, so if those builds work we could do that (and normally it should be multithreaded, but I don't have a mac so...).

@samuelstjean
Copy link
Collaborator

Well I tried some new arm mac build, but I don't have access to one to check if everything works as supposed. They are built with some emulation layer, so I can not test them either on the build machines. It should be fine if you want to give it a try before putting them back here https://github.com/samuelstjean/spams-python/releases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants