-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: install CUDA support on Windows if available #339
Conversation
Converting to a draft because my code is pretty hacky since I wrote it rather hastily. Would prefer someone to review it and test it before merging. |
result: fail:
so, apparently, it is looking for I am on Ubuntu 20 and I can do this:
so the file is clearly there, and installed locally, and the code still doesn't find it there |
UPDATE #1 adding the following to the command line, helps and begins to work correctly:
That's of course because I happen to know precisely where my files are, this will have to get adopted for the particular way of installing that you have used there. |
Update #2 Even though the process now finishes with no error, So, there is still something wrong. |
@jerzydziewierz Thanks for testing. It looks like the libraries get placed in the |
Alright, I wasn't able to figure out Linux support the way I did for Windows, so I decided to just update this PR to only install precompiled CUDA and llama.cpp binaries on Windows when A new change also detects whether the CPU supports AVX2, AVX, or neither and installs the appropriate precompiled llama-cpp-python package. For now, Linux NVIDIA users will need to install CUDA themselves. |
i got the same issue too .. on macOS Ventura 13.2.1, please some help to fix this thing |
I’m going to go ahead and lose this one as stale and we’ll revisit it. |
This PR installs official NVIDIA wheels for CUDA support so the CUDA Toolkit does not need to be installed.
It also installs precompiled llama-cpp-python wheels that support CUDA so VS / dev tools don't need to be present on the computer.
The install is also much faster since nothing needs to be compiled.
Based on #338 which should be merged first.
Steps for testing:
Choose any model, choose yes for GPU, ensure llama-cpp-python installs without error, ensure GPU is utilized after the first request.
To test again, uninstall llama-cpp-python first: