-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] Add the Ability to Use NVIDIA for Docker #43
Comments
I too would like to see this capability. Looks like it is possible with this project: https://github.com/ggerganov/llama.cpp |
I am willing to test GPU / Nvidia / Docker |
Hey guys, Please reach out to me if you want to test it. |
This is in the TODO list, we have to split the Dockerfiles first though. Having GPU support will make the image huge. The plan is to publish separate images for GPU/Non-GPU support |
@gaby, any idea when could these be expected? Happy to test too. |
I got it running. Feel free to borrow: https://gist.github.com/Jonpro03/604430a3e64735a0a9df6b7e385d15be |
Ideally during runtime the image should be "-runtime" not "-devel". My plan is to make a compose that use the dockerfile for the final stage |
This week |
Groovy. Can confirm it works with the runtime image too. |
@Jonpro03 llama-cpp-python compiles correctly even using the "-runtime" tag? |
I see that you are adding CMAKE flags, not sure what those do, haha |
Possibly a me problem, since I'm running older CPUs (Xeon E5-2640's), but these flags were necessary to get it to compile with the pip. |
Thanks for the info! |
Oops, I spoke too soon. It only works with the devel tag. |
Yeah, that's what i figured. The devel tag is needed to compile the llama-cpp-python wheel. After that the runtime tag can be used. My main problem has been that the devel tag is like 6.5GB |
We're also willing to test Nvidia support, have 2x RTX 3090 w/nvlink set up on Pop!_OS and eager see Nvidia support in Serge! |
Any progress? I would love to be able to run a local LLM on unraid with GPU acceleration |
I'm on vacation, will get this done in like 2 weeks. :-) |
I just found no use of gpu when I started a chat ( 30 GPU Layers) ,Should I install CUDA inside Docker? Below are my proposed modifications:
step2: replace the installation method for "llama-cpp-python" in the "/scripts/deploy.sh" file
step3:docker build step4:install nvidia-container-toolkit step5:start docker |
@creed2415 Did you get GPU to work? |
Cool stuff does anyone have numbers on the infrance speed on gpu vs cpu using this |
yes! After I installed CUDA inside Docker,it worked with gpu. |
well, the GPU Creed showed has 3840 Cuda-cores compared to the 24 cores/32Threads a current consumer Intel CPU provides. |
@creed2415 did you do this in unraid by chance? Not sure how to actually make those edits to the package since it comes from the unraid community applications. @gaby any tips on how to make that happen? Or is that something you still planned on ? |
@gaby If possible, I would like to submit my solution (linux for nvidia) to the repository. |
@creed2415 Serge has to work for both gpu/non-gpu, so images have to be separate, etc. So, it's a bit more complicated than just adding nvidia-toolkit. If you got a solution that works for everyone, sure. |
@gaby well,here are the adjustments I made. just take a look and let me know if they are useful. |
any news on this pls? |
@mkeshav This week :-) |
Woh knows the best nvidia gpu provider iaas XD looking forward to it! ps maybe harddware really is worth it ! but need med scale ; |
Any luck with this? |
Still in progress |
Can I help? |
@TheQuickestFox Not yet, i'm helping the |
any ETA on this? and also on the latest IPv6 fixes? The last release is from Feb :( |
@JuniperChris929 Within the next 2 weeks. The main library used by Serge now has built-in support for CUDA Python wheels which will streamline our approach to support multiple platforms. Even though last release was in February, there's over 100 commits between then and now. |
Are there any updates regarding Serge using GPU acceleration? |
@gaby is this gunna happen? |
Any news here? We keep hearing over and over again "within a week or two" but the last release is from February and neither CUDA SUpport is easily working nor support for IPv6-only :( |
Allow large language models with graphics card with large RAM.
https://github.com/NVIDIA/nvidia-docker
The text was updated successfully, but these errors were encountered: