-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Container image fails to start with 'Unable to dynamically load the "cuda" shared library' #478
Comments
I have a similar issue, see this comment for the details: huggingface/candle#353 (comment) |
@sammcj have you been able to reproduce this? |
The latest mistralrs container image seems to fail to start with the following when using any qwen2.5 models which are all I'm running at the moment (just as they're so much better than anything else):
I'll download an older model and see if it works, will let you know. |
Oh I also use mistral-large, just realised I had a GGUF for that - it fails as well:
|
No go, looks like it fails with llama 3.2 as well:
|
@sammcj we don't support the I- quants yet (that explains mistral-large). They will be added soon with the upcoming imatrix support 😉! Can you please update your CUDA docker container? I just released some new images (our images for compute cap 75 is now deprecated). |
Hey that fixed Llama 3.2! Nice work! 🎉
FYI I'm currently running 2x 3090, (compute 86,86+PTX), still trying to decide what to do with my 2x A4000 and 2x P100s I've got sitting here that won't fit in my case 😅 |
FYI Qwen 2.5 32b Q6_K starts to load with that updated container image, then crashes with CUDA out of memory - it looks like it's only using 1 of the 2 GPUs, I'll have to check I don't have any configuration issues - if I don't I'll log a separate bug for it. |
FYI manually specifying the layers to place on each GPU correctly used both.
But then fails:
|
Describe the bug
When the mistralrs container using the official Dockerfile.cuda-all starts it crashes with:
Latest commit
The text was updated successfully, but these errors were encountered: