-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AssertionError: Torch not compiled with CUDA enabled #156
Comments
If I run |
I have the exact same problem |
@endolith I will have a look at the code and see what is causing this. |
I do get this error running on an Apple M1 with PyTorch compiled against MPS and running the script with |
I fixed this issue by installing requirements.txt through Install Then I ran the following commands from the YouTube video:
If it worked, the output of
|
Should I create a separate issue for this regarding the Apple M1/M2 MPS support? @PromtEngineer |
Here is the traceback:
|
Some updates and partial success on my M1: "cuda" is hardcoded in Line 63 in 979f912
This probably should take the I changed this locally and it starts.
Trying to chase this one now... |
@ChristianWeyer this seems to be a bug, thanks for highlighting it. I am not sure if auto_gptq supports M1/M2. Will need to test that. |
Seems it does not: Which means we cannot use LocalGPT on M1/M2 with quantized models for now. |
@ChristianWeyer I finally got a M2 and just tested it, that is the case. Need to figure out if there is another way. |
BTW @PromtEngineer: the current code checks for CUDA explicitly for full models, which makes it unusable for MPS: |
@PromtEngineer did you find the error that @endolith mentioned earlier. Even though I have a conda environment, I still get an AssertionError: Torch not compiled with CUDA enabled error when I run python run_localGPT.py --device_type cpu |
yeah, same here. conda on windows cpu. |
I did have the Torch not complied with CUDA enabled on my Windows 11 with Nvidia RTX4070, using conda. This solved my issue. |
Do you already have an idea here? |
@OssBozier @mindwellsolutions Are you trying to run it with CUDA, though? My GPU doesn't have enough memory so I'm trying to run it without. Yours might count as a different bug? Needs that particular version added to requirements.txt? (Actually I guess in my original comment I was trying to run it with CUDA, so maybe my second comment is what should be in a separate bug.) |
I was just trying to get it to work at all. I am happy to run it with my GPU as I have enough GPU memory. I did only read your first comment before responding back here. I do see your desire to run the CPU option now in your second post. Not sure how to help with that one. |
Just pushed a fix for it. Let me know if there is still the same issue.
For my M2, I get better performance with |
If I run
So then I manually downloaded the .bin files and tried to put them in the huggingface hub cache folder, but then I get
etc. So I think it's just not going to work. #186 |
What worked for me: --check if cuda is installed, should be visible top right corner
--check if torch is installed
--clean up
--go to this site an get the propper command for your system an cuda installation
--check again
--try to ingest |
I tried this and it worked, thanks for defining it step by step! But now I have the next problem...I was able to ingest the document, but I couldn't run it due to
I tried changing the EMBEDDING_MODEL_NAME to instructor-base, but it seems it's not yet small enough. I have an RTX3070, so only 8 GB VRAM unfortunately...anyone know what instructor model would work? or something else that might be done? |
Hi @adjiap! |
I'm rather new in understanding the intricacies of machine models and embedding, but here I get to learn a few stuff :) In the end I was able to run the project by using the base instructor (using my GPU) during ingest.py, but running localGPT.py using CPU. This brings me to the next problem of having too little RAM, as the Vicuna-7B takes 30GB load of my 32GB RAM (not GPU VRAM, btw). Though it works, the questions are really slow. I haven't tried PromtEngineer's comment about setting the RAM there, (as I'm not sure yet what the argument actually does), because Vicuna-7B afaik, is 30GB, and if it's limited to something smaller, like 5 GB, would probably not work as intended. A colleague of mine helped me using his machine with dual RTX2080TI, with 12 GB VRAM each, and he was able to run the ingest.py and run_localGPT.py with no issue, though he did show me that when the runLocalGPT.py was run, both of his GPUs are maintaining a 9 GB load. tl;dr: Vicuna 7B and the large instructor doesn't work without at least a 20 GB VRAM GPU in total. The Embedding (ingest.py) would still work if I were to use the base instructor, but the actual model execution doesn't. |
Thanks, this solved it for me. |
there is a nice pip install syntax generator on the pytorch website in order to ensure that you download torch with cuda enabled ... it takes care of diffrent operating systems, vrsions of cuda etc. and generatas the right pip install command and correct torch versoin for you. |
In my case, for some reason, i had to force reinstall torch packages with no-cache option:
Trying to Then the ingest.py used my GPU, on windows 11. |
I have PyTorch with CUDA enabled:
This error message needs improvement. What is the actual problem?
requirements.txt
needs to be updated to include the correct pytorch version?OS Name Microsoft Windows 10 Pro
Version 10.0.19045 Build 19045
The text was updated successfully, but these errors were encountered: