-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8-bit precision not working on Windows #20
Comments
bitsandbytes currently does not support windows, but there are some workarounds. |
Thanks, I managed to somehow make it work. |
How did you manage to make it work? can you share the method? |
Basically you have to download these 2 dll files from here. then you move those files into |
EDIT: I celebrated too early, it gives me a cublast error on trying to generate lol @minipasila THANKYOU SO MUCH! Your instructions + prebuilt bitandbytes for older GPUs https://github.com/james-things/bitsandbytes-prebuilt-all_arch is helping me run Pygmalion 2.7B on my GTX 1060 6GB and it's taking only 3.8 GB VRAM ( out of which prolly 0.4 is being used by the system as I don't have inbuilt graphics ) |
@minipasila Thank you for sharing, your instructions worked perfectly for GPT-J-6B on 3070ti |
For future reference, the 8 bit windows fix required me to navigate to my Python310 install folder instead of the env, as bitsandbytes was not installed in the conda env. |
For anybody having troubles still, you can try using newer library - https://github.com/james-things/bitsandbytes-prebuilt-all_arch |
I still have the same issue I tried everything linked except the v37 fix, I downloaded the dll and put it in the bitsandbytes folder, what next? |
Just change |
I followed the suggestions above but when i try to run it with 8-bit precision i get an error window pop up "Bad Image EDIT: Okay, this is weird. I copied the dll from my stable diffusion bitsandbytes folder and it seems to work now. |
I got the same error before. |
When attempting to generate with 8-bit using the new libraries suggested by VertexMachine, I get this error.
|
Loading llama-7b-hf... |
How do you find the stable diffusion bitsandbytes folder? |
It's in this folder: stable-diffusion-webui\venv\Lib\site-packages\bitsandbytes |
8-bit should work out of the box with the new one-click installer https://github.com/oobabooga/text-generation-webui#one-click-installers |
This seems to be solving the issue for me, still working on it. |
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 160.00 MiB. GPU 0 has a total capacity of 16.00 GiB of which 0 bytes is free. Of the allocated memory 15.06 GiB is allocated by PyTorch, and 54.28 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) |
It means you ran out of memory (try 4bit precision or another quant method like GPTQ/EXL2/GGUF or smaller model) but this error is unrelated to this issue. (sorry for another notification) |
Is it not compatible? Even slower is faster than llama... |
RuntimeError: CUDA error: no kernel image is available for execution on the device |
Ok this is a different error.. so uhh you probably should give a bit more information about your issue like how you installed textgen and what your setup is etc.. potentially just make a new issue because this is probably a different issue than what I made this for originally. |
exllamav2 Directly loaded, it is also cuda12.4 but it does not support the m40GPU Max architecture of computing Power 5.2 |
在本地 URL 上运行:http://127.0.0.1:7860/ 01:55:11-441029 INFO 加载“14b-exl” 01:55:54-858096 INFO 加载“14b-exl” ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 文件“D:\text-generation-webui\installer_files\env\Lib\site-packages\exllamav2\fasttensors.py”,第 204 行,在 get_tensor |
It seems like it doesn't like to work on Windows and is unable to detect my cuda installation.
The text was updated successfully, but these errors were encountered: