-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot load exllamav2 models #44
Comments
Thanks for reporting and I appreciate your building the official repo to verify! I had a quick look and can replicate the issue. I guess that it may be a problem with the wheels for exllamav2.. will look into it further and see about building it from source in the image. The following commit may be the root of the issue: oobabooga/text-generation-webui@bde7f00. |
I don't think, that that is the issue, HQQ loader did not work for me either. This was not working for some time, but I thought the original repo is faulty. Now I really wanted to upgrade to try out the exllamav2 0.15, which has some great memory management improvements. |
I will have to see when I have time to debug it properly. I do not think it is 'missing' the CUDA runtime - the step you suggested refers to setting up a |
Yes, I have used one successfully, this was the model: https://huggingface.co/mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bit-HQQ |
Thanks very much - tried it out and got an error about flash attention: |
This is again some C library error, I still suspect that somehow we miss that cuda runtime. |
Thank you for the heads up - it was a library version mismatch, and thankfully a simple fix! New stable images are building and will be up in about an hour. |
Thank you very much for the quick fix!
…On Mon, Mar 11, 2024, 22:42 Atinoda ***@***.***> wrote:
Thank you for the heads up - it was a library version mismatch, and
thankfully a simple fix! New stable images are building and will be up in
about an hour.
—
Reply to this email directly, view it on GitHub
<#44 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA7GVX57RZIGUTMXCA5BDYLYXYQLXAVCNFSM6AAAAABEQ7GBS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBZGQ4TMNJRGA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
This happens for the two consequent nightly versions, and I have also built an image from the 2024-03-10 snapshot version:
https://github.com/oobabooga/text-generation-webui/releases/tag/snapshot-2024-03-10 . The issue happens both of them.
This is the base-nvidia version.
When I try to load an exllamav2 modell, I receive this error message:
File "/app/modules/ui_model_menu.py", line 245, in load_model_wrapper shared.model, shared.tokenizer = load_model(selected_model, loader) File "/app/modules/models.py", line 87, in load_model output = load_func_map[loader](model_name) File "/app/modules/models.py", line 378, in ExLlamav2_HF_loader from modules.exllamav2_hf import Exllamav2HF File "/app/modules/exllamav2_hf.py", line 7, in from exllamav2 import ( File "/venv/lib/python3.10/site-packages/exllamav2/init.py", line 3, in from exllamav2.model import ExLlamaV2 File "/venv/lib/python3.10/site-packages/exllamav2/model.py", line 23, in from exllamav2.config import ExLlamaV2Config File "/venv/lib/python3.10/site-packages/exllamav2/config.py", line 2, in from exllamav2.fasttensors import STFile File "/venv/lib/python3.10/site-packages/exllamav2/fasttensors.py", line 5, in from exllamav2.ext import exllamav2_ext as ext_c File "/venv/lib/python3.10/site-packages/exllamav2/ext.py", line 15, in import exllamav2_ext ImportError: /venv/lib/python3.10/site-packages/exllamav2_ext.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c107WarningC1ESt7variantIJNS0_11UserWarningENS0_18DeprecationWarningEEERKNS_14SourceLocationESsb
I built an image from the official repo as well, and that worked flowlessly.
I think the issue could be this step from the official repository:
conda install -y -c "nvidia/label/cuda-12.1.1" cuda-runtime
I couldn't find this step in the Dockerfile here.
Thanks for the help!
The text was updated successfully, but these errors were encountered: