Cannot load exllamav2 models #44

MrMojoR · 2024-03-11T20:12:49Z

This happens for the two consequent nightly versions, and I have also built an image from the 2024-03-10 snapshot version:
https://github.com/oobabooga/text-generation-webui/releases/tag/snapshot-2024-03-10 . The issue happens both of them.
This is the base-nvidia version.
When I try to load an exllamav2 modell, I receive this error message:

File "/app/modules/ui_model_menu.py", line 245, in load_model_wrapper shared.model, shared.tokenizer = load_model(selected_model, loader) File "/app/modules/models.py", line 87, in load_model output = load_func_map[loader](model_name) File "/app/modules/models.py", line 378, in ExLlamav2_HF_loader from modules.exllamav2_hf import Exllamav2HF File "/app/modules/exllamav2_hf.py", line 7, in from exllamav2 import ( File "/venv/lib/python3.10/site-packages/exllamav2/init.py", line 3, in from exllamav2.model import ExLlamaV2 File "/venv/lib/python3.10/site-packages/exllamav2/model.py", line 23, in from exllamav2.config import ExLlamaV2Config File "/venv/lib/python3.10/site-packages/exllamav2/config.py", line 2, in from exllamav2.fasttensors import STFile File "/venv/lib/python3.10/site-packages/exllamav2/fasttensors.py", line 5, in from exllamav2.ext import exllamav2_ext as ext_c File "/venv/lib/python3.10/site-packages/exllamav2/ext.py", line 15, in import exllamav2_ext ImportError: /venv/lib/python3.10/site-packages/exllamav2_ext.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c107WarningC1ESt7variantIJNS0_11UserWarningENS0_18DeprecationWarningEEERKNS_14SourceLocationESsb

I built an image from the official repo as well, and that worked flowlessly.
I think the issue could be this step from the official repository:
conda install -y -c "nvidia/label/cuda-12.1.1" cuda-runtime

I couldn't find this step in the Dockerfile here.
Thanks for the help!

The text was updated successfully, but these errors were encountered:

Atinoda · 2024-03-11T20:35:14Z

Thanks for reporting and I appreciate your building the official repo to verify! I had a quick look and can replicate the issue.

I guess that it may be a problem with the wheels for exllamav2.. will look into it further and see about building it from source in the image. The following commit may be the root of the issue: oobabooga/text-generation-webui@bde7f00.

MrMojoR · 2024-03-11T20:42:00Z

I don't think, that that is the issue, HQQ loader did not work for me either. This was not working for some time, but I thought the original repo is faulty. Now I really wanted to upgrade to try out the exllamav2 0.15, which has some great memory management improvements.

Atinoda · 2024-03-11T20:49:44Z

I will have to see when I have time to debug it properly. I do not think it is 'missing' the CUDA runtime - the step you suggested refers to setting up a conda environment, and this image uses venv. Have you successfully used the HQQ loader in the official image? If so, could you please point me to the model and settings you used? I will check that out as well, when I'm looking at the exllamav2 issue in more detail.

MrMojoR · 2024-03-11T21:04:06Z

Yes, I have used one successfully, this was the model: https://huggingface.co/mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bit-HQQ
There is only one setting, I used the pytorch backend.

Atinoda · 2024-03-11T21:22:05Z

Thanks very much - tried it out and got an error about flash attention: /venv/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops15sum_IntList_out4callERKNS_6TensorEN3c1016OptionalArrayRefIlEEbSt8optionalINS5_10ScalarTypeEERS2_

MrMojoR · 2024-03-11T21:26:02Z

This is again some C library error, I still suspect that somehow we miss that cuda runtime.

Atinoda · 2024-03-11T21:41:57Z

Thank you for the heads up - it was a library version mismatch, and thankfully a simple fix! New stable images are building and will be up in about an hour.

MrMojoR · 2024-03-11T21:45:51Z

Thank you very much for the quick fix!

…

On Mon, Mar 11, 2024, 22:42 Atinoda ***@***.***> wrote: Thank you for the heads up - it was a library version mismatch, and thankfully a simple fix! New stable images are building and will be up in about an hour. — Reply to this email directly, view it on GitHub <#44 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA7GVX57RZIGUTMXCA5BDYLYXYQLXAVCNFSM6AAAAABEQ7GBS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBZGQ4TMNJRGA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

Atinoda added the bug Something isn't working label Mar 11, 2024

Atinoda closed this as completed in 6726a71 Mar 11, 2024

Atinoda mentioned this issue Mar 14, 2024

Container instantly crashes when trying to load GGUF #45

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot load exllamav2 models #44

Cannot load exllamav2 models #44

MrMojoR commented Mar 11, 2024

Atinoda commented Mar 11, 2024

MrMojoR commented Mar 11, 2024

Atinoda commented Mar 11, 2024

MrMojoR commented Mar 11, 2024

Atinoda commented Mar 11, 2024

MrMojoR commented Mar 11, 2024

Atinoda commented Mar 11, 2024

MrMojoR commented Mar 11, 2024 via email

Cannot load exllamav2 models #44

Cannot load exllamav2 models #44

Comments

MrMojoR commented Mar 11, 2024

Atinoda commented Mar 11, 2024

MrMojoR commented Mar 11, 2024

Atinoda commented Mar 11, 2024

MrMojoR commented Mar 11, 2024

Atinoda commented Mar 11, 2024

MrMojoR commented Mar 11, 2024

Atinoda commented Mar 11, 2024

MrMojoR commented Mar 11, 2024 via email