Failed to load 4-bits weights from HuggingFace #51

PierreOreistein · 2023-11-07T02:28:26Z

Description

Unable to load the quantized weights (4 bits) from HuggingFace

Code

The code is a direct copy from the file examples/example_chat_4bit_en.py

import torch
from transformers import AutoModel, AutoTokenizer

import auto_gptq
from auto_gptq.modeling import BaseGPTQForCausalLM

auto_gptq.modeling._base.SUPPORTED_MODELS = ["InternLMXComposer"]

torch.set_grad_enabled(False)


class InternLMXComposerQForCausalLM(BaseGPTQForCausalLM):
    layers_block_name = "internlm_model.model.layers"
    outside_layer_modules = [
        "query_tokens",
        "flag_image_start",
        "flag_image_end",
        "visual_encoder",
        "Qformer",
        "internlm_model.model.embed_tokens",
        "internlm_model.model.norm",
        "internlm_proj",
        "internlm_model.lm_head",
    ]
    inside_layer_modules = [
        ["self_attn.k_proj", "self_attn.v_proj", "self_attn.q_proj"],
        ["self_attn.o_proj"],
        ["mlp.gate_proj"],
        ["mlp.up_proj"],
        ["mlp.down_proj"],
    ]


# init model and tokenizer
model = InternLMXComposerQForCausalLM.from_quantized(
    "internlm/internlm-xcomposer-7b-4bit", trust_remote_code=True, device="cuda:0"
)
model = model.eval()
tokenizer = AutoTokenizer.from_pretrained(
    "internlm/internlm-xcomposer-7b-4bit", trust_remote_code=True
)
model.model.tokenizer = tokenizer

# example image
image = "examples/images/aiyinsitan.jpg"

# Multi-Turn Text-Image Dialogue
# 1st turn
text = 'Describe this image in detial.'
image = "examples/images/aiyinsitan.jpg"
response, history = model.chat(text, image)
print(f"User: {text}")
print(f"Bot: {response}") 
# The image features a black and white portrait of Albert Einstein, the famous physicist and mathematician. 
# Einstein is seated in the center of the frame, looking directly at the camera with a serious expression on his face. 
# He is dressed in a suit, which adds a touch of professionalism to his appearance.

Error

Traceback (most recent call last):
  File "/mnt/bd/dev-pierre-oreistein-st/sandbox/test_internlm_vl/test_internlm_vl_4bits", line 35, in <module>
    model = InternLMXComposerQForCausalLM.from_quantized(
  File "/home/pierre/.pyenv/versions/dev3.9/lib/python3.9/site-packages/auto_gptq/modeling/_base.py", line 847, in from_quantized
    raise FileNotFoundError(f"Could not find a model in {model_name_or_path} with a name in {', '.join(searched_files)}. Please specify the argument model_basename to use a custom file name.")
FileNotFoundError: Could not find a model in internlm/internlm-xcomposer-7b-4bit with a name in gptq_model-4bit-128g.safetensors, model.safetensors. Please specify the argument model_basename to use a custom file name.

Ideas

According to this similar issue I need to specify the model file. However, I was unable to find it on HuggingFace. Could you help me with this?

Thanks in advance for your help!

LightDXY · 2023-11-07T04:17:54Z

hi, try to set use_safetensors=False when load from the model by from_quantized. This may caused by the default value difference in different GPTQ version.

PierreOreistein · 2023-11-07T04:32:49Z

Thanks for the fast reply. I still have a similar error so. It was able to download the bin file but unable to load it I guess. Any idea what is the problem? Or any recommendation on the gptq version to use?

Environment

Python 3.9.18

accelerate==0.24.1
aiofiles==23.2.1
aiohttp==3.8.6
aiosignal==1.3.1
altair==5.1.2
annotated-types==0.6.0
anyio==3.7.1
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens==2.4.1
async-lru==2.0.4
async-timeout==4.0.3
attrs==23.1.0
auto-gptq==0.5.0
Babel==2.13.1
beautifulsoup4==4.12.2
bleach==6.1.0
certifi==2023.7.22
cffi==1.16.0
charset-normalizer==3.3.2
click==8.1.7
comm==0.1.4
contourpy==1.2.0
cycler==0.12.1
datasets==2.14.6
debugpy==1.8.0
decorator==5.1.1
defusedxml==0.7.1
dill==0.3.7
einops==0.7.0
exceptiongroup==1.1.3
executing==2.0.1
fastapi==0.104.1
fastjsonschema==2.18.1
ffmpy==0.3.1
filelock==3.13.1
fonttools==4.44.0
fqdn==1.5.1
frozenlist==1.4.0
fsspec==2023.10.0
gekko==1.0.6
gradio==3.44.4
gradio_client==0.5.1
h11==0.14.0
httpcore==1.0.1
httpx==0.25.1
huggingface-hub==0.17.3
idna==3.4
importlib-metadata==6.8.0
importlib-resources==6.1.0
ipykernel==6.26.0
ipython==8.17.2
ipywidgets==8.1.1
isoduration==20.11.0
jedi==0.19.1
Jinja2==3.1.2
json5==0.9.14
jsonpointer==2.4
jsonschema==4.19.2
jsonschema-specifications==2023.7.1
jupyter==1.0.0
jupyter-console==6.6.3
jupyter-events==0.8.0
jupyter-lsp==2.2.0
jupyter_client==8.5.0
jupyter_core==5.5.0
jupyter_server==2.9.1
jupyter_server_terminals==0.4.4
jupyterlab==4.0.8
jupyterlab-pygments==0.2.2
jupyterlab-widgets==3.0.9
jupyterlab_server==2.25.0
kiwisolver==1.4.5
markdown2==2.4.10
MarkupSafe==2.1.3
matplotlib==3.8.1
matplotlib-inline==0.1.6
mistune==3.0.2
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.15
nbclient==0.8.0
nbconvert==7.10.0
nbformat==5.9.2
nest-asyncio==1.5.8
networkx==3.2.1
notebook==7.0.6
notebook_shim==0.2.3
numpy==1.26.1
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.18.1
nvidia-nvjitlink-cu12==12.3.52
nvidia-nvtx-cu12==12.1.105
orjson==3.9.10
overrides==7.4.0
packaging==23.2
pandas==2.1.2
pandocfilters==1.5.0
parso==0.8.3
peft==0.6.0
pexpect==4.8.0
Pillow==10.1.0
platformdirs==3.11.0
polars==0.19.12
prometheus-client==0.18.0
prompt-toolkit==3.0.39
psutil==5.9.6
ptyprocess==0.7.0
pure-eval==0.2.2
pyarrow==14.0.0
pycparser==2.21
pydantic==2.4.2
pydantic_core==2.10.1
pydub==0.25.1
Pygments==2.16.1
pyparsing==3.1.1
python-dateutil==2.8.2
python-json-logger==2.0.7
python-multipart==0.0.6
pytz==2023.3.post1
PyYAML==6.0.1
pyzmq==25.1.1
qtconsole==5.5.0
QtPy==2.4.1
referencing==0.30.2
regex==2023.10.3
requests==2.31.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rouge==1.0.1
rpds-py==0.12.0
safetensors==0.4.0
semantic-version==2.10.0
Send2Trash==1.8.2
sentencepiece==0.1.99
six==1.16.0
sniffio==1.3.0
soupsieve==2.5
stack-data==0.6.3
starlette==0.27.0
sympy==1.12
terminado==0.17.1
timm==0.4.12
tinycss2==1.2.1
tokenizers==0.13.3
tomli==2.0.1
toolz==0.12.0
torch==2.1.0
torchvision==0.16.0
tornado==6.3.3
tqdm==4.66.1
traitlets==5.13.0
transformers==4.33.1
triton==2.1.0
types-python-dateutil==2.8.19.14
typing_extensions==4.8.0
tzdata==2023.3
uri-template==1.3.0
urllib3==2.0.7
uvicorn==0.24.0.post1
wcwidth==0.2.9
webcolors==1.13
webencodings==0.5.1
websocket-client==1.6.4
websockets==11.0.3
widgetsnbextension==4.0.9
XlsxWriter==3.1.2
xxhash==3.4.1
yarl==1.9.2
zipp==3.17.0

Code

I added the 'use_safetensors=False'

import torch
from transformers import AutoModel, AutoTokenizer

import auto_gptq
from auto_gptq.modeling import BaseGPTQForCausalLM

auto_gptq.modeling._base.SUPPORTED_MODELS = ["InternLMXComposer"]

torch.set_grad_enabled(False)


class InternLMXComposerQForCausalLM(BaseGPTQForCausalLM):
    layers_block_name = "internlm_model.model.layers"
    outside_layer_modules = [
        "query_tokens",
        "flag_image_start",
        "flag_image_end",
        "visual_encoder",
        "Qformer",
        "internlm_model.model.embed_tokens",
        "internlm_model.model.norm",
        "internlm_proj",
        "internlm_model.lm_head",
    ]
    inside_layer_modules = [
        ["self_attn.k_proj", "self_attn.v_proj", "self_attn.q_proj"],
        ["self_attn.o_proj"],
        ["mlp.gate_proj"],
        ["mlp.up_proj"],
        ["mlp.down_proj"],
    ]


# init model and tokenizer
model = InternLMXComposerQForCausalLM.from_quantized(
    "internlm/internlm-xcomposer-7b-4bit",
    trust_remote_code=True,
    device="cuda:0",
    use_safetensors=False
)
model = model.eval()
tokenizer = AutoTokenizer.from_pretrained(
    "internlm/internlm-xcomposer-7b-4bit", trust_remote_code=True
)
model.model.tokenizer = tokenizer

# example image
image = "examples/images/aiyinsitan.jpg"

# Multi-Turn Text-Image Dialogue
# 1st turn
text = 'Describe this image in detail.'
image = "./norway.png"
response, history = model.chat(text, image)
print(f"User: {text}")
print(f"Bot: {response}") 
# The image features a black and white portrait of Albert Einstein, the famous physicist and mathematician. 
# Einstein is seated in the center of the frame, looking directly at the camera with a serious expression on his face. 
# He is dressed in a suit, which adds a touch of professionalism to his appearance.

Error

Downloading (…)_model-4bit-128g.bin: 100%|█████████████████████████████████████████████████████| 7.25G/7.25G [03:20<00:00, 36.1MB/s]
Traceback (most recent call last):
  File "/mnt/bd/dev-pierre-oreistein-st/sandbox/test_internlm_vl/test_internlm_vl_4bits", line 35, in <module>
    
  File "/home/pierre/.pyenv/versions/dev3.9/lib/python3.9/site-packages/auto_gptq/modeling/_base.py", line 847, in from_quantized
    raise FileNotFoundError(f"Could not find a model in {model_name_or_path} with a name in {', '.join(searched_files)}. Please specify the argument model_basename to use a custom file name.")
FileNotFoundError: Could not find a model in internlm/internlm-xcomposer-7b-4bit with a name in gptq_model-4bit-128g.bin, gptq_model-4bit-128g.pt, model.pt. Please specify the argument model_basename to use a custom file name.

myownskyW7 · 2023-11-07T06:31:39Z

It seems the download files are not found. Please try to download the https://huggingface.co/internlm/internlm-xcomposer-7b-4bit/tree/main to your local_path, and change the "internlm/internlm-xcomposer-7b-4bit" in the example code to your local_path.

LightDXY closed this as completed Nov 23, 2023

gordonhu608 mentioned this issue Feb 2, 2024

Could not find a model in internlm/internlm-xcomposer-7b-4bit with a name in gptq_model-4bit-128g.safetensors, model.safetensors #124

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to load 4-bits weights from HuggingFace #51

Failed to load 4-bits weights from HuggingFace #51

PierreOreistein commented Nov 7, 2023 •

edited

Loading

LightDXY commented Nov 7, 2023

PierreOreistein commented Nov 7, 2023 •

edited

Loading

myownskyW7 commented Nov 7, 2023

Failed to load 4-bits weights from HuggingFace #51

Failed to load 4-bits weights from HuggingFace #51

Comments

PierreOreistein commented Nov 7, 2023 • edited Loading

Description

Code

Error

Ideas

LightDXY commented Nov 7, 2023

PierreOreistein commented Nov 7, 2023 • edited Loading

Environment

Code

Error

myownskyW7 commented Nov 7, 2023

PierreOreistein commented Nov 7, 2023 •

edited

Loading

PierreOreistein commented Nov 7, 2023 •

edited

Loading