Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM服务无法正确加载LoRA合并后的模型,报错:EOFError: EOF when reading a line[BUG] #1193

Closed
MyGitHubPigStar opened this issue Aug 22, 2023 · 5 comments
Labels
bug Something isn't working stale

Comments

@MyGitHubPigStar
Copy link

问题描述 / Problem Description
使用V0.1.7版本的LLaMA-Efficient-Tuning项目,用LoRA方式微调并与基础模型合并后,无法启用LLM服务

复现问题的步骤 / Steps to Reproduce
1、在LLaMA-Efficient-Tuning上导出合并后的模型,测试能正常加载使用。
2、迁移完整模型文件夹至langChat项目中。
3、修改model_config.py文件

  • model_config.py

"chatglm2-6b": { "local_model_path": "/home/ubuntu/workspace/Langchain-Chatchat/model/08-22-01", "api_base_url": "http://localhost:8888/v1", # URL需要与运行fastchat服务端的server_config.FSCHAT_OPENAI_API一致 "api_key": "EMPTY" }
# LLM 名称 LLM_MODEL = "chatglm2-6b"

  • server_config.py

# fastchat openai_api server FSCHAT_OPENAI_API = { "host": DEFAULT_BIND_HOST, "port": 8888, # model_config.llm_model_dict中模型配置的api_base_url需要与这里一致。 }
4、运行命令 python server/llm_api.py


合并后模型的config.json文件:

{
"_name_or_path": "THUDM/chatglm2-6b",
"add_bias_linear": false,
"add_qkv_bias": true,
"apply_query_key_layer_scaling": true,
"apply_residual_connection_post_layernorm": false,
"architectures": [
"ChatGLMForConditionalGeneration"
],
"attention_dropout": 0.0,
"attention_softmax_in_fp32": true,
"auto_map": {
"AutoConfig": "configuration_chatglm.ChatGLMConfig",
"AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration"
},
"bias_dropout_fusion": true,
"eos_token_id": 2,
"ffn_hidden_size": 13696,
"fp32_residual_connection": false,
"hidden_dropout": 0.0,
"hidden_size": 4096,
"kv_channels": 128,
"layernorm_epsilon": 1e-05,
"model_type": "chatglm",
"multi_query_attention": true,
"multi_query_group_num": 2,
"num_attention_heads": 32,
"num_layers": 28,
"original_rope": true,
"pad_token_id": 0,
"padded_vocab_size": 65024,
"post_layer_norm": true,
"pre_seq_len": null,
"prefix_projection": false,
"quantization_bit": 0,
"rmsnorm": true,
"seq_length": 32768,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.31.0",
"use_cache": true,
"vocab_size": 65024
}

预期的结果 / Expected Result
能够正确加载模型,并成功启动LLM服务

实际结果 / Actual Result
启动llm服务: python server/llm_api.py 后出现以下错误
日志图片:
image

日志文字:

(langchain) root@ubuntu-MS-7D94:/home/ubuntu/workspace/Langchain-Chatchat# python server/llm_api.py
2023-08-22 13:52:52,415 - llm_api.py[line:231] - INFO: {'local_model_path': '/home/ubuntu/workspace/Langchain-Chatchat/model/08-22-01', 'api_base_url': 'http://localhost:8888/v1', 'api_key': 'EMPTY'}
2023-08-22 13:52:52,415 - llm_api.py[line:234] - INFO: 如需查看 llm_api 日志,请前往 /home/ubuntu/workspace/Langchain-Chatchat/logs
2023-08-22 13:52:53 | ERROR | stderr | INFO: Started server process [267192]
2023-08-22 13:52:53 | ERROR | stderr | INFO: Waiting for application startup.
2023-08-22 13:52:53,469 - instantiator.py[line:21] - INFO: Created a temporary directory at /tmp/tmpg5b6wsd5
2023-08-22 13:52:53,469 - instantiator.py[line:76] - INFO: Writing /tmp/tmpg5b6wsd5/_remote_module_non_scriptable.py
2023-08-22 13:52:53 | ERROR | stderr | INFO: Application startup complete.
2023-08-22 13:52:53 | ERROR | stderr | INFO: Uvicorn running on http://0.0.0.0:20001 (Press CTRL+C to quit)
2023-08-22 13:52:53 | INFO | model_worker | Loading the model ['chatglm2-6b'] on worker 96a3e133 ...
2023-08-22 13:52:53 | INFO | stdout | Loading /home/ubuntu/workspace/Langchain-Chatchat/model/08-22-01 requires to execute some code in that repo, you can inspect the content of the repository at https://hf.co//home/ubuntu/workspace/Langchain-Chatchat/model/08-22-01. You can dismiss this prompt by passing trust_remote_code=True.
2023-08-22 13:52:53 | INFO | stdout | Do you accept? [y/N]
2023-08-22 13:52:53 | ERROR | stderr | Process model_worker(267158):
2023-08-22 13:52:53 | ERROR | stderr | Traceback (most recent call last):
2023-08-22 13:52:53 | ERROR | stderr | File "/root/anaconda3/envs/langchain/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
2023-08-22 13:52:53 | ERROR | stderr | self.run()
2023-08-22 13:52:53 | ERROR | stderr | File "/root/anaconda3/envs/langchain/lib/python3.10/multiprocessing/process.py", line 108, in run
2023-08-22 13:52:53 | ERROR | stderr | self._target(*self._args, **self._kwargs)
2023-08-22 13:52:53 | ERROR | stderr | File "/home/ubuntu/workspace/Langchain-Chatchat/server/llm_api.py", line 194, in run_model_worker
2023-08-22 13:52:53 | ERROR | stderr | app = create_model_worker_app(*args, **kwargs)
2023-08-22 13:52:53 | ERROR | stderr | File "/home/ubuntu/workspace/Langchain-Chatchat/server/llm_api.py", line 128, in create_model_worker_app
2023-08-22 13:52:53 | ERROR | stderr | worker = ModelWorker(
2023-08-22 13:52:53 | ERROR | stderr | File "/root/anaconda3/envs/langchain/lib/python3.10/site-packages/fastchat/serve/model_worker.py", line 207, in init
2023-08-22 13:52:53 | ERROR | stderr | self.model, self.tokenizer = load_model(
2023-08-22 13:52:53 | ERROR | stderr | File "/root/anaconda3/envs/langchain/lib/python3.10/site-packages/fastchat/model/model_adapter.py", line 268, in load_model
2023-08-22 13:52:53 | ERROR | stderr | model, tokenizer = adapter.load_model(model_path, kwargs)
2023-08-22 13:52:53 | ERROR | stderr | File "/root/anaconda3/envs/langchain/lib/python3.10/site-packages/fastchat/model/model_adapter.py", line 72, in load_model
2023-08-22 13:52:53 | ERROR | stderr | model = AutoModelForCausalLM.from_pretrained(
2023-08-22 13:52:53 | ERROR | stderr | File "/root/anaconda3/envs/langchain/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 461, in from_pretrained
2023-08-22 13:52:53 | ERROR | stderr | config, kwargs = AutoConfig.from_pretrained(
2023-08-22 13:52:53 | ERROR | stderr | File "/root/anaconda3/envs/langchain/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 986, in from_pretrained
2023-08-22 13:52:53 | ERROR | stderr | trust_remote_code = resolve_trust_remote_code(
2023-08-22 13:52:53 | ERROR | stderr | File "/root/anaconda3/envs/langchain/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 538, in resolve_trust_remote_code
2023-08-22 13:52:53 | ERROR | stderr | answer = input(
2023-08-22 13:52:53 | ERROR | stderr | EOFError: EOF when reading a line
INFO: Started server process [267194]
INFO: Waiting for application startup.

环境信息 / Environment Information

  • langchain-Chatchat 版本:v0.2.1
  • LLaMA-Efficient-Tuning版本:v0.1.7
  • 是否使用 Docker 部署:否
  • 使用的模型:ChatGLM-6B
  • 使用的 Embedding 模型:m3e-base
  • 使用的向量库类型 : faiss
  • 操作系统及版本 : ubuntu 20.04
  • Python 版本 / Python version: 3.10.12
@MyGitHubPigStar MyGitHubPigStar added the bug Something isn't working label Aug 22, 2023
@MyGitHubPigStar
Copy link
Author

使用PEFT的方式启动服务,已有了解决方案,#1130
但还并未解决合并模型无法启动的问题。

@github-actions
Copy link

这个问题已经被标记为 stale ,因为它已经超过 30 天没有任何活动。

@hydai99
Copy link

hydai99 commented Oct 8, 2023

这个有解决方案吗

@MyGitHubPigStar
Copy link
Author

这个有解决方案吗

把langchain升级到0.2.0版本就可以了

@hydai99
Copy link

hydai99 commented Oct 11, 2023

这个有解决方案吗

把langchain升级到0.2.0版本就可以了

目前不是最高才0.0.312吗?我更新到0.0.312了还是报错

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
None yet
Development

No branches or pull requests

3 participants