Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8 * A100 启动巨慢,有启动成功的勇士不 #11

Closed
CarryChang opened this issue May 8, 2024 · 2 comments
Closed

8 * A100 启动巨慢,有启动成功的勇士不 #11

CarryChang opened this issue May 8, 2024 · 2 comments

Comments

@CarryChang
Copy link

No description provided.

@zwd003
Copy link

zwd003 commented May 8, 2024

建议使用vllm启动vllm-project/vllm#4650

@stack-heap-overflow
Copy link
Contributor

HuggingFace代码中accelerate库对模型的显存分配计算有问题,目前示例代码已修改,预计大幅缩短模型加载速度。

加载模型的代码修改为:

model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True, device_map="sequential", torch_dtype=torch.bfloat16, max_memory=max_memory, attn_implementation="eager")

@soloice soloice closed this as completed May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants