Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【用户反馈】通过Llama2-7b模型验证DeeLink问题咨询 #1152

Open
MiaoYYu opened this issue Apr 23, 2024 · 1 comment
Open

【用户反馈】通过Llama2-7b模型验证DeeLink问题咨询 #1152

MiaoYYu opened this issue Apr 23, 2024 · 1 comment
Labels

Comments

@MiaoYYu
Copy link
Contributor

MiaoYYu commented Apr 23, 2024

用户在通过Llama2-7b模型验证DeeLink能力时遇到问题,以下为原话。
---------------------邮件原文---------------------
目前正在通过Llama2-7b模型验证DeeLink能力,其中遇到两个问题需要求助下:

  1. 在昇腾上通过DeepLink训练Llama2-7b,验证无代码改动可在英伟达与昇腾无障碍训练。我使用的llama2-chinese脚本,但在部署Llama2模型环境时遇到flash_attn与cuda强相关,无法在昇腾环境安装,猜测验证DeepLink能力需要特定的脚本,我在你们github上的测评模型中未找到相关脚本,请问你们方便提供下不,万分感谢!

  2. 在英伟达通过DeepLink训练Llama2-7b时遇到IndexError: map::at报错,前期定位是device='cuda:7'中:7不存在问题,但今天看到这个问题已解决,更新deeplink后测试发现以下报错仍然存在,请问你们有遇到这种问题吗?有啥临时解决方案吗?

image

@MiaoYYu MiaoYYu added the ascend label Apr 23, 2024
@yangbofun
Copy link
Collaborator

yangbofun commented Apr 23, 2024

关于昇腾的问题:

  1. 在昇腾上使用deeplink.framework时,使用flash-attention的python层组合算子是可以进行大模型训练的。这时不用编译安装flash-attention, 使用flash-attention的python代码就行。
  2. 如果要大模型算子(比如rotary embedding、 flashattention、rms_norm等)不是pytorch组合实现,而是调用对应算子实现。可参考我们的deeplinkext对接internevo的方法. 想问下咱们在大模型训练时使用的是什么框架?(e.g. deepspeed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants