[mthreads] deepspeed llama2 #354

shang-mt · 2023-12-06T14:34:03Z

No description provided.

…gOpen#346) * [kunlunxin] fix tacotron2 running error and add 1x1 & 2x8 config * [kunlunxin] modify tacotron2 test_config * [kunlunxin] update tacotron2 readme * [kunlunxin] modify tacotron2 torch.load()

* update iluvatar/swin_transformer-pytorch * update * update * update * fix batch size mistake in readme * correct val_loss to final acc1 * add finnal_acc1 and mem in readme * correct readme mem --------- Co-authored-by: 魏杰 <[email protected]> Co-authored-by: 杨智超 <[email protected]> Co-authored-by: clveryang <[email protected]>

Co-authored-by: zhouyu <[email protected]>

Co-authored-by: sen.li <[email protected]>

* Update README.md * Update README.md

* iluvatar bertlarge MLM inference case * update ixrt readme --------- Co-authored-by: 杨智超 <[email protected]>

training/mthreads/llama2_7b-deepspeed/README.md

* support bert_hf fp32/amp/bf16 training for mthreads * update readme * prevent overrun * 1x1/2x8 not support

* support resnet50 training on mthreads * fix typo * support rn50 amp training on mthreads * add test config (should revert this commit) * update config & readme * add get_system_info fn * update * 1x1/2x8 not support --------- Co-authored-by: Zhou Yu <[email protected]>

upvenly · 2023-12-12T02:53:41Z

training/benchmarks/llama2_7b/deepspeed/run_pretraining.py

@@ -54,18 +54,19 @@ def get_argument_parser():

 def train(model_engine, dataloader):
    model_engine.train()
+    device = torch.device('musa:'+str(args.local_rank))


这么改的话，别家厂商是不是跑不通了

已修改。

upvenly · 2023-12-12T03:43:42Z

training/benchmarks/llama2_7b/deepspeed/model/llama_model.py

@@ -6,5 +6,6 @@ def get_llama_model(model_config_dir, flashattn):
    config = LlamaConfig.from_pretrained(model_config_dir)
    config._flash_attn_2_enabled = flashattn
    model = LlamaForCausalLM(config)
+    model.gradient_checkpointing_enable()


这个地方需要修改成以配置文件控制开关，NV默认关，厂商自行开关。

已添加。

* fixllama * add t/tflops

shh2000 · 2023-12-14T01:48:08Z

training/mthreads/llama2_7b-deepspeed/README.md

+
+- ##### 优化策略
+
+   - 无


在这里标注一下使用的优化策略，例如
-flash attention（1/2/sdp-attn）
-checkpointing

收到，测试之后完善。

inference/benchmarks/bertLarge/pytorch/iluvatar_requirements.txt

shh2000 · 2023-12-15T08:39:13Z

training/mthreads/llama2_7b-deepspeed/config/config_S4000x1x8.py

+datafilename = "openwebtext_llama2_100M.npy"
+epochs = 1
+theoryflops = 98000000000000.0
+flashattn = True


flashattn=True # using sdp attention

training/mthreads/llama2_7b-deepspeed/README.md

jamesruio and others added 3 commits December 6, 2023 11:26

[kunlunxin] fix tacotron2 running error and add 1x1 & 2x8 config (Fla…

23f188e

…gOpen#346) * [kunlunxin] fix tacotron2 running error and add 1x1 & 2x8 config * [kunlunxin] modify tacotron2 test_config * [kunlunxin] update tacotron2 readme * [kunlunxin] modify tacotron2 torch.load()

fix get_system_info for iluvatar_monitor (FlagOpen#351)

5028b58

Co-authored-by: zhouyu <[email protected]>

shang-mt force-pushed the llama2-base branch 3 times, most recently from 4bade8b to 9c7f506 Compare December 7, 2023 08:23

forestlee95 and others added 3 commits December 7, 2023 17:10

update iluvatar mobilenetv2 config (FlagOpen#356)

1edbda2

Co-authored-by: sen.li <[email protected]>

Update README.md (FlagOpen#357)

d954729

* Update README.md * Update README.md

[iluvatar] bertlarge inference case (FlagOpen#353)

805520c

* iluvatar bertlarge MLM inference case * update ixrt readme --------- Co-authored-by: 杨智超 <[email protected]>

yuzhou03 reviewed Dec 11, 2023

View reviewed changes

training/mthreads/llama2_7b-deepspeed/README.md Outdated Show resolved Hide resolved

yuzhou03 approved these changes Dec 11, 2023

View reviewed changes

shang-mt force-pushed the llama2-base branch 2 times, most recently from b9abb42 to 9a02a7d Compare December 11, 2023 03:41

yuzhou03 force-pushed the llama2-base branch from 9a02a7d to ad1500b Compare December 11, 2023 05:31

mingyuanw-mt and others added 2 commits December 11, 2023 17:56

[mthreads] bert_hf 1x8 (FlagOpen#350)

5dfb723

* support bert_hf fp32/amp/bf16 training for mthreads * update readme * prevent overrun * 1x1/2x8 not support

upvenly reviewed Dec 12, 2023

View reviewed changes

shh2000 and others added 2 commits December 12, 2023 17:30

fix llama, add TFLOPS log (FlagOpen#358)

dc9dcbb

* fixllama * add t/tflops

[mthreads] deepspeed llama2

5fae218

shang-mt force-pushed the llama2-base branch from ad1500b to 5fae218 Compare December 13, 2023 07:48

shh2000 reviewed Dec 14, 2023

View reviewed changes

shang-mt changed the base branch from main to mthreads_llama2 December 14, 2023 09:49

shh2000 reviewed Dec 15, 2023

View reviewed changes

inference/benchmarks/bertLarge/pytorch/iluvatar_requirements.txt Show resolved Hide resolved

upvenly approved these changes Dec 15, 2023

View reviewed changes

shh2000 reviewed Dec 15, 2023

View reviewed changes

shh2000 reviewed Dec 20, 2023

View reviewed changes

training/mthreads/llama2_7b-deepspeed/README.md Show resolved Hide resolved

update readme for sdpa

8b18240

shang-mt force-pushed the llama2-base branch from a1c98e3 to 8b18240 Compare December 20, 2023 01:20

shh2000 approved these changes Dec 20, 2023

View reviewed changes

shh2000 merged commit 9c64606 into FlagOpen:mthreads_llama2 Dec 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mthreads] deepspeed llama2 #354

[mthreads] deepspeed llama2 #354

shang-mt commented Dec 6, 2023

upvenly Dec 12, 2023

shang-mt Dec 14, 2023

upvenly Dec 12, 2023

shang-mt Dec 14, 2023

shh2000 Dec 14, 2023

shang-mt Dec 14, 2023

shang-mt Dec 20, 2023

shh2000 Dec 15, 2023

[mthreads] deepspeed llama2 #354

[mthreads] deepspeed llama2 #354

Conversation

shang-mt commented Dec 6, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment