Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kunlunxin add stable diffusion v 1_4 case #227

Merged
merged 8 commits into from
Sep 25, 2023

Conversation

zjmoo123
Copy link
Contributor

No description provided.

@GuangYang1573
Copy link

@shh2000
Copy link
Collaborator

shh2000 commented Aug 31, 2023

请在readme中补充下运行这个case的kunlunxin+xtcl配置、版本,以及log中的结果。log中除了acc必须填写外、其他性能数据如不愿公开可空置

@zjmoo123
Copy link
Contributor Author

run20230831071022.tgz
yolov5 fp32 日志

@@ -56,5 +56,6 @@
| ----------- | --------- | ---- | ---- | -------- | ----------- | ---------- | ------------- | ------------ | ----------- | ----------- |
| tensorrt | fp16 | 2 |1674.9 | 11.4 | 45.2 | 10.6 | 60.6 | 13.2% | 17.1/25.2 | 13.3/40.0 |
| tensorrt | fp32 | 2 | 1807.4 | 8.2 | 20.6 | 7.2 | 16.1 | 7.0% | 25.2/25.3 | 39.2/40.0 |
| kunlunxin_xtcl | fp32 | 2 | / | / | / | / | / | / | 26.7/25.3 | -/40.0 |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

昆仑40.0G?

@shh2000 shh2000 merged commit eb86795 into FlagOpen:main Sep 25, 2023
yuzhou03 added a commit to yuzhou03/FlagPerf that referenced this pull request Nov 14, 2023
* bert: bugfix for 1x1 training (FlagOpen#160)

Co-authored-by: zhouyu <[email protected]>

* add Efficientnet on xpu (FlagOpen#155)

* init

* add efficientnet

* modify config

* modify config

* modify config

* add efficientnet

* modify config

* add efficientnet

* bug fix

* add efficientnet

* add efficientnet

* fix code style

* fix code style

* fix code style

* Revert "fix code style"

This reverts commit ae86109.

* fix code style

* fix code style

* fix code style

* fix code style

* fix code style

* add stardard case readme

* fix code style

* add efficientnet xpu case

* add efficientnet xpu case

---------

Co-authored-by: Feilei Du <[email protected]>

* refine mobilenetv2 (FlagOpen#153)

* refine retinanet

* update case readme

* upadte case readme for bs=512

* remove 1x4 config

---------

Co-authored-by: zhouyu <[email protected]>

* bert: update case readme (FlagOpen#161)

* bert: update case readme

* remove mlm_acc

---------

Co-authored-by: zhouyu <[email protected]>

* add kunlunxin resnet50 1x1 config (FlagOpen#164)

* add kunlunxin resnet50 2x8 config (FlagOpen#166)

* transformer model, fix No module named 'fairseq.data.batch_C' (FlagOpen#163)

Co-authored-by: chenrui22 <[email protected]>

* Iluvatar update bert conf (FlagOpen#165)

* update iluvatar bert config

* update iluvatar bert README

* add iluvatar bert 1x1 2x8 conf

* update iluvatar bert README

* add faster_rcnn for kunlunxin (FlagOpen#167)

* fix bug for iluvatar fast_rcnn 1x1 conf (FlagOpen#169)

* fix bug for iluvatar fast_rcnn 1x1 conf

* adjust iluvatar fast_rcnn 1x1 batch size

* Update trainer_adapter.py (FlagOpen#171)

Update Kunlunxin bert trainer_adapter.py to fix time collecting bug under 1x1 scenario.

* refine retinanet (FlagOpen#157)

* refine retina

* fix create_model

---------

Co-authored-by: zhouyu <[email protected]>

* add efficientnet for iluvatar (FlagOpen#170)

* add efficientnet for iluvatar

* update

* add Iluvatar retinanet case. (FlagOpen#173)

* add iluvatar retinanet case

* update README

* update iluvatar retinanet config and README

---------

Co-authored-by: uuup <[email protected]>

* Iluvatar transformer (FlagOpen#174)

* add iluvatar transformer

* update

* add paddle Bert kunlunxin case (FlagOpen#172)

* add config

* update

* update

* update

* update

* fix

* add

* fix

* Update README.md

---------

Co-authored-by: WZD09 <[email protected]>

* Inference frame (FlagOpen#136)

* upd ign

* init inference

* fix trtexec

* fix trtexec

* fix

* upd pipe

* rm secret

* fix

* add 5time 4perf and summary in run_inference

* update monitor (#1)

* finish logdir

* finish merge

* format

* fix

* lic & rdm

* ur

* Update README.md

* fix log output

* fix cal perf

* fix sync

* fix output

* fix

* fixbug

* fix frame

* ur

* add skip validation

* fix

* fix kunlun

* fix

---------

Co-authored-by: uuup <[email protected]>

* Update Regularly (FlagOpen#177)

* common

* add pd

* update faster-rcnn for kunlunxin (FlagOpen#176)

* update faster-rcnn for kunlunxin

* 修正配置描述

* fix iluvatar ixsmi monitor bug (FlagOpen#183)

* retinanet: fix case readme (FlagOpen#182)

* retinanet: fix case readme

* remove redudant

---------

Co-authored-by: zhouyu <[email protected]>

* refine maskrcnn (FlagOpen#168)

* refine maskrcnn: add 4 perf and 3 time

* fix var

* mask-rcnn: update case readme

* maskrcnn: fix readme

* refactor variable names

---------

Co-authored-by: zhouyu <[email protected]>

* refine cpm (FlagOpen#179)

Co-authored-by: zhouyu <[email protected]>

* update iluvatar retinaNet 1x1 2x8 config (FlagOpen#181)

* update iluvatar retinaNet 1x1 2x8 config

* fix retinaNet README info

* add mAP and mem info

* bertLarge stdcase (FlagOpen#180)

* bert

* fix

* add

* add MFU

* retinanet: update case readme (FlagOpen#184)

Co-authored-by: zhouyu <[email protected]>

* upd docs (FlagOpen#178)

* upd docs

* Update inference-case-doc.md

* Update inference-case-doc.md

* Update inference-case-doc.md

* Update inference-case-doc.md

* Iluvatar VisionTransformer repo (FlagOpen#188)

* Iluvatar Bigtransfer case

* iluvatar transformer

* mobilenetv2: add 1x1, 2x8 to case readme (FlagOpen#189)

Co-authored-by: zhouyu <[email protected]>

* Upd readme for future plan (FlagOpen#193)

* bert

* fix

* add

* add MFU

* vit

* addsrc

* ud

* dd

* Update README.md

* ud

* assets

* d

* up

* a

* a

* a

* a

* a

* update config (FlagOpen#194)

* support yolov5 (FlagOpen#190)

* upd ign

* init inference

* fix trtexec

* fix trtexec

* fix

* upd pipe

* rm secret

* fix

* add 5time 4perf and summary in run_inference

* update monitor (#1)

* finish logdir

* finish merge

* format

* fix

* lic & rdm

* ur

* Update README.md

* fix log output

* fix cal perf

* fix sync

* fix output

* fix

* fixbug

* fix frame

* ur

* add skip validation

* fix

* support yolov5l

* dev

* dev

* dev

* dev

* dev

* dev

---------

Co-authored-by: shh2000 <[email protected]>

* stable diffusion stdcase (FlagOpen#191)

* bert

* fix

* add

* add MFU

* vit

* addsrc

* sd

* ViT stdcase (FlagOpen#186)

* bert

* fix

* add

* add MFU

* vit

* addsrc

* support yolov5 fp16 (FlagOpen#197)

* upd ign

* init inference

* fix trtexec

* fix trtexec

* fix

* upd pipe

* rm secret

* fix

* add 5time 4perf and summary in run_inference

* update monitor (#1)

* finish logdir

* finish merge

* format

* fix

* lic & rdm

* ur

* Update README.md

* fix log output

* fix cal perf

* fix sync

* fix output

* fix

* fixbug

* fix frame

* ur

* add skip validation

* fix

* support yolov5l

* dev

* dev

* dev

* dev

* dev

* dev

* support fp16

* support fp16

* support fp16

* support fp16

---------

Co-authored-by: shh2000 <[email protected]>

* Update Inference Readme (FlagOpen#198)

* bert

* fix

* add

* add MFU

* vit

* addsrc

* upd

* Kunlunxin inference (FlagOpen#192)

* kunlunxin inference

* change docker version

* xtcl support fp16 onnx

* add kunlun monitor

* kunlunxin sync and remove d2h time

---------

Co-authored-by: zhaoyixuan02 <[email protected]>
Co-authored-by: zhoujiamin01 <[email protected]>

* Fix resnet50 evaluation (FlagOpen#202)

* add cpu model for nvidia training case readme (FlagOpen#199)

Co-authored-by: zhouyu <[email protected]>

* Iluvatar inference Resnet50 (FlagOpen#195)

* add ixrt

* add torch sync

* customized input & output

* merge latest

* update

* update readme

* update readme

* update

---------

Co-authored-by: stezpy <[email protected]>

* training: clean 1x2, 1x4 configs (FlagOpen#204)

Co-authored-by: zhouyu <[email protected]>

* refine GLM (FlagOpen#187)

* refine GLM

* style

* glm: add 1x1

* add MFU

* add MFU annotation for case readme

* add e2e_time for GLM 1x1

* update 1x1 e2e_time to about 2h

---------

Co-authored-by: zhouyu <[email protected]>

* Iluvatar paddle bert (FlagOpen#207)

* Iluvatar Bigtransfer case

* iluvatar transformer

* Iluvatar paddle bert case update

* swinTransformer stdcase (FlagOpen#206)

* swin

* change to base

* rm

* Update README.md

* Update README.md

* update iluvatar cpm config (FlagOpen#210)

* 1.update iluvatar cpm config.
2.update iluvatar sdk info.

* update cpm 1x1 2x8 mem info

* update cpm performance info

* Llama2 7b mmlu stdcase (FlagOpen#211)

* test

* finishfp32

* upd

* upd

* upd

* glm: add 2x8 statistics (FlagOpen#216)

Co-authored-by: zhouyu <[email protected]>

* fix cpm 1x1 for FP32 (FlagOpen#215)

Co-authored-by: zhouyu <[email protected]>

* 修复kunlunxin设置随机种子问题 (FlagOpen#222)

快速修复非cuda兼容模式下的kunlunxin的seed问题

* common update (FlagOpen#221)

* support special packages (FlagOpen#220)

* support special packages

* Update prepare_in_container.py

* Iluvatar update glm (FlagOpen#217)

* update iluvatar GLM

* update glm performance info

---------

Co-authored-by: sen.li <[email protected]>

* fix llama readme (FlagOpen#223)

* Update README.md

* Update README.md

* upd readme (FlagOpen#224)

* upd

* upd

* Update start_pytorch_task.py - Handle non-zero return code in process execution (FlagOpen#225)

feat: Handle non-zero return code in process execution

Refactor the code to check the return code of each process execution.
If the return code is non-zero, an exception is raised with a descriptive
error message indicating the process ID and suggesting to check the relevant
issue for further details.

* init (FlagOpen#218)

* support aquila7b (FlagOpen#209)

* support aquila7b

* support aquila7b

* modify Aquila7b according to comment

* modify Aquila7b according to comment

* modify Aquila7b according to comment

* Update Dockerfile,environment_variables.sh(kunlunxin-cpm),pytorch_install.sh (FlagOpen#219)

* 一键run.py

* 一键run.py更新

* Launch 'run.py' with a single command.

* Launch 'run.py' with a single command.

* Launch 'run.py' with a single command.

---------

Co-authored-by: zhangytong04 <[email protected]>

* Enhance the execution speed of CPM dataloaders (FlagOpen#230)

* Update start_pytorch_task.py - Handle non-zero return code in process execution

feat: Handle non-zero return code in process execution

Refactor the code to check the return code of each process execution.
If the return code is non-zero, an exception is raised with a descriptive
error message indicating the process ID and suggesting to check the relevant
issue for further details.

* Enhance the execution speed of CPM dataloaders

Enhance the execution speed of CPM dataloaders, potentially reducing the time by around 30 seconds, subject to potential variations due to different environments 

Initialize jieba library using jieba.initialize()

* Kunlunxin-cpm supports fp16 training (FlagOpen#229)

* kunlunxin-cpm supports fp16

* Add cpm 1x1 2x8 configs

* Refine kunlunxin cpm configs

* Add performance in Readme

* Update environment_variables.sh for kunlunxin-cpm (FlagOpen#234)

* kunlunxin update glm config (FlagOpen#236)

* glm_config

* fix_#1

* glm-config_updated

* glm-config-updated#2

* glm_config-updated#2

* glm_config-#2

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update pytorch_install.sh

* Create config_common

* Update README.md

* Rename config_common to config_common.py

* Update config_R300x2x8.py

* Update config_R300x1x1.py

* Update config_R300x1x8.py

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update requirements.txt

* Update README.md

* Update config_R300x1x1.py

* Update config_R300x1x8.py

* Update config_R300x2x8.py

* Update config_R300x1x1.py

* Update config_R300x1x8.py

* Update config_R300x2x8.py

* Update config_common.py

* Update config_R300x1x1.py

* Update config_R300x2x8.py

* Update README.md

* Update README.md

* Update README.md

* Update config_R300x1x1.py

* Update config_R300x1x8.py

* Update config_R300x2x8.py

---------

Co-authored-by: guanlongjie <[email protected]>

* Fix kunlunxin-glm training. (FlagOpen#242)

* Fix kunlunxin GLM training configs

* Relocate xacc install logic

* Modify max_steps for config 1x1 and 2x8

* glm: fix dataset url (FlagOpen#248)

Co-authored-by: zhouyu <[email protected]>

* kunlunxin berfLarge inference configs && results (FlagOpen#212)

* kunlunxin inference : add bertLarge

* Revert "kunlunxin inference : add bertLarge"

This reverts commit cd9127c.

* kunlunxin inference : add bertLarge

* kunlunxin : remove re-install transformers

* adjust env for bertlarge

* kunlunxin: update bertLarge performance

* Update BertLarge performance

---------

Co-authored-by: zhaoyixuan02 <[email protected]>
Co-authored-by: Shi Jinxiang <[email protected]>

* update cpm 1x1 running stats (FlagOpen#238)

Co-authored-by: zhouyu <[email protected]>

* update data_dir for test_conf (FlagOpen#247)

Co-authored-by: zhouyu <[email protected]>

* Add DistilBERT model (FlagOpen#249)

* Add DistilBert with training logic under developing

* DistilBert for 1x1 GPU training

* DistilBert for 1x8 GPU training

* Add README and externel configs

* Remove non-necessary files

* Restore environment_varaibles.sh from kunlunxin-cpm

* Update training configurations in _base.py

update max_epoch and target_acc

* Update README.md

* Add nvidia pytorch1.12 docker

* Update README.md

* Add 1x1 2x8 cases

* Add p_core unit name

* Add p_core unit name

* Update README.md

---------

Co-authored-by: wangyakai <[email protected]>

* GPT2 (FlagOpen#205)

* Add gpt2 model

* Add gpt2 test case in test_conf.py

* refine README and python files

* Remove redundant codes and re-organize denpendency

* remove redundancy files

* refine gpt_dataset

* "Refine traing job"

* Refine README

* fix typo in README.md

* Update README.md

* Add config for 1x1 2x8

* Update README.md 1x1 config

* Update README.md

* Add T5-Small training model (FlagOpen#201)

* add t5 small

* t5_small use huggingface accelerate

* fix coding style for t5_small model

* update t5_small bs config

* add MFU information in t5-small nvidia README

* fix t5_small doc typo

* iluvatar_infer_resnet50 (FlagOpen#259)

Co-authored-by: 杨智超 <[email protected]>

* zixiao:add resnet50 inference configs && results (FlagOpen#256)

* zixiao:add resnet50 inference configs && results

* zixiao: modify resnet50 config & add log file

* zixiao: remote log file

* zixiao: fix resnet50 inference result

* zixiao: update zxrt.py & resnet50 result (FlagOpen#262)

* zixiao: update zxrt.py & resnet50 result

* zixiao: update resnet50 test batch_size

* kunlunxin: add BERT readme (FlagOpen#260)

* Add BERT readme

* Update 1x8 result in README.md

* Update header in README.md

* Iluvatar Ixrt environment (FlagOpen#265)

* Ixrt environment

* add touch config

---------

Co-authored-by: 杨智超 <[email protected]>

* Add ViT model for FlagPerf (FlagOpen#200)

* Add ViT model

* update the script based on zhiyuan's model

* Update script based on PR review

* Update ViT performance in README.md

* support swin_transformer on XPU (FlagOpen#255)

* support swin_transformer on XPU

* support swin_transformer on XPU

---------

Co-authored-by: wangdongyu04 <[email protected]>

* Kunlunxin add stable diffusion v 1_4  case (FlagOpen#227)

* kunlunxin inference

* xtcl support fp16 onnx

* Add stable diffusion fp32 case

* kunlunxin add yolov5 case

* update resnet50 fp16 performance

* add stable_diffusion_v1_4 kunlunxin mem_usage

---------

Co-authored-by: zhaoyixuan02 <[email protected]>
Co-authored-by: zhoujiamin01 <[email protected]>

* kunlunxin swinTransformer inference configs && results (FlagOpen#243)

* kunlunxin swinTransformer inference configs && results

* kunlunxin swinTransformer inference configs && results

{'vendor': 'kunlunxin', 'compiler': 'xtcl', 'precision': 'fp32', 'batchsize': 256, 'flops': 723982880000.0, 'e2e_time(second)': 543.745, 'p_validation_whole(qps)': None, 'p_validation_core(qps)': None, 'p_inference_whole(qps)': 166.937, '*p_inference_core(qps)': 175.724, 'val_average_acc': None, 'infer_average_acc': 0.832}

---------

Co-authored-by: SHIHONGHAO <[email protected]>

* kunlunxin sam_h (FlagOpen#244)

* add Transformer XL model (FlagOpen#258)

* add transfoxl

* update readme and add new config for 2x8

* update readme

* add 1x1 config for transformer xl

* fix nvidia readme for transformer XL

* modification of kunlunxin-RetinaNet (FlagOpen#264)

* Add kunlunxin retinanet

* Update environment_variables.sh

* Update environment_variables.sh

* Add 2x8 config

* Modify 1x1 2x8 config

* remove max_steps logic

* add readme

---------

Co-authored-by: Reiase <[email protected]>
Co-authored-by: root <[email protected]>

* [LLM-paddle] add llama1-7b pretrain with callback (FlagOpen#239)

* modify gitignore

* add paddle llama

* add recompute and sharding for llama7b

* adapte to the driver & fix start_paddle_task

* fix llama1-7b fig files and trainer
fix llama1-7b docker run cmd
modify docker paddle version

* [callback] llama1-7B pretrain

* modify the llama case config name in test_conf.py
fix llama run_pretraining.py
fix llama1-13b config
fix llama1-7b and llama1-13b readme
[LLM] add llama1-13b pretrain
[LLM] llama1-7b pretrain with callback

* update config

* update config

* add metrics in README.md

* update README.md

* remove llama 13B files

---------

Co-authored-by: DrownFish19 <[email protected]>

* [paddle] add metrics for llama-7b (FlagOpen#278)

* fix run_pretraining

* fix config

* update scale_loss

* fix warmup_steps setting

* remove evaluate

* update config

* update config for pp

* update config

* update

* add metrics of llama-7b

* update llama1-7B 80G mertics

* fix

* update

* update llama1-13b metrics

* fix

* remove 13B metrics

* Distilbert kunlunxin (FlagOpen#272)

* Fit distilbert on kunlunxin

* Add kunlunxin readme

* Refine kunlunxin readme

* Refine task kind  kunlunxin readme

* Add vendor name in config_common.py

---------

Co-authored-by: root <[email protected]>

* add KUNLUNXIN XPU t5_small config & log. (FlagOpen#269)

* add KUNLUNXIN XPU t5_small config & log.

* Update README.md

* Update README.md

* Gpt2 kunlunxin (FlagOpen#273)

* Fit gpt2 on kunlunxin

* Add kunlunxin readme

* Refine task kind  kunlunxin readme

* Fix unit of p_whole in README.md

* Refine 1x1 config

---------

Co-authored-by: root <[email protected]>

* update readme for v1.0 (FlagOpen#268)

* ur

* ur

* 11

* refine tacotron2, add nv configs and results (FlagOpen#251)

* refine tacotron2

* update test_conf && req.txt for pytorch1.13

* update 1x1 and 1x8

* update 2x8

---------

Co-authored-by: zhouyu <[email protected]>

* refine efficientnet, add configs && results (FlagOpen#252)

* refine efficientnet

* update results

---------

Co-authored-by: zhouyu <[email protected]>

* Add kunlunxin mask-rcnn (FlagOpen#276)

* Add kunlunxin mask-rcnn

* Refine mask-rcnn

---------

Co-authored-by: root <[email protected]>

* [paddle] add llama1-13b metric (FlagOpen#279)

* fix run_pretraining

* fix config

* update scale_loss

* fix warmup_steps setting

* remove evaluate

* update config

* update config for pp

* update config

* update

* add llama1-13B files

* update config

* config recompute

* update config

* add metrics of llama-7b

* add llama-13b metrics

* add test_config

* add requirements.txt for transformer_xl stdcase (FlagOpen#281)

* fix_#1

* Create config_common.py

* Update config_common.py

* transformer_xl-benchmark_req

* stdcasefix_#1

* stdcasefix_#2

* stdcasefix_#3

* stdcasefix_#4

---------

Co-authored-by: guanlongjie <[email protected]>

* add Transformer_xl configs for kunlunxin (FlagOpen#277)

* fix_#1

* Create config_common.py

* Update config_common.py

* transformer_xl-config

* transformer_xl-config-#1

* transformer_xl-config#2

* transformer_xl-config#2

* fix_#2

* fix_#3

* fix_#4

* fix_#5

* fix_#6

* fix_#7

* config_#9

* fix_#8

* fix_#9

* fix_#10

* fix_#11

---------

Co-authored-by: guanlongjie <[email protected]>

* add longformer training stdcase (FlagOpen#282)

* add longformer

* fix typos in README.md

* full resnet50 precision(bf16+amp) (FlagOpen#253)

* full resnet50

* add ieee754

* add ieee754

* refine swin transformer, fix 1x1, update results (FlagOpen#283)

Co-authored-by: zhouyu <[email protected]>

* [paddle] add gpt3 benchmark (FlagOpen#233)

* add new feature

* fix

* update

* update

* update

* update

* update

* update

* update

* update

* update

* add continue_training

* update

* rename config name with soft link

* update config

* replace nvidia-docker with docker

* update config

* add README.md

* set converged state

* update

* update target ppl metric

* update GPT-3 case config

* update base config

* add use_fused_rms_norm config

* update config

* update paddle dockerfile

* update config

* update GPT-3 config

* update config

* update configs

* update GPT-3 config

* update config

* rename GPT-3 folders

* update start_paddle_task

* update config

* update run_pretrain.py

* update

* update and add gpt3 configs

* add gpt3-13b benchmarks

* remove try and catch

* update dataloader

* update filename

* update

* update configs

* update config

* add gpt3 metrics

* update test_config

* update README.md

* add detr model (FlagOpen#266)

* add detr on GPU

* refine detr on gpu

* modify detr code and upload test data on gpu

* update the format of test data and add detr test case

* update detr test metric

* add gpu 1x1 log for detr

* update 1x1 log

* add detr in test_conf.py

---------

Co-authored-by: wangdongyu04 <[email protected]>

* update readme for Q3 (FlagOpen#285)

* u1012

* ur

* detr

* Iluvatar infer yolov5 (FlagOpen#287)

* Ixrt environment

* add touch config

* Iluvatar yolov5 case

* fix mistake

---------

Co-authored-by: 杨智超 <[email protected]>

* Kunlunxin detr (FlagOpen#288)

* add detr on xpu

* add mAP jpg

* add mAP png and rm mAP.jpg

* add xpu 2x8 log

* update memory data

* add description on mAP.png

---------

Co-authored-by: wangdongyu04 <[email protected]>

* update klx swin_transformer's data (FlagOpen#290)

Co-authored-by: wangdongyu04 <[email protected]>

* update klx bertLarge performance (FlagOpen#291)

* update klx bertLarge performance

* update klx bertLarge performance

---------

Co-authored-by: Shi Jinxiang <[email protected]>

* remove performance (FlagOpen#292)

* remove performance

* update klx stable diffusion performance

---------

Co-authored-by: zhoujiamin01 <[email protected]>

* Update the ViT model's README (FlagOpen#293)

* Add ViT model

* update the script based on zhiyuan's model

* Update script based on PR review

* Update ViT benchmark README.md

* Update ViT performance in README.md

* Update Vit model's README

* Update ViT model's README file

---------

Co-authored-by: zangzhan <[email protected]>

* llama2 7B pretrain标准case (FlagOpen#289)

* init

* fix

* upd result

* Update deepspeed-nvidia_install.sh

* Update run_pretraining.py

* kunlunxin pytorch resnet50 add requirements.txt and environment_variables.sh (FlagOpen#298)

* update gpt2 kunlunxin config (FlagOpen#300)

* gpt2 env config

* gpt2 config

* Update test_conf.py

---------

Co-authored-by: zhangyutong04 <[email protected]>

* add bert_hf openwebtext (FlagOpen#267)

* add bert_hf_small_dataset

* addtestconf

* upd exp

* Update iluvatar retinanet conf to avoid CUDA OOM. (FlagOpen#310)

* update iluvatar retinaNet 1x1 2x8 config

* fix retinaNet README info

* add mAP and mem info

* update 1*8 conf to avoid cuda OOM.

* update kunlunxin transformer_xl config (FlagOpen#307)

* update kunlunxin pytorch_install.sh (FlagOpen#311)

* update kunlunxin glm config (FlagOpen#312)

* klx: update requirements.txt and env for faster_rcnn (FlagOpen#302)

* upd (FlagOpen#313)

* refine VIT && update NV results (FlagOpen#309)

Co-authored-by: zhouyu <[email protected]>

* update kunlunxin retinanet configs (FlagOpen#304)

Co-authored-by: wangdongyu04 <[email protected]>
Co-authored-by: Zhou Yu <[email protected]>

* update kunlunxin maskrcnn configs (FlagOpen#305)

Co-authored-by: wangdongyu04 <[email protected]>
Co-authored-by: Zhou Yu <[email protected]>

* [paddle] fix paddlenlp version for llama1 and gpt3 (FlagOpen#301)

* fix paddle

* update Dockerfile

* update config for llama

* remove file

* fix kunlunxin t5_small 1x1 training error (FlagOpen#315)

* 【iluvatar】update mobilenetv2 config (FlagOpen#295)

* update iluvatar mobilenetv2 config

* update iluvatar mobilenetv2 README

* fix mobilenetv2 on kunlunxin (FlagOpen#314)

* init

* add efficientnet

* modify config

* modify config

* modify config

* add efficientnet

* modify config

* add efficientnet

* bug fix

* add efficientnet

* add efficientnet

* fix code style

* fix code style

* fix code style

* Revert "fix code style"

This reverts commit ae86109.

* fix code style

* fix code style

* fix code style

* fix code style

* fix code style

* bug fix

* add kunlunxin readme

* fix mobilenetv2 on kunlunxin

* add mobilenet config_R300x2x8.py

---------

Co-authored-by: Feilei Du <[email protected]>

* 121 (FlagOpen#319)

* refine bigtransfer (FlagOpen#317)

* refine bigtransfer, add configs and update results

* update readme

---------

Co-authored-by: zhouyu <[email protected]>

* 1107 (FlagOpen#316)

* Aquila2_7B-flagscale pretraining (FlagOpen#299)

* init

* fix

* upd result

* init code

* fix rdm

* rm llama2

* upd rpt

* upd

* 67 (FlagOpen#322)

* 【iluvatar】fix docker bug (FlagOpen#320)

* fix the error that cannot generate docker image

* fix iluvatar docker bug

* fix the spelling error

* add 'apt install -y libncursesw5'

* add klx-training-pre-pr-check.yml

---------

Co-authored-by: zhouyu <[email protected]>
Co-authored-by: Stanley <[email protected]>
Co-authored-by: Feilei Du <[email protected]>
Co-authored-by: Jianbang Yang <[email protected]>
Co-authored-by: Rain Chan <[email protected]>
Co-authored-by: chenrui22 <[email protected]>
Co-authored-by: forestlee95 <[email protected]>
Co-authored-by: Reiase <[email protected]>
Co-authored-by: KungYork <[email protected]>
Co-authored-by: stezpy <[email protected]>
Co-authored-by: uuup <[email protected]>
Co-authored-by: WZD09 <[email protected]>
Co-authored-by: WZD09 <[email protected]>
Co-authored-by: SHIHONGHAO <[email protected]>
Co-authored-by: clveryang <[email protected]>
Co-authored-by: zjm <[email protected]>
Co-authored-by: zhaoyixuan02 <[email protected]>
Co-authored-by: zhoujiamin01 <[email protected]>
Co-authored-by: stezpy <[email protected]>
Co-authored-by: sen.li <[email protected]>
Co-authored-by: clemente0420 <[email protected]>
Co-authored-by: flying tree <[email protected]>
Co-authored-by: zhangytong04 <[email protected]>
Co-authored-by: GGuanl <[email protected]>
Co-authored-by: guanlongjie <[email protected]>
Co-authored-by: jinxiangshi <[email protected]>
Co-authored-by: Shi Jinxiang <[email protected]>
Co-authored-by: wangyakai <[email protected]>
Co-authored-by: 杨智超 <[email protected]>
Co-authored-by: feldmanshan <[email protected]>
Co-authored-by: gganduu_zz <[email protected]>
Co-authored-by: TWANG07 <[email protected]>
Co-authored-by: wangdongyu04 <[email protected]>
Co-authored-by: liuyumoye <[email protected]>
Co-authored-by: Quanfeng Li <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: laixinyi <[email protected]>
Co-authored-by: DrownFish19 <[email protected]>
Co-authored-by: Xiao Han <[email protected]>
Co-authored-by: zangzhan <[email protected]>
Co-authored-by: zhangyutong04 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants