DeepSpeed-Chat step-1 hanging for a long time #906

lemon-little · 2024-06-19T16:31:56Z

deepspeed --hostfile ~/hostfile
--num_gpus 4
--num_nodes 2
--master_addr 172.16.4.41
main.py
--data_path Dahoas/rm-static
--data_split 2,4,4
--model_name_or_path shakechen/Llama-2-7b-hf/
--per_device_train_batch_size 4
--per_device_eval_batch_size 4
--max_seq_len 512
--learning_rate 9.65e-6
--weight_decay 0.
--num_train_epochs 1
--gradient_accumulation_steps 1
--lr_scheduler_type cosine
--num_warmup_steps 0
--seed 1234
--gradient_checkpointing
--zero_stage 3
--deepspeed
--output_dir /home/bingxing2/home/scx7avs/Deepspeed/output/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepSpeed-Chat step-1 hanging for a long time #906

DeepSpeed-Chat step-1 hanging for a long time #906

lemon-little commented Jun 19, 2024 •

edited

Loading

DeepSpeed-Chat step-1 hanging for a long time #906

DeepSpeed-Chat step-1 hanging for a long time #906

Comments

lemon-little commented Jun 19, 2024 • edited Loading

lemon-little commented Jun 19, 2024 •

edited

Loading