You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I find a issue when I attempt to continue pretraining?
trainer.train(resume_from_checkpoint=True) File "/opt/conda/lib/python3.8/site-packages/transformers/trainer.py", line 1539, in train return inner_training_loop( File "/opt/conda/lib/python3.8/site-packages/transformers/trainer.py", line 1676, in _inner_training_loop deepspeed_load_checkpoint(self.model_wrapped, resume_from_checkpoint) File "/opt/conda/lib/python3.8/site-packages/transformers/deepspeed.py", line 389, in deepspeed_load_checkpoint raise ValueError(f"Can't find a valid checkpoint at {checkpoint_path}") ValueError: Can't find a valid checkpoint at /ossfs/workspace/mnt_new/xxx/llama-vid/work_dirs/llama-vid-7b-pretrain-224-video-fps-1/checkpoint-15000
The text was updated successfully, but these errors were encountered:
This was because the weights or the larger files weren't downloaded (correctly). Use git-lfs to download the weights repo from huggingface.
# Ensure git-lfs is installed
git lfs --version
# prints git-lfs/2.9.2 (GitHub; linux amd64; go 1.13.5)
git clone https://huggingface.co/YanweiLi/llama-vid-7b-full-224-video-fps-1 # change with the weights you need
I find a issue when I attempt to continue pretraining?
trainer.train(resume_from_checkpoint=True) File "/opt/conda/lib/python3.8/site-packages/transformers/trainer.py", line 1539, in train return inner_training_loop( File "/opt/conda/lib/python3.8/site-packages/transformers/trainer.py", line 1676, in _inner_training_loop deepspeed_load_checkpoint(self.model_wrapped, resume_from_checkpoint) File "/opt/conda/lib/python3.8/site-packages/transformers/deepspeed.py", line 389, in deepspeed_load_checkpoint raise ValueError(f"Can't find a valid checkpoint at {checkpoint_path}") ValueError: Can't find a valid checkpoint at /ossfs/workspace/mnt_new/xxx/llama-vid/work_dirs/llama-vid-7b-pretrain-224-video-fps-1/checkpoint-15000
The text was updated successfully, but these errors were encountered: