Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load ckpt_path when given to test/validate/predict #8347

Closed
SeanNaren opened this issue Jul 9, 2021 · 2 comments · Fixed by #8352
Closed

Load ckpt_path when given to test/validate/predict #8347

SeanNaren opened this issue Jul 9, 2021 · 2 comments · Fixed by #8352
Assignees
Labels
feature Is an improvement or enhancement help wanted Open to be worked on
Milestone

Comments

@SeanNaren
Copy link
Contributor

SeanNaren commented Jul 9, 2021

🚀 Feature

When the ckpt_path is passed to the test/validation/predict functions of the Trainer, they load the weights even if a model is provided.

Motivation

I noticed that one of our DeepSpeed test was incorrect (see here). resume_from_checkpoint does not re-load the weights for test/validate/predict, which is probably the right thing to do, however when modified to pass ckpt_path to the test function I noticed the weights are not loaded, which is default behaviour.

As described by @carmocca I suggested we change the behaviour as such:

BEFORE

trainer.test(model, ckpt_path=None) # use provided model
trainer.test(model, ckpt_path='best') # use provided model, ignore ckpt_path
trainer.test(model, ckpt_path='my_path') # use provided model, ignore ckpt_path

trainer.fit(model)
# then
trainer.test(ckpt_path=None) # use latest model
trainer.test(ckpt_path='my_path') # load path

AFTER

trainer.test(model, ckpt_path=None) # use provided model
trainer.test(model, ckpt_path='best') # load best model
trainer.test(model, ckpt_path='my_path') # load path

trainer.fit(model)
# then
trainer.test(ckpt_path=None) # load best model
trainer.test(ckpt_path='my_path') # load path

This imo makes the behaviour in line with what's expected + allows deepspeed to be used as an engine in the cases where inference cannot happen without the Trainer (when there is sharding orchestration etc).

@SeanNaren SeanNaren added feature Is an improvement or enhancement help wanted Open to be worked on labels Jul 9, 2021
@SeanNaren SeanNaren self-assigned this Jul 9, 2021
@tchaton
Copy link
Contributor

tchaton commented Jul 9, 2021

Sounds good to me !

@SeanNaren
Copy link
Contributor Author

SeanNaren commented Jul 12, 2021

Behaviour now is for test, validate and predict:

trainer.test(model, ckpt_path=None) # use provided model
trainer.test(model, ckpt_path='best') # load best model
trainer.test(model, ckpt_path='my_path') # load path
trainer.fit(model)
# then
trainer.test(ckpt_path=None) # load best model
trainer.test(ckpt_path='my_path') # load path

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement help wanted Open to be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants