Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Explicitly pass serialization directory and local rank to trainer in train command #5180

Merged
merged 4 commits into from
May 7, 2021

Conversation

epwalsh
Copy link
Member

@epwalsh epwalsh commented May 5, 2021

The local_rank was never actually passed to the trainer.

@epwalsh epwalsh requested a review from dirkgr May 5, 2021 22:30
Copy link
Member

@dirkgr dirkgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a test that fails when this isn't fixed.

@epwalsh epwalsh changed the title fix bug with local_rank Explicitly pass serialization directory and local rank to trainer in train command May 6, 2021
@epwalsh
Copy link
Member Author

epwalsh commented May 6, 2021

Ok, so this wasn't a bug after all.

Apparently FromParams does some magic here where the local_rank (and serialization_dir) parameters from TrainCommand.from_partial_objects() are implicitly passed down into the GradientDescentTrainer.from_partial_objects() method.

This blows my mind, and I'm not really comfortable with it. So I've made it explicit and added a test to ensure the behavior never breaks.

@epwalsh epwalsh requested a review from dirkgr May 6, 2021 20:16
Copy link
Member

@dirkgr dirkgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a big fan of not-magic.

@epwalsh epwalsh merged commit d85c5c3 into main May 7, 2021
@epwalsh epwalsh deleted the distributed-fix branch May 7, 2021 21:46
dirkgr added a commit that referenced this pull request May 10, 2021
…train command (#5180)

* fix bug with local_rank

* fix up

Co-authored-by: Dirk Groeneveld <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants