-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add curriculum learning callback #1256
Conversation
29b3b64
to
2765769
Compare
@b-chu about the new API, couple questions:
|
Also, I'm worried about the loss curves in the plots you shared, they don't look fully deterministic to me. What model size and batch size were you running at, and with which datasets? Longer training runs with a bigger model and small batch size, without shuffling, would be helpful so that we can determine if the loss curves are actually deterministic or not. Just looking at the first few steps most training runs will look pretty similar regardless of the data ordering. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a composer release first, right?
bce0270
to
596b761
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass, mostly lgtm but some minor comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
second pass, lgtm besides a few minor comments. requiring review from @milocress
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a couple questions
08d9b7f
to
42406c0
Compare
32579a0
to
6cc5c10
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will approve after comments are addressed, overall lgtm
ec28600
to
a5fa8a5
Compare
ddf2876
to
6f95810
Compare
Curriculum learning callback
Requirements
Features
Other
Manual tests
Matches old callback behavior
Resumes correctly in the middle of the schedule
Resumes correctly when new datamix added to schedule
Resumes correctly when callback added after initial training run
API
Old API:
Start a new run
Start a new run
New API: