-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QN solvers: Use different gradient norms for different for different loss functions. #4491
QN solvers: Use different gradient norms for different for different loss functions. #4491
Conversation
…ifferent loss functions
Codecov Report
@@ Coverage Diff @@
## branch-22.02 #4491 +/- ##
===============================================
Coverage ? 85.77%
===============================================
Files ? 236
Lines ? 19314
Branches ? 0
===============================================
Hits ? 16567
Misses ? 2747
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
rerun tests |
1 similar comment
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The changes are straightforward.
rerun tests |
rerun tests |
1 similar comment
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Artem for the PR, it looks good to me.
rerun tests |
@gpucibot merge |
…loss functions. (rapidsai#4491) Different loss functions may scale differently with the number of features. This has an effect on the convergence criteria. To account for that, I let a loss function define its preferred metric. As a result, the number of iterations should be less dependent on the number of features for all loss functions. Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - Tamas Bela Feher (https://github.com/tfeher) URL: rapidsai#4491
Different loss functions may scale differently with the number of features. This has an effect on the convergence criteria. To account for that, I let a loss function define its preferred metric. As a result, the number of iterations should be less dependent on the number of features for all loss functions.