QN solvers: Use different gradient norms for different for different loss functions. #4491

achirkin · 2022-01-17T15:06:37Z

Different loss functions may scale differently with the number of features. This has an effect on the convergence criteria. To account for that, I let a loss function define its preferred metric. As a result, the number of iterations should be less dependent on the number of features for all loss functions.

…ifferent loss functions

codecov-commenter · 2022-01-18T14:05:47Z

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.02@d0cbcd9). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff               @@
##             branch-22.02    #4491   +/-   ##
===============================================
  Coverage                ?   85.77%           
===============================================
  Files                   ?      236           
  Lines                   ?    19314           
  Branches                ?        0           
===============================================
  Hits                    ?    16567           
  Misses                  ?     2747           
  Partials                ?        0

Flag	Coverage Δ
dask	`46.54% <0.00%> (?)`
non-dask	`78.67% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d0cbcd9...46bd852. Read the comment docs.

cjnolet · 2022-01-19T18:51:48Z

rerun tests

achirkin · 2022-01-24T06:47:41Z

rerun tests

cjnolet

LGTM. The changes are straightforward.

achirkin · 2022-01-31T07:29:04Z

rerun tests

achirkin · 2022-02-03T09:41:15Z

rerun tests

cjnolet · 2022-02-03T21:06:24Z

rerun tests

tfeher

Thanks Artem for the PR, it looks good to me.

achirkin · 2022-02-08T07:22:44Z

rerun tests

cjnolet · 2022-02-10T00:20:44Z

@gpucibot merge

…loss functions. (rapidsai#4491) Different loss functions may scale differently with the number of features. This has an effect on the convergence criteria. To account for that, I let a loss function define its preferred metric. As a result, the number of iterations should be less dependent on the number of features for all loss functions. Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - Tamas Bela Feher (https://github.com/tfeher) URL: rapidsai#4491

Use different gradient norms in check_convergence for different for d…

8a27566

…ifferent loss functions

github-actions bot added the CUDA/C++ label Jan 17, 2022

achirkin added 2 - In Progress Currenty a work in progress non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Jan 17, 2022

Refine gradNorm functions

46bd852

achirkin marked this pull request as ready for review January 18, 2022 11:14

achirkin requested a review from a team as a code owner January 18, 2022 11:14

achirkin requested a review from tfeher January 18, 2022 11:14

achirkin added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currenty a work in progress labels Jan 18, 2022

achirkin changed the title ~~Use different gradient norms for different for different loss functions.~~ QN solvers: Use different gradient norms for different for different loss functions. Jan 18, 2022

achirkin changed the base branch from branch-22.02 to branch-22.04 January 24, 2022 14:41

cjnolet approved these changes Jan 28, 2022

View reviewed changes

Merge branch 'branch-22.04' into enh-qn-flexible-norm

21fa93b

tfeher approved these changes Feb 7, 2022

View reviewed changes

achirkin added 2 commits February 8, 2022 09:02

Merge branch 'branch-22.04' into enh-qn-flexible-norm

16300f1

Merge branch 'branch-22.04' into enh-qn-flexible-norm

21d9e36

rapids-bot bot merged commit 06b1f92 into rapidsai:branch-22.04 Feb 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QN solvers: Use different gradient norms for different for different loss functions. #4491

QN solvers: Use different gradient norms for different for different loss functions. #4491

achirkin commented Jan 17, 2022 •

edited

Loading

codecov-commenter commented Jan 18, 2022

cjnolet commented Jan 19, 2022

achirkin commented Jan 24, 2022

cjnolet left a comment

achirkin commented Jan 31, 2022

achirkin commented Feb 3, 2022

cjnolet commented Feb 3, 2022

tfeher left a comment

achirkin commented Feb 8, 2022

cjnolet commented Feb 10, 2022

QN solvers: Use different gradient norms for different for different loss functions. #4491

QN solvers: Use different gradient norms for different for different loss functions. #4491

Conversation

achirkin commented Jan 17, 2022 • edited Loading

codecov-commenter commented Jan 18, 2022

Codecov Report

cjnolet commented Jan 19, 2022

achirkin commented Jan 24, 2022

cjnolet left a comment

Choose a reason for hiding this comment

achirkin commented Jan 31, 2022

achirkin commented Feb 3, 2022

cjnolet commented Feb 3, 2022

tfeher left a comment

Choose a reason for hiding this comment

achirkin commented Feb 8, 2022

cjnolet commented Feb 10, 2022

achirkin commented Jan 17, 2022 •

edited

Loading