assert GPU CPU intercept_ equal when fit_intercept is false in cuml.LogisticRegression #5567

lijinf2 · 2023-08-30T22:31:04Z

There is a discrepancy in intercept_ between cuml.linear_model.LogisticRegression and sklearn.linear_model.LogisticRegression. When fit_intercept=False and n_classes > 2, GPU intercept_ has shape (1, ) while CPU intercept_ has shape (n_classes,). This PR revises QN class to resolve the discrepancy.

copy-pr-bot · 2023-08-30T22:31:07Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

dantegd · 2023-09-19T19:50:15Z

/ok to test

csadorf

Thank you for the contribution! I am a bit confused by the test change.

csadorf · 2023-09-19T19:53:12Z

python/cuml/tests/test_linear_model.py

+    if fit_intercept is False:
+        assert np.array_equal(culog.intercept_, sklog.intercept_)


I thought the point of this change was to ensure that intercept_ is equal in both cases. Why are we only testing for equality when fit_intercept is False ?

When fit_intercept is True, intercept_ can be a list of non-zero. It is possible that culog.intercept_ and sklog.intercept_ are different, given their implementations are not exactly the same.

When fit_intercept is False, intercept_ is a list of zero. The test case is able to assert equal the culog.intercept_ and sklog.intercept_.

But they would still be similar, no? You can use the cuml.testing.utils.array_equal function like here.

Had tried assert array_equal(culog.intercept_, sklog.intercept_, 1e-3, with_sign=True).
Some tests failed:
FAILED tests/test_linear_model.py::test_logistic_regression[column_info0-1000-2-float32-none-1.0-True-1.0-0.001] - AssertionError: assert <array_equal: [0.60477465] [0.6086404] unit_tol=0.001 total_tol=0.0001 with_sign=True>

FAILED tests/test_linear_model.py::test_logistic_regression[column_info0-1000-10-float32-l1-1.0-True-1.0-0.001] - AssertionError: assert <array_equal: [-0.15404558 0.10340142 0.12519604 ... -0.13840534 0.13014889 -0.01927175] [-0.14051172 0.11931933 0.14109698 ... -0.12780005 0.14333196 -0.0074189 ] unit_tol=0.001 t...

Mean accuracies of culog and sklog are the same though. It seems the models converged to different optima.

Ok, that's a valid point.

lijinf2 · 2023-09-20T07:23:56Z

/ok to test

lijinf2 · 2023-09-20T18:17:02Z

/ok to test

csadorf · 2023-09-21T17:51:24Z

python/cuml/tests/test_linear_model.py

+    if fit_intercept is False:
+        assert np.array_equal(culog.intercept_, sklog.intercept_)


Ok, that's a valid point.

Also adopted the code structure of the SG class to prepare for future PRs. This PR depends on and has included [PR 5567](#5567) Authors: - Jinfeng Li (https://github.com/lijinf2) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #5558

lijinf2 · 2023-09-21T22:30:43Z

Closed this PR because it has been merged as part of the follow-up PR: #5558

lijinf2 requested a review from a team as a code owner August 30, 2023 22:31

github-actions bot added the Cython / Python Cython or Python issue label Aug 30, 2023

lijinf2 added breaking Breaking change improvement Improvement / enhancement to an existing function 3 - Ready for Review Ready for review by team labels Aug 30, 2023

lijinf2 force-pushed the lrsg_intercept branch from 2800f6c to 18acacf Compare September 18, 2023 18:12

csadorf requested changes Sep 19, 2023

View reviewed changes

lijinf2 force-pushed the lrsg_intercept branch from 18acacf to c251293 Compare September 20, 2023 07:23

lijinf2 force-pushed the lrsg_intercept branch from c251293 to 60fd4ec Compare September 20, 2023 07:26

assert GPU CPU intercept_ equal when fit_intercept is false

62ee5f9

lijinf2 force-pushed the lrsg_intercept branch 2 times, most recently from d75f99e to 8e69f1b Compare September 20, 2023 07:37

revise self.qnparams to self.solver_model.qnparams

f284d4f

lijinf2 force-pushed the lrsg_intercept branch from 8e69f1b to f284d4f Compare September 20, 2023 18:18

lijinf2 mentioned this pull request Sep 20, 2023

[FEA] Support no regularization in MNMG LogisticRegression #5558

Merged

csadorf approved these changes Sep 21, 2023

View reviewed changes

lijinf2 closed this Sep 21, 2023

lijinf2 deleted the lrsg_intercept branch June 26, 2024 21:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assert GPU CPU intercept_ equal when fit_intercept is false in cuml.LogisticRegression #5567

assert GPU CPU intercept_ equal when fit_intercept is false in cuml.LogisticRegression #5567

lijinf2 commented Aug 30, 2023

copy-pr-bot bot commented Aug 30, 2023

dantegd commented Sep 19, 2023

csadorf left a comment

csadorf Sep 19, 2023

lijinf2 Sep 19, 2023 •

edited

Loading

csadorf Sep 19, 2023

lijinf2 Sep 19, 2023

lijinf2 Sep 19, 2023

csadorf Sep 21, 2023

lijinf2 commented Sep 20, 2023

lijinf2 commented Sep 20, 2023

csadorf Sep 21, 2023

lijinf2 commented Sep 21, 2023

		if fit_intercept is False:
		assert np.array_equal(culog.intercept_, sklog.intercept_)

assert GPU CPU intercept_ equal when fit_intercept is false in cuml.LogisticRegression #5567

assert GPU CPU intercept_ equal when fit_intercept is false in cuml.LogisticRegression #5567

Conversation

lijinf2 commented Aug 30, 2023

copy-pr-bot bot commented Aug 30, 2023

dantegd commented Sep 19, 2023

csadorf left a comment

Choose a reason for hiding this comment

csadorf Sep 19, 2023

Choose a reason for hiding this comment

lijinf2 Sep 19, 2023 • edited Loading

Choose a reason for hiding this comment

csadorf Sep 19, 2023

Choose a reason for hiding this comment

lijinf2 Sep 19, 2023

Choose a reason for hiding this comment

lijinf2 Sep 19, 2023

Choose a reason for hiding this comment

csadorf Sep 21, 2023

Choose a reason for hiding this comment

lijinf2 commented Sep 20, 2023

lijinf2 commented Sep 20, 2023

csadorf Sep 21, 2023

Choose a reason for hiding this comment

lijinf2 commented Sep 21, 2023

lijinf2 Sep 19, 2023 •

edited

Loading