-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Logistic regression does not return fit status #2546
Comments
What if we added a |
This issue has been marked rotten due to no recent activity in the past 90d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. |
This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d. |
The warning message is correctly displayed in the Jupyter notebook:
This basically solves the main inssue here. We might want to add a few notes what to do in case the max iterations are reached, along the line what sklearn does
I am preparing a small PR to implement this. Since sklearn does not return a fit status, I think we can also skip it. |
closes #2546 This PR improves the warning message printed when max iterations are reached during fitting a linear model. Example: ```python import numpy as np from cuml.linear_model import LogisticRegression from sklearn.datasets import load_breast_cancer X, y = load_breast_cancer(return_X_y=True) y = y.astype(np.float64) cls = LogisticRegression(penalty='none', C=1) cls.fit(X, y) ``` This produces the following output, where the last line is added by this PR: ``` [W] [15:31:04.467478] L-BFGS: max iterations reached [W] [15:31:04.467804] Maximum iterations reached before solver is converged. To increase model accuracy you can increase the number of iterations (max_iter) or improve the scaling of the input data. ``` Authors: - Tamas Bela Feher (@tfeher) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3515
Describe the bug
Logistic regression uses the QN solver. The QN solver defines a set of return codes
cuml/cpp/src/glm/qn/qn_util.cuh
Lines 40 to 46 in 39e1bb2
While qn_fit returns these codes to the caller, qnFit ignores the returned error code. This way the Python layer is not informed of the fit status.
If the solver exits after the first iteration with numerical error, the Python user is not informed about the error, and only sees insufficent accuracy of the model, like in this issue.
PR #2543 improved the C++ side logging to print a warning message if the solver is not converged, and error message if numerical error is found. But this might be hidden for user who is running a Jupyter notebook.
Steps/Code to reproduce bug
The C++ layer prints a warning message on the standard output:
L-BFGS: max iterations reached
, but no such message visible in a jupyter notebook.In contrast Scikit learn
gives the following output in a jupyter notebook:
Expected behavior
Give informative message about fit status for Jupyter notebook users.
The text was updated successfully, but these errors were encountered: