-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Update Mixin classes and include in estimators #2411
[REVIEW] Update Mixin classes and include in estimators #2411
Conversation
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
…er mixin to logistic regression and rips out redundant logistic regression score implementation
This is now ready for review. |
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad to see this being used across the board. Very minor things.
handle = None | ||
|
||
preds = self.predict(X) | ||
return r2_score(y, preds, handle=handle) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we increase compatibility by also overriding the _more_tags()
method? https://github.com/scikit-learn/scikit-learn/blob/fd237278e/sklearn/base.py#L501. It
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to implement this, though I wonder if it might be worth holding off unless there's a compelling reason. Do you know what this generally used for in sklearn @cjnolet ? It's not clear to me, but I'm not well versed in the sklearn internals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was really a genuine question in my part. If you don’t see a need for this yet then I’m fine holding off.
I just happened to notice this was the only difference between our mixins and those in Scikit-learn so I figured I’d ask.
The scikit-learn documentation also gives little evidence into exactly how this is used. My first thinking was that it might be used for some type of tag-based model selection where only estimators meeting particular behavioral / characteristic criteria are trained and evaluated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess they are used primarily in tests and by helper functions to determine check/validate their inputs and outputs: https://scikit-learn.org/stable/developers/develop.html#estimator-tags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find. Sharing the overview summary in this thread for readability:
Scikit-learn introduced estimator tags in version 0.21. These are annotations of estimators that allow programmatic inspection of their capabilities, such as sparse matrix support, supported output types and supported methods. The estimator tags are a dictionary returned by the method _get_tags().
Given your research into how sklearn uses them, I would vote we hold off for now. But I'm still happy to be persuaded otherwise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I don't these are a high priority at the moment.
Co-authored-by: Corey J. Nolet <[email protected]>
Co-authored-by: Corey J. Nolet <[email protected]>
Co-authored-by: Corey J. Nolet <[email protected]>
Co-authored-by: Corey J. Nolet <[email protected]>
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR:
common.base
- [ ] Updates the Mixin classes to use the new CumlArray (may not be necessary)clf.score
via the Mixin class to every estimator and removes newly redundantscore
methods (does not removescore
methods that use an optimized libcuml implementation such as random forest)Related to #2401 . This closes #2393
As @dantegd noted offline (summarized in #2393 (comment)), it would be nice to update the single GPU Mixin classes to ensure they're working well and use them for inheritance.