Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Update Mixin classes and include in estimators #2411

Merged

Conversation

beckernick
Copy link
Member

@beckernick beckernick commented Jun 12, 2020

This PR:

  • Moves the single GPU Mixin classes to common.base
  • Updates the set of regressors using RegressorMixin to point to the updated namespace
    - [ ] Updates the Mixin classes to use the new CumlArray (may not be necessary)
  • Adds Mixin classes to additional estimators (classifiers and regressors)
  • Adds clf.score via the Mixin class to every estimator and removes newly redundant score methods (does not remove score methods that use an optimized libcuml implementation such as random forest)

Related to #2401 . This closes #2393

As @dantegd noted offline (summarized in #2393 (comment)), it would be nice to update the single GPU Mixin classes to ensure they're working well and use them for inheritance.

@GPUtester
Copy link
Contributor

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

@beckernick beckernick changed the title [WIP] Refactor mixin classes [skip-ci] [WIP] Update Mixin classes [skip-ci] Jun 12, 2020
…er mixin to logistic regression and rips out redundant logistic regression score implementation
@beckernick beckernick self-assigned this Jun 12, 2020
@beckernick beckernick added the 2 - In Progress Currenty a work in progress label Jun 12, 2020
@beckernick beckernick marked this pull request as ready for review June 15, 2020 16:52
@beckernick beckernick requested a review from a team as a code owner June 15, 2020 16:52
@beckernick beckernick changed the title [WIP] Update Mixin classes [skip-ci] [REVIEW] Update Mixin classes [skip-ci] Jun 15, 2020
@beckernick beckernick added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currenty a work in progress labels Jun 15, 2020
@beckernick beckernick changed the title [REVIEW] Update Mixin classes [skip-ci] [REVIEW] Update Mixin classes and include in estimators Jun 15, 2020
@beckernick
Copy link
Member Author

This is now ready for review.

@beckernick
Copy link
Member Author

rerun tests

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to see this being used across the board. Very minor things.

handle = None

preds = self.predict(X)
return r2_score(y, preds, handle=handle)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we increase compatibility by also overriding the _more_tags() method? https://github.com/scikit-learn/scikit-learn/blob/fd237278e/sklearn/base.py#L501. It

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to implement this, though I wonder if it might be worth holding off unless there's a compelling reason. Do you know what this generally used for in sklearn @cjnolet ? It's not clear to me, but I'm not well versed in the sklearn internals.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was really a genuine question in my part. If you don’t see a need for this yet then I’m fine holding off.

I just happened to notice this was the only difference between our mixins and those in Scikit-learn so I figured I’d ask.

The scikit-learn documentation also gives little evidence into exactly how this is used. My first thinking was that it might be used for some type of tag-based model selection where only estimators meeting particular behavioral / characteristic criteria are trained and evaluated?

Copy link
Member

@cjnolet cjnolet Jun 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess they are used primarily in tests and by helper functions to determine check/validate their inputs and outputs: https://scikit-learn.org/stable/developers/develop.html#estimator-tags

Copy link
Member Author

@beckernick beckernick Jun 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find. Sharing the overview summary in this thread for readability:

Scikit-learn introduced estimator tags in version 0.21. These are annotations of estimators that allow programmatic inspection of their capabilities, such as sparse matrix support, supported output types and supported methods. The estimator tags are a dictionary returned by the method _get_tags().

Given your research into how sklearn uses them, I would vote we hold off for now. But I'm still happy to be persuaded otherwise

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I don't these are a high priority at the moment.

python/cuml/common/base.pyx Outdated Show resolved Hide resolved
python/cuml/common/base.pyx Outdated Show resolved Hide resolved
python/cuml/common/base.pyx Outdated Show resolved Hide resolved
python/cuml/common/base.pyx Outdated Show resolved Hide resolved
beckernick and others added 2 commits June 16, 2020 15:05
beckernick and others added 2 commits June 16, 2020 15:05
@beckernick
Copy link
Member Author

rerun tests

@cjnolet cjnolet added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Jun 17, 2020
Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Add an _estimator_type attribute to estimators
3 participants