Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Multiclass meta estimator wrappers and multiclass SVC #3092

Merged
merged 22 commits into from
Jan 12, 2021

Conversation

tfeher
Copy link
Contributor

@tfeher tfeher commented Oct 30, 2020

This PR closes #959.

This PR implements multi class classification meta estimators, and multi class SVC.

The multiclass meta estimators are just thin wrappers around the scikit-learn counterpart to provide array conversion.
This has some overhead, but should be small compared to SVM training time. More info in issue #2876.

Limitations compared to scikit-learn multiclass SVC:

  • Currently we do not handle sample_weight in multi class case, and a warning is issued. To handle that we would need to fork the multi class meta estimators and ensure that the class weight parameter is propagated to the binary classifier.
  • decision_function_shape arg not supported
  • break_ties arg not supported

Using sklearn.multiclass allows us to choose between one-vs-one and one-vs-rest classifiers. I have introduced an extra arg multiclass_strategy to let the user choose. One vs one is more accurate, but more costly.

This is marked as work in progress, because the actual way of mapping back and fort to numpy data types (in order to make use of sklearn.multiclass) is going be simplified by #3040. Ideally we should merge this after #3040, but if that gets blocked we can bring this forward.

@tfeher tfeher requested a review from a team as a code owner October 30, 2020 15:37
@GPUtester
Copy link
Contributor

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

@tfeher tfeher force-pushed the fea-ext-svm-multiclass branch 2 times, most recently from 4ccb764 to 1e32cc7 Compare November 17, 2020 14:23
@tfeher tfeher changed the title [WIP] Multiclass SVC [REVIEW] Multiclass meta estimator wrappers and multiclass SVC Nov 17, 2020
Copy link
Member

@dantegd dantegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have one question regarding object consumption

python/cuml/multiclass/multiclass.py Outdated Show resolved Hide resolved
@JohnZed JohnZed added 4 - Waiting on Author Waiting for author to respond to review 4 - Waiting on Reviewer Waiting for reviewer to review or respond and removed 4 - Waiting on Reviewer Waiting for reviewer to review or respond labels Nov 23, 2020
@tfeher tfeher force-pushed the fea-ext-svm-multiclass branch from 10e3bdb to a197b14 Compare November 25, 2020 20:51
@tfeher
Copy link
Contributor Author

tfeher commented Nov 25, 2020

I have improved the docstring. Additionally I have remove the n_jobs parameter because that might mislead people to think that these wrappers can do multi GPU, which they cannot.

@tfeher tfeher added 4 - Waiting on Reviewer Waiting for reviewer to review or respond and removed 4 - Waiting on Author Waiting for author to respond to review labels Nov 25, 2020
@codecov-io
Copy link

codecov-io commented Nov 25, 2020

Codecov Report

Merging #3092 (d57fa0b) into branch-0.18 (2e4388d) will increase coverage by 0.10%.
The diff coverage is 89.28%.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.18    #3092      +/-   ##
===============================================
+ Coverage        71.45%   71.56%   +0.10%     
===============================================
  Files              205      207       +2     
  Lines            16594    16702     +108     
===============================================
+ Hits             11858    11952      +94     
- Misses            4736     4750      +14     
Impacted Files Coverage Δ
python/cuml/multiclass/multiclass.py 84.21% <84.21%> (ø)
python/cuml/svm/svm_base.pyx 94.27% <91.30%> (-0.63%) ⬇️
python/cuml/svm/svc.pyx 95.16% <92.98%> (-0.62%) ⬇️
python/cuml/multiclass/__init__.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e4388d...d57fa0b. Read the comment docs.

@tfeher
Copy link
Contributor Author

tfeher commented Nov 26, 2020

Fixed failing tests:

  • The estimator arg has no default param therefore the unfit_pickle and unfit_clone tests ar set to xfail.
  • The docsting tests did not pass, I believe the parameter docs are not parsed correctly. For now I have skipped these test. @mdemoret-nv could you have a look while these fail?

@tfeher
Copy link
Contributor Author

tfeher commented Nov 27, 2020

rerun tests

1 similar comment
@tfeher
Copy link
Contributor Author

tfeher commented Nov 27, 2020

rerun tests

@tfeher
Copy link
Contributor Author

tfeher commented Nov 30, 2020

rerun tests

There seem to be on intermittent error with FilTests/PredictSparse16FilTest.Predict/17 on centos.

@tfeher tfeher added feature request New feature or request non-breaking Non-breaking change labels Dec 1, 2020
Copy link
Contributor

@mdemoret-nv mdemoret-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a couple of suggestions regarding estimator usage, documentation, and decorators but overall it's nothing major.

I think before this gets merged you need to add the classes to the API doc so these new estimators will show up in our documentation.

python/cuml/multiclass/multiclass.py Outdated Show resolved Hide resolved
python/cuml/multiclass/multiclass.py Show resolved Hide resolved
python/cuml/multiclass/multiclass.py Outdated Show resolved Hide resolved
python/cuml/multiclass/multiclass.py Outdated Show resolved Hide resolved
python/cuml/multiclass/multiclass.py Outdated Show resolved Hide resolved
python/cuml/test/test_pickle.py Outdated Show resolved Hide resolved
python/cuml/test/test_svm.py Outdated Show resolved Hide resolved
python/cuml/multiclass/multiclass.py Show resolved Hide resolved
python/cuml/multiclass/multiclass.py Show resolved Hide resolved
python/cuml/multiclass/multiclass.py Show resolved Hide resolved
Copy link
Contributor Author

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks at @mdemoret-nv for the review, I appreciate your suggestions, very useful! I have addressed the issues.

I have added a link to the API doc (f36f2d5), probably @JohnZed should have a look if that is the right place.

python/cuml/multiclass/multiclass.py Outdated Show resolved Hide resolved
python/cuml/multiclass/multiclass.py Outdated Show resolved Hide resolved
python/cuml/multiclass/multiclass.py Outdated Show resolved Hide resolved
python/cuml/multiclass/multiclass.py Outdated Show resolved Hide resolved
python/cuml/multiclass/multiclass.py Show resolved Hide resolved
python/cuml/svm/svc.pyx Outdated Show resolved Hide resolved
python/cuml/svm/svc.pyx Outdated Show resolved Hide resolved
python/cuml/test/test_pickle.py Show resolved Hide resolved
python/cuml/test/test_pickle.py Outdated Show resolved Hide resolved
python/cuml/test/test_svm.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @mdemoret-nv for the comments, I have fixed the remaining issues.

python/cuml/multiclass/multiclass.py Outdated Show resolved Hide resolved
python/cuml/test/test_pickle.py Show resolved Hide resolved
Copy link
Contributor

@mdemoret-nv mdemoret-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updates look good. Changes LGTM

@tfeher tfeher changed the base branch from branch-0.17 to branch-0.18 December 4, 2020 09:38
@dantegd dantegd merged commit 639357e into rapidsai:branch-0.18 Jan 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4 - Waiting on Reviewer Waiting for reviewer to review or respond feature request New feature or request non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Multi-class classification in SVM
6 participants