Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIL to import categorical models from treelite #4173

Merged
merged 157 commits into from
Sep 29, 2021

Conversation

levsnv
Copy link
Contributor

@levsnv levsnv commented Aug 23, 2021

  • importing forests with categorical nodes from Treelite
  • python tests for lightgbm forests fit on categorical features and containing categorical nodes as well as numerical nodes in the same forest
  • above tests include multiclass (GBDT) tests
  • sklearn internally transforms categorical features into numerical: RandomForestClassifier GradientBoostingClassifier
  • (cuml RF tests can come separately, if needed)
  • (xgboost tests can come separately, if needed)

levsnv added 30 commits June 1, 2021 14:24
This reverts commit 472575d.
cpu inference supports categorical nodes
data generation supports categorical features
moved is_categoricals_h generation from GPU to CPU (value flow more obvious)
fitted parameters to conventions
neater memory management
@levsnv levsnv added 3 - Ready for Review Ready for review by team and removed 4 - Waiting on Author Waiting for author to respond to review labels Sep 24, 2021
@levsnv levsnv requested a review from canonizer September 24, 2021 06:38
Copy link
Contributor

@canonizer canonizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, provided that the comments are addressed.

cpp/src/fil/fil.cu Outdated Show resolved Hide resolved
cpp/src/fil/fil.cu Outdated Show resolved Hide resolved
cpp/src/fil/fil.cu Outdated Show resolved Hide resolved
cpp/src/fil/fil.cu Outdated Show resolved Hide resolved
cpp/src/fil/fil.cu Outdated Show resolved Hide resolved
cpp/src/fil/internal.cuh Outdated Show resolved Hide resolved
cpp/src/fil/internal.cuh Show resolved Hide resolved
cpp/src/fil/internal.cuh Show resolved Hide resolved
cpp/test/sg/fil_test.cu Outdated Show resolved Hide resolved
cpp/test/sg/fil_test.cu Show resolved Hide resolved
@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.10@d0aaafc). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.10    #4173   +/-   ##
===============================================
  Coverage                ?   86.07%           
===============================================
  Files                   ?      231           
  Lines                   ?    18652           
  Branches                ?        0           
===============================================
  Hits                    ?    16055           
  Misses                  ?     2597           
  Partials                ?        0           
Flag Coverage Δ
dask 47.03% <0.00%> (?)
non-dask 78.74% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d0aaafc...fb67202. Read the comment docs.

@levsnv levsnv requested a review from wphicks September 28, 2021 00:47
Copy link
Contributor

@wphicks wphicks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great work!

@teju85
Copy link
Member

teju85 commented Sep 28, 2021

rerun tests

1 similar comment
@levsnv
Copy link
Contributor Author

levsnv commented Sep 28, 2021

rerun tests

@levsnv
Copy link
Contributor Author

levsnv commented Sep 28, 2021

rerun tests

conda_package_handling.exceptions.InvalidArchiveError: Error with archive /opt/conda/envs/rapids/pkgs/cudatoolkit-11.2.72-h2bc3f7f_0.tar.bz2.  You probably need to delete and re-download or re-create this file.  Message from libarchive 
("14:30:32 EOFError: Compressed file ended before the end-of-stream marker was reached"

@dantegd
Copy link
Member

dantegd commented Sep 29, 2021

@gpucibot merge

@dantegd
Copy link
Member

dantegd commented Sep 29, 2021

rerun tests

@dantegd
Copy link
Member

dantegd commented Sep 29, 2021

@gpucibot merge

1 similar comment
@dantegd
Copy link
Member

dantegd commented Sep 29, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 003f6fc into rapidsai:branch-21.10 Sep 29, 2021
vimarsh6739 pushed a commit to vimarsh6739/cuml that referenced this pull request Oct 9, 2023
* importing forests with categorical nodes from Treelite
* python tests for lightgbm forests fit on categorical features and containing categorical nodes as well as numerical nodes in the same forest
* above tests include multiclass (GBDT) tests
* sklearn internally transforms categorical features into numerical: [RandomForestClassifier](https://scikit-learn.org/0.24/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier.fit) [GradientBoostingClassifier](https://scikit-learn.org/0.24/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html#sklearn.ensemble.GradientBoostingClassifier.fit)
* (cuml RF tests can come separately, if needed)
* (xgboost tests can come separately, if needed)

Authors:
  - Levs Dolgovs (https://github.com/levsnv)
  - Dante Gama Dessavre (https://github.com/dantegd)

Approvers:
  - https://github.com/Salonijain27
  - Andy Adinets (https://github.com/canonizer)
  - William Hicks (https://github.com/wphicks)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: rapidsai#4173
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team CUDA/C++ Cython / Python Cython or Python issue feature request New feature or request non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants