Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Ability to export cuML RF models and run prediction on machines without a GPU #3853

Closed
hcho3 opened this issue May 12, 2021 · 1 comment
Labels
feature request New feature or request

Comments

@hcho3
Copy link
Contributor

hcho3 commented May 12, 2021

Is your feature request related to a problem? Please describe.
This is similar to #3556 and #3822, but for random forests specifically.

Describe the solution you'd like
When the latest Treelite (1.3.0) is brought into cuML, we will be able to convert cuML RF to a Treelite object and then serialize it as a checkpoint. Treelite now offers a binary checkpoint format so that tree models can be exchanged between different machines.

Running prediction on machines without GPUs will proceed as follows:

  1. Export cuML RF as a Treelite checkpoint checkpoint.tl.
  2. Copy the checkpoint file checkpoint.tl to the target machine (which has no GPU)
  3. Install Treelite package on the target machine.
  4. Load the checkpoint checkpoint.tl on the target machine, by calling treelite.Model.deserialize(...).
  5. Make prediction, by calling treelite.gtil.predict(...).

Compatibility considerations.

Checkpoints are specific to the version of Treelite with which it was produced. Each version of cuML is pinned to a specific version of Treelite, and the target machine must have the exact version of Treelite installed.

For example, the upcoming version of cuML will have Treelite 1.3.0, so the target machine must also have Treelite 1.3.0 installed.

@hcho3 hcho3 added feature request New feature or request ? - Needs Triage Need team to review and classify labels May 12, 2021
@hcho3 hcho3 removed the ? - Needs Triage Need team to review and classify label May 13, 2021
rapids-bot bot pushed a commit that referenced this issue May 14, 2021
Upgrade to Treelite 1.3.0 to take advantage of the following new features:

* Faster model import for scikit-learn tree models (dmlc/treelite#264). Fixes #3768
* Binary serializer to a file stream (dmlc/treelite#270, dmlc/treelite#273)
* [EXPERIMENTAL] Add GTIL, reference inference backend (dmlc/treelite#274)

Make progress towards #3853

Depends on rapidsai/integration#270

Authors:
  - Philip Hyunsu Cho (https://github.com/hcho3)

Approvers:
  - William Hicks (https://github.com/wphicks)
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #3855
@hcho3
Copy link
Contributor Author

hcho3 commented May 18, 2021

Closing, since it's now possible to export cuML RF models as a checkpoint file that can be later be loaded into a machine without a GPU.

from cuml.ensemble import RandomForestClassifier as cumlRandomForestClassifier
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)
X, y = X.astype(np.float32), y.astype(np.int32)
clf = cumlRandomForestClassifier(max_depth=3, random_state=0, n_estimators=10)
clf.fit(X, y)

checkpoint_path = './checkpoint.tl'
# Export cuML RF model as Treelite checkpoint
clf.convert_to_treelite_model().to_treelite_checkpoint(checkpoint_path)

Later on a machine without a GPU:

import treelite

# The checkpoint file has been copied over
checkpoint_path = './checkpoint.tl'
tl_model = treelite.Model.deserialize(checkpoint_path)
out_prob = treelite.gtil.predict(tl_model, X)

Note that only Treelite needs to be installed on the target machine; cuML is not required.

@hcho3 hcho3 closed this as completed May 18, 2021
rapids-bot bot pushed a commit that referenced this issue May 28, 2021
vimarsh6739 pushed a commit to vimarsh6739/cuml that referenced this issue Oct 9, 2023
Upgrade to Treelite 1.3.0 to take advantage of the following new features:

* Faster model import for scikit-learn tree models (dmlc/treelite#264). Fixes rapidsai#3768
* Binary serializer to a file stream (dmlc/treelite#270, dmlc/treelite#273)
* [EXPERIMENTAL] Add GTIL, reference inference backend (dmlc/treelite#274)

Make progress towards rapidsai#3853

Depends on rapidsai/integration#270

Authors:
  - Philip Hyunsu Cho (https://github.com/hcho3)

Approvers:
  - William Hicks (https://github.com/wphicks)
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: rapidsai#3855
vimarsh6739 pushed a commit to vimarsh6739/cuml that referenced this issue Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant