Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync with upstream #32

Merged
merged 158 commits into from
Dec 16, 2021
Merged

Conversation

daxiongshu
Copy link
Owner

No description provided.

trxcllnt and others added 30 commits August 25, 2021 14:15
Removes `-g` from the compile commands generated by distutils to compile Cython files.

This will make our container images, conda packages, and python wheels smaller.
Closes #4054

Authors:
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4179
Summary of the changes:
- Remove some unused print functions
- Move validity checks into parameter construction, so parameters are checked by default
- Remove Node_ID_info struct, we can just use a std::pair
- Move builder_base.cuh into builder.cuh
- Remove node.cuh. Use InstanceRange to store this information.
- Builder.train() directly returns a DT::TreeMetaDataNode<DataT, LabelT> object
- computeQuantiles is made into a pure function. Some weird usages of smart pointers removed.
- Unused DataInfo struct removed
- DecisionTree class member variables removed, member functions made into pure functions (static)
- Some unnecessary RandomForest member variables removed, destructor removed
- Some instances of new/delete change to use std containers
- Tests for instance counts moved from python to gtest
- Change indexing type from 32-bit integers to std::size_t
- Test fil predictions against rf predictions, fixes a case where ties in multi-class prediction are broken inconsistently in RF's cpu predictor

Authors:
  - Rory Mitchell (https://github.com/RAMitchell)

Approvers:
  - Venkat (https://github.com/venkywonka)
  - Vinay Deshpande (https://github.com/vinaydes)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4166
Change the error type when trying to predict before fitting SVM to match sklearn.

Fixes #4192

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4198
This is a continuation of PR #1763, #4053, and #4079, to add Categorical Naive Bayes.
This is supposed to be merged after #4079.
Linking issue #1666.

Authors:
  - Micka (https://github.com/lowener)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #4150
…ghbors Estimator (#4178)

This pull request partially solves [[FEA] #3461](#3461).

This quick-fix has been created to enable cuML's NearestNeighbor estimator to gracefully accept sklearns 'n_jobs' parameter as a pass-through.  

The purpose of making this quick fix is to allow Imbalanced-Learn samplers to rely on cuML's NearestNeighbor estimator, without producing an error when setting the estimators n_jobs parameter `.set_params(**{"n_jobs": self.n_jobs})` [1](https://github.com/scikit-learn-contrib/imbalanced-learn/blob/edf6eae2c00f7fa6d76ee381f5b625155061a725/imblearn/over_sampling/_adasyn.py#L112)

Authors:
  - https://github.com/NV-jpt

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4178
Fixes the old build instructions for `cuml`

cc @dantegd

Authors:
  - https://github.com/shaneding

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4200
…o distance metrics (#4155)

-- This PR depends on RAFT PR - rapidsai/raft#306
-- Adds cpp & python interfaces for these distance metrics with pytest support for each of them.
-- also remove redundant commented code in canberra distance metric

Authors:
  - Mahesh Doijade (https://github.com/mdoijade)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #4155
Simplify the type check

Authors:
  - Nanthini (https://github.com/Nanthini10)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4190
…rparts (#4130)

This looks to me like a typo, and may be problematic and confusing if the `n_rows` and `n_cols` members from the base class instead of the ones from the derived class are accessed.

Signed-off-by: Yitao Li <[email protected]>

Authors:
  - Yitao Li (https://github.com/yitao-li)

Approvers:
  - Micka (https://github.com/lowener)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4130
This will avoid from consumers having to add Thrust explicitly when consuming cuML in CMake. 

cc @shaneding

Authors:
  - Dante Gama Dessavre (https://github.com/dantegd)

Approvers:
  - Robert Maynard (https://github.com/robertmaynard)

URL: #4209
This doesn't include treelite import (export). That will come in #4041

Authors:
  - https://github.com/levsnv

Approvers:
  - Andy Adinets (https://github.com/canonizer)
  - Robert Maynard (https://github.com/robertmaynard)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4092
Fixes #3764,#2518

To do:
- post charts confirming the improvement in accuracy
- address python tests
- benchmark

Authors:
  - Rory Mitchell (https://github.com/RAMitchell)

Approvers:
  - Vinay Deshpande (https://github.com/vinaydes)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4191
This PR ⬇️ 
* fixes #4193 and fixes #4194 that relates to API incompatibility with dask-ml GridSearchCV
* changes the behaviour of cuml RF in the following cases:
    * In the not-so-uncommon case when `n_bins` > number of rows in training sample, instead of throwing error and exiting, the estimator is made to print a warning and use the `n_bins` as the number of training samples. 
    * When `.predict()` is called using `float64` data, instead of throwing an error asking user to explicitly specify `predict_model="CPU"` and rerun, a warning is displayed and implicity defaults to CPU-based prediction from the default GPU-based prediction.
 * Corresponding tests to capture the warnings from above added
 * the estimators now accept both numbers and strings as input for `split_criterion` parameter thus in parity with sklearn's API that takes in strings as criterion.
 * `split_algo` and `use_experimental_backend` parameters of the estimator class have now been completely removed from both documentation and warnings after deprecation in previous releases (from both single-gpu and dask RF). 
 * `num_classes` parameter of predict and score methods have also been similarly removed

Authors:
  - Venkat (https://github.com/venkywonka)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)
  - Rory Mitchell (https://github.com/RAMitchell)

URL: #4207
when we make a new cuml version, we need to also bump the rapids-cmake version at the
same time. Otherwise we will get the previous releases dependencies by mistake.

Authors:
  - Robert Maynard (https://github.com/robertmaynard)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4213
This PR allows support for missing observations and padding at the start for variable-length batch. Example:

![missing_obs_0](https://user-images.githubusercontent.com/17441062/125832072-1ff903c9-088e-4d77-9b17-be365890d982.png)

Note: I had to change ARIMA tests because I used a different method than statsmodels (which is used as a reference in tests) to compute the initial parameter estimation. They cut all missing observations for their initial least-square estimation, and I decided to fill them with naive replacements instead, so I keep the temporal relationships in the data and have a much better initial estimate and often a better fit in the end, according to some MASE measurements I made. So I updated the integration test to use the MASE and pass if we are approximately the same _or better_ than statsmodels.

Authors:
  - Louis Sugy (https://github.com/Nyrio)

Approvers:
  - Robert Maynard (https://github.com/robertmaynard)
  - Tamas Bela Feher (https://github.com/tfeher)
  - Dante Gama Dessavre (https://github.com/dantegd)
  - Ray Douglass (https://github.com/raydouglass)

URL: #4058
Forward-merge `branch-21.08` into `branch-21.10`
[gpuCI] Forward-merge branch-21.10 to branch-21.12 [skip gpuci]
* Adds the poisson impurity criterion to RF, in parity with scikit learn's RF regressor [[here](https://scikit-learn.org/stable/modules/tree.html#regression-criteria)]
EDIT:
* Also adds C++ level testing for RF Objective function gains of Poisson and Gini.

Authors:
  - Venkat (https://github.com/venkywonka)

Approvers:
  - Rory Mitchell (https://github.com/RAMitchell)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4156
[gpuCI] Forward-merge branch-21.10 to branch-21.12 [skip gpuci]
The 2.1.0 version of Treelite incorporates the following major improvements:

* dmlc/treelite#311
* dmlc/treelite#302
* dmlc/treelite#303
* dmlc/treelite#296

In particular, dmlc/treelite#311 is a critical follow-up to #4191 and addresses a performance regression.

Requires rapidsai/integration#353

Authors:
  - Philip Hyunsu Cho (https://github.com/hcho3)

Approvers:
  - Jordan Jacobelli (https://github.com/Ethyling)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4220
[gpuCI] Forward-merge branch-21.10 to branch-21.12 [skip gpuci]
Benchmarks show that RF performs consistently better with pinned host memory, while DBSCAN sometimes better and sometimes not (within the margin of error), so using pinned host memory by default for both these algorithms.

Ignoring KMeans and LARS for now as both show slightly better perf with pinned host memory but only with increasing number of columns. Since this would need more analysis and deciding if a heuristic is needed for selecting memory, deferring it to 21.12.

Here are the raw numbers:
1. LARS
Normal memory:
```{'lars': {(100000, 10): 0.12429666519165039, (100000, 100): 0.015396833419799805, (100000, 250): 0.015408039093017578, (250000, 10): 0.00986933708190918, (250000, 100): 0.023822546005249023, (250000, 250): 0.03715157508850098, (500000, 10): 0.013423442840576172, (500000, 100): 0.044762372970581055, (500000, 250): 0.07782578468322754}```
Pinned memory:
```{'lars': {(100000, 10): 0.12958097457885742, (100000, 100): 0.01501011848449707, (100000, 250): 0.016597509384155273, (250000, 10): 0.01801013946533203, (250000, 100): 0.022644996643066406, (250000, 250): 0.037090301513671875, (500000, 10): 0.020437955856323242, (500000, 100): 0.044635772705078125, (500000, 250): 0.07696056365966797}```
2. RFR
Normal memory:
```'rfr': {(100000, 10): 1.1951744556427002, (100000, 100): 5.099738359451294, (100000, 250): 11.32804536819458, (250000, 10): 2.0097765922546387, (250000, 100): 9.109776496887207, (250000, 250): 21.058837890625, (500000, 10): 3.3387184143066406, (500000, 100): 15.802990436553955, (500000, 250): 36.80855870246887}```
Pinned memory:
```'rfr': {(100000, 10): 1.1727137565612793, (100000, 100): 4.804195880889893, (100000, 250): 11.621357917785645, (250000, 10): 1.8899295330047607, (250000, 100): 9.16961407661438, (250000, 250): 21.12194561958313, (500000, 10): 3.2937560081481934, (500000, 100): 15.66197681427002, (500000, 250): 36.6080117225647}```
3. KMeans
Normal memory:
```{(100000, 10): 0.11008882522583008, (100000, 100): 0.15475797653198242, (100000, 250): 0.15683507919311523, (250000, 10): 0.18775177001953125, (250000, 100): 0.25696277618408203, (250000, 250): 0.40389132499694824, (500000, 10): 0.4578282833099365, (500000, 100): 0.3917391300201416, (500000, 250): 0.6426849365234375}```
Pinned memory:
```'kmeans': {(100000, 10): 0.11982870101928711, (100000, 100): 0.16992664337158203, (100000, 250): 0.1021108627319336, (250000, 10): 0.16021251678466797, (250000, 100): 0.31025242805480957, (250000, 250): 0.298201322555542, (500000, 10): 0.21084189414978027, (500000, 100): 0.50473952293396, (500000, 250): 0.6191830635070801}```
4. DBSCAN
Normal memory:
```'dbscan': {(100000, 10): 0.4957292079925537, (100000, 100): 0.8680248260498047, (100000, 250): 1.585218906402588, (250000, 10): 4.52524995803833, (250000, 100): 7.175846099853516, (250000, 250): 12.135416269302368, (500000, 10): 26.427770853042603, (500000, 100): 37.57275915145874, (500000, 250): 57.98261737823486}}```
Pinned memory:
```'dbscan': {(100000, 10): 0.49578166007995605, (100000, 100): 0.8678708076477051, (100000, 250): 1.5854766368865967, (250000, 10): 4.526952505111694, (250000, 100): 7.172863006591797, (250000, 250): 12.145166397094727, (500000, 10): 26.422622680664062, (500000, 100): 37.56665277481079, (500000, 250): 58.02563738822937}}```

Authors:
  - Divye Gala (https://github.com/divyegala)

Approvers:
  - Rory Mitchell (https://github.com/RAMitchell)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4215
[gpuCI] Forward-merge branch-21.10 to branch-21.12 [skip gpuci]
[gpuCI] Forward-merge branch-21.10 to branch-21.12 [skip gpuci]
GPUtester and others added 24 commits November 19, 2021 17:38
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
Changes to be in-line with: rapidsai/cudf#9734

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #4390
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
PR uses project flash to build the cuML Python package mirroring what the C++ flow looks like.

Note: Currently only changed for the CUDA 11.0 GPU test since that one uses Python 3.7, to do the other jobs we need to build the python package twice on the CPU job.
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4396
Suggest using LinearSVM when the user chooses to use the linear kernel in SVM. The reason is that LinearSVM uses a specialized faster solver.

Closes #1664
Also partially addresses #2857

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4382
There were actuall 2 minor issues that prevented `UMAPAlgo::Optimize::find_params_ab()` from being ASAN-clean at the moment:

- One is the mem leaks, of course
- Another one is the `malloc()`-`delete` mismatch -- only memory allocated using `new` or equivalent should be freed with operator `delete` or `delete[]`

Another issue that was also addressed here: exception safety (i.e., by using `make_unique` from C++-14)

Signed-off-by: Yitao Li <[email protected]>

Authors:
  - Yitao Li (https://github.com/yitao-li)

Approvers:
  - Zach Bjornson (https://github.com/zbjornson)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #4405
P_sum is equal to n. See #2622 where I made this change once before. #4208 changed it back while consolidating code.

Authors:
  - Zach Bjornson (https://github.com/zbjornson)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #4425
This PR separates the Decision tree kernels into separate Translation Units (TU) and explicitly instantiates templates. 
This is helpful in 2 ways:
1. refactoring top-level RF/DT code now would not require recompilation of the kernels 
2. Since they are separated into different TUs and linked, they can leverage build parallelism (4x improvement in rebuild times after touching kernel definitions)

Rebuilding by running `time ./build.sh libcuml -v -n PARALLEL_LEVEL=20` after touching RF kernels comparison:
(Note: using `--ccache` doesn't matter here, assuming after touching RF kernels the state of the code-base is completely new and not part of ccache's hashed index)
<details><summary>This PR</summary>
  
  ```
  real    0m20.054s             
  user    2m28.436s                                               
  sys     0m14.241s 
  ```
</details>
<details><summary>branch-21.12</summary>
  
  ```
real    1m21.197s                                                                                                                    
user    2m5.751s                                                                                                                     
sys     0m6.050s
  ```
</details>

Some other changes include renaming and reorganizing files, pruning headers and cleaning up some code

Things to do: 
- [x] split DT Kernels
- [x] benchmark for regressions

Authors:
  - Venkat (https://github.com/venkywonka)

Approvers:
  - Rory Mitchell (https://github.com/RAMitchell)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4299
Answers #4203
Just set in stone the warning filter for "Numerical issues".

Authors:
  - Victor Lafargue (https://github.com/viclafargue)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4408
* FIX Remove hard sklearn imports

* FIX Missing whitespace

* FIX minor error

* FIX PEP8 fixes
[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci]
Update `ucx-py` version on release using `rvc`

Authors:
  - Jordan Jacobelli (https://github.com/Ethyling)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #4411
This PR updates the pinnings of the conda environment for CUDA 11.5 to use 22.02 packages. This resolves conflicts between a5e7cfb and #4364.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #4450
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.