cuML 0.10.0 (Date TBD)

New Features

PR #1071: Selective eigen solver of cuSolver
PR #1073: Updating RF wrappers to use FIL for GPU accelerated prediction
PR #1113: prims: new batched make-symmetric-matrix primitive
PR #1112: prims: new batched-gemv primitive
PR #855: Added benchmark tools

Improvements

PR #961: High Peformance RF; HIST algo
PR #1028: Dockerfile updates after dir restructure. Conda env yaml to add statsmodels as a dependency
PR #763: Add examples to train_test_split documentation
PR #1093: Unified inference kernels for different FIL algorithms
PR #1076: Paying off some UMAP / Spectral tech debt.
PR #1086: Ensure RegressorMixin scorer uses device arrays
PR #1114: K-means: Exposing useful params, removing unused params, proxying params in Dask

Bug Fixes

PR #1016: Use correct libcumlprims version in GPU CI
PR #1040: Update version of numba in development conda yaml files
PR #1043: Updates to accomodate cuDF python code reorganization
PR #1044: Remove nvidia driver installation from ci/cpu/build.sh
PR #991: Barnes Hut TSNE Memory Issue Fixes
PR #1075: Pinning Dask version for consistent CI results
PR #990: Barnes Hut TSNE Memory Issue Fixes
PR #1066: Using proper set of workers to destroy nccl comms
PR #1072: Remove pip requirements and setup
PR #1074: Fix flake8 CI style check
PR #1087: Accuracy improvement for sqrt/log in RF max_feature
PR #1088: Change straggling numba python allocations to use RMM
PR #1106: Pinning Distributed version to match Dask for consistent CI results

cuML 0.9.0 (21 Aug 2019)

New Features

PR #894: Convert RF to treelite format
PR #826: Jones transformation of params for ARIMA models timeSeries ml-prim
PR #697: Silhouette Score metric ml-prim
PR #674: KL Divergence metric ml-prim
PR #787: homogeneity, completeness and v-measure metrics ml-prim
PR #711: Mutual Information metric ml-prim
PR #724: Entropy metric ml-prim
PR #766: Expose score method based on inertia for KMeans
PR #823: prims: cluster dispersion metric
PR #816: Added inverse_transform() for LabelEncoder
PR #789: prims: sampling without replacement
PR #813: prims: Col major istance prim
PR #635: Random Forest & Decision Tree Regression (Single-GPU)
PR #819: Forest Inferencing Library (FIL)
PR #829: C++: enable nvtx ranges
PR #835: Holt-Winters algorithm
PR #837: treelite for decision forest exchange format
PR #871: Wrapper for FIL
PR #870: make_blobs python function
PR #881: wrappers for accuracy_score and adjusted_rand_score functions
PR #840: Dask RF classification and regression
PR #870: make_blobs python function
PR #879: import of treelite models to FIL
PR #892: General Gram matrices prim
PR #883: Adding MNMG Kmeans
PR #930: Dask RF
PR #882: TSNE - T-Distributed Stochastic Neighbourhood Embedding
PR #624: Internals API & Graph Based Dimensionality Reductions Callback
PR #926: Wrapper for FIL
PR #960: Enable using libcumlprims for MG algorithms/prims

Improvements

PR #822: build: build.sh update to club all make targets together
PR #807: Added development conda yml files
PR #840: Require cmake >= 3.14
PR #832: Stateless Decision Tree and Random Forest API
PR #857: Small modifications to comms for utilizing IB w/ Dask
PR #851: Random forest Stateless API wrappers
PR #865: High Performance RF
PR #895: Pretty prints arguments!
PR #920: Add an empty marker kernel for tracing purposes
PR #915: syncStream added to cumlCommunicator
PR #922: Random Forest support in FIL
PR #911: Update headers to credit CannyLabs BH TSNE implementation
PR #918: Streamline CUDA_REL environment variable
PR #924: kmeans: updated APIs to be stateless, refactored code for mnmg support
PR #950: global_bias support in FIL
PR #773: Significant improvements to input checking of all classes and common input API for Python
PR #957: Adding docs to RF & KMeans MNMG. Small fixes for release
PR #965: Making dask-ml a hard dependency
PR #976: Update api.rst for new 0.9 classes
PR #973: Use cudaDeviceGetAttribute instead of relying on cudaDeviceProp object being passed
PR #978: Update README for 0.9
PR #1009: Fix references to notebooks-contrib

Bug Fixes

PR #923: Fix misshapen level/trend/season HoltWinters output
PR #831: Update conda package dependencies to cudf 0.9
PR #772: Add missing cython headers to SGD and CD
PR #849: PCA no attribute trans_input_ transform bug fix
PR #869: Removing incorrect information from KNN Docs
PR #885: libclang installation fix for GPUCI
PR #896: Fix typo in comms build instructions
PR #921: Fix build scripts using incorrect cudf version
PR #928: TSNE Stability Adjustments
PR #934: Cache cudaDeviceProp in cumlHandle for perf reasons
PR #932: Change default param value for RF classifier
PR #949: Fix dtype conversion tests for unsupported cudf dtypes
PR #908: Fix local build generated file ownerships
PR #983: Change RF max_depth default to 16
PR #987: Change default values for knn
PR #988: Switch to exact tsne
PR #991: Cleanup python code in cuml.dask.cluster
PR #996: ucx_initialized being properly set in CommsContext
PR #1007: Throws a well defined error when mutigpu is not enabled
PR #1018: Hint location of nccl in build.sh for CI
PR #1022: Using random_state to make K-Means MNMG tests deterministic
PR #1034: Fix typos and formatting issues in RF docs

cuML 0.8.0 (27 June 2019)

New Features

PR #652: Adjusted Rand Index metric ml-prim
PR #679: Class label manipulation ml-prim
PR #636: Rand Index metric ml-prim
PR #515: Added Random Projection feature
PR #504: Contingency matrix ml-prim
PR #644: Add train_test_split utility for cuDF dataframes
PR #612: Allow Cuda Array Interface, Numba inputs and input code refactor
PR #641: C: Separate C-wrapper library build to generate libcuml.so
PR #631: Add nvcategory based ordinal label encoder
PR #681: Add MBSGDClassifier and MBSGDRegressor classes around SGD
PR #705: Quasi Newton solver and LogisticRegression Python classes
PR #670: Add test skipping functionality to build.sh
PR #678: Random Forest Python class
PR #684: prims: make_blobs primitive
PR #673: prims: reduce cols by key primitive
PR #812: Add cuML Communications API & consolidate Dask cuML

Improvements

PR #597: C++ cuML and ml-prims folder refactor
PR #590: QN Recover from numeric errors
PR #482: Introduce cumlHandle for pca and tsvd
PR #573: Remove use of unnecessary cuDF column and series copies
PR #601: Cython PEP8 cleanup and CI integration
PR #596: Introduce cumlHandle for ols and ridge
PR #579: Introduce cumlHandle for cd and sgd, and propagate C++ errors in cython level for cd and sgd
PR #604: Adding cumlHandle to kNN, spectral methods, and UMAP
PR #616: Enable clang-format for enforcing coding style
PR #618: CI: Enable copyright header checks
PR #622: Updated to use 0.8 dependencies
PR #626: Added build.sh script, updated CI scripts and documentation
PR #633: build: Auto-detection of GPU_ARCHS during cmake
PR #650: Moving brute force kNN to prims. Creating stateless kNN API.
PR #662: C++: Bulk clang-format updates
PR #671: Added pickle pytests and correct pickling of Base class
PR #675: atomicMin/Max(float, double) with integer atomics and bit flipping
PR #677: build: 'deep-clean' to build.sh to clean faiss build as well
PR #683: Use stateless c++ API in KNN so that it can be pickled properly
PR #686: Use stateless c++ API in UMAP so that it can be pickled properly
PR #695: prims: Refactor pairwise distance
PR #707: Added stress test and updated documentation for RF
PR #701: Added emacs temporary file patterns to .gitignore
PR #606: C++: Added tests for host_buffer and improved device_buffer and host_buffer implementation
PR #726: Updated RF docs and stress test
PR #730: Update README and RF docs for 0.8
PR #744: Random projections generating binomial on device. Fixing tests.
PR #741: Update API docs for 0.8
PR #754: Pickling of UMAP/KNN
PR #753: Made PCA and TSVD picklable
PR #746: LogisticRegression and QN API docstrings

Bug Fixes

PR #584: Added missing virtual destructor to deviceAllocator and hostAllocator
PR #620: C++: Removed old unit-test files in ml-prims
PR #627: C++: Fixed dbscan crash issue filed in 613
PR #640: Remove setuptools from conda run dependency
PR #646: Update link in contributing.md
PR #649: Bug fix to LinAlg::reduce_rows_by_key prim filed in issue #648
PR #666: fixes to gitutils.py to resolve both string decode and handling of uncommitted files
PR #676: Fix template parameters in bernoulli() implementation.
PR #685: Make CuPy optional to avoid nccl conda package conflicts
PR #687: prims: updated tolerance for reduce_cols_by_key unit-tests
PR #689: Removing extra prints from NearestNeighbors cython
PR #718: Bug fix for DBSCAN and increasing batch size of sgd
PR #719: Adding additional checks for dtype of the data
PR #736: Bug fix for RF wrapper and .cu print function
PR #547: Fixed issue if C++ compiler is specified via CXX during configure.
PR #759: Configure Sphinx to render params correctly
PR #762: Apply threshold to remove flakiness of UMAP tests.
PR #768: Fixing memory bug from stateless refactor
PR #782: Nearest neighbors checking properly whether memory should be freed
PR #783: UMAP was using wrong size for knn computation
PR #776: Hotfix for self.variables in RF
PR #777: Fix numpy input bug
PR #784: Fix jit of shuffle_idx python function
PR #790: Fix rows_sample input type for RF
PR #793: Fix for dtype conversion utility for numba arrays without cupy installed
PR #806: Add a seed for sklearn model in RF test file
PR #843: Rf quantile fix

cuML 0.7.0 (10 May 2019)

New Features

PR #405: Quasi-Newton GLM Solvers
PR #277: Add row- and column-wise weighted mean primitive
PR #424: Add a grid-sync struct for inter-block synchronization
PR #430: Add R-Squared Score to ml primitives
PR #463: Add matrix gather to ml primitives
PR #435: Expose cumlhandle in cython + developer guide
PR #455: Remove default-stream arguement across ml-prims and cuML
PR #375: cuml cpp shared library renamed to libcuml++.so
PR #460: Random Forest & Decision Trees (Single-GPU, Classification)
PR #491: Add doxygen build target for ml-prims
PR #505: Add R-Squared Score to python interface
PR #507: Add coordinate descent for lasso and elastic-net
PR #511: Add a minmax ml-prim
PR #516: Added Trustworthiness score feature
PR #520: Add local build script to mimic gpuCI
PR #503: Add column-wise matrix sort primitive
PR #525: Add docs build script to cuML
PR #528: Remove current KMeans and replace it with a new single GPU implementation built using ML primitives

Improvements

PR #481: Refactoring Quasi-Newton to use cumlHandle
PR #467: Added validity check on cumlHandle_t
PR #461: Rewrote permute and added column major version
PR #440: README updates
PR #295: Improve build-time and the interface e.g., enable bool-OutType, for distance()
PR #390: Update docs version
PR #272: Add stream parameters to cublas and cusolver wrapper functions
PR #447: Added building and running mlprims tests to CI
PR #445: Lower dbscan memory usage by computing adjacency matrix directly
PR #431: Add support for fancy iterator input types to LinAlg::reduce_rows_by_key
PR #394: Introducing cumlHandle API to dbscan and add example
PR #500: Added CI check for black listed CUDA Runtime API calls
PR #475: exposing cumlHandle for dbscan from python-side
PR #395: Edited the CONTRIBUTING.md file
PR #407: Test files to run stress, correctness and unit tests for cuml algos
PR #512: generic copy method for copying buffers between device/host
PR #533: Add cudatoolkit conda dependency
PR #524: Use cmake find blas and find lapack to pass configure options to faiss
PR #527: Added notes on UMAP differences from reference implementation
PR #540: Use latest release version in update-version CI script
PR #552: Re-enable assert in kmeans tests with xfail as needed
PR #581: Add shared memory fast col major to row major function back with bound checks
PR #592: More efficient matrix copy/reverse methods
PR #721: Added pickle tests for DBSCAN and Random Projections

Bug Fixes

PR #334: Fixed segfault in ML::cumlHandle_impl::destroyResources
PR #349: Developer guide clarifications for cumlHandle and cumlHandle_impl
PR #398: Fix CI scripts to allow nightlies to be uploaded
PR #399: Skip PCA tests to allow CI to run with driver 418
PR #422: Issue in the PCA tests was solved and CI can run with driver 418
PR #409: Add entry to gitmodules to ignore build artifacts
PR #412: Fix for svdQR function in ml-prims
PR #438: Code that depended on FAISS was building everytime.
PR #358: Fixed an issue when switching streams on MLCommon::device_buffer and MLCommon::host_buffer
PR #434: Fixing bug in CSR tests
PR #443: Remove defaults channel from ci scripts
PR #384: 64b index arithmetic updates to the kernels inside ml-prims
PR #459: Fix for runtime library path of pip package
PR #464: Fix for C++11 destructor warning in qn
PR #466: Add support for column-major in LinAlg::*Norm methods
PR #465: Fixing deadlock issue in GridSync due to consecutive sync calls
PR #468: Fix dbscan example build failure
PR #470: Fix resource leakage in Kalman filter python wrapper
PR #473: Fix gather ml-prim test for change in rng uniform API
PR #477: Fixes default stream initialization in cumlHandle
PR #480: Replaced qn_fit() declaration with #include of file containing definition to fix linker error
PR #495: Update cuDF and RMM versions in GPU ci test scripts
PR #499: DEVELOPER_GUIDE.md: fixed links and clarified ML::detail::streamSyncer example
PR #506: Re enable ml-prim tests in CI
PR #508: Fix for an error with default argument in LinAlg::meanSquaredError
PR #519: README.md Updates and adding BUILD.md back
PR #526: Fix the issue of wrong results when fit and transform of PCA are called separately
PR #531: Fixing missing arguments in updateDevice() for RF
PR #543: Exposing dbscan batch size through cython API and fixing broken batching
PR #551: Made use of ZLIB_LIBRARIES consistent between ml_test and ml_mg_test
PR #557: Modified CI script to run cuML tests before building mlprims and removed lapack flag
PR #578: Updated Readme.md to add lasso and elastic-net
PR #580: Fixing cython garbage collection bug in KNN
PR #577: Use find libz in prims cmake
PR #594: fixed cuda-memcheck mean_center test failures

cuML 0.6.1 (09 Apr 2019)

Bug Fixes

PR #462 Runtime library path fix for cuML pip package

cuML 0.6.0 (22 Mar 2019)

New Features

PR #249: Single GPU Stochastic Gradient Descent for linear regression, logistic regression, and linear svm with L1, L2, and elastic-net penalties.
PR #247: Added "proper" CUDA API to cuML
PR #235: NearestNeighbors MG Support
PR #261: UMAP Algorithm
PR #290: NearestNeighbors numpy MG Support
PR #303: Reusable spectral embedding / clustering
PR #325: Initial support for single process multi-GPU OLS and tSVD
PR #271: Initial support for hyperparameter optimization with dask for many models

Improvements

PR #144: Dockerfile update and docs for LinearRegression and Kalman Filter.
PR #168: Add /ci/gpu/build.sh file to cuML
PR #167: Integrating full-n-final ml-prims repo inside cuml
PR #198: (ml-prims) Removal of *MG calls + fixed a bug in permute method
PR #194: Added new ml-prims for supporting LASSO regression.
PR #114: Building faiss C++ api into libcuml
PR #64: Using FAISS C++ API in cuML and exposing bindings through cython
PR #208: Issue ml-common-3: Math.h: swap thrust::for_each with binaryOp,unaryOp
PR #224: Improve doc strings for readable rendering with readthedocs
PR #209: Simplify README.md, move build instructions to BUILD.md
PR #218: Fix RNG to use given seed and adjust RNG test tolerances.
PR #225: Support for generating random integers
PR #215: Refactored LinAlg::norm to Stats::rowNorm and added Stats::colNorm
PR #234: Support for custom output type and passing index value to main_op in *Reduction kernels
PR #230: Refactored the cuda_utils header
PR #236: Refactored cuml python package structure to be more sklearn like
PR #232: Added reduce_rows_by_key
PR #246: Support for 2 vectors in the matrix vector operator
PR #244: Fix for single GPU OLS and Ridge to support one column training data
PR #271: Added get_params and set_params functions for linear and ridge regression
PR #253: Fix for issue #250-reduce_rows_by_key failed memcheck for small nkeys
PR #269: LinearRegression, Ridge Python docs update and cleaning
PR #322: set_params updated
PR #237: Update build instructions
PR #275: Kmeans use of faster gpu_matrix
PR #288: Add n_neighbors to NearestNeighbors constructor
PR #302: Added FutureWarning for deprecation of current kmeans algorithm
PR #312: Last minute cleanup before release
PR #315: Documentation updating and enhancements
PR #330: Added ignored argument to pca.fit_transform to map to sklearn's implemenation
PR #342: Change default ABI to ON
PR #572: Pulling DBSCAN components into reusable primitives

Bug Fixes

PR #193: Fix AttributeError in PCA and TSVD
PR #211: Fixing inconsistent use of proper batch size calculation in DBSCAN
PR #202: Adding back ability for users to define their own BLAS
PR #201: Pass CMAKE CUDA path to faiss/configure script
PR #200 Avoid using numpy via cimport in KNN
PR #228: Bug fix: LinAlg::unaryOp with 0-length input
PR #279: Removing faiss-gpu references in README
PR #321: Fix release script typo
PR #327: Update conda requirements for version 0.6 requirements
PR #352: Correctly calculating numpy chunk sizing for kNN
PR #345: Run python import as part of package build to trigger compilation
PR #347: Lowering memory usage of kNN.
PR #355: Fixing issues with very large numpy inputs to SPMG OLS and tSVD.
PR #357: Removing FAISS requirement from README
PR #362: Fix for matVecOp crashing on large input sizes
PR #366: Index arithmetic issue fix with TxN_t class
PR #376: Disabled kmeans tests since they are currently too sensitive (see #71)
PR #380: Allow arbitrary data size on ingress for numba_utils.row_matrix
PR #385: Fix for long import cuml time in containers and fix for setup_pip
PR #630: Fixing a missing kneighbors in nearest neighbors python proxy

cuML 0.5.1 (05 Feb 2019)

Bug Fixes

PR #189 Avoid using numpy via cimport to prevent ABI issues in Cython compilation

cuML 0.5.0 (28 Jan 2019)

New Features

PR #66: OLS Linear Regression
PR #44: Distance calculation ML primitives
PR #69: Ridge (L2 Regularized) Linear Regression
PR #103: Linear Kalman Filter
PR #117: Pip install support
PR #64: Device to device support from cuML device pointers into FAISS

Improvements

PR #56: Make OpenMP optional for building
PR #67: Github issue templates
PR #44: Refactored DBSCAN to use ML primitives
PR #91: Pytest cleanup and sklearn toyset datasets based pytests for kmeans and dbscan
PR #75: C++ example to use kmeans
PR #117: Use cmake extension to find any zlib installed in system
PR #94: Add cmake flag to set ABI compatibility
PR #139: Move thirdparty submodules to root and add symlinks to new locations
PR #151: Replace TravisCI testing and conda pkg builds with gpuCI
PR #164: Add numba kernel for faster column to row major transform
PR #114: Adding FAISS to cuml build

Bug Fixes

PR #48: CUDA 10 compilation warnings fix
PR #51: Fixes to Dockerfile and docs for new build system
PR #72: Fixes for GCC 7
PR #96: Fix for kmeans stack overflow with high number of clusters
PR #105: Fix for AttributeError in kmeans fit method
PR #113: Removed old glm python/cython files
PR #118: Fix for AttributeError in kmeans predict method
PR #125: Remove randomized solver option from PCA python bindings

cuML 0.4.0 (05 Dec 2018)

New Features

Improvements

PR #42: New build system: separation of libcuml.so and cuml python package
PR #43: Added changelog.md

Bug Fixes

cuML 0.3.0 (30 Nov 2018)

New Features

PR #33: Added ability to call cuML algorithms using numpy arrays

Improvements

PR #24: Fix references of python package from cuML to cuml and start using versioneer for better versioning
PR #40: Added support for refactored cuDF 0.3.0, updated Conda files
PR #33: Major python test cleaning, all tests pass with cuDF 0.2.0 and 0.3.0. Preparation for new build system
PR #34: Updated batch count calculation logic in DBSCAN
PR #35: Beginning of DBSCAN refactor to use cuML mlprims and general improvements

Bug Fixes

PR #30: Fixed batch size bug in DBSCAN that caused crash. Also fixed various locations for potential integer overflows
PR #28: Fix readthedocs build documentation
PR #29: Fix pytests for cuml name change from cuML
PR #33: Fixed memory bug that would cause segmentation faults due to numba releasing memory before it was used. Also fixed row major/column major bugs for different algorithms
PR #36: Fix kmeans gtest to use device data
PR #38: cuda_free bug removed that caused google tests to sometimes pass and sometimes fail randomly
PR #39: Updated cmake to correctly link with CUDA libraries, add CUDA runtime linking and include source files in compile target

cuML 0.2.0 (02 Nov 2018)

New Features

PR #11: Kmeans algorithm added
PR #7: FAISS KNN wrapper added
PR #21: Added Conda install support

Improvements

PR #15: Added compatibility with cuDF (from prior pyGDF)
PR #13: Added FAISS to Dockerfile
PR #21: Added TravisCI build system for CI and Conda builds

Bug Fixes

PR #4: Fixed explained variance bug in TSVD
PR #5: Notebook bug fixes and updated results

cuML 0.1.0

Initial release including PCA, TSVD, DBSCAN, ml-prims and cython wrappers

Files

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

cuML 0.10.0 (Date TBD)

New Features

Improvements

Bug Fixes

cuML 0.9.0 (21 Aug 2019)

New Features

Improvements

Bug Fixes

cuML 0.8.0 (27 June 2019)

New Features

Improvements

Bug Fixes

cuML 0.7.0 (10 May 2019)

New Features

Improvements

Bug Fixes

cuML 0.6.1 (09 Apr 2019)

Bug Fixes

cuML 0.6.0 (22 Mar 2019)

New Features

Improvements

Bug Fixes

cuML 0.5.1 (05 Feb 2019)

Bug Fixes

cuML 0.5.0 (28 Jan 2019)

New Features

Improvements

Bug Fixes

cuML 0.4.0 (05 Dec 2018)

New Features

Improvements

Bug Fixes

cuML 0.3.0 (30 Nov 2018)

New Features

Improvements

Bug Fixes

cuML 0.2.0 (02 Nov 2018)

New Features

Improvements

Bug Fixes

cuML 0.1.0