Please see https://github.com/rapidsai/raft/releases/tag/v23.02.00a for the latest changes to this development branch.
- Make ucx linkage explicit and add a new CMake target for it (#1032) @vyasr
- IVF-Flat: make adaptive-centers behavior optional (#1019) @achirkin
- Remove make_mdspan template for memory_type enum (#1005) @wphicks
- ivf-pq performance tweaks (#926) @achirkin
- fusedL2NN: Add input alignment checks (#1045) @achirkin
- Fix fusedL2NN bug that can happen when the same point appears in both x and y (#1040) @Nyrio
- Fix trivial deprecated header includes (#1034) @achirkin
- Suppress ptxas stack size warning in Debug mode (#1033) @tfeher
- Don't use CMake 3.25.0 as it has a FindCUDAToolkit show stopping bug (#1029) @robertmaynard
- Fix for gemmi deprecation (#1020) @lowener
- Remove make_mdspan template for memory_type enum (#1005) @wphicks
- Add
except +
to cython extern cdef declarations (#1001) @benfred - Changing Overloads for GCC 11/12 bug (#995) @divyegala
- Changing Overloads for GCC 11/12 bugs (#992) @divyegala
- Fix pylibraft docstring example code (#980) @benfred
- Update raft tests to compile with C++17 features enabled (#973) @robertmaynard
- Making ivf flat gtest invoke mdspanified APIs (#955) @cjnolet
- Updates to kmeans public API to fix cuml (#932) @cjnolet
- Fix logger (vsnprintf consumes args) (#917) @Nyrio
- Adding missing include for device mdspan in
mean_squared_error.cuh
(#906) @cjnolet
- Add links to the docs site in the README (#1042) @benfred
- Moving contributing and developer guides to main docs (#1006) @cjnolet
- Update compiler flags in build docs (#999) @cjnolet
- Updating minimum required gcc version (#993) @cjnolet
- important doc updates for core, cluster, and neighbors (#933) @cjnolet
- ANN refinement Python wrapper (#1052) @tfeher
- Add ANN refinement method (#1038) @tfeher
- IVF-Flat: make adaptive-centers behavior optional (#1019) @achirkin
- Add wheel builds (#1013) @vyasr
- Update cuSparse wrappers to avoid deprecated functions (#989) @wphicks
- Provide memory_type enum (#984) @wphicks
- Add Tests for kmeans API (#982) @lowener
- mdspanifying
weighted_mean
and addraft::stats
tests (#910) @lowener - Implement
raft::stats
API with mdspan (#802) @lowener
- Pin
dask
anddistributed
for release (#1062) @galipremsagar - IVF-PQ: use device properties helper (#1035) @achirkin
- Make ucx linkage explicit and add a new CMake target for it (#1032) @vyasr
- Fixing broken doc functions and improving coverage (#1030) @cjnolet
- Expose cluster_cost to python (#1028) @benfred
- Adding lightweight cai_wrapper to reduce boilerplate (#1027) @cjnolet
- Change
raft
docs theme topydata-sphinx-theme
(#1026) @galipremsagar - Revert " Pin
dask
anddistributed
for release" (#1023) @galipremsagar - Pin
dask
anddistributed
for release (#1022) @galipremsagar - Replace
dots_along_rows
withrowNorm
and improvecoalescedReduction
performance (#1011) @Nyrio - Moving TestDeviceBuffer to
pylibraft.common.device_ndarray
(#1008) @cjnolet - Add codespell as a linter (#1007) @benfred
- Fix environment channels (#996) @bdice
- Automatically sync handle when not passed to pylibraft functions (#987) @benfred
- Replace
normalize_rows
inann_utils.cuh
by a newrowNormalize
prim and improve performance for thin matrices (smalln_cols
) (#979) @Nyrio - Forward merge 22.10 into 22.12 (#978) @vyasr
- Use new rapids-cmake functionality for rpath handling. (#976) @vyasr
- Update cuda-python dependency to 11.7.1 (#975) @galipremsagar
- IVF-PQ Python wrappers (#970) @tfeher
- Remove unnecessary requirements for raft-dask. (#969) @vyasr
- Expose
linalg::dot
in public API (#968) @benfred - Fix kmeans cluster templates (#966) @lowener
- Run linters using pre-commit (#965) @benfred
- linewiseop padded span test (#964) @mfoerste4
- Add unittest for
linalg::mean_squared_error
(#961) @benfred - Exposing fused l2 knn to public APIs (#959) @cjnolet
- Remove a left over print statement from pylibraft (#958) @betatim
- Switch to using rapids-cmake for gbench. (#954) @vyasr
- Some cleanup of k-means internals (#953) @cjnolet
- Remove stale labeler (#951) @raydouglass
- Adding optional handle to each public API function (along with example) (#947) @cjnolet
- Improving documentation across the board. Adding quick-start to breathe docs. (#943) @cjnolet
- Add unittest for
linalg::axpy
(#942) @benfred - Add cutlass 3xTF32,DMMA based L2/cosine distance kernels for SM 8.0 or higher (#939) @mdoijade
- Calculate max cluster size correctly for IVF-PQ (#938) @tfeher
- Add tests for
raft::matrix
(#937) @lowener - Add fusedL2NN benchmark (#936) @Nyrio
- ivf-pq performance tweaks (#926) @achirkin
- Adding
fused_l2_nn_argmin
wrapper to Pylibraft (#924) @cjnolet - Moving kernel gramm primitives to
raft::distance::kernels
(#920) @cjnolet - kmeans improvements: random initialization on GPU, NVTX markers, no batching when using fusedL2NN (#918) @Nyrio
- Moving
raft::spatial::knn
->raft::neighbors
(#914) @cjnolet - Create cub-based argmin primitive and replace
argmin_along_rows
in ANN kmeans (#912) @Nyrio - Replace
map_along_rows
withmatrixVectorOp
(#911) @Nyrio - Integrate
accumulate_into_selected
from ANN utils intolinalg::reduce_rows_by_keys
(#909) @Nyrio - Re-enabling Fused L2 NN specializations and renaming
cub::KeyValuePair
->raft::KeyValuePair
(#905) @cjnolet - Unpin
dask
anddistributed
for development (#886) @galipremsagar - Adding padded layout 'layout_padded_general' (#725) @mfoerste4
- Separating mdspan/mdarray infra into host_* and device_* variants (#810) @cjnolet
- Remove type punning from TxN_t (#781) @wphicks
- ivf_flat::index: hide implementation details (#747) @achirkin
- ivf-pq integration: hotfixes (#891) @achirkin
- Removing cub symbol from libraft-distance instantiation. (#887) @cjnolet
- ivf-pq post integration hotfixes (#878) @achirkin
- Fixing a few compile errors in new APIs (#874) @cjnolet
- Include knn.cuh in knn.cu benchmark source for finding brute_force_knn (#855) @teju85
- Do not use strcpy to copy 2 char (#848) @mhoemmen
- rng_state not including necessary cstdint (#839) @MatthiasKohl
- Fix integer overflow in ANN kmeans (#835) @Nyrio
- Add alignment to the TxN_t vectorized type (#792) @achirkin
- Fix adj_to_csr_kernel (#785) @ahendriksen
- Use rapids-cmake 22.10 best practice for RAPIDS.cmake location (#784) @robertmaynard
- Remove type punning from TxN_t (#781) @wphicks
- Various fixes for build.sh (#771) @vyasr
- Fix target names in build.sh help text (#879) @Nyrio
- Document that minimum required CMake version is now 3.23.1 (#841) @robertmaynard
- mdspanify raft::random functions uniformInt, normalTable, fill, bernoulli, and scaled_bernoulli (#897) @mhoemmen
- mdspan-ify several raft::random rng functions (#857) @mhoemmen
- Develop new mdspan-ified multi_variable_gaussian interface (#845) @mhoemmen
- Mdspanify permute (#834) @mhoemmen
- mdspan-ify rmat_rectangular_gen (#833) @mhoemmen
- mdspanify sampleWithoutReplacement (#830) @mhoemmen
- mdspan-ify make_regression (#811) @mhoemmen
- Updating
raft::linalg
APIs to usemdspan
(#809) @divyegala - Integrate KNN implementation: ivf-pq (#789) @achirkin
- Some fixes for build.sh (#901) @cjnolet
- Revert recent fused l2 nn instantiations (#899) @cjnolet
- Update Python build instructions (#898) @betatim
- Adding ninja and cxx compilers to conda dev dependencies (#893) @cjnolet
- Output non-normalized distances in IVF-PQ and brute-force KNN (#892) @Nyrio
- Readme updates for 22.10 (#884) @cjnolet
- Breaking apart benchmarks into individual binaries (#883) @cjnolet
- Pin
dask
anddistributed
for release (#858) @galipremsagar - Mdspanifying (currently tested)
raft::matrix
(#846) @cjnolet - Separating _RAFT_HOST and _RAFT_DEVICE macros (#836) @cjnolet
- Updating cpu job in hopes it speeds up python cpu builds (#828) @cjnolet
- Mdspan-ifying
raft::spatial
(#827) @cjnolet - Fixing init.py for handle and stream (#826) @cjnolet
- Moving a few more things around (#822) @cjnolet
- Use fusedL2NN in ANN kmeans (#821) @Nyrio
- Separating test executables (#820) @cjnolet
- Separating mdspan/mdarray infra into host_* and device_* variants (#810) @cjnolet
- Fix malloc/delete mismatch (#808) @mhoemmen
- Renaming
pyraft
->raft-dask
(#801) @cjnolet - Branch 22.10 merge 22.08 (#800) @cjnolet
- Statically link all CUDA toolkit libraries (#797) @trxcllnt
- Minor follow-up fixes for ivf-flat (#796) @achirkin
- KMeans benchmarks (cuML + ANN implementations) and fix for IndexT=int64_t (#795) @Nyrio
- Optimize fusedL2NN when data is skinny (#794) @ahendriksen
- Complete the deprecation of duplicated hpp headers (#793) @ahendriksen
- Prepare parts of the balanced kmeans for ivf-pq (#788) @achirkin
- Unpin
dask
anddistributed
for development (#783) @galipremsagar - Exposing python wrapper for the RMAT generator logic (#778) @teju85
- Device, Host, Managed Accessor Types for
mdspan
(#776) @divyegala - Fix Forward-Merger Conflicts (#768) @ajschmidt8
- Fea 2208 kmeans use specializations (#760) @cjnolet
- ivf_flat::index: hide implementation details (#747) @achirkin
- Update
mdspan
to account for changes toextents
(#751) @divyegala - Replace csr_adj_graph functions with faster equivalent (#746) @ahendriksen
- Integrate KNN implementation: ivf-flat (#652) @achirkin
- Moving kmeans from cuml to Raft (#605) @lowener
- Relax ivf-flat test recall thresholds (#766) @achirkin
- Restrict the use of
]
to CXX 20 only. (#764) @trivialfis - Update rapids-cmake version for pyraft in update-version.sh (#749) @vyasr
- Use documented header template for doxygen (#773) @galipremsagar
- Switch
language
fromNone
to"en"
in docs build (#721) @galipremsagar
- Update
mdspan
to account for changes toextents
(#751) @divyegala - Implement matrix transpose with mdspan. (#739) @trivialfis
- Implement unravel_index for row-major array. (#723) @trivialfis
- Integrate KNN implementation: ivf-flat (#652) @achirkin
- Use common
js
andcss
code (#779) @galipremsagar - Pin
dask
&distributed
for release (#772) @galipremsagar - Move cmake to the build section. (#763) @vyasr
- Adding old kmeans impl back in (as kmeans_deprecated) (#761) @cjnolet
- Fix for KMeans raw pointers API (#758) @lowener
- Fix KMeans (#756) @divyegala
- Add inline to nccl_sync_stream() (#750) @seunghwak
- Replace csr_adj_graph functions with faster equivalent (#746) @ahendriksen
- Add wrapper functions for ncclGroupStart() and ncclGroupEnd() (#742) @seunghwak
- Fix variadic template type check for mdarrays (#741) @hlinsen
- RMAT rectangular graph generator (#738) @teju85
- Update conda recipes to UCX 1.13.0 (#736) @pentschev
- Add warp-aggregated atomic increment (#735) @ahendriksen
- fix logic bug in include_checker.py utility (#734) @grlee77
- Support 32bit and unsigned indices in bruteforce KNN (#730) @achirkin
- Ability to use ccache to speedup local builds (#729) @teju85
- Pin max version of
cuda-python
to11.7.0
(#728) @Ethyling - Always add
raft::raft_nn_lib
andraft::raft_distance_lib
aliases (#727) @trxcllnt - Add several type aliases and helpers for creating mdarrays (#726) @achirkin
- fix nans in naive kl divergence kernel introduced by div by 0. (#724) @mdoijade
- Use rapids-cmake for cuco (#722) @vyasr
- Update Python classifiers. (#719) @bdice
- Fix sccache (#718) @Ethyling
- Introducing raft::mdspan as an alias (#715) @divyegala
- Update cuco version (#714) @vyasr
- Update conda environment pinnings and update-versions.sh. (#713) @bdice
- Branch 22.08 merge branch 22.06 (#712) @cjnolet
- Testing conda compilers (#705) @cjnolet
- Unpin
dask
&distributed
for development (#704) @galipremsagar - Avoid shadowing CMAKE_ARGS variable in build.sh (#701) @vyasr
- Use unique ptr in
print_device_vector
(#695) @lowener - Add missing Thrust includes (#678) @bdice
- Consolidate C++ conda recipes and add libraft-tests package (#641) @Ethyling
- Moving kmeans from cuml to Raft (#605) @lowener
- Rng: removed cyclic dependency creating hard-to-debug compiler errors (#639) @MatthiasKohl
- Allow enabling NVTX markers by downstream projects after install (#610) @achirkin
- Rng: expose host-rng-state in host-only API (#609) @MatthiasKohl
- For fixing the cuGraph test failures with PCG (#690) @vinaydes
- Fix excessive memory used in selection test (#689) @achirkin
- Revert print vector changes because of std::vector<bool> (#681) @lowener
- fix race in fusedL2knn smem read/write by adding a syncwarp (#679) @mdoijade
- gemm: fix parameter C mistakenly set as const (#664) @achirkin
- Fix SelectionTest: allow different indices when keys are equal. (#659) @achirkin
- Revert recent cmake updates (#657) @cjnolet
- Don't install component dependency files in raft-header only mode (#655) @robertmaynard
- Rng: removed cyclic dependency creating hard-to-debug compiler errors (#639) @MatthiasKohl
- Fixing raft compile bug w/ RNG changes (#634) @cjnolet
- Get
libcudacxx
fromcuco
(#632) @trxcllnt - RNG API fixes (#630) @MatthiasKohl
- Fix mdspan accessor mixin offset policy. (#628) @trivialfis
- Branch 22.06 merge 22.04 (#625) @cjnolet
- fix issue in fusedL2knn which happens when rows are multiple of 256 (#604) @mdoijade
- Restore changes from #653 and #655 and correct cmake component dependencies (#686) @robertmaynard
- Adding handle and stream to pylibraft (#683) @cjnolet
- Map CMake install components to conda library packages (#653) @robertmaynard
- Rng: expose host-rng-state in host-only API (#609) @MatthiasKohl
- mdspan/mdarray template functions and utilities (#601) @divyegala
- Change build.sh to find C++ library by default (#697) @vyasr
- Pin
dask
anddistributed
for release (#693) @galipremsagar - Pin
dask
&distributed
for release (#680) @galipremsagar - Improve logging (#673) @achirkin
- Fix minor errors in CMake configuration (#662) @vyasr
- Pulling mdspan fork (from official rapids repo) into raft to remove dependency (#649) @cjnolet
- Fixing the unit test issue(s) in RAFT (#646) @vinaydes
- Build pyraft with scikit-build (#644) @vyasr
- Some fixes to pairwise distances for cupy integration (#643) @cjnolet
- Require UCX 1.12.1+ (#638) @jakirkham
- Updating raft rng host public API and adding docs (#636) @cjnolet
- Build pylibraft with scikit-build (#633) @vyasr
- Add
cuda_lib_dir
tolibrary_dirs
, allow changingUCX
/RMM
/Thrust
/spdlog
locations via envvars insetup.py
(#624) @trxcllnt - Remove perf prints from MST (#623) @divyegala
- Enable components installation using CMake (#621) @Ethyling
- Allow nullptr as input-indices argument of select_k (#618) @achirkin
- Update CMake pinning to allow newer CMake versions (#617) @vyasr
- Unpin
dask
&distributed
for development (#616) @galipremsagar - Improve performance of select-top-k RADIX implementation (#615) @achirkin
- Moving more prims benchmarks to RAFT (#613) @cjnolet
- Allow enabling NVTX markers by downstream projects after install (#610) @achirkin
- Improve performance of select-top-k WARP_SORT implementation (#606) @achirkin
- Enable building static libs (#602) @trxcllnt
- Update
ucx-py
version (#596) @ajschmidt8 - Fix merge conflicts (#587) @ajschmidt8
- Making cuco, thrust, and mdspan optional dependencies. (#585) @cjnolet
- Some RBC3D fixes (#530) @cjnolet
- Moving some of the remaining linalg prims from cuml (#502) @cjnolet
- Fix badly merged cublas wrappers (#492) @achirkin
- Hiding implementation details for lap, clustering, spectral, and label (#477) @cjnolet
- Adding destructor for std comms and using nccl allreduce for barrier in mpi comms (#473) @cjnolet
- Cleaning up cusparse_wrappers (#441) @cjnolet
- Improvents to RNG (#434) @vinaydes
- Remove RAFT memory management (#400) @viclafargue
- LinAlg impl in detail (#383) @divyegala
- Pin cmake in conda recipe to <3.23 (#600) @dantegd
- Fix make_device_vector_view (#595) @lowener
- Update cuco version. (#592) @vyasr
- Fixing raft headers dir (#574) @cjnolet
- Update update-version.sh (#560) @raydouglass
- find_package(raft) can now be called multiple times safely (#532) @robertmaynard
- Allocate sufficient memory for Hungarian if number of batches > 1 (#531) @ChuckHastings
- Adding lap.hpp back (with deprecation) (#529) @cjnolet
- raft-config is idempotent no matter RAFT_COMPILE_LIBRARIES value (#516) @robertmaynard
- Call initialize() in mpi_comms_t constructor. (#506) @seunghwak
- Improve row-major meanvar kernel via minimizing atomicCAS locks (#489) @achirkin
- Adding destructor for std comms and using nccl allreduce for barrier in mpi comms (#473) @cjnolet
- Add benchmarks (#549) @achirkin
- Unify weighted mean code (#514) @lowener
- single-pass raft::stats::meanvar (#472) @achirkin
- Move
random
package of cuML to RAFT (#449) @divyegala - mdspan integration. (#437) @trivialfis
- Interruptible execution (#433) @achirkin
- make raft sources compilable with clang (#424) @MatthiasKohl
- Span implementation. (#399) @trivialfis
- Adding build script for docs (#589) @cjnolet
- Temporarily disable new
ops-bot
functionality (#586) @ajschmidt8 - Fix commands to get conda output files (#584) @Ethyling
- Link to
cuco
and add faissEXCLUDE_FROM_ALL
option (#583) @trxcllnt - exposing faiss::faiss (#582) @cjnolet
- Pin
dask
anddistributed
version (#581) @galipremsagar - removing exclude_from_all from cuco (#580) @cjnolet
- Adding INSTALL_EXPORT_SET for cuco, rmm, thrust (#579) @cjnolet
- Thrust package name case (#576) @trxcllnt
- Add missing thrust includes to transpose.cuh (#575) @zbjornson
- Use unanchored clang-format version check (#573) @zbjornson
- Fixing accidental removal of thrust target from cmakelists (#571) @cjnolet
- Don't add gtest to build export set or generate a gtest-config.cmake (#565) @trxcllnt
- Set
main
label by default (#559) @galipremsagar - Add local conda channel while looking for conda outputs (#558) @Ethyling
- Updated dask and distributed to >=2022.02.1 (#557) @rlratzel
- Upload packages using testing label for nightlies (#556) @Ethyling
- Add
.github/ops-bot.yaml
config file (#554) @ajschmidt8 - Disabling benchmarks building by default. (#553) @cjnolet
- KNN select-top-k variants (#551) @achirkin
- Adding logger (#550) @cjnolet
- clang-tidy support: improved clang run scripts with latest changes (see cugraph-ops) (#548) @MatthiasKohl
- Pylibraft for pairwise distances (#540) @cjnolet
- mdspan PoC for distance make_blobs (#538) @cjnolet
- Include thrust/sort.h in ball_cover.cuh (#526) @akifcorduk
- Increase parallelism in allgatherv (#525) @seunghwak
- Moving device functions to cuh files and deprecating hpp (#524) @cjnolet
- Use
dynamic_extent
fromstdex
. (#523) @trivialfis - Updating some of the ci check scripts (#522) @cjnolet
- Use shfl_xor in warpReduce for broadcast (#521) @akifcorduk
- Fixing Python conda package and installation (#520) @cjnolet
- Adding instructions to install from conda and build using CPM (#519) @cjnolet
- Implement span storage optimization. (#515) @trivialfis
- RNG test fixes and improvements (#513) @vinaydes
- Moving scores and metrics over to raft::stats (#512) @cjnolet
- Random ball cover in 3d (#510) @cjnolet
- Initializing memory in RBC (#509) @cjnolet
- Adjusting conda packaging to remove duplicate dependencies (#508) @cjnolet
- Moving remaining stats prims from cuml (#507) @cjnolet
- Correcting the namespace (#505) @vinaydes
- Passing stream through commsplit (#503) @cjnolet
- Moving some of the remaining linalg prims from cuml (#502) @cjnolet
- Fixing spectral APIs (#496) @cjnolet
- Fix badly merged cublas wrappers (#492) @achirkin
- Fix integer overflow in distances (#490) @RAMitchell
- Reusing shared libs in gpu ci builds (#487) @cjnolet
- Adding fatbin to shared libs and fixing conda paths in cpu build (#485) @cjnolet
- Add CMake
install
rule for tests (#483) @ajschmidt8 - Adding cpu ci for conda build (#482) @cjnolet
- iUpdating codeowners to use new raft codeowners (#480) @cjnolet
- Hiding implementation details for lap, clustering, spectral, and label (#477) @cjnolet
- Define PTDS via
-D
to fix cache misses in sccache (#476) @trxcllnt - Unpin dask and distributed (#474) @galipremsagar
- Replace
ccache
withsccache
(#471) @ajschmidt8 - More README updates (#467) @cjnolet
- CUBLAS wrappers with switchable host/device pointer mode (#453) @achirkin
- Cleaning up cusparse_wrappers (#441) @cjnolet
- Adding conda packaging for libraft and pyraft (#439) @cjnolet
- Improvents to RNG (#434) @vinaydes
- Hiding implementation details for comms (#409) @cjnolet
- Remove RAFT memory management (#400) @viclafargue
- LinAlg impl in detail (#383) @divyegala
- Simplify raft component CMake logic, and allow compilation without FAISS (#428) @robertmaynard
- One cudaStream_t instance per raft::handle_t (#291) @divyegala
- Removing extra logging from faiss mr (#463) @cjnolet
- Pin
dask
&distributed
versions (#455) @galipremsagar - Replace RMM CUDA Python bindings with those provided by CUDA-Python (#451) @shwina
- Fix comms memory leak (#436) @seunghwak
- Fix C++ doxygen documentation (#426) @achirkin
- Fix clang-format style errors (#425) @achirkin
- Fix using incorrect macro RAFT_CHECK_CUDA in place of RAFT_CUDA_TRY (#415) @achirkin
- Fix CUDA_CHECK_NO_THROW compatibility define (#414) @zbjornson
- Disabling fused l2 knn from bfknn (#407) @cjnolet
- Disabling expanded fused l2 knn to unblock cuml CI (#404) @cjnolet
- Reverting default knn distance to L2Unexpanded for now. (#403) @cjnolet
- README and build fixes before release (#459) @cjnolet
- Updates to Python and C++ Docs (#442) @cjnolet
- error macros: determining buffer size instead of fixed 2048 chars (#420) @MatthiasKohl
- NVTX range helpers (#416) @achirkin
- Splitting fused l2 knn specializations (#461) @cjnolet
- Update cuCollection git tag (#447) @seunghwak
- Remove libcudacxx patch needed for nvcc 11.4 (#446) @robertmaynard
- Unpin
dask
anddistributed
(#440) @galipremsagar - Public apis for remainder of matrix and stats (#438) @divyegala
- Fix bug in producer-consumer buffer exchange which occurs in UMAP test on GV100 (#429) @mdoijade
- Simplify raft component CMake logic, and allow compilation without FAISS (#428) @robertmaynard
- Update ucx-py version on release using rvc (#422) @Ethyling
- Disabling fused l2 knn again. Not sure how this got added back. (#421) @cjnolet
- Adding no throw macro variants (#417) @cjnolet
- Remove
IncludeCategories
from.clang-format
(#412) @codereport - fix nan issues in L2 expanded sqrt KNN distances (#411) @mdoijade
- Consistent renaming of CHECK_CUDA and *_TRY macros (#410) @cjnolet
- Faster matrix-vector-ops (#401) @achirkin
- Adding dev conda environment files. (#397) @cjnolet
- Update to UCX-Py 0.24 (#392) @pentschev
- Branch 21.12 merge 22.02 (#386) @cjnolet
- Hiding implementation details for sparse API (#381) @cjnolet
- Adding distance specializations (#376) @cjnolet
- Use FAISS with RMM (#363) @viclafargue
- Add Fused L2 Expanded KNN kernel (#339) @mdoijade
- Update
.clang-format
to be consistent with all other RAPIDS repos (#300) @codereport - One cudaStream_t instance per raft::handle_t (#291) @divyegala
- Fixing bad host->device copy (#375) @cjnolet
- Fix coalesced access checks in matrix_vector_op (#372) @achirkin
- Port libcudacxx patch from cudf (#370) @dantegd
- Fixing overflow in expanded distances (#365) @cjnolet
- Upgrade
clang
to11.1.0
(#394) @galipremsagar - Fix Changelog Merge Conflicts for
branch-21.12
(#390) @ajschmidt8 - Pin max
dask
&distributed
(#388) @galipremsagar - Removing conflict w/ CUDA_CHECK (#378) @cjnolet
- Update RAFT test directory (#359) @viclafargue
- Update to UCX-Py 0.23 (#358) @pentschev
- Hiding implementation details for random, stats, and matrix (#356) @divyegala
- README updates (#351) @cjnolet
- Use 64 bit CuSolver API for Eigen decomposition (#349) @lowener
- Hiding implementation details for distance primitives (dense + sparse) (#344) @cjnolet
- Unpin
dask
&distributed
in CI (#338) @galipremsagar
- Miscellaneous tech debts/cleanups (#286) @viclafargue
- Accounting for rmm::cuda_stream_pool not having a constructor for 0 streams (#329) @divyegala
- Fix wrong lda parameter in gemv (#327) @achirkin
- Fix
matrixVectorOp
to verify promoted pointer type is still aligned to vectorized load boundary (#325) @viclafargue - Pin rmm to branch-21.10 and remove warnings from kmeans.hpp (#322) @dantegd
- Temporarily pin RMM while refactor removes deprecated calls (#315) @dantegd
- Fix more warnings (#311) @harrism
- Add Hamming, Jensen-Shannon, KL-Divergence, Russell rao and Correlation distance metrics support (#306) @mdoijade
- Pin max
dask
anddistributed
versions to2021.09.1
(#334) @galipremsagar - Make sure we keep the rapids-cmake and raft cal version in sync (#331) @robertmaynard
- Add broadcast with const input iterator (#328) @seunghwak
- Fused L2 (unexpanded) kNN kernel for NN <= 64, without using temporary gmem to store intermediate distances (#324) @mdoijade
- Update with rapids cmake new features (#320) @robertmaynard
- Update to UCX-Py 0.22 (#319) @pentschev
- Fix Forward-Merge Conflicts (#318) @ajschmidt8
- Enable CUDA device code warnings as errors (#307) @harrism
- Remove max version pin for dask & distributed on development branch (#303) @galipremsagar
- Warnings are errors (#299) @harrism
- Use the new RAPIDS.cmake to fetch rapids-cmake (#298) @robertmaynard
- ENH Replace gpuci_conda_retry with gpuci_mamba_retry (#295) @dillon-cullinan
- Miscellaneous tech debts/cleanups (#286) @viclafargue
- Random Ball Cover Algorithm for 2D Haversine/Euclidean (#213) @cjnolet
- expose epsilon parameter to allow precision to to be specified (#275) @ChuckHastings
- Fix support for different input and output types in linalg::reduce (#296) @Nyrio
- Const raft handle in sparse bfknn (#280) @cjnolet
- Add
cuco::cuco
to list of linked libraries (#279) @trxcllnt - Use nested include in destination of install headers to avoid docker permission issues (#263) @dantegd
- Update UCX-Py version to 0.21 (#255) @pentschev
- Fix mst knn test build failure due to RMM device_buffer change (#253) @mdoijade
- Add chebyshev, canberra, minkowksi and hellinger distance metrics (#276) @mdoijade
- Move FAISS ANN wrappers to RAFT (#265) @cjnolet
- Remaining sparse semiring distances (#261) @cjnolet
- removing divye from codeowners (#257) @divyegala
- Pinning cuco to a specific commit hash for release (#304) @rlratzel
- Pin max
dask
&distributed
versions (#301) @galipremsagar - Overlap epilog compute with ldg of next grid stride in pairwise distance & fusedL2NN kernels (#292) @mdoijade
- Always add faiss library alias if it's missing (#287) @trxcllnt
- Use
NVIDIA/cuCollections
repo again (#284) @trxcllnt - Use the 21.08 branch of rapids-cmake as rmm requires it (#278) @robertmaynard
- expose epsilon parameter to allow precision to to be specified (#275) @ChuckHastings
- Fix
21.08
forward-merge conflicts (#274) @ajschmidt8 - Add lds and sts inline ptx instructions to force vector instruction generation (#273) @mdoijade
- Move ANN to RAFT (additional updates) (#270) @cjnolet
- Sparse semirings cleanup + hash table & batching strategies (#269) @divyegala
- Revert "pin dask versions in CI (#260)" (#264" (#264)) @ajschmidt8
- Pass stream to device_scalar::value() calls. (#259) @harrism
- Update get_rmm.cmake to better support CalVer (#258) @harrism
- Add Grid stride pairwise dist and fused L2 NN kernels (#250) @mdoijade
- Fix merge conflicts (#236) @ajschmidt8
- Update UCX-Py version to 0.20 (#254) @pentschev
- cuco git tag update (again) (#248) @seunghwak
- Revert PR #232 for 21.06 release (#246) @dantegd
- Python comms to hold onto server endpoints (#241) @cjnolet
- Fix Thrust 1.12 compile errors (#231) @trxcllnt
- Make sure we use CalVer when checking out rapids-cmake (#230) @robertmaynard
- Loss of Precision in MST weight alteration (#223) @divyegala
- cuco git tag update (#243) @seunghwak
- Update
CHANGELOG.md
links for calver (#233) @ajschmidt8 - Add Grid stride pairwise dist and fused L2 NN kernels (#232) @mdoijade
- Updates to enable HDBSCAN (#208) @cjnolet
- Exposing spectral random seed property (#193) @cjnolet
- Fix pointer arithmetic in spmv smem kernel (#183) @lowener
- Modify default value for rowMajorIndex and rowMajorQuery in bf-knn (#173) @viclafargue
- Remove setCudaMallocWarning() call for libfaiss[@v1.7.0 (#167) @trxcllnt](https://github.com/v1.7.0 (#167) @trxcllnt)
- Add const to KNN handle (#157) @hlinsen
- Fixing codeowners (#194) @cjnolet
- Adjust Hellinger pairwise distance to vaoid NaNs (#189) @lowener
- Add column major input support in contractions_nt kernels with new kernel policy for it (#188) @mdoijade
- Dice formula correction (#186) @lowener
- Scaling knn graph fix connectivities algorithm (#181) @cjnolet
- Fixing RAFT CI & a few small updates for SLHC Python wrapper (#178) @cjnolet
- Add Precomputed to the DistanceType enum (for cuML DBSCAN) (#177) @Nyrio
- Enable matrix::copyRows for row major input (#176) @tfeher
- Add Dice distance to distancetype enum (#174) @lowener
- Porting over recent updates to distance prim from cuml (#172) @cjnolet
- Update KNN (#171) @viclafargue
- Adding translations parameter to brute_force_knn (#170) @viclafargue
- Update Changelog Link (#169) @ajschmidt8
- Map operation (#168) @viclafargue
- Updating sparse prims based on recent changes (#166) @cjnolet
- Prepare Changelog for Automation (#164) @ajschmidt8
- Update 0.18 changelog entry (#163) @ajschmidt8
- MST symmetric/non-symmetric output for SLHC (#162) @divyegala
- Pass pre-computed colors to MST (#154) @divyegala
- Streams upgrade in RAFT handle (RMM backend + create handle from parent's pool) (#148) @afender
- Merge branch-0.18 into 0.19 (#146) @dantegd
- Add device_send, device_recv, device_sendrecv, device_multicast_sendrecv (#144) @seunghwak
- Adding SLHC prims. (#140) @cjnolet
- Moving cuml sparse prims to raft (#139) @cjnolet
- Make NCCL root initialization configurable. (#120) @drobison00
- Add idx_t template parameter to matrix helper routines (#131) @tfeher
- Eliminate CUDA 10.2 as valid for large svd solving (#129) @wphicks
- Update check to allow svd solver on CUDA>=10.2 (#125) @wphicks
- Updating gpu build.sh and debugging threads CI issue (#123) @dantegd
- Adding additional distances (#116) @cjnolet
- Update stale GHA with exemptions & new labels (#152) @mike-wendt
- Add GHA to mark issues/prs as stale/rotten (#150) @Ethyling
- Prepare Changelog for Automation (#135) @ajschmidt8
- Adding Jensen-Shannon and BrayCurtis to DistanceType for Nearest Neighbors (#132) @lowener
- Add brute force KNN (#126) @hlinsen
- Make NCCL root initialization configurable. (#120) @drobison00
- Auto-label PRs based on their content (#117) @jolorunyomi
- Add gather & gatherv to raft::comms::comms_t (#114) @seunghwak
- Adding canberra and chebyshev to distance types (#99) @cjnolet
- Gpuciscripts clean and update (#92) @msadang
- PR #65: Adding cuml prims that break circular dependency between cuml and cumlprims projects
- PR #101: MST core solver
- PR #93: Incorporate Date/Nagi implementation of Hungarian Algorithm
- PR #94: Allow generic reductions for the map then reduce op
- PR #95: Cholesky rank one update prim
- PR #108: Remove unused old-gpubuild.sh
- PR #73: Move DistanceType enum from cuML to RAFT
- pr #92: Cleanup gpuCI scripts
- PR #98: Adding InnerProduct to DistanceType
- PR #103: Epsilon parameter for Cholesky rank one update
- PR #100: Add divyegala as codeowner
- PR #111: Cleanup gpuCI scripts
- PR #120: Update NCCL init process to support root node placement.
- PR #106: Specify dependency branches to avoid pip resolver failure
- PR #77: Fixing CUB include for CUDA < 11
- PR #86: Missing headers for newly moved prims
- PR #102: Check alignment before binaryOp dispatch
- PR #104: Fix update-version.sh
- PR #109: Fixing Incorrect Deallocation Size and Count Bugs
- PR #63: Adding MPI comms implementation
- PR #70: Adding CUB to RAFT cmake
- PR #59: Adding csrgemm2 to cusparse_wrappers.h
- PR #61: Add cusparsecsr2dense to cusparse_wrappers.h
- PR #62: Adding
get_device_allocator
tohandle.pxd
- PR #67: Remove dependence on run-time type info
- PR #56: Fix compiler warnings.
- PR #64: Remove
cublas_try
fromcusolver_wrappers.h
- PR #66: Fixing typo
get_stream
togetStream
inhandle.pyx
- PR #68: Change the type of recvcounts & displs in allgatherv from size_t[] to size_t* and int[] to size_t*, respectively.
- PR #69: Updates for RMM being header only
- PR #74: Fix std_comms::comm_split bug
- PR #79: remove debug print statements
- PR #81: temporarily expose internal NCCL communicator
- PR #12: Spectral clustering.
- PR #7: Migrating cuml comms -> raft comms_t
- PR #18: Adding commsplit to cuml communicator
- PR #15: add exception based error handling macros
- PR #29: Add ceildiv functionality
- PR #44: Add get_subcomm and set_subcomm to handle_t
- PR #13: Add RMM_INCLUDE and RMM_LIBRARY options to allow linking to non-conda RMM
- PR #22: Preserve order in comms workers for rank initialization
- PR #38: Remove #include <cudar_utils.h> from
raft/mr/
- PR #39: Adding a virtual destructor to
raft::handle_t
andraft::comms::comms_t
- PR #37: Clean-up CUDA related utilities
- PR #41: Upgrade to
cusparseSpMV()
, alg selection, and rectangular matrices. - PR #45: Add Ampere target to cuda11 cmake
- PR #47: Use gtest conda package in CMake/build.sh by default
- PR #17: Make destructor inline to avoid redeclaration error
- PR #25: Fix bug in handle_t::get_internal_streams
- PR #26: Fix bug in RAFT_EXPECTS (add parentheses surrounding cond)
- PR #34: Fix issue with incorrect docker image being used in local build script
- PR #35: Remove #include <nccl.h> from
raft/error.hpp
- PR #40: Preemptively fixed future CUDA 11 related errors.
- PR #43: Fixed CUDA version selection mechanism for SpMV.
- PR #46: Fix for cpp file extension issue (nvcc-enforced).
- PR #48: Fix gtest target names in cmake build gtest option.
- PR #49: Skip raft comms test if raft module doesn't exist
- Initial RAFT version
- PR #3: defining raft::handle_t, device_buffer, host_buffer, allocator classes
- PR #5: Small build.sh fixes