[REVIEW] Add chebyshev, canberra, minkowksi and hellinger distance metrics #276

mdoijade · 2021-06-15T15:19:26Z

This change will be followed up by cuML side change to use these metrics and update the pytest accordingly.

…r usage in all contraction based kernels so that n is along x dir and m is along y dir blocks

…kernels. --add launch config generator function to launch optimal grid size kernel for these pairwise dist kernels

…ed up over previous version. -- improve logic of the grid launch config generator for x-dir blocks

…ced val for pre-volta arch

… for subsequent gridStrideX variations. this overall improves perf of fusedL2NN to 1.85x over previous version. --Also remove checking keys only check values in fusedL2nn test case, as it may happen a row has multiple keys with same min val

…und in launchConfigGenerator. --Use constexpr in shmemSize.

…e sure next grid stride doesn't pollute shmem before completion of this calculation

…t iteration of grid stride

…m::device_buffer

…pected to work on pdf

mdoijade · 2021-06-15T16:42:52Z

@teju85 @cjnolet please help with review.

cjnolet

Looks great. Just very minor things.

cjnolet · 2021-06-15T16:52:22Z

cpp/include/raft/distance/canberra.cuh

@@ -0,0 +1,159 @@
+/*
+ * Copyright (c) 2018-2021, NVIDIA CORPORATION.


Since this is a new file, we should just put 2021 here.

cjnolet · 2021-06-15T16:52:30Z

cpp/include/raft/distance/chebyshev.cuh

@@ -0,0 +1,156 @@
+/*
+ * Copyright (c) 2018-2021, NVIDIA CORPORATION.


Here as well

cjnolet · 2021-06-15T16:55:20Z

cpp/include/raft/distance/canberra.cuh

+ * @tparam OutType output data-type (for C and D matrices)
+ * @tparam FinalLambda user-defined epilogue lamba
+ * @tparam Index_ Index type
+ * @param m number of rows of A and C/D


Should we add the ins/outs here as well for consistency?

cjnolet · 2021-06-15T16:55:58Z

cpp/include/raft/distance/chebyshev.cuh

+ *  It computes the following equation: cij = max(cij, op(ai-bj))
+ * @tparam InType input data-type (for A and B matrices)
+ * @tparam AccType accumulation data-type
+ * @tparam OutType output data-type (for C and D matrices)


Same here, maybe add the ins/outs for consistency w/ the above docs (also apply to the distances below)

…nly 2021

…and reverting in post completion. also add unrolls to ldg arrays in contractions

mdoijade · 2021-06-23T16:57:49Z

@cjnolet @teju85 the cuML side PR is passing - rapidsai/cuml#3990
the failure in those tests is in dbscan due to some recent numpy update.
All tests pertaining to these distance metrics are passing.

cjnolet · 2021-06-24T20:45:39Z

@mdoijade, the dbscan / numpy issue should be fixed in this PR: rapidsai/cuml#4012.

Would it reduce the compile time at all if we just used function arguments for the individual distances and removed the explicit templates for them?

cjnolet

LGTM

dantegd

Before we merge this PR, could you quantify the impact of the changes in cuML's compilation time (locally at least)?

mdoijade · 2021-06-25T16:31:57Z

Before we merge this PR, could you quantify the impact of the changes in cuML's compilation time (locally at least)?

@dantegd
Without new distance metrics cuML total build time
real 9m24.016s
user 118m8.600s
sys 3m43.230s

With these 4 additional distance metrics cuML total build time
real 18m11.113s
user 120m28.920s
sys 3m34.880s

It is roughly 2x build time increase, and this reduction is after my change of using compiled cuML pairwise dist kernels instead of each time calling raft API wherever it is feasible without that change build time is about 25mins.
This increase is coming from pairwise_distance.cu

mdoijade · 2021-06-29T13:34:25Z

@dantegd @cjnolet @teju85
I split the pairwise_distance.cu compilation into multiple source files for each distance metric to enable parallel build in it,
this has helped by a certain extent from previous effort like around 2-3 mins build time reduction.
I am seeing that minkowksi kernels are taking longer than others with CUDA 11.0 it has "powf(), pow()" functions in it that is the only difference.
so after disabling minkowski kernels (other 3 new kernels enabled) the time take for full cuML build with CUDA 11.0
real 9m35.232s
user 113m56.715s
sys 3m44.966s
this is almost same time as without these kernels.

With these 4 additional distance metrics cuML total build time without splitting - CUDA 11.0
real 18m11.113s
user 120m28.920s
sys 3m34.880s

Without new distance metrics cuML build time - CUDA 11.0 (current baseline)
real 9m24.016s
user 118m8.600s
sys 3m43.230s

After splitting the distance prims in pairwise_distance.cu into individual source files (with minkowski) - CUDA 11.0
real 17m44.648s
user 127m42.492s
sys 3m48.588s

Without minkowski - CUDA 11.0
real 9m35.232s
user 113m56.715s
sys 3m44.966s

With minkowski - cuda 11.2 + split pairwise distance
real 9m57.291s
user 123m47.790s
sys 3m54.819s

With minkowski - cuda 11.2 + without split pairwise distance
real 12m22.500s
user 118m22.608s
sys 3m45.721s

So it is evident that this splitting of pairwise_distance.cu does help reduce build time.
and there is some compilation slowness issue with cuda 11.0 for pow(), powf() which seems to be resolved in cuda 11.2.
cuML PR - rapidsai/cuml#3990

mdoijade · 2021-07-05T12:56:53Z

@dantegd can we merge this now? and also cuML PR - rapidsai/cuml#3990

dantegd · 2021-07-06T13:09:54Z

@gpucibot merge

This reverts commit 82061e0. This PR appears to have resulted in a substantial increase in compilation time, causing timeouts in CI for cuML PR 4029. These changes should be reinstated once the cause of the increased compilation time has been determined.

) This PR relies on RAFT PR rapidsai/raft#276 which adds these distance metrics support. Authors: - Mahesh Doijade (https://github.com/mdoijade) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) - Corey J. Nolet (https://github.com/cjnolet) - AJ Schmidt (https://github.com/ajschmidt8) URL: #3990

…pidsai#3990) This PR relies on RAFT PR rapidsai/raft#276 which adds these distance metrics support. Authors: - Mahesh Doijade (https://github.com/mdoijade) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) - Corey J. Nolet (https://github.com/cjnolet) - AJ Schmidt (https://github.com/ajschmidt8) URL: rapidsai#3990

mdoijade added 30 commits May 12, 2021 22:03

Refactor fusedL2NN to use pairwiseDistance class. invert block y/x di…

3a4ec66

…r usage in all contraction based kernels so that n is along x dir and m is along y dir blocks

-- add grid stride support to pairwise distance based cosine, l2, l1 …

76f9a72

…kernels. --add launch config generator function to launch optimal grid size kernel for these pairwise dist kernels

--Add grid stride based fusedL2NN kernel, this gives approx 1.67x spe…

af89085

…ed up over previous version. -- improve logic of the grid launch config generator for x-dir blocks

Add note on reason to use thread 0 from each warp to write final redu…

9c71c4a

…ced val for pre-volta arch

fix clangformat and copyright year

4d76b57

Merge branch 'branch-21.06' into gridStridedDist

da2d768

Use cudaOccupancyMaxActiveBlocksPerSM instead of hard-coded launch bo…

2e804c2

…und in launchConfigGenerator. --Use constexpr in shmemSize.

Merge branch 'branch-21.06' into gridStridedDist

3408a40

initialize regx and regy during each prolog call

69b316d

Add chebyshev distance metric support

6a64b7a

initialize ldgX, ldgY in prolog

9a30a87

Merge branch 'gridStridedDist' into chebyshevDist

14a2673

Add hellinger distance metric support

21577a4

Merge branch 'branch-21.08' into gridStridedDist

969c65a

add syncthreads post epilog calc for non-norm distance metrics to mak…

9c4d5a0

…e sure next grid stride doesn't pollute shmem before completion of this calculation

Merge branch 'gridStridedDist' into chebyshevDist

df4ce55

remove syncthreads in epilog and instead use ping-pong buffers in nex…

4fb00e6

…t iteration of grid stride

Add minkowski distance metric

5346232

use ping-pong buffers for safely grid striding

b5b3c51

Merge branch 'gridStridedDist' into chebyshevDist

0a0f964

Add canberra distance metric support

2fd7f4c

fix build failure of mst and knn test by adding cuda stream arg to rm…

0f2c03d

…m::device_buffer

temp commit for test rerun

484b082

use ucx-py version 0.21 to temp resolve ci build failures

04f656f

Merge branch 'fix_mst_knn_test' into gridStridedDist

753f612

merge branch-21.08

f73471c

Merge branch 'fix_mst_knn_test' into gridStridedDist

8007d7a

Merge branch 'branch-21.08' into gridStridedDist

45dc556

Merge branch 'branch-21.08' into gridStridedDist

d0b8947

github-actions bot added CMake cpp labels Jun 15, 2021

mdoijade added 2 commits June 15, 2021 20:51

fix clang format issue

7673074

fix hellinger inputs to be only in range of 0 to 1 as hellinger is ex…

476ed99

…pected to work on pdf

mdoijade mentioned this pull request Jun 15, 2021

[REVIEW] Use chebyshev, canberra, hellinger and minkowski distance metrics rapidsai/cuml#3990

Merged

cjnolet added feature request New feature or request non-breaking Non-breaking change labels Jun 15, 2021

cjnolet requested changes Jun 15, 2021

View reviewed changes

mdoijade added 5 commits June 16, 2021 17:23

fix doc issues in all dist functions, also fix copyright year to be o…

59e78e8

…nly 2021

reduce sqrt in hellinger usage by overwriting input matrices by sqrt …

7af5e32

…and reverting in post completion. also add unrolls to ldg arrays in contractions

fix clang format issues

0e99113

hellinger: only sqrt inputs when x & y are not same.

f4b8d33

fix clang format issues

a71e520

mdoijade changed the title ~~Add chebyshev, canberra, minkowksi and hellinger distance metrics~~ [REVIEW] Add chebyshev, canberra, minkowksi and hellinger distance metrics Jun 17, 2021

cjnolet approved these changes Jun 24, 2021

View reviewed changes

dantegd requested changes Jun 24, 2021

View reviewed changes

Merge branch 'branch-21.08' into chebyshevDist

842fcd0

dantegd approved these changes Jul 6, 2021

View reviewed changes

rapids-bot bot merged commit 82061e0 into rapidsai:branch-21.08 Jul 6, 2021

mdoijade mentioned this pull request Aug 12, 2021

[FEA] Add benchmark for newly added dense distance metrics as listed rapidsai/cuml#4159

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] Add chebyshev, canberra, minkowksi and hellinger distance metrics #276

[REVIEW] Add chebyshev, canberra, minkowksi and hellinger distance metrics #276

mdoijade commented Jun 15, 2021 •

edited

Loading

mdoijade commented Jun 15, 2021

cjnolet left a comment

cjnolet Jun 15, 2021

cjnolet Jun 15, 2021

mdoijade Jun 16, 2021

cjnolet Jun 15, 2021

cjnolet Jun 15, 2021

mdoijade Jun 16, 2021

mdoijade commented Jun 23, 2021

cjnolet commented Jun 24, 2021

cjnolet left a comment

dantegd left a comment

mdoijade commented Jun 25, 2021

mdoijade commented Jun 29, 2021 •

edited

Loading

mdoijade commented Jul 5, 2021

dantegd commented Jul 6, 2021

		@@ -0,0 +1,159 @@
		/*
		* Copyright (c) 2018-2021, NVIDIA CORPORATION.

		@@ -0,0 +1,156 @@
		/*
		* Copyright (c) 2018-2021, NVIDIA CORPORATION.

[REVIEW] Add chebyshev, canberra, minkowksi and hellinger distance metrics #276

[REVIEW] Add chebyshev, canberra, minkowksi and hellinger distance metrics #276

Conversation

mdoijade commented Jun 15, 2021 • edited Loading

mdoijade commented Jun 15, 2021

cjnolet left a comment

Choose a reason for hiding this comment

cjnolet Jun 15, 2021

Choose a reason for hiding this comment

cjnolet Jun 15, 2021

Choose a reason for hiding this comment

mdoijade Jun 16, 2021

Choose a reason for hiding this comment

cjnolet Jun 15, 2021

Choose a reason for hiding this comment

cjnolet Jun 15, 2021

Choose a reason for hiding this comment

mdoijade Jun 16, 2021

Choose a reason for hiding this comment

mdoijade commented Jun 23, 2021

cjnolet commented Jun 24, 2021

cjnolet left a comment

Choose a reason for hiding this comment

dantegd left a comment

Choose a reason for hiding this comment

mdoijade commented Jun 25, 2021

mdoijade commented Jun 29, 2021 • edited Loading

mdoijade commented Jul 5, 2021

dantegd commented Jul 6, 2021

mdoijade commented Jun 15, 2021 •

edited

Loading

mdoijade commented Jun 29, 2021 •

edited

Loading