-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sparse TSNE #3293
Sparse TSNE #3293
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-0.18 #3293 +/- ##
===============================================
+ Coverage 71.48% 71.55% +0.06%
===============================================
Files 207 207
Lines 16750 16787 +37
===============================================
+ Hits 11974 12012 +38
+ Misses 4776 4775 -1
Continue to review full report at Codecov.
|
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really glad to see this change coming in!
Most of the feedback is minor, however adding this feature required that I also parametrize the remaining functions in UMAP so that updating from int64_t
and float
is straightforward. We should use a parametrized type instead of value_t
where possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good and it's almost there. My main concern is that I don't think all the non-pointer arguments need should be coupled to the 64-bit template types.
@cjnolet leaving this comment as a reference for post-vacation. I updated this PR with your latest review feedback |
rerun tests |
@@ -208,14 +208,16 @@ def extract_knn_graph(knn_graph, convert_dtype=True): | |||
knn_indices = knn_graph.col | |||
|
|||
if knn_indices is not None: | |||
convert_to_dtype = None | |||
if convert_dtype: | |||
convert_to_dtype = np.int32 if sparse else np.int64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's going to be important to change this when FAISS is updated (and the indices are 32-bit). Referencing relevant issue: #2821
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This PR allows TSNE to accept sparse inputs.
It also removes long-standing warnings
ptxas warning : Value of threads per SM for entry _ZN2ML4TSNE17IntegrationKernelEfffPfS1_PKfS3_S3_S3_S1_S1_S1_S1_S3_i is out of range. .minnctapersm will be ignored ptxas warning : Value of threads per SM for entry _ZN2ML4TSNE15RepulsionKernelEffPKiS2_PKfS4_S4_PfS5_S5_fiiiS4_S2_ is out of range. .minnctapersm will be ignored ptxas warning : Value of threads per SM for entry _ZN2ML4TSNE18TreeBuildingKernelEPiPKfS3_iiS1_S1_S3_ is out of range. .minnctapersm will be ignored ptxas warning : Value of threads per SM for entry _ZN2ML4TSNE17BoundingBoxKernelEPiS1_PfS2_S2_S2_S2_S2_S2_iiiPjS2_ is out of range. .minnctapersm will be ignored
from cuml builds which were caused by invalid parameters to__launch_bounds__
in TSNE kernels.Furthermore, I also created a class
TSNE_runner
to handle running separate components of the algorithm as well as to ensure the proper use of RAII buffers and their de-allocation once their use is done, without explicitly deleting those buffers.closes #2751