Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix UMAP and simplicial set functions metric #5490

Merged
merged 14 commits into from
Aug 3, 2023

Conversation

viclafargue
Copy link
Contributor

Answers #5422

@viclafargue viclafargue requested a review from a team as a code owner July 4, 2023 11:57
@viclafargue viclafargue added the 3 - Ready for Review Ready for review by team label Jul 11, 2023
@viclafargue viclafargue requested a review from a team as a code owner July 13, 2023 13:22
@github-actions github-actions bot added the Cython / Python Cython or Python issue label Jul 13, 2023
@viclafargue viclafargue changed the title Fix UMAP metric Fix UMAP and simplicial set functions metric Jul 13, 2023
@viclafargue viclafargue requested a review from cjnolet July 13, 2023 15:05
Copy link
Contributor

@csadorf csadorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some suggestions, but my biggest concern is whether our current tests are sufficiently capturing the motivating bug.

python/cuml/manifold/simpl_set.pyx Outdated Show resolved Hide resolved
python/cuml/manifold/simpl_set.pyx Outdated Show resolved Hide resolved
python/cuml/manifold/simpl_set.pyx Outdated Show resolved Hide resolved
python/cuml/manifold/simpl_set.pyx Outdated Show resolved Hide resolved
python/cuml/tests/test_umap.py Show resolved Hide resolved
python/cuml/tests/test_umap.py Show resolved Hide resolved
@csadorf csadorf added the improvement Improvement / enhancement to an existing function label Jul 14, 2023
python/cuml/manifold/simpl_set.pyx Show resolved Hide resolved
python/cuml/manifold/simpl_set.pyx Outdated Show resolved Hide resolved
Copy link
Contributor

@csadorf csadorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@csadorf
Copy link
Contributor

csadorf commented Jul 27, 2023

@viclafargue While this fix changes the behavior of the estimator class, I would consider the previous one broken and we are now moving towards the intended behavior and thus would not consider this a breaking change. What do you think?

@viclafargue viclafargue added non-breaking Non-breaking change bug Something isn't working and removed improvement Improvement / enhancement to an existing function labels Jul 28, 2023
@@ -62,6 +62,7 @@ inline void launcher(const raft::handle_t& handle,
ptrs[0] = inputsA.X;
sizes[0] = inputsA.n;

std::vector<int64_t>* translations = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to introduce a temporary for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function template won't be instantiated while providing a nullptr directly unless a cast is used it seems. I just switched it for a cast.

Copy link
Contributor

@wphicks wphicks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

"correlation": DistanceType.CorrelationExpanded,
"hellinger": DistanceType.HellingerExpanded,
"hamming": DistanceType.HammingUnexpanded,
"jaccard": DistanceType.JaccardExpanded,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jaccard is supported in the sparse distances- is there any reason we're not separating the sparse from dense supported metrics? I can't see why we'd want to remove jaccard from being executed on sparse metrics just because it's not yet provided for dense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjnolet This should be addressed now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, didn't saw it was used in the sparse case. Thanks for fixing this @csadorf.

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meant to request changes for the comment above. We should avoid removing features.

@csadorf csadorf requested a review from cjnolet August 2, 2023 22:00
"jaccard",
"hamming",
"canberra",
("l2", True),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only have to do the mappings once- if you use strings in SPARSE_SUPPORTED_METRICS and DENSE_SUPPORTED_METRICS then you can literally just use the union of the two here here instead of having to list them out at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulling in those lists into the test code would be counter-productive IMO since it correlates implementation and test expectation which means that it becomes harder to detect breaking changes.

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it could be cleaned up a bit, but it's not an urgent issue so long as the rules aren't hardcoded and we aren't losing the jaccard functionality.

@csadorf
Copy link
Contributor

csadorf commented Aug 2, 2023

/merge

@rapids-bot rapids-bot bot merged commit 6bf61ca into rapidsai:branch-23.08 Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working CUDA/C++ Cython / Python Cython or Python issue non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants