Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement new All Pairs similarity algorithm #4134

Closed

Conversation

ChuckHastings
Copy link
Collaborator

Added a new entry point for similarity functionality that combines the functionality of k_hop_nbrs and similarity.

This entry point allows us to compute similarity for all pairs of vertices in the graph in a single call. We also add the optional parameter topk which, if specified, will only return the vertices that have the highest scores. If topk is specified on an all pairs call, we compute the scores for pairs in batches and extract the topk as we go along to keep the memory footprint low.

This PR also updates a FIXME in the C++ similarity test. The C++ similarity test had been written before we had a k_hop_nbrs call, so there was some inefficient test code to compute that. Now that we have a k_hop_nbrs call, the test code was refactored to use that call.

@ChuckHastings ChuckHastings self-assigned this Feb 1, 2024
@ChuckHastings ChuckHastings added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change and removed cuGraph CMake labels Feb 1, 2024
@ChuckHastings ChuckHastings added this to the 24.04 milestone Feb 1, 2024
@@ -277,6 +277,7 @@ class Tests_Multithreaded

std::tie(std::ignore, modularity) = cugraph::louvain<vertex_t, edge_t, weight_t, true>(
thread_handle.raft_handle(),
std::nullopt,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We somehow missed this in the ECG update... we don't regularly compile the MTMG code in CI.

@github-actions github-actions bot added the python label Feb 7, 2024
@ChuckHastings
Copy link
Collaborator Author

This branch got corrupted and pushed. Closing with a new PR from a cleaner branch.

rapids-bot bot pushed a commit that referenced this pull request Mar 7, 2024
Added a new entry point for similarity functionality that combines the functionality of k_hop_nbrs and similarity.

This entry point allows us to compute similarity for all pairs of vertices in the graph in a single call. We also add the optional parameter topk which, if specified, will only return the vertices that have the highest scores. If topk is specified on an all pairs call, we compute the scores for pairs in batches and extract the topk as we go along to keep the memory footprint low.

This PR also updates a FIXME in the C++ similarity test. The C++ similarity test had been written before we had a k_hop_nbrs call, so there was some inefficient test code to compute that. Now that we have a k_hop_nbrs call, the test code was refactored to use that call.

Supersedes PR #4134

Authors:
  - Chuck Hastings (https://github.com/ChuckHastings)

Approvers:
  - Seunghwa Kang (https://github.com/seunghwak)

URL: #4158
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake cuGraph improvement Improvement / enhancement to an existing function non-breaking Non-breaking change python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant