-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement eigenvector centrality #2287
Implement eigenvector centrality #2287
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-22.06 #2287 +/- ##
================================================
- Coverage 63.82% 63.69% -0.14%
================================================
Files 100 100
Lines 4484 4481 -3
================================================
- Hits 2862 2854 -8
- Misses 1622 1627 +5
Continue to review full report at Codecov.
|
cpp/include/cugraph/algorithms.hpp
Outdated
void eigenvector_centrality( | ||
raft::handle_t const& handle, | ||
graph_view_t<vertex_t, edge_t, weight_t, true, multi_gpu> const& graph_view, | ||
raft::device_span<weight_t> centralities, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for the sake of discussion,
So, what do you think about passing raft::device_span<weight_t> centralities
as an input argument vs returning rmm::device_uvector<weight_t> holding centrality values?
The former might be more natural when we're passing initial values and we may be able to reduce memory allocations (when we are running PageRank with different personalization vectors, but with the rmm pool allocator, memory allocation overhead might be insignificant) while the latter might be more functional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got the idea of using the span from looking at your new triangle_count
implementation. The [in/out] of centralities is more consistent with what we have been doing. Our paradigm thus far has been to specify the output storage a priori if we can know it, and to allocate it dynamically if we can't know it.
What you are suggesting would be a paradigm shift for the API. I'm not opposed to changing the paradigm.
It seems to me the current paradigm has the following advantages:
- Less memory allocation. The new strategy would require temporarily having an extra vector of length V.
- The caller can use any memory allocator that they choose to allocate the device memory
The new paradigm would have the following advantages:
- More functional in nature
- More consistency (all algorithms would return results the same way, whether the size is predictable or not)
In the grand scheme of memory things, I'm not all that concerned over allocating an extra result array temporarily. It seems to me that the functional feel of the proposed paradigm is useful and consistency in how algorithms behave across the interface is always better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case I can certainly change raft::device_span<weight_t> centralities
to std::optional< raft::device_span<weight_t>> centralities
to support an optional input, and make the return value rmm::device_uvector<weight_t>
cpp/include/cugraph/algorithms.hpp
Outdated
* @param handle RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and | ||
* handles to various CUDA libraries) to run graph algorithms. | ||
* @param graph_view Graph view object. | ||
* @param centralities Device span where we should store the eigenvector centralities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pass initial values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add that support. Missed that.
#include <rmm/exec_policy.hpp> | ||
|
||
#include <thrust/fill.h> | ||
#include <thrust/for_each.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not, copy/paste. I'll check all the headers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to delete this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
thrust::fill(handle.get_thrust_policy(), | ||
centralities.begin(), | ||
centralities.end(), | ||
weight_t{1.0} / static_cast<weight_t>(num_vertices)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NetworkX supports passing initial values (https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.centrality.eigenvector_centrality.html). Shouldn't we support the same (we support initial values for PageRank).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add, missed that.
Pushed an update to address @seunghwak comments |
@gpucibot merge |
This PR implements Eigenvector Centrality in C++ using the graph primitives. It also provides the C API implementation.
There are unit tests for C++ and C both SG and MG.
Partially addresses #2146