-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define k-core API and tests #2712
Changes from 5 commits
0264144
cdf5f0a
650d017
3d06e3d
2c00479
7caaba6
3f7db87
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1606,6 +1606,37 @@ void core_number(raft::handle_t const& handle, | |
size_t k_last = std::numeric_limits<size_t>::max(), | ||
bool do_expensive_check = false); | ||
|
||
/** | ||
* @brief Extract K Core of a graph | ||
* | ||
* @throws cugraph::logic_error when an error occurs. | ||
* | ||
* @tparam vertex_t Type of vertex identifiers. Needs to be an integral type. | ||
* @tparam edge_t Type of edge identifiers. Needs to be an integral type. | ||
* @tparam weight_t Type of edge weights. Needs to be a floating point type. | ||
* @tparam multi_gpu Flag indicating whether template instantiation should target single-GPU (false) | ||
* @param graph_view Graph view object. | ||
* @param k Order of the core. This value must not be negative. | ||
* @param degree_type Optional parameter to dictate whether to compute the K-core decomposition | ||
* based on in-degrees, out-degrees, or in-degrees + out_degrees. One of @p | ||
* degree_type and @p core_numbers must be specified. | ||
* @param core_numbers Optional output from core_number algorithm. If not specified then | ||
* k_core will call core_number itself using @p degree_type | ||
* @param do_expensive_check A flag to run expensive checks for input arguments (if set to `true`). | ||
* | ||
* @return edge list for the graph | ||
*/ | ||
template <typename vertex_t, typename edge_t, typename weight_t, bool multi_gpu> | ||
std::tuple<rmm::device_uvector<vertex_t>, | ||
rmm::device_uvector<vertex_t>, | ||
std::optional<rmm::device_uvector<weight_t>>> | ||
k_core(raft::handle_t const& handle, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Something to think about... So, we're passing core numbers here. So in this case, should we really have both this function and induced subgraph? seeing core numbers, we can easily extract a vertex list in a single thrust call (e.g. thrust::copy_if). Then, we can pass a vertex list to induced subgraph. Not sure this function adds enough convenience to justify increase in compile time/binary size. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe OK, if this calls explicitly instantiated induced_subgraph, increases in compile time/binary size might be minimal... Still debating between whether this function should take core_numbers or just call core_numbers internally.... (to maximize user convenience, if not, a user can just separately call core_numbers, something like thurst::copy_if, and induced_subgraph). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I had that debate internally. Ultimately I concluded that if we're going to expose a complete k-core implementation at the python level we should probably expose it at the lower levels as well. It seems like something that would be useful. I do wonder if we should make passing in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah... +1 for taking Still a bit not-sure about "Ultimately I concluded that if we're going to expose a complete k-core implementation at the python level we should probably expose it at the lower levels as well." At the python layer, user convenience might be more important than anything else. And providing an additional python function that internally calls core_number, find vertex list, and induced subgraph has no toll on compile time/binary size as python is an interpreted language. Not sure we should provide the same level of convenience for C++ users, as we assume C++ users are more advanced and they might be OK about finding k-core by composing core_numbers and induced subgraph. But this is a much bigger topic, and we may have this discussion again in the future. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And another thing to consider is that this may incur some sort of dependency within a same algorithm level in our C++ software hierarchy. Most algorithms are composed of primitives and thrust calls. But now algorithms are composed of primitives, thrust calls, and other algorithms. I am not sure what kind of complications this can incur in the future. Composing C++ algorithms in the C layer might be OK, but I am not sure whether we should support use cases that can be supported simply by composing few existing algorithms in the C++ level. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fair point. I will change We can explore dropping the |
||
graph_view_t<vertex_t, edge_t, weight_t, false, multi_gpu> const& graph_view, | ||
size_t k, | ||
std::optional<k_core_degree_type_t> degree_type, | ||
std::optional<raft::device_span<edge_t const>> core_numbers, | ||
bool do_expensive_check = false); | ||
|
||
/** | ||
* @brief Uniform Neighborhood Sampling. | ||
* | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be a better idea to pass
bool transpose=false
as template parameter and then use transpose as a named parameter to the function instead of implicit false?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We try to avoid excess compilation of different templated functions, as this increased compile time.
Most of our functions are implemented based on an assumption about whether the graph is stored transposed or not. For example,
pagerank
assumes thatstore_transposed=true
. Implementingpagerank
ifstore_transposed=false
would result in a great deal more synchronization (and communication in a multi-gpu environment). There are a few examples that will work on a graph in either orientation.Supporting this would require providing an implementation that would work with either orientation of the graph. It seems like as long as this matches the requirement for
core_number
(which has to be called first) then just one orientation is sufficient.