Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define k-core API and tests #2712

Merged
merged 7 commits into from
Sep 26, 2022
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,8 @@ set(CUGRAPH_SOURCES
src/cores/legacy/core_number.cu
src/cores/core_number_sg.cu
src/cores/core_number_mg.cu
src/cores/k_core_sg.cu
src/cores/k_core_mg.cu
src/traversal/two_hop_neighbors.cu
src/components/legacy/connectivity.cu
src/centrality/betweenness_centrality.cu
Expand Down Expand Up @@ -335,6 +337,7 @@ add_library(cugraph_c
src/c_api/eigenvector_centrality.cpp
src/c_api/core_number.cpp
src/c_api/core_result.cpp
src/c_api/k_core.cpp
src/c_api/hits.cpp
src/c_api/bfs.cpp
src/c_api/sssp.cpp
Expand Down
31 changes: 31 additions & 0 deletions cpp/include/cugraph/algorithms.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1606,6 +1606,37 @@ void core_number(raft::handle_t const& handle,
size_t k_last = std::numeric_limits<size_t>::max(),
bool do_expensive_check = false);

/**
* @brief Extract K Core of a graph
*
* @throws cugraph::logic_error when an error occurs.
*
* @tparam vertex_t Type of vertex identifiers. Needs to be an integral type.
* @tparam edge_t Type of edge identifiers. Needs to be an integral type.
* @tparam weight_t Type of edge weights. Needs to be a floating point type.
* @tparam multi_gpu Flag indicating whether template instantiation should target single-GPU (false)
* @param graph_view Graph view object.
* @param k Order of the core. This value must not be negative.
* @param degree_type Optional parameter to dictate whether to compute the K-core decomposition
* based on in-degrees, out-degrees, or in-degrees + out_degrees. One of @p
* degree_type and @p core_numbers must be specified.
* @param core_numbers Optional output from core_number algorithm. If not specified then
* k_core will call core_number itself using @p degree_type
* @param do_expensive_check A flag to run expensive checks for input arguments (if set to `true`).
*
* @return edge list for the graph
*/
template <typename vertex_t, typename edge_t, typename weight_t, bool multi_gpu>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be a better idea to pass bool transpose=false as template parameter and then use transpose as a named parameter to the function instead of implicit false?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to avoid excess compilation of different templated functions, as this increased compile time.

Most of our functions are implemented based on an assumption about whether the graph is stored transposed or not. For example, pagerank assumes that store_transposed=true. Implementing pagerank if store_transposed=false would result in a great deal more synchronization (and communication in a multi-gpu environment). There are a few examples that will work on a graph in either orientation.

Supporting this would require providing an implementation that would work with either orientation of the graph. It seems like as long as this matches the requirement for core_number (which has to be called first) then just one orientation is sufficient.

std::tuple<rmm::device_uvector<vertex_t>,
rmm::device_uvector<vertex_t>,
std::optional<rmm::device_uvector<weight_t>>>
k_core(raft::handle_t const& handle,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to think about...

So, we're passing core numbers here. So in this case, should we really have both this function and induced subgraph?

seeing core numbers, we can easily extract a vertex list in a single thrust call (e.g. thrust::copy_if). Then, we can pass a vertex list to induced subgraph.

Not sure this function adds enough convenience to justify increase in compile time/binary size.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe OK, if this calls explicitly instantiated induced_subgraph, increases in compile time/binary size might be minimal...

Still debating between whether this function should take core_numbers or just call core_numbers internally.... (to maximize user convenience, if not, a user can just separately call core_numbers, something like thurst::copy_if, and induced_subgraph).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had that debate internally. Ultimately I concluded that if we're going to expose a complete k-core implementation at the python level we should probably expose it at the lower levels as well. It seems like something that would be useful.

I do wonder if we should make passing in core_numbers optional, and if they are not passed then also call core_numbers. This would allow a simple caller that just wants to extract a 3-core from the graph to be able to call the function directly and get the complete answer, while a more sophisticated case might call core_number once and reuse the result to extract the different subgraphs as required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... +1 for taking core_numbers as std::optional.

Still a bit not-sure about "Ultimately I concluded that if we're going to expose a complete k-core implementation at the python level we should probably expose it at the lower levels as well."

At the python layer, user convenience might be more important than anything else. And providing an additional python function that internally calls core_number, find vertex list, and induced subgraph has no toll on compile time/binary size as python is an interpreted language.

Not sure we should provide the same level of convenience for C++ users, as we assume C++ users are more advanced and they might be OK about finding k-core by composing core_numbers and induced subgraph.

But this is a much bigger topic, and we may have this discussion again in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And another thing to consider is that this may incur some sort of dependency within a same algorithm level in our C++ software hierarchy. Most algorithms are composed of primitives and thrust calls. But now algorithms are composed of primitives, thrust calls, and other algorithms. I am not sure what kind of complications this can incur in the future.

Composing C++ algorithms in the C layer might be OK, but I am not sure whether we should support use cases that can be supported simply by composing few existing algorithms in the C++ level.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point.

I will change core_numbers to be optional.

We can explore dropping the k_core from the C++ layer if having the algorithms depend on other algorithms becomes an issue. I see the point that adding it at the C layer may be sufficient.

graph_view_t<vertex_t, edge_t, weight_t, false, multi_gpu> const& graph_view,
size_t k,
std::optional<k_core_degree_type_t> degree_type,
std::optional<raft::device_span<edge_t const>> core_numbers,
bool do_expensive_check = false);

/**
* @brief Uniform Neighborhood Sampling.
*
Expand Down
18 changes: 18 additions & 0 deletions cpp/include/cugraph_c/array.h
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,24 @@ cugraph_error_code_t cugraph_type_erased_device_array_create(
cugraph_type_erased_device_array_t** array,
cugraph_error_t** error);

/**
* @brief Create a type erased device array from a view
*
* Copies the data from the view into the new device array
*
* @param [in] handle Handle for accessing resources
* @param [in] view Type erased device array view to copy from
* @param [out] array Pointer to the location to store the pointer to the device array
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
*/
cugraph_error_code_t cugraph_type_erased_device_array_create_from_view(
const cugraph_resource_handle_t* handle,
const cugraph_type_erased_device_array_view_t* view,
cugraph_type_erased_device_array_t** array,
cugraph_error_t** error);

/**
* @brief Destroy a type erased device array
*
Expand Down
89 changes: 88 additions & 1 deletion cpp/include/cugraph_c/core_algorithms.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,31 @@ typedef struct {
int32_t align_;
} cugraph_core_result_t;

/**
* @brief Opaque k-core result type
*/
typedef struct {
int32_t align_;
} cugraph_k_core_result_t;

/**
* @brief Create a core_number result (in case it was previously extracted)
*
* @param [in] handle Handle for accessing resources
* @param [in] vertices The result from core number
* @param [in] core_numbers The result from core number
* @param [out] result Opaque pointer to core number results
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
*/
cugraph_error_code_t cugraph_core_result_create(
const cugraph_resource_handle_t* handle,
cugraph_type_erased_device_array_view_t* vertices,
cugraph_type_erased_device_array_view_t* core_numbers,
cugraph_core_result_t** core_result,
cugraph_error_t** error);

/**
* @brief Get the vertex ids from the core result
*
Expand All @@ -57,6 +82,42 @@ cugraph_type_erased_device_array_view_t* cugraph_core_result_get_core_numbers(
*/
void cugraph_core_result_free(cugraph_core_result_t* result);

/**
* @brief Get the src vertex ids from the k-core result
*
* @param [in] result The result from k-core
* @return type erased array of src vertex ids
*/
cugraph_type_erased_device_array_view_t* cugraph_k_core_result_get_src_vertices(
cugraph_k_core_result_t* result);

/**
* @brief Get the dst vertex ids from the k-core result
*
* @param [in] result The result from k-core
* @return type erased array of dst vertex ids
*/
cugraph_type_erased_device_array_view_t* cugraph_k_core_result_get_dst_vertices(
cugraph_k_core_result_t* result);

/**
* @brief Get the weights from the k-core result
*
* Returns NULL if the graph is unweighted
*
* @param [in] result The result from k-core
* @return type erased array of weights
*/
cugraph_type_erased_device_array_view_t* cugraph_k_core_result_get_weights(
cugraph_k_core_result_t* result);

/**
* @brief Free k-core result
*
* @param [in] result The result from k-core
*/
void cugraph_k_core_result_free(cugraph_k_core_result_t* result);

/**
* @brief Enumeration for computing core number
*/
Expand All @@ -74,7 +135,7 @@ typedef enum {
* @param [in] degree_type Compute core_number using in, out or both in and out edges
* @param [in] do_expensive_check A flag to run expensive checks for input arguments (if set to
* `true`).
* @param [out] result Opaque pointer to paths results
* @param [out] result Opaque pointer to core number results
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
Expand All @@ -86,6 +147,32 @@ cugraph_error_code_t cugraph_core_number(const cugraph_resource_handle_t* handle
cugraph_core_result_t** result,
cugraph_error_t** error);

/**
* @brief Perform k_core using output from core_number
*
* @param [in] handle Handle for accessing resources
* @param [in] graph Pointer to graph
* @param [in] k The value of k to use
* @param [in] degree_type Compute core_number using in, out or both in and out edges.
* Ignored if core_result is specified.
* @param [in] core_result Result from calling cugraph_core_number, if NULL then
* call core_number inside this function call.
* @param [in] do_expensive_check A flag to run expensive checks for input arguments (if set to
* `true`).
* @param [out] result Opaque pointer to k_core results
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
*/
cugraph_error_code_t cugraph_k_core(const cugraph_resource_handle_t* handle,
cugraph_graph_t* graph,
size_t k,
cugraph_k_core_degree_type_t degree_type,
const cugraph_core_result_t* core_result,
bool_t do_expensive_check,
cugraph_k_core_result_t** result,
cugraph_error_t** error);

#ifdef __cplusplus
}
#endif
39 changes: 39 additions & 0 deletions cpp/src/c_api/array.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,45 @@ size_t data_type_sz[] = {4, 8, 4, 8};
} // namespace c_api
} // namespace cugraph

extern "C" cugraph_error_code_t cugraph_type_erased_device_array_create_from_view(
const cugraph_resource_handle_t* handle,
const cugraph_type_erased_device_array_view_t* view,
cugraph_type_erased_device_array_t** array,
cugraph_error_t** error)
{
*array = nullptr;
*error = nullptr;

try {
if (!handle) {
*error = reinterpret_cast<cugraph_error_t*>(
new cugraph::c_api::cugraph_error_t{"invalid resource handle"});
return CUGRAPH_INVALID_HANDLE;
}

auto p_handle = reinterpret_cast<cugraph::c_api::cugraph_resource_handle_t const*>(handle);
auto internal_pointer =
reinterpret_cast<cugraph::c_api::cugraph_type_erased_device_array_view_t const*>(view);

size_t n_bytes =
internal_pointer->size_ * (cugraph::c_api::data_type_sz[internal_pointer->type_]);

auto ret_value = new cugraph::c_api::cugraph_type_erased_device_array_t(
internal_pointer->size_, n_bytes, internal_pointer->type_, p_handle->handle_->get_stream());

raft::copy(reinterpret_cast<byte_t*>(ret_value->data_.data()),
reinterpret_cast<byte_t const*>(internal_pointer->data_),
internal_pointer->num_bytes(),
p_handle->handle_->get_stream());

*array = reinterpret_cast<cugraph_type_erased_device_array_t*>(ret_value);
return CUGRAPH_SUCCESS;
} catch (std::exception const& ex) {
*error = reinterpret_cast<cugraph_error_t*>(new cugraph::c_api::cugraph_error_t{ex.what()});
return CUGRAPH_UNKNOWN_ERROR;
}
}

extern "C" cugraph_error_code_t cugraph_type_erased_device_array_create(
const cugraph_resource_handle_t* handle,
size_t n_elems,
Expand Down
12 changes: 12 additions & 0 deletions cpp/src/c_api/array.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,18 @@ struct cugraph_type_erased_device_array_t {
{
}

template <typename T>
T* as_type()
{
return reinterpret_cast<T*>(data_.data());
}

template <typename T>
T const * as_type() const
{
return reinterpret_cast<T const*>(data_.data());
}

auto view()
{
return new cugraph_type_erased_device_array_view_t{data_.data(), size_, data_.size(), type_};
Expand Down
66 changes: 66 additions & 0 deletions cpp/src/c_api/core_result.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,39 @@

#include <c_api/core_result.hpp>

extern "C" cugraph_error_code_t cugraph_core_result_create(
const cugraph_resource_handle_t* handle,
cugraph_type_erased_device_array_view_t* vertices,
cugraph_type_erased_device_array_view_t* core_numbers,
cugraph_core_result_t** core_result,
cugraph_error_t** error)
{
cugraph_error_code_t error_code{CUGRAPH_SUCCESS};

cugraph::c_api::cugraph_type_erased_device_array_t* vertices_copy;
cugraph::c_api::cugraph_type_erased_device_array_t* core_numbers_copy;

error_code = cugraph_type_erased_device_array_create_from_view(
handle,
vertices,
reinterpret_cast<cugraph_type_erased_device_array_t**>(&vertices_copy),
error);
if (error_code == CUGRAPH_SUCCESS) {
error_code = cugraph_type_erased_device_array_create_from_view(
handle,
core_numbers,
reinterpret_cast<cugraph_type_erased_device_array_t**>(&core_numbers_copy),
error);

if (error_code == CUGRAPH_SUCCESS) {
auto internal_pointer =
new cugraph::c_api::cugraph_core_result_t{vertices_copy, core_numbers_copy};
*core_result = reinterpret_cast<cugraph_core_result_t*>(internal_pointer);
}
}
return error_code;
}

extern "C" cugraph_type_erased_device_array_view_t* cugraph_core_result_get_vertices(
cugraph_core_result_t* result)
{
Expand All @@ -41,3 +74,36 @@ extern "C" void cugraph_core_result_free(cugraph_core_result_t* result)
delete internal_pointer->core_numbers_;
delete internal_pointer;
}

cugraph_type_erased_device_array_view_t* cugraph_k_core_result_get_src_vertices(
cugraph_k_core_result_t* result)
{
auto internal_pointer = reinterpret_cast<cugraph::c_api::cugraph_k_core_result_t*>(result);
return reinterpret_cast<cugraph_type_erased_device_array_view_t*>(
internal_pointer->src_vertices_->view());
}

cugraph_type_erased_device_array_view_t* cugraph_k_core_result_get_dst_vertices(
cugraph_k_core_result_t* result)
{
auto internal_pointer = reinterpret_cast<cugraph::c_api::cugraph_k_core_result_t*>(result);
return reinterpret_cast<cugraph_type_erased_device_array_view_t*>(
internal_pointer->dst_vertices_->view());
}

cugraph_type_erased_device_array_view_t* cugraph_k_core_result_get_weights(
cugraph_k_core_result_t* result)
{
auto internal_pointer = reinterpret_cast<cugraph::c_api::cugraph_k_core_result_t*>(result);
return reinterpret_cast<cugraph_type_erased_device_array_view_t*>(
internal_pointer->weights_->view());
}

void cugraph_k_core_result_free(cugraph_k_core_result_t* result)
{
auto internal_pointer = reinterpret_cast<cugraph::c_api::cugraph_k_core_result_t*>(result);
delete internal_pointer->src_vertices_;
delete internal_pointer->dst_vertices_;
delete internal_pointer->weights_;
delete internal_pointer;
}
6 changes: 6 additions & 0 deletions cpp/src/c_api/core_result.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,5 +26,11 @@ struct cugraph_core_result_t {
cugraph_type_erased_device_array_t* core_numbers_{};
};

struct cugraph_k_core_result_t {
cugraph_type_erased_device_array_t* src_vertices_{};
cugraph_type_erased_device_array_t* dst_vertices_{};
cugraph_type_erased_device_array_t* weights_{};
};

} // namespace c_api
} // namespace cugraph
Loading