Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define and Implement C API for biased sampling #4535

Merged
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -652,7 +652,7 @@ add_library(cugraph_c
src/c_api/lookup_src_dst.cpp
src/c_api/louvain.cpp
src/c_api/triangle_count.cpp
src/c_api/uniform_neighbor_sampling.cpp
src/c_api/neighbor_sampling.cpp
src/c_api/labeling_result.cpp
src/c_api/weakly_connected_components.cpp
src/c_api/strongly_connected_components.cpp
Expand Down
146 changes: 146 additions & 0 deletions cpp/include/cugraph_c/properties.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
/*
* Copyright (c) 2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#pragma once

//
// Speculative description of handling generic vertex and edge properties.
//
// If we have vertex properties and edge properties that we want to apply to an existing graph
// (after it was created) we could use these methods to construct C++ objects to represent these
// properties.
//
// These assume the use of external vertex ids and external edge ids as the mechanism for
// correlating a property to a particular vertex or edge.
//

#include <cugraph_c/resource_handle.h>

#ifdef __cplusplus
extern "C" {
#endif

typedef struct {
int32_t align_;
} cugraph_vertex_property_t;

typedef struct {
int32_t align_;
} cugraph_edge_property_t;

typedef struct {
int32_t align_;
} cugraph_vertex_property_view_t;

typedef struct {
int32_t align_;
} cugraph_edge_property_view_t;

#if 0
// Blocking out definition of these since this is speculative work.

/**
* @brief Create a vertex property
*
* @param [in] handle Handle for accessing resources
* @param [in] graph Pointer to graph.
* @param [in] vertex_ids Device array of vertex ids
* @param [in] property Device array of vertex property
* @param [out] result Pointer to the location to store the pointer to the vertex property object
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
*/
cugraph_error_code_t cugraph_vertex_property_create(
const cugraph_resource_handle_t* handle,
const cugraph_graph_t * graph,
const cugraph_type_erased_device_array_t* vertex_ids,
const cugraph_type_erased_device_array_t* properties,
cugraph_vertex_property_t** result,
cugraph_error_t** error);

/**
* @brief Create a edge property
*
* @param [in] handle Handle for accessing resources
* @param [in] graph Pointer to graph.
* @param [in] lookup_container Lookup map
* @param [in] edge_ids Device array of edge ids
* @param [in] property Device array of edge property
* @param [out] result Pointer to the location to store the pointer to the edge property object
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
*/
cugraph_error_code_t cugraph_edge_property_create(
const cugraph_resource_handle_t* handle,
const cugraph_graph_t * graph,
const cugraph_lookup_container_t* lookup_container,
const cugraph_type_erased_device_array_t* edge_ids,
const cugraph_type_erased_device_array_t* properties,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if edge IDs are provided only for a subset of edges? Will those edges have undefined values? Should we ask for default values?

And no need for an update function? (e.g. update bias values only for a small subset of edges).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasonable ideas. I'll look at defining update functions and a mechanism for a default value. I'll add it to both vertex and edge functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just fyi. In the C++ primitives level, we have fill_edge_property_value (which sets edge property values for the entire set of edges, we may add a function that works on a subset of the edges in the future) and update_edge_property function that works either on a subset of the edges or the entire set of edges. Following this may make implementing the API easier.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed an update that adds these functions. Plan is to implement them at some point in the future.

cugraph_edge_property_t** result,
cugraph_error_t** error);

/**
* @brief Create a vertex_property_view from a vertex property
*
* @param [in] vertex_property Pointer to the vertex property object
* @return Pointer to the view of the host array
*/
cugraph_vertex_property_view_t* cugraph_vertex_property_view(
cugraph_vertex_property_view* vertex_property);

/**
* @brief Create a edge_property_view from a edge property
*
* @param [in] edge_property Pointer to the edge property object
* @return Pointer to the view of the host array
*/
cugraph_edge_property_view_t* cugraph_edge_property_view(
cugraph_edge_property_view* edge_property);

/**
* @brief Destroy a vertex_property object
*
* @param [in] p Pointer to the vertex_property object
*/
void cugraph_vertex_property_free(cugraph_vertex_property_t* p);

/**
* @brief Destroy a edge_property object
*
* @param [in] p Pointer to the edge_property object
*/
void cugraph_edge_property_free(cugraph_edge_property_t* p);

/**
* @brief Destroy a vertex_property_view object
*
* @param [in] p Pointer to the vertex_property_view object
*/
void cugraph_vertex_property_view_free(cugraph_vertex_property__viewt* p);

/**
* @brief Destroy a edge_property_view object
*
* @param [in] p Pointer to the edge_property_view object
*/
void cugraph_edge_property_view_free(cugraph_edge_property_view_t* p);
#endif

#ifdef __cplusplus
}
#endif
60 changes: 60 additions & 0 deletions cpp/include/cugraph_c/sampling_algorithms.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@

#include <cugraph_c/error.h>
#include <cugraph_c/graph.h>
#include <cugraph_c/properties.h>
#include <cugraph_c/random.h>
#include <cugraph_c/resource_handle.h>

Expand Down Expand Up @@ -373,6 +374,65 @@ cugraph_error_code_t cugraph_uniform_neighbor_sample(
cugraph_sample_result_t** result,
cugraph_error_t** error);

/**
* @brief Biased Neighborhood Sampling
*
* Returns a sample of the neighborhood around specified start vertices. Optionally, each
* start vertex can be associated with a label, allowing the caller to specify multiple batches
* of sampling requests in the same function call - which should improve GPU utilization.
*
* If label is NULL then all start vertices will be considered part of the same batch and the
* return value will not have a label column.
*
* @param [in] handle Handle for accessing resources
* @param [in] graph Pointer to graph. NOTE: Graph might be modified if the storage
* needs to be transposed
* @param [in] edge_biases Device array of edge biases to use for sampling. If NULL
* use the edge weight as the bias. NOTE: This is a placeholder for future capability, the
* value for edge_biases should always be set to NULL at the moment.
* @param [in] start_vertices Device array of start vertices for the sampling
* @param [in] start_vertex_labels Device array of start vertex labels for the sampling. The
* labels associated with each start vertex will be included in the output associated with results
* that were derived from that start vertex. We only support label of type INT32. If label is
* NULL, the return data will not be labeled.
* @param [in] label_list Device array of the labels included in @p start_vertex_labels. If
* @p label_to_comm_rank is not specified this parameter is ignored. If specified, label_list
* must be sorted in ascending order.
* @param [in] label_to_comm_rank Device array identifying which comm rank the output for a
* particular label should be shuffled in the output. If not specifed the data is not organized in
* output. If specified then the all data from @p label_list[i] will be shuffled to rank @p. This
* cannot be specified unless @p start_vertex_labels is also specified
* label_to_comm_rank[i]. If not specified then the output data will not be shuffled between ranks.
* @param [in] label_offsets Device array of the offsets for each label in the seed list. This
* parameter is only used with the retain_seeds option.
* @param [in] fanout Host array defining the fan out at each step in the sampling algorithm.
* We only support fanout values of type INT32
* @param [in,out] rng_state State of the random number generator, updated with each call
* @param [in] sampling_options
* Opaque pointer defining the sampling options.
* @param [in] do_expensive_check
* A flag to run expensive checks for input arguments (if set to true)
* @param [out] result Output from the uniform_neighbor_sample call
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
*/
cugraph_error_code_t cugraph_biased_neighbor_sample(
const cugraph_resource_handle_t* handle,
cugraph_graph_t* graph,
const cugraph_edge_property_view_t* edge_biases,
const cugraph_type_erased_device_array_view_t* start_vertices,
const cugraph_type_erased_device_array_view_t* start_vertex_labels,
const cugraph_type_erased_device_array_view_t* label_list,
const cugraph_type_erased_device_array_view_t* label_to_comm_rank,
const cugraph_type_erased_device_array_view_t* label_offsets,
const cugraph_type_erased_host_array_view_t* fan_out,
cugraph_rng_state_t* rng_state,
const cugraph_sampling_options_t* options,
bool_t do_expensive_check,
cugraph_sample_result_t** result,
cugraph_error_t** error);

/**
* @deprecated This call should be replaced with cugraph_sample_result_get_majors
* @brief Get the source vertices from the sampling algorithm result
Expand Down
Loading
Loading