Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C API for creating a graph #1907

Merged

Conversation

ChuckHastings
Copy link
Collaborator

Partially addresses #1906

This PR defines the API for graph creation and the pagerank and bfs calls that we will use to experiment with transposing a graph.

Some notes on the design here.

  1. The intention is that the C API will handle renumbering (when set to true on graph creation). This means that the opaque cugraph_graph_t pointer being populated by cugraph_sg_graph_create will contain the renumbering device vector and that the C API implementation of algorithms (pagerank and bfs demonstrated here) will unrenumber the result before returning
  2. The intention is that the C API will understand whether the algorithm wants store_transposed=true or store_transposed=false and will call the transpose method if required.

@ChuckHastings ChuckHastings requested review from a team as code owners October 26, 2021 20:54
@ChuckHastings ChuckHastings self-assigned this Oct 26, 2021
@ChuckHastings ChuckHastings added 3 - Ready for Review improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 26, 2021
@ChuckHastings ChuckHastings added this to the 21.12 milestone Oct 26, 2021
@codecov-commenter
Copy link

codecov-commenter commented Oct 27, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.12@61b99df). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head ed7b66a differs from pull request most recent head d4a20a7. Consider uploading reports for the commit d4a20a7 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.12    #1907   +/-   ##
===============================================
  Coverage                ?   70.17%           
===============================================
  Files                   ?      143           
  Lines                   ?     8863           
  Branches                ?        0           
===============================================
  Hits                    ?     6220           
  Misses                  ?     2643           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 61b99df...d4a20a7. Read the comment docs.

Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general, and I have few minor comments.

bool has_initial_guess,
bool do_expensive_check,
cugraph_type_erased_device_array_t** vertex_ids,
cugraph_type_erased_device_array_t** pageranks);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these "pointer of pointer"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recent change.

My recent thought (mentioned in the PR description) is to have renumbering hidden in the C API. With renumbering hidden in the C API, my assumption (unvalidated) is that in a multi-GPU mode I might not know a priori the exact size of the vertex_ids and pageranks array on each node. Thus the memory would be allocated internally.

The functions had been returning a cugraph_type_erased_device_array_t *, making "pointer of pointer" was the easiest way to accommodate this change.

I could make the function require them to be preallocated as input, but you couldn't allocate them on the stack (as currently implemented) because the structure is opaque. They would have to be allocated with the cugraph_type_erased_device_array_create function and then passed in. If the shape of the structure changed (because of renumbering in an MNMG context) they could then be changed in place.

This felt more "C"-like for the pointer to memory allocated inside the function.

* @param [out] pageranks Returns device pointer to pagerank scores
* @return error code
*/
cugraph_error_t pagerank(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we discussed about returning auxiliary information. Not a task for this PR, but we may consider returning something like pagerank_aux_info_t as a wrapper of cugraph_error_t to include additional information later.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning toward having all functions either return void or cugraph_error_t - for consistency.

We could pass a pointer to a pagerank_aux_info_t as a parameter for updating auxiliary information.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments to the HITS PR.

That makes sense to me. I'll update the algorithm APIs to reflect this notion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed an update that handles this.

cpp/src/c_api/array.cpp Outdated Show resolved Hide resolved
cpp/src/c_api/graph_mg.cpp Outdated Show resolved Hide resolved
cpp/src/c_api/graph_sg.cpp Outdated Show resolved Hide resolved
cpp/CMakeLists.txt Show resolved Hide resolved
Copy link
Contributor

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I just had a few suggestions.

cpp/include/cugraph_c/algorithms.h Outdated Show resolved Hide resolved
cpp/include/cugraph_c/array.h Show resolved Hide resolved
cpp/src/c_api/array.cpp Show resolved Hide resolved
cpp/tests/c_api/c_test_utils.h Show resolved Hide resolved
cpp/include/cugraph_c/cugraph_api.h Outdated Show resolved Hide resolved
cpp/include/cugraph_c/cugraph_api.h Outdated Show resolved Hide resolved
@ChuckHastings
Copy link
Collaborator Author

rerun tests

@BradReesWork
Copy link
Member

@gpucibot merge

@ChuckHastings
Copy link
Collaborator Author

rerun tests

1 similar comment
@rlratzel
Copy link
Contributor

rlratzel commented Nov 4, 2021

rerun tests

@rapids-bot rapids-bot bot merged commit 61e8bad into rapidsai:branch-21.12 Nov 5, 2021
@ChuckHastings ChuckHastings deleted the fea_c_api_create_graph branch February 1, 2022 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants