Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement induced subgraph extraction (SG C++) #1354

Merged
merged 28 commits into from
Jan 26, 2021

Conversation

seunghwak
Copy link
Contributor

@seunghwak seunghwak commented Jan 22, 2021

Close #1323

  • Add extract_induced_subgraphs()
  • Add C++ tests for extract_induced_subgraphs()

@seunghwak seunghwak added non-breaking Non-breaking change feature request New feature or request labels Jan 22, 2021
Copy link
Member

@afender afender left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to see some perf comparison against the COO one and profile.
I am particularly curious about how parts like the big thrust::for_each and its nested thrust calls play out.

subgraph_edge_offsets.begin());

CUDA_TRY(
cudaStreamSynchronize(handle.get_stream())); // subgraph_vertex_output_offsets will become
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we doing cudaStreamSynchronize here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially thought subgraph_vertex_output_offsets (
https://github.com/rapidsai/cugraph/pull/1354/files/038d2d782363c87ede499d0e8d441e4d091a3dac#diff-dfa26d738fa79bcd814644900db1536e42986eaebed75714da3dad8f993381e4R128)
will become out-of-scope once this function returns; then the memory can be reclaimed and be used elsewhere; but operations using subgraph_vertex_output_offsets will not finish till the stream synchronizes; if this happens, it can lead to undefined behaviors.

Need to double check, but this may not be true.
deallocate() is submitted using a stream (
https://github.com/rapidsai/rmm/blob/branch-0.18/include/rmm/device_buffer.hpp#L422); so actual memory reclamation may not happen till all the previous operations on the stream finishes. I will double check this with RMM folks and make a fix if this adds unnecessary stream synchronization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this is not necessary, I will remove this (and I did something like this in many places... so I need to remove all those).


// explicit instantiation

template std::tuple<rmm::device_uvector<int32_t>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add 64b edges and double weights instantiations? I already accounted for these in egonet.
It should be just about instantiating them right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so what type combinations do you need?; we need to include all the types we actually use (for the obvious reason) but better avoid instantiating for any types we don't use (as this increases compile-time and binary size).

for (vertex_t, edge_t, weight_t) triplets, (int32_t, int32_t, float), (int32_t, int64_t, float), (int64_t, int64_t, float) must be instantiated I guess, and are we actually using double weights?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have instantiated these in Egonet: https://github.com/afender/cugraph/blob/7844fa4c35500fb85d9f38a9b2f74d640684fc9b/cpp/src/community/egonet.cu#L128
For the double weights, it is a good discussion. I don't think it is motivated by this algo but since the graph and other algos accept it, we should probably instantiate it here as well.

@@ -243,5 +243,49 @@ void relabel(raft::handle_t const& handle,
vertex_t num_labels,
bool do_expensive_check = false);

/**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you pick what goes in algorithms.hpp and what goes in graph_functions.hpp?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the current guideline (for myself) is to place graph analytics (e.g. PageRank, BFS, ...) in the existing algorithms.hpp while providing operations on graphs (but that does not modify the graph object, ones modifying the graph object needs to be a member function) in the graph_functions.hpp.

Do you have better suggestions for header file namings and where to put which?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see how there's a thin and somewhat subjective boundary between graph analytics algos and operations on a graph that do not modify it. We should identify if there's a strong benefit to C++ API users (since it is an exposed header) and go from there.


matrix_partition_device_t<graph_view_t<vertex_t, edge_t, weight_t, store_transposed, multi_gpu>>
matrix_partition(graph_view, 0);
thrust::transform(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two large thrust transforms calls in this file could come with some more explanation to facilitate future maintenance.

@codecov-io
Copy link

codecov-io commented Jan 22, 2021

Codecov Report

Merging #1354 (a46f863) into branch-0.18 (2fb0725) will increase coverage by 0.33%.
The diff coverage is 57.84%.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.18    #1354      +/-   ##
===============================================
+ Coverage        60.38%   60.71%   +0.33%     
===============================================
  Files               67       67              
  Lines             3029     3060      +31     
===============================================
+ Hits              1829     1858      +29     
- Misses            1200     1202       +2     
Impacted Files Coverage Δ
python/cugraph/centrality/__init__.py 100.00% <ø> (ø)
python/cugraph/comms/comms.py 34.52% <25.00%> (ø)
python/cugraph/dask/common/input_utils.py 23.07% <28.57%> (+1.14%) ⬆️
python/cugraph/utilities/utils.py 67.18% <35.71%> (-4.37%) ⬇️
python/cugraph/dask/common/mg_utils.py 37.50% <38.09%> (-2.50%) ⬇️
python/cugraph/community/spectral_clustering.py 72.54% <38.46%> (-11.67%) ⬇️
python/cugraph/structure/number_map.py 58.12% <50.00%> (+2.16%) ⬆️
python/cugraph/structure/graph.py 68.75% <76.47%> (+1.95%) ⬆️
python/cugraph/__init__.py 100.00% <100.00%> (ø)
...ython/cugraph/centrality/betweenness_centrality.py 100.00% <100.00%> (ø)
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 20d2a5b...a46f863. Read the comment docs.

@seunghwak
Copy link
Contributor Author

seunghwak commented Jan 25, 2021

Would be great to see some perf comparison against the COO one and profile.
I am particularly curious about how parts like the big thrust::for_each and its nested thrust calls play out.

I haven't rigorously compared the performance with the approach scanning the entire edge list, but extracting three unweighted subgraphs of size (300, 20, 400) vertices from lournal-2008.mtx took 1.7 ms and three weighted subgraphs of size (9130, 1200, 300) vertices from ljournal-2008.mtx took 4.5 ms, so this is running faster with smaller sub-graphs (I assume it will take longer if we scan the entire set of edges).

The biggest performance issue with the current implementation is dealing with power-law graphs with wide variations in vertex degrees, but this is a recurring issue in many implementations in the experimental space, and I plan to address these all at once in a separate PR.

And let me know if this becomes a performance bottleneck in your egonet testing.

@seunghwak
Copy link
Contributor Author

@afender I think I addressed all your comments, but let me know if you have any remaining concerns.

@BradReesWork BradReesWork added this to the 0.18 milestone Jan 26, 2021
@BradReesWork BradReesWork merged commit 9820990 into rapidsai:branch-0.18 Jan 26, 2021
@seunghwak seunghwak deleted the fea_induced_subgraph branch June 24, 2021 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] subgraph extraction
6 participants