Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Subgraph extraction python bindings #246

Merged
merged 11 commits into from
May 3, 2019

Conversation

jwyles
Copy link
Contributor

@jwyles jwyles commented Apr 25, 2019

Adding in the python bindings for subgraph extraction after the fix in nvgraph was put in. Closes #60

@BradReesWork BradReesWork changed the title Subgraph extraction python bindings [REVIEW] Subgraph extraction python bindings Apr 26, 2019
Copy link
Member

@afender afender left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build and Python style check should pass
Changelog should be updated

@ChuckHastings
Copy link
Collaborator

Looks good if you can make make the checks pass.

@ChuckHastings
Copy link
Collaborator

rerun tests

@ChuckHastings
Copy link
Collaborator

rerun tests


cpdef subgraph(G, vertices):
"""
Compute a clustering/partitioning of the given graph using the spectral balanced
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to be updated to reflect subgraph extraction

Parameters
----------
G : cuGraph.Graph
cuGraph graph descriptor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it work both on directed and undirected graphs? We may re-use nvgraph doc
https://docs.nvidia.com/cuda/nvgraph/index.html#function-nvgraphextractsubgraphbyvertex

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it works for both directed and undirected graphs. It doesn't actually traverse the edges, so direction is immaterial.

@BradReesWork BradReesWork requested a review from afender April 30, 2019 17:32
@BradReesWork BradReesWork requested a review from seunghwak May 2, 2019 16:20
* @param vertices Pointer to GDF column object which contains the list of vertices to extract
* @return Error code
*/
gdf_error gdf_extract_subgraph_vertex_nvgraph(gdf_graph* gdf_G,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So as I long as I understand based on comments,

gdf_graph* gdf_g and gdf_column* vertices are input parameters and gdf_graph* result is an output parameter. A function input parameter list having (input, output, input) is unorthodox. It seems like other functions in this file places input parameters before output parameters in the function input parameter lists, we should follow that convention. Also, if this function is not modifying gdf_G and vertices, we should enforce "const correctness".

So the input parameter should be
(const gdf_graph* gdf_G, const gdf_column* vertices, gdf_graph* result).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the order.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't make this const gdf_graph* gdf_G. The graph can be changed if there is no CSR representation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing vertices to const makes sense. However, that would require changing the implementation in nvgraph. It seems to me like this is more than we want to address now. In fact, we probably should talk about how much we want to change nvgraph other than bug fixes.

@@ -515,6 +515,69 @@ gdf_error gdf_AnalyzeClustering_ratio_cut_nvgraph(gdf_graph* gdf_G,
return GDF_SUCCESS;
}


gdf_error gdf_extract_subgraph_vertex_nvgraph(gdf_graph* gdf_G,
gdf_graph* result,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation is weird, here. The input parameter list should be updated to match the suggested changes in the header file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converted tabs to spaces in the file. Should address the weird indentation.


// Call Nvgraph function to get subgraph (into nv_result descriptor)
NVG_TRY(nvgraphExtractSubgraphByVertex( nvg_handle,
nvgraph_G,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird indentation.


// Call Nvgraph function to get subgraph (into nv_result descriptor)
NVG_TRY(nvgraphExtractSubgraphByVertex( nvg_handle,
nvgraph_G,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird indentation

nvgraphTopologyType_t TT = NVGRAPH_CSR_32;
NVG_TRY(nvgraphGetGraphStructure(nvg_handle, nv_result, (void*)&topo, &TT));
if (TT != NVGRAPH_CSR_32)
return GDF_C_ERROR;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this is not a request to make change, but it seems like you're using tab for indentation. cugraph mostly uses 2 space or 4 space for indentation in C++. We really need a rule for this, but right at this moment, I am not sure I should ask you to use either 2 space or 4 space... so you may just keep using tab or switch to 2 space or 4 space :-(

result->adjList->indices = new gdf_column;
result->adjList->ownership = 0;
int *offsets, *indices;
CUDA_TRY(cudaMallocManaged((void**)&offsets, sizeof(int) * (num_verts + 1)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not directly invoke cudaMalloc/cudaMallocManaged. We are allocating memory through RMM, so we need to use RMM_ALLOC instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use RMM

NVG_TRY(nvgraphCreateGraphDescr(nvg_handle, &nv_result));

// Call Nvgraph function to get subgraph (into nv_result descriptor)
NVG_TRY(nvgraphExtractSubgraphByVertex( nvg_handle,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to complain about this trivial thing... but in function calls, cugraph code base (mostly) does not place space after opening parenthesis. So better be "(nvg_handle" instead of "( nvg_handle"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

GDF_REQUIRE(!vertices->valid, GDF_VALIDITY_UNSUPPORTED);

// Initialize Nvgraph and wrap the graph
nvgraphHandle_t nvg_handle = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and is there really a reason we are mixing nv_, nvg_, and nvgraph_ in variable names (e.g. nv_result, nvg_handle, and nvgraph_G)? I am guessing all three are for nvgraph data type variables.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed everything to use nvg_

result->adjList = new gdf_adj_list;
result->adjList->offsets = new gdf_column;
result->adjList->indices = new gdf_column;
result->adjList->ownership = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ownership is set to 0, who will free the memory allocated in line 560 and 561?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output of the analytic is a graph. This space is owned by the graph, the user will need to delete the graph when they are done with it.

import numpy as np
import networkx as nx
from scipy.io import mmread

Copy link
Contributor

@seunghwak seunghwak May 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you have a strong reason to import in this order, we may better follow import orders in other files.

Import python standard modules first, then thirdparty modules, then rapids modules.

We may alphabetically sort within a group. And if NetworkX folks hasn't fixed this, importing networkx generates deprecation warnings and to avoid this, we need to do

# Temporarily suppress warnings till networkX fixes deprecation warnings
# (Using or importing the ABCs from 'collections' instead of from
# 'collections.abc' is deprecated, and in 3.8 it will stop working) for
# python 3.7.  Also, this import networkx needs to be relocated in the
# third-party group once this gets fixed.
import warnings
with warnings.catch_warnings():
    warnings.filterwarnings("ignore", category=DeprecationWarning)
    import networkx as nx

See other test files (e.g. test_sssp.py) for an example.


def compareEdges(cg, nxg, verts):
src, dest = cg.view_edge_list()
if (len(src) != nxg.size()):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not

assert len(src) == nxg.size()

?

Similarly in line 27 & 28,

assert nxg.has_edge(verts[src[i]], verts[dest[i]]) == True

Looks clearer to me.


def nx_call(M, verts):
G = nx.DiGraph(M)
Sg = nx.subgraph(G, verts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the following for PEP8 naming rules.

https://www.python.org/dev/peps/pep-0008/#function-and-variable-names

Sg and cu_Sg violates

"Function names should be lowercase, with words separated by underscores as necessary to improve readability.

Variable names follow the same convention as function names."

unless "mixedCase is allowed only in contexts where that's already the prevailing style (e.g. threading.py), to retain backwards compatibility." and I don't think mixed case is already the prevailing style in cugraph.


@pytest.mark.parametrize('graph_file', datasets)
def test_subgraph_extraction(graph_file):
M = ReadMtxFile(graph_file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadMtxFile also violates the PEP8 naming rule.

verts[2] = 17
cu_Sg = cugraph_call(M, verts)
nx_Sg = nx_call(M, verts)
assert compareEdges(cu_Sg, nx_Sg, verts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compareEdges also violate the PEP8 naming rule.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed these naming issues

@afender afender merged commit ac6fb3a into rapidsai:branch-0.7 May 3, 2019
ChuckHastings pushed a commit to ChuckHastings/cugraph that referenced this pull request Jun 14, 2021
After the merge of rapidsai#232, a few different tests failed in rapidsai/cuml#3891, given the timing I think it'd be best to target 232 (again) to 21.08 after triaging the issues.

Authors:
  - Dante Gama Dessavre (https://github.com/dantegd)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Divye Gala (https://github.com/divyegala)
  - Brad Rees (https://github.com/BradReesWork)

URL: rapidsai/raft#246
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] [NV] Fix and Add Python bindings for Subgraph Extraction
5 participants