Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Broken graph methods #3766

Closed
2 tasks done
jnke2016 opened this issue Aug 3, 2023 · 0 comments · Fixed by #3757
Closed
2 tasks done

[BUG]: Broken graph methods #3766

jnke2016 opened this issue Aug 3, 2023 · 0 comments · Fixed by #3757
Assignees
Labels
? - Needs Triage Need team to review and classify bug Something isn't working

Comments

@jnke2016
Copy link
Contributor

jnke2016 commented Aug 3, 2023

Version

23.08

Which installation method(s) does this occur on?

Docker, Conda, Pip, Source

Describe the bug.

Several graph methods in both the SG and MG API are broken, some due to the migration away from cython.cu renumbering which include:

  1. view_edge_list() and edges() returns internal column names for SG.
  2. view_edge_list() doesn't return the upper triangular part when creating and MG graph and return internal column names when the vertices were renumbered.
  3. number_of_nodes() and number_of_vertices() for MG when the graph is un-renumbered or has string vertices.
  4. number_of_edges() for MG when the graph is undirected.

Minimum reproducible example

def get_graph_0(directed):
      df = cudf.DataFrame()
      df["source"] = cudf.Series(["a", "b", "x", "f", "b", "g"], dtype=str)
      df["target"] = cudf.Series(["f", "g", "a", "a", "v", "h"], dtype=str)
      df["value"] = cudf.Series([1.0, 1.0, 1.0, 1.0, 1.0, 1.0], dtype="float32")
  
      ddf = dask_cudf.from_cudf(df, npartitions=2)
  
      mG = cugraph.Graph(directed=directed)
  
      mG.from_dask_cudf_edgelist(ddf, source='source', destination='target', weight='value', renumber=True)
      return mG

def get_graph_1(directed):
      df = cudf.DataFrame()
      df["src_0"] = [1, 2, 3, 4, 5]
      df["src_1"] = [2, 1, 5, 7, 8]
      df["dst_0"] = [4, 1, 5, 6, 7]
      df["dst_1"] = [9, 4, 10, 5, 7]
      df["value"] = [1.0, 1.0, 1.0, 1.0, 1.0]
  
      ddf = dask_cudf.from_cudf(df, npartitions=2)
  
      sG = cugraph.Graph(directed=directed)
      mG = cugraph.Graph(directed=directed)

      sG.from_cudf_edgelist(df, source=["src_0", "src_1"], destination=["dst_0", "dst_1"], weight='value', renumber=True)
      mG.from_dask_cudf_edgelist(ddf, source=["src_0", "src_1"], destination=["dst_0", "dst_1"], weight='value', renumber=True)
   
       return sG, mG

if __name__ == "__main__":
    directed = False
    mG = get_graph(directed)
    # Reproducer 1: Getting the number of nodes from a graph having string vertices
    mG.number_of_nodes()
    # Reproducer 2: The number of edges must be 5 because only the upper triangular part must be 
    # returned. Instead, it returns 10
    assert mG.number_of_edges() == 5


if __name__ == "__main__":
    directed = False
    sG, mG = get_graph(directed)
    # Reproducer 3: Retrieving the edge list view from the SG graph
    sG.view_edge_list()
    # Reproducer 4: Retrieving the edge list view from the MG graph
    mG.view_edge_list() #  Returns an edge list with the renumbered vertices instead of the user input dataframe
    

Relevant log output

Reproducer 1 log
File "/home/nfs/jnke/miniconda3/envs/branch23.08_/lib/python3.10/site-packages/cugraph/structure/graph_implementation/simpleDistributedGraph.py", line 431, in number_of_nodes
    return self.number_of_vertices()
  File "/home/nfs/jnke/miniconda3/envs/branch23.08_/lib/python3.10/site-packages/cugraph/structure/graph_implementation/simpleDistributedGraph.py", line 422, in number_of_vertices
    self.properties.node_count = ddf.max().max().compute() + 1
  File "/home/nfs/jnke/miniconda3/envs/branch23.08_/lib/python3.10/site-packages/cudf/utils/utils.py", line 232, in __getattr__
    raise AttributeError(
AttributeError: DataFrame object has no attribute infer_objects
Exception ignored in: <function Comms.__del__ at 0x7f89a0ed1990>


Reproducer 3 log
  File "/home/nfs/jnke/miniconda3/envs/branch23.08_/lib/python3.10/site-packages/cugraph/structure/graph_implementation/simpleGraph.py", line 421, in view_edge_list
    edgelist_df[simpleGraphImpl.srcCol]
  File "/home/nfs/jnke/miniconda3/envs/branch23.08_/lib/python3.10/site-packages/nvtx/nvtx.py", line 101, in inner
    result = func(*args, **kwargs)
  File "/home/nfs/jnke/miniconda3/envs/branch23.08_/lib/python3.10/site-packages/cudf/core/dataframe.py", line 1159, in __getitem__
    return self._get_columns_by_label(arg, downcast=True)
  File "/home/nfs/jnke/miniconda3/envs/branch23.08_/lib/python3.10/site-packages/nvtx/nvtx.py", line 101, in inner
    result = func(*args, **kwargs)
  File "/home/nfs/jnke/miniconda3/envs/branch23.08_/lib/python3.10/site-packages/cudf/core/dataframe.py", line 1792, in _get_columns_by_label
    new_data = super()._get_columns_by_label(labels, downcast)
  File "/home/nfs/jnke/miniconda3/envs/branch23.08_/lib/python3.10/site-packages/nvtx/nvtx.py", line 101, in inner
    result = func(*args, **kwargs)
  File "/home/nfs/jnke/miniconda3/envs/branch23.08_/lib/python3.10/site-packages/cudf/core/frame.py", line 425, in _get_columns_by_label
    return self._data.select_by_label(labels)
  File "/home/nfs/jnke/miniconda3/envs/branch23.08_/lib/python3.10/site-packages/cudf/core/column_accessor.py", line 357, in select_by_label
    return self._select_by_label_grouped(key)
  File "/home/nfs/jnke/miniconda3/envs/branch23.08_/lib/python3.10/site-packages/cudf/core/column_accessor.py", line 512, in _select_by_label_grouped
    result = self._grouped_data[key]
KeyError: 'src'

Code of Conduct

  • I agree to follow cuGraph's Code of Conduct
  • I have searched the open bugs and have found no duplicates for this bug report
@jnke2016 jnke2016 added ? - Needs Triage Need team to review and classify bug Something isn't working labels Aug 3, 2023
@jnke2016 jnke2016 changed the title [BUG]: Broken cugraph methods [BUG]: Broken graph methods Aug 3, 2023
@rapids-bot rapids-bot bot closed this as completed in #3757 Aug 4, 2023
rapids-bot bot pushed a commit that referenced this issue Aug 4, 2023
Several graph methods are failing, some being an effect of migrating away from cython.cu renumbering.
This PR fixes couple graph methods and fixes the inconsistency in results returned by the SG and MG API


closes #3740 
closes #3766

Authors:
  - Joseph Nke (https://github.com/jnke2016)

Approvers:
  - Brad Rees (https://github.com/BradReesWork)
  - Rick Ratzel (https://github.com/rlratzel)

URL: #3757
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant