[FEA]: Topic: SG Egograph (Batched vs. Single Vertex) #4191

nv-rliu · 2024-02-23T22:15:41Z

Is this a new feature, an improvement, or a change to existing functionality?

Change

How would you describe the priority of this feature request

High

Please provide a clear description of problem this feature solves

Upon closer examination of Egonet.py, it has been discovered that there are a few inconsistencies with the behavior of ego_graph and batched_ego_graph based on what the documentation says.

According to the docs, the ego_graph function is supposed to compute the induced sub-graph of neighbors centered at a single node, n, and return the sub-graph. The batched_ego_graph function (in congruence with its design in the legacy API) is supposed to accept a List of n values, or seeds, and compute each induced sub-graph in "batch" mode (parallel).

However, currently both Python functions just call the same PLC algorithm, which raises two concerns:

The PLC function doesn't actually run ego_graph in "batch" mode like it's designed to
The single ego_graph function can be called with multiple n values and returns a graph that doesn't specify the cut-offs (offsets).

Describe your ideal solution

The proposed solution for the Python API is to combined both functions into a single ego_graph function, which will recognize when it is given a single n or multiple seeds to compute.

Down the line, another parameter can be added which allows for proper "batch" mode, similar to how cugraph handles batched betweenness centrality.

def ego_graph(
    G,
    n,
    radius=1,
    center=True,
    undirected=None,
    distance=None,
    batch_mode=False
):

New Parameters
    ----------
    n : integer or List, cudf.Series, cudf.DataFrame
    < update docstring to specify that it can compute a single n or multiple seeds >

    batch_mode : Boolean
    Can be set to True or False

Returns
    ----------
A Graph or Edge Lists with Offsets

Tasks:

current batched_ego_graphs algorithm can be given a DeprecationWarning.
Add true batch-mode support to ego_graph

Describe any alternatives you have considered

The two functions could also remain the same, but then ego_graph would need to not allow multiple seeds in order to be differentiated from batched_ego_graph, aka, multiple ego_graph calls.

Additional context

This was discovered while looking into this bug on the MG implementation of ego_graph

Further context may be included in the discussion down below.

Code of Conduct

I agree to follow cuGraph's Code of Conduct
I have searched the open feature requests and have found no duplicates for this feature request

The text was updated successfully, but these errors were encountered:

This PR addresses #4191 Since the plan is to add "batched" support to the regular `ego_graph` method, a `DeprecationWarning` has been added to `batched_ego_graphs`. -- Minor change: use proper indexing when accessing a pd.Series value. Authors: - Ralph Liu (https://github.com/nv-rliu) Approvers: - Rick Ratzel (https://github.com/rlratzel) URL: #4209

ChuckHastings · 2024-05-22T23:54:31Z

I'm not sure this analysis is correct - or I'm misinterpreting what your asserting.

Looking here:

cugraph/python/pylibcugraph/pylibcugraph/egonet.pyx

Line 89 in ddfaacf

A tuple of device arrays containing the sources, destinations, edge_weights

It appears to me that the PLC function ego_graph operates on multiple seeds and does return the offsets array that allows the decomposition of the resulting graphs into separate graphs.

I do see that the python code is ignoring the offsets value in the ego_graph call and using it in the deprecated batched_ego_graph call. I see several TODO items in the current python code that are describing some of the issues you see here.

The consequence of ignoring the offsets value in ego_graph is that if you pass multiple seeds to ego_graph, the result will be a single graph that is the composite of all of the resulting graphs, which is incorrect behavior.

So I believe this is entirely a python issue to resolve. Although I'll investigate further if we can identify a PLC call that is generating an answer other than what is expected/required to handle this correctly.

nv-rliu added feature request New feature or request ? - Needs Triage Need team to review and classify labels Feb 23, 2024

nv-rliu added this to the 24.04 milestone Feb 23, 2024

nv-rliu assigned nv-rliu, rlratzel, ChuckHastings and jnke2016 Feb 23, 2024

BradReesWork removed the ? - Needs Triage Need team to review and classify label Feb 26, 2024

nv-rliu mentioned this issue Mar 1, 2024

Add DeprecationWarning to batched_ego_graphs #4209

Merged

BradReesWork modified the milestones: 24.04, 24.12 Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA]: Topic: SG Egograph (Batched vs. Single Vertex) #4191

[FEA]: Topic: SG Egograph (Batched vs. Single Vertex) #4191

nv-rliu commented Feb 23, 2024 •

edited

Loading

ChuckHastings commented May 22, 2024 •

edited

Loading

[FEA]: Topic: SG Egograph (Batched vs. Single Vertex) #4191

[FEA]: Topic: SG Egograph (Batched vs. Single Vertex) #4191

Comments

nv-rliu commented Feb 23, 2024 • edited Loading

Is this a new feature, an improvement, or a change to existing functionality?

How would you describe the priority of this feature request

Please provide a clear description of problem this feature solves

Describe your ideal solution

Describe any alternatives you have considered

Additional context

Code of Conduct

ChuckHastings commented May 22, 2024 • edited Loading

nv-rliu commented Feb 23, 2024 •

edited

Loading

ChuckHastings commented May 22, 2024 •

edited

Loading