You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is this a new feature, an improvement, or a change to existing functionality?
Change
How would you describe the priority of this feature request
High
Please provide a clear description of problem this feature solves
Upon closer examination of Egonet.py, it has been discovered that there are a few inconsistencies with the behavior of ego_graph and batched_ego_graph based on what the documentation says.
According to the docs, the ego_graph function is supposed to compute the induced sub-graph of neighbors centered at a single node, n, and return the sub-graph. The batched_ego_graph function (in congruence with its design in the legacy API) is supposed to accept a List of n values, or seeds, and compute each induced sub-graph in "batch" mode (parallel).
However, currently both Python functions just call the same PLC algorithm, which raises two concerns:
The PLC function doesn't actually run ego_graph in "batch" mode like it's designed to
The single ego_graph function can be called with multiple n values and returns a graph that doesn't specify the cut-offs (offsets).
Describe your ideal solution
The proposed solution for the Python API is to combined both functions into a single ego_graph function, which will recognize when it is given a single n or multiple seeds to compute.
Down the line, another parameter can be added which allows for proper "batch" mode, similar to how cugraph handles batched betweenness centrality.
def ego_graph(
G,
n,
radius=1,
center=True,
undirected=None,
distance=None,
batch_mode=False
):
New Parameters
----------
n : integer or List, cudf.Series, cudf.DataFrame
< update docstring to specify that it can compute a single n or multiple seeds >
batch_mode : Boolean
Can be set to True or False
Returns
----------
A Graph or Edge Lists with Offsets
Tasks:
current batched_ego_graphs algorithm can be given a DeprecationWarning.
Add true batch-mode support to ego_graph
Describe any alternatives you have considered
The two functions could also remain the same, but then ego_graph would need to not allow multiple seeds in order to be differentiated from batched_ego_graph, aka, multiple ego_graph calls.
Additional context
This was discovered while looking into this bug on the MG implementation of ego_graph
Further context may be included in the discussion down below.
Code of Conduct
I agree to follow cuGraph's Code of Conduct
I have searched the open feature requests and have found no duplicates for this feature request
The text was updated successfully, but these errors were encountered:
This PR addresses #4191
Since the plan is to add "batched" support to the regular `ego_graph` method, a `DeprecationWarning` has been added to `batched_ego_graphs`.
--
Minor change: use proper indexing when accessing a pd.Series value.
Authors:
- Ralph Liu (https://github.com/nv-rliu)
Approvers:
- Rick Ratzel (https://github.com/rlratzel)
URL: #4209
A tuple of device arrays containing the sources, destinations, edge_weights
It appears to me that the PLC function ego_graph operates on multiple seeds and does return the offsets array that allows the decomposition of the resulting graphs into separate graphs.
I do see that the python code is ignoring the offsets value in the ego_graph call and using it in the deprecated batched_ego_graph call. I see several TODO items in the current python code that are describing some of the issues you see here.
The consequence of ignoring the offsets value in ego_graph is that if you pass multiple seeds to ego_graph, the result will be a single graph that is the composite of all of the resulting graphs, which is incorrect behavior.
So I believe this is entirely a python issue to resolve. Although I'll investigate further if we can identify a PLC call that is generating an answer other than what is expected/required to handle this correctly.
Is this a new feature, an improvement, or a change to existing functionality?
Change
How would you describe the priority of this feature request
High
Please provide a clear description of problem this feature solves
Upon closer examination of Egonet.py, it has been discovered that there are a few inconsistencies with the behavior of
ego_graph
andbatched_ego_graph
based on what the documentation says.According to the docs, the
ego_graph
function is supposed to compute the induced sub-graph of neighbors centered at a single node,n
, and return the sub-graph. Thebatched_ego_graph
function (in congruence with its design in the legacy API) is supposed to accept a List ofn
values, orseeds
, and compute each induced sub-graph in "batch" mode (parallel).However, currently both Python functions just call the same PLC algorithm, which raises two concerns:
ego_graph
function can be called with multiplen
values and returns a graph that doesn't specify the cut-offs (offsets).Describe your ideal solution
The proposed solution for the Python API is to combined both functions into a single
ego_graph
function, which will recognize when it is given a singlen
or multipleseeds
to compute.Down the line, another parameter can be added which allows for proper "batch" mode, similar to how cugraph handles batched betweenness centrality.
Tasks:
batched_ego_graphs
algorithm can be given aDeprecationWarning
.ego_graph
Describe any alternatives you have considered
The two functions could also remain the same, but then
ego_graph
would need to not allow multipleseeds
in order to be differentiated frombatched_ego_graph
, aka, multiple ego_graph calls.Additional context
This was discovered while looking into this bug on the MG implementation of
ego_graph
Further context may be included in the discussion down below.
Code of Conduct
The text was updated successfully, but these errors were encountered: