Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Create function to extract paths from BFS output #1753

Closed
ChuckHastings opened this issue Aug 3, 2021 · 0 comments · Fixed by #1838
Closed

[ENH] Create function to extract paths from BFS output #1753

ChuckHastings opened this issue Aug 3, 2021 · 0 comments · Fixed by #1838
Assignees
Labels
improvement Improvement / enhancement to an existing function
Milestone

Comments

@ChuckHastings
Copy link
Collaborator

BFS output is a data frame that contains, for every vertex in the graph, the distance from the seed and a back pointer to the previous element in the path. Extracting the path from the seed to a particular vertex requires looking up values in the data frame to back track back to the seed.

If there are multiple vertices that you want to extract paths for this can be a very inefficient mechanism.

What would be helpful would be to create an efficient function that takes the data frame from BFS and a list of vertices and returns a list of the paths to the vertices.

Input:

  • Dataframe from BFS call
  • A list of vertices for which we want the path from the seed to each vertex
    Output:
  • Dataframe with one row for each vertex in the input list containing the path
  • Columns (labeled something simple, like '0', '1', '2', ... 'n') representing step in the path
  • Nulls (at the python level) indicating entries in the Dataframe are invalid

A few notes.

  1. Not all paths will be the same length. If the maximum path length is 5 and a path is only length 2, the unused vertices will be null
  2. There is no requirement that there be a path to all vertices in the specified set in the input. If there is no path then all elements in that row will be null. Note that if the first column (column '0' as suggested above) is null then there is no path to the vertex for that row.
  3. Observe that all nulls will be on the "right". That is, in no row will there be a column 'i' that is null followed by a column 'j' that is not null (where j > i)

Implementation thoughts:

  • Need a C++ function that does this. Should be fairly straightforward
  • Need a Python wrapper to encapsulate the C++ function.
@ChuckHastings ChuckHastings added the ? - Needs Triage Need team to review and classify label Aug 3, 2021
@ChuckHastings ChuckHastings added this to the 21.10 milestone Aug 3, 2021
@BradReesWork BradReesWork added improvement Improvement / enhancement to an existing function and removed ? - Needs Triage Need team to review and classify labels Aug 5, 2021
@BradReesWork BradReesWork linked a pull request Sep 29, 2021 that will close this issue
@BradReesWork BradReesWork modified the milestones: 21.10, 21.12 Sep 29, 2021
rapids-bot bot pushed a commit that referenced this issue Oct 14, 2021
Partially addresses #1753 

Creates a new C++ function to extract BFS paths.

This PR includes the SG implementation and the SG unit tests.  A separate PR will provide the MG unit test.

Authors:
  - Chuck Hastings (https://github.com/ChuckHastings)

Approvers:
  - Seunghwa Kang (https://github.com/seunghwak)

URL: #1838
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants