Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Failure when running MG algos on very small datasets #2196

Closed
jnke2016 opened this issue Apr 6, 2022 · 2 comments · Fixed by #2216
Closed

[BUG] Failure when running MG algos on very small datasets #2196

jnke2016 opened this issue Apr 6, 2022 · 2 comments · Fixed by #2216
Assignees
Labels
? - Needs Triage Need team to review and classify bug Something isn't working
Milestone

Comments

@jnke2016
Copy link
Contributor

jnke2016 commented Apr 6, 2022

Our MG implementation assumes that each workers has a partition, which is unlikely to occur for very small datasets and a large number of GPUs

Steps/Code to reproduce bug
Run MG Neighborhood sampling with 3+ GPUs on the small_tree.csv datasets

@jnke2016 jnke2016 added ? - Needs Triage Need team to review and classify bug Something isn't working labels Apr 6, 2022
@jnke2016 jnke2016 changed the title [BUG] Failure occurring when running MG algos on very small datasets [BUG] Failure when running MG algos on very small datasets Apr 6, 2022
@seunghwak
Copy link
Contributor

Does this happen with C++ benchmarking as well? (This shouldn't but if you can produce this in MG C++ testing, please let me know). Our MG C++ implementation assumes that each process has partition(s) but partitions can have no vertex/edge.

@rlratzel rlratzel added this to the 22.06 milestone Apr 7, 2022
@jnke2016
Copy link
Contributor Author

jnke2016 commented Apr 7, 2022

@seunghwak . This bug is on the python side and I am fixing this. But I might have uncovered another bug seems to be C/C++ related(still with small datasets) where I get this error
Exception: "RuntimeError('non-success value returned from uniform_nbr_sample: CUGRAPH_UNKNOWN_ERROR')"

I will provide more updates about this.

@rlratzel rlratzel added the python label Apr 7, 2022
rapids-bot bot pushed a commit that referenced this issue Apr 20, 2022
This PR enables MG support for very small datasets where the number of partitions is smaller than the number of workers

Dependent on issue #2217 to be resolved before merging
closes #2196

Authors:
  - Joseph Nke (https://github.com/jnke2016)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)

URL: #2216
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants