[BUG]: vertex pair not properly shuffled #3001

jnke2016 · 2022-11-30T00:09:57Z

Version

22.12

Which installation method(s) does this occur on?

Docker, Conda, Pip, Source

Describe the bug.

The vertex pairs are not properly shuffled to the appropriate GPUs leading to undefined behaviors such as illegal memory accesses. This causes all the MG similarity algorithms (jaccard, sorensen, overlap) to fail at 8+ GPUs

Minimum reproducible example

Run the script below or any MG similarity algo test with 8+ GPUs

    setup_objs = setup()
    client = setup_objs[0]
    num_workers = len(client.scheduler_info()['workers'])

    df = karate.get_edgelist()
    
    ddf = dask_cudf.from_cudf(df, npartitions=num_workers)

    # Create MG Graph
    dg = cugraph.Graph(directed=False)
    dg.from_dask_cudf_edgelist(
        ddf, source='src', destination='dst',
        legacy_renum_only=True)

    # Get vertex_pair by computing the two_hop_neighbors
    vertex_pair = dg.get_two_hop_neighbors()
    vertex_pair = vertex_pair.compute.head()

    # Call jaccard
    df = dcg.jaccard(dg, vertex_pair)    

    teardown(*setup_objs)

Relevant log output

Exception: "RuntimeError('non-success value returned from cugraph_jaccard_coefficients: CUGRAPH_UNKNOWN_ERROR std::bad_alloc: out_of_memory: CUDA error at: /gpfs/fs1/projects/sw_rapids/users/jnke/miniconda3/envs/cugraph_test/include/rmm/mr/device/cuda_memory_resource.hpp')"

Code of Conduct

I agree to follow cuGraph's Code of Conduct
I have searched the open bugs and have found no duplicates for this bug report

The text was updated successfully, but these errors were encountered:

An illegal memory access occurs when running the MG similarity algos at certain scale. This is caused by vertex pairs not being shuffled appropriately. This PR: 1. Shuffle the vertex pairs based on the edge partitioning 2. Update the the vertex pairs column names which are not necessarily edgelists 3. Update the docstrings, tests and notebooks accordingly closes #3001 Authors: - Joseph Nke (https://github.com/jnke2016) - Chuck Hastings (https://github.com/ChuckHastings) Approvers: - Rick Ratzel (https://github.com/rlratzel) - Chuck Hastings (https://github.com/ChuckHastings) URL: #3002

jnke2016 added ? - Needs Triage Need team to review and classify bug Something isn't working labels Nov 30, 2022

jnke2016 mentioned this issue Nov 30, 2022

Shuffle the vertex pair #3002

Merged

BradReesWork removed the ? - Needs Triage Need team to review and classify label Nov 30, 2022

BradReesWork added this to the 22.12 milestone Nov 30, 2022

rapids-bot bot closed this as completed in #3002 Nov 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: vertex pair not properly shuffled #3001

[BUG]: vertex pair not properly shuffled #3001

jnke2016 commented Nov 30, 2022 •

edited

Loading

[BUG]: vertex pair not properly shuffled #3001

[BUG]: vertex pair not properly shuffled #3001

Comments

jnke2016 commented Nov 30, 2022 • edited Loading

Version

Which installation method(s) does this occur on?

Describe the bug.

Minimum reproducible example

Relevant log output

Code of Conduct

jnke2016 commented Nov 30, 2022 •

edited

Loading