Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] importing cugraph in Amazon SageMaker Studio Lab cause import error libnuma.so.1 #2113

Closed
lhyuen opened this issue Mar 14, 2022 · 5 comments · Fixed by #2241
Closed
Assignees
Labels
bug Something isn't working
Milestone

Comments

@lhyuen
Copy link

lhyuen commented Mar 14, 2022

Describe the bug
importing cugraph in Amazon SageMaker Studio Lab cause import error libnuma.so.1: cannot open shared object file: No such file or directory

Steps/Code to reproduce bug
import cugraph in a cell within jupyter notebook

Expected behavior
successful import of cugraph

Environment overview (please complete the following information)
Followed the exact installation from the doc:
conda create -n rapids-22.02 -c rapidsai -c nvidia -c conda-forge rapids=22.02 python=3.8 cudatoolkit=11.4 dask-sql ipykernel -y

Environment details
Cannot import cugraph so I can't get the message

Additional context
I have no issue importing other cu libraries like cudf/dask_cudf etc.

@lhyuen lhyuen added ? - Needs Triage Need team to review and classify bug Something isn't working labels Mar 14, 2022
@BradReesWork BradReesWork added this to the 22.04 milestone Mar 14, 2022
@BradReesWork BradReesWork removed the ? - Needs Triage Need team to review and classify label Mar 14, 2022
@BradReesWork BradReesWork modified the milestones: 22.04, 22.06 Mar 31, 2022
@rlratzel
Copy link
Contributor

rlratzel commented Apr 6, 2022

@taureandyernv is helping look into this. Thanks, Taurean.

@taureandyernv
Copy link

TLDR:
UCX now implicitly requires libnuma (and thus libnuma.so.1). Sagemaker doesn't have libnuma or the shared library installed. We currently can't install libnuma as SageMaker Studio Lab doesn't let you apt install. We also don't need it, because it is called by ucp (ucx-py), as SageMaker is a single GPU instance and its worked before without ucx-py with cuml (rapidsai/cuml#4616). However, when we remove ucx-py, as per rapidsai/ucx-py#790. it still throws an error because cugraph has a raft dependency that requires dask, which requires ucp (ucx-py)

Okay and now for the long part of it:
Here is the traceback after removing ucx-py

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Input In [10], in <cell line: 1>()
----> 1 import cugraph

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/__init__.py:14, in <module>
      1 # Copyright (c) 2019-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.community import (
     15     ecg,
     16     ktruss_subgraph,
     17     k_truss,
     18     louvain,
     19     leiden,
     20     spectralBalancedCutClustering,
     21     spectralModularityMaximizationClustering,
     22     analyzeClustering_modularity,
     23     analyzeClustering_edge_cut,
     24     analyzeClustering_ratio_cut,
     25     subgraph,
     26     triangles,
     27     ego_graph,
     28     batched_ego_graphs,
     29 )
     31 from cugraph.structure import (
     32     Graph,
     33     DiGraph,
   (...)
     56     is_bipartite,
     57     is_multipartite)
     59 from cugraph.centrality import (
     60     betweenness_centrality,
     61     edge_betweenness_centrality,
     62     katz_centrality,
     63 )

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/community/__init__.py:14, in <module>
      1 # Copyright (c) 2019-2021, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.community.louvain import louvain
     15 from cugraph.community.leiden import leiden
     16 from cugraph.community.ecg import ecg

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/community/louvain.py:14, in <module>
      1 # Copyright (c) 2019-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.community import louvain_wrapper
     15 from cugraph.structure.graph_classes import Graph
     16 from cugraph.utilities import (ensure_cugraph_obj_for_nx,
     17                                df_score_to_dictionary,
     18                                )

File cugraph/community/louvain_wrapper.pyx:21, in init cugraph.community.louvain_wrapper()

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/structure/__init__.py:14, in <module>
      1 # Copyright (c) 2019-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.structure.graph_classes import (Graph,
     15                                              DiGraph,
     16                                              MultiGraph,
     17                                              MultiDiGraph,
     18                                              BiPartiteGraph,
     19                                              BiPartiteDiGraph)
     20 from cugraph.structure.graph_classes import (is_weighted,
     21                                              is_directed,
     22                                              is_multigraph,
     23                                              is_bipartite,
     24                                              is_multipartite)
     25 from cugraph.structure.number_map import NumberMap

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/structure/graph_classes.py:15, in <module>
      1 # Copyright (c) 2021-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
     14 import numpy as np
---> 15 from .graph_implementation import (simpleGraphImpl,
     16                                    simpleDistributedGraphImpl,
     17                                    npartiteGraphImpl)
     18 import cudf
     19 import warnings

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/structure/graph_implementation/__init__.py:14, in <module>
      1 # Copyright (c) 2021, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from .simpleGraph import simpleGraphImpl
     15 from .simpleDistributedGraph import simpleDistributedGraphImpl
     16 from .npartiteGraph import npartiteGraphImpl

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/structure/graph_implementation/simpleGraph.py:14, in <module>
      1 # Copyright (c) 2021-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.structure import graph_primtypes_wrapper
     15 from cugraph.structure.graph_primtypes_wrapper import Direction
     16 from cugraph.structure.symmetrize import symmetrize

File cugraph/structure/graph_primtypes_wrapper.pyx:27, in init cugraph.structure.graph_primtypes_wrapper()

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/comms/comms.py:14, in <module>
      1 # Copyright (c) 2018-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.raft.dask.common.comms import Comms as raftComms
     15 from cugraph.raft.dask.common.comms import get_raft_comm_state
     16 from cugraph.raft.common.handle import Handle

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/raft/dask/__init__.py:16, in <module>
      1 # Copyright (c) 2020, NVIDIA CORPORATION.
      2 #
      3 # Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     13 # limitations under the License.
     14 #
---> 16 from .common.comms import Comms

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/raft/dask/common/__init__.py:16, in <module>
      1 # Copyright (c) 2020, NVIDIA CORPORATION.
      2 #
      3 # Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     13 # limitations under the License.
     14 #
---> 16 from .comms import Comms
     17 from .comms import local_handle
     19 from .comms_utils import inject_comms_on_handle

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/raft/dask/common/comms.py:17, in <module>
      1 # Copyright (c) 2020, NVIDIA CORPORATION.
      2 #
      3 # Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     13 # limitations under the License.
     14 #
     16 from .nccl import nccl
---> 17 from .ucx import UCX
     19 from .comms_utils import inject_comms_on_handle
     20 from .comms_utils import inject_comms_on_handle_coll_only

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/raft/dask/common/ucx.py:16, in <module>
      1 # Copyright (c) 2020, NVIDIA CORPORATION.
      2 #
      3 # Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     13 # limitations under the License.
     14 #
---> 16 import ucp
     19 async def _connection_func(ep):
     20     UCX.get().add_server_endpoint(ep)

ModuleNotFoundError: No module named 'ucp'

and here is the traceback when importing cugraph before we tried removing ucx-py for reference

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Input In [1], in <cell line: 1>()
----> 1 import cugraph

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/__init__.py:14, in <module>
      1 # Copyright (c) 2019-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.community import (
     15     ecg,
     16     ktruss_subgraph,
     17     k_truss,
     18     louvain,
     19     leiden,
     20     spectralBalancedCutClustering,
     21     spectralModularityMaximizationClustering,
     22     analyzeClustering_modularity,
     23     analyzeClustering_edge_cut,
     24     analyzeClustering_ratio_cut,
     25     subgraph,
     26     triangles,
     27     ego_graph,
     28     batched_ego_graphs,
     29 )
     31 from cugraph.structure import (
     32     Graph,
     33     DiGraph,
   (...)
     56     is_bipartite,
     57     is_multipartite)
     59 from cugraph.centrality import (
     60     betweenness_centrality,
     61     edge_betweenness_centrality,
     62     katz_centrality,
     63 )

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/community/__init__.py:14, in <module>
      1 # Copyright (c) 2019-2021, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.community.louvain import louvain
     15 from cugraph.community.leiden import leiden
     16 from cugraph.community.ecg import ecg

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/community/louvain.py:14, in <module>
      1 # Copyright (c) 2019-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.community import louvain_wrapper
     15 from cugraph.structure.graph_classes import Graph
     16 from cugraph.utilities import (ensure_cugraph_obj_for_nx,
     17                                df_score_to_dictionary,
     18                                )

File cugraph/community/louvain_wrapper.pyx:21, in init cugraph.community.louvain_wrapper()

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/structure/__init__.py:14, in <module>
      1 # Copyright (c) 2019-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.structure.graph_classes import (Graph,
     15                                              DiGraph,
     16                                              MultiGraph,
     17                                              MultiDiGraph,
     18                                              BiPartiteGraph,
     19                                              BiPartiteDiGraph)
     20 from cugraph.structure.graph_classes import (is_weighted,
     21                                              is_directed,
     22                                              is_multigraph,
     23                                              is_bipartite,
     24                                              is_multipartite)
     25 from cugraph.structure.number_map import NumberMap

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/structure/graph_classes.py:15, in <module>
      1 # Copyright (c) 2021-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
     14 import numpy as np
---> 15 from .graph_implementation import (simpleGraphImpl,
     16                                    simpleDistributedGraphImpl,
     17                                    npartiteGraphImpl)
     18 import cudf
     19 import warnings

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/structure/graph_implementation/__init__.py:14, in <module>
      1 # Copyright (c) 2021, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from .simpleGraph import simpleGraphImpl
     15 from .simpleDistributedGraph import simpleDistributedGraphImpl
     16 from .npartiteGraph import npartiteGraphImpl

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/structure/graph_implementation/simpleGraph.py:14, in <module>
      1 # Copyright (c) 2021-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.structure import graph_primtypes_wrapper
     15 from cugraph.structure.graph_primtypes_wrapper import Direction
     16 from cugraph.structure.symmetrize import symmetrize

File cugraph/structure/graph_primtypes_wrapper.pyx:27, in init cugraph.structure.graph_primtypes_wrapper()

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/comms/comms.py:14, in <module>
      1 # Copyright (c) 2018-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.raft.dask.common.comms import Comms as raftComms
     15 from cugraph.raft.dask.common.comms import get_raft_comm_state
     16 from cugraph.raft.common.handle import Handle

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/raft/dask/__init__.py:16, in <module>
      1 # Copyright (c) 2020, NVIDIA CORPORATION.
      2 #
      3 # Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     13 # limitations under the License.
     14 #
---> 16 from .common.comms import Comms

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/raft/dask/common/__init__.py:16, in <module>
      1 # Copyright (c) 2020, NVIDIA CORPORATION.
      2 #
      3 # Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     13 # limitations under the License.
     14 #
---> 16 from .comms import Comms
     17 from .comms import local_handle
     19 from .comms_utils import inject_comms_on_handle

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/raft/dask/common/comms.py:17, in <module>
      1 # Copyright (c) 2020, NVIDIA CORPORATION.
      2 #
      3 # Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     13 # limitations under the License.
     14 #
     16 from .nccl import nccl
---> 17 from .ucx import UCX
     19 from .comms_utils import inject_comms_on_handle
     20 from .comms_utils import inject_comms_on_handle_coll_only

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/cugraph/raft/dask/common/ucx.py:16, in <module>
      1 # Copyright (c) 2020, NVIDIA CORPORATION.
      2 #
      3 # Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     13 # limitations under the License.
     14 #
---> 16 import ucp
     19 async def _connection_func(ep):
     20     UCX.get().add_server_endpoint(ep)

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/ucp/__init__.py:21, in <module>
     18     os.environ["UCX_MEMTYPE_CACHE"] = "n"
     20 from ._version import get_versions as _get_versions  # noqa
---> 21 from .core import *  # noqa
     22 from .core import get_ucx_version  # noqa
     23 from .utils import get_ucxpy_logger  # noqa

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/ucp/core.py:16, in <module>
     13 from functools import partial
     14 from os import close as close_fd
---> 16 from . import comm
     17 from ._libs import ucx_api
     18 from ._libs.arr import Array

File ~/.conda/envs/rapids-22.0-b/lib/python3.8/site-packages/ucp/comm.py:8, in <module>
      5 import asyncio
      6 from typing import Union
----> 8 from ._libs import arr, ucx_api
     11 def _cb_func(request, exception, event_loop, future):
     12     if event_loop.is_closed() or future.done():

ImportError: libnuma.so.1: cannot open shared object file: No such file or directory

@rlratzel rlratzel added the python label Apr 7, 2022
@rapids-bot rapids-bot bot closed this as completed in #2241 May 4, 2022
rapids-bot bot pushed a commit that referenced this issue May 4, 2022
This PR aims to directly resolve #2113, and allows for cugraph to be imported without errors in environments with only a single GPU

What this PR does:

- Moves `comms` module into the `dask` module, renaming imports across `cugraph` accordingly
- Adds mock class in case ucp is not available or usable in SG-only environments (note ucp requires libnuma.so)
- Renames `tests/dask` to `tests/mg` to better reflect the fact that those tests are for mg algs and utils
- Separates symmetrize testing into `tests/mg` and `tests` for separation of SG and MG testing

Authors:
  - https://github.com/betochimas

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #2241
@taureandyernv
Copy link

taureandyernv commented Jun 22, 2022

Finally got an instance in SMSL....but no dice.

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Input In [1], in <cell line: 1>()
----> 1 import cugraph

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/__init__.py:14, in <module>
      1 # Copyright (c) 2019-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.community import (
     15     ecg,
     16     ktruss_subgraph,
     17     k_truss,
     18     louvain,
     19     leiden,
     20     spectralBalancedCutClustering,
     21     spectralModularityMaximizationClustering,
     22     analyzeClustering_modularity,
     23     analyzeClustering_edge_cut,
     24     analyzeClustering_ratio_cut,
     25     subgraph,
     26     triangles,
     27     ego_graph,
     28     batched_ego_graphs,
     29 )
     31 from cugraph.structure import (
     32     Graph,
     33     DiGraph,
   (...)
     56     is_bipartite,
     57     is_multipartite)
     59 from cugraph.centrality import (
     60     betweenness_centrality,
     61     edge_betweenness_centrality,
   (...)
     64     eigenvector_centrality,
     65 )

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/community/__init__.py:14, in <module>
      1 # Copyright (c) 2019-2021, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.community.louvain import louvain
     15 from cugraph.community.leiden import leiden
     16 from cugraph.community.ecg import ecg

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/community/louvain.py:14, in <module>
      1 # Copyright (c) 2019-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.community import louvain_wrapper
     15 from cugraph.utilities import (ensure_cugraph_obj_for_nx,
     16                                df_score_to_dictionary,
     17                                )
     20 def louvain(G, max_iter=100, resolution=1.):

File cugraph/community/louvain_wrapper.pyx:21, in init cugraph.community.louvain_wrapper()

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/structure/__init__.py:14, in <module>
      1 # Copyright (c) 2019-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.structure.graph_classes import (Graph,
     15                                              DiGraph,
     16                                              MultiGraph,
     17                                              MultiDiGraph,
     18                                              BiPartiteGraph,
     19                                              BiPartiteDiGraph)
     20 from cugraph.structure.graph_classes import (is_weighted,
     21                                              is_directed,
     22                                              is_multigraph,
     23                                              is_bipartite,
     24                                              is_multipartite)
     25 from cugraph.structure.number_map import NumberMap

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/structure/graph_classes.py:15, in <module>
      1 # Copyright (c) 2021-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
     14 import numpy as np
---> 15 from .graph_implementation import (simpleGraphImpl,
     16                                    simpleDistributedGraphImpl,
     17                                    npartiteGraphImpl)
     18 import cudf
     19 import dask_cudf

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/structure/graph_implementation/__init__.py:14, in <module>
      1 # Copyright (c) 2021, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from .simpleGraph import simpleGraphImpl
     15 from .simpleDistributedGraph import simpleDistributedGraphImpl
     16 from .npartiteGraph import npartiteGraphImpl

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/structure/graph_implementation/simpleGraph.py:14, in <module>
      1 # Copyright (c) 2021-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.structure import graph_primtypes_wrapper
     15 from cugraph.structure.graph_primtypes_wrapper import Direction
     16 from cugraph.structure.symmetrize import symmetrize

File cugraph/structure/graph_primtypes_wrapper.pyx:27, in init cugraph.structure.graph_primtypes_wrapper()

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/dask/__init__.py:14, in <module>
      1 # Copyright (c) 2020-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from .link_analysis.pagerank import pagerank
     15 from .link_analysis.hits import hits
     16 from .traversal.bfs import bfs

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/dask/link_analysis/pagerank.py:17, in <module>
      1 # Copyright (c) 2019-2022, NVIDIA CORPORATION.
      2 #
      3 # Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     13 # limitations under the License.
     14 #
     16 from dask.distributed import wait, default_client
---> 17 from cugraph.dask.common.input_utils import (get_distributed_data,
     18                                              get_vertex_partition_offsets)
     19 from cugraph.dask.link_analysis import mg_pagerank_wrapper as mg_pagerank
     20 import cugraph.dask.comms.comms as Comms

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/dask/common/input_utils.py:22, in <module>
     19 from dask_cudf.core import DataFrame as dcDataFrame
     20 from dask_cudf.core import Series as daskSeries
---> 22 import cugraph.dask.comms.comms as Comms
     23 # FIXME: this raft import breaks the library if ucx-py is
     24 # not available. They are necessary only when doing MG work.
     25 from cugraph.dask.common.read_utils import MissingUCXPy

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/dask/comms/comms.py:18, in <module>
     16 from cugraph.dask.common.read_utils import MissingUCXPy
     17 try:
---> 18     from raft.dask.common.comms import Comms as raftComms
     19     from raft.dask.common.comms import get_raft_comm_state
     20 except ModuleNotFoundError as err:

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/raft/dask/__init__.py:16, in <module>
      1 # Copyright (c) 2020-2022, NVIDIA CORPORATION.
      2 #
      3 # Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     13 # limitations under the License.
     14 #
---> 16 from .common.comms import Comms

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/raft/dask/common/__init__.py:16, in <module>
      1 # Copyright (c) 2020-2022, NVIDIA CORPORATION.
      2 #
      3 # Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     13 # limitations under the License.
     14 #
---> 16 from .comms import Comms
     17 from .comms import local_handle
     19 from .comms_utils import inject_comms_on_handle

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/raft/dask/common/comms.py:17, in <module>
      1 # Copyright (c) 2020-2022, NVIDIA CORPORATION.
      2 #
      3 # Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     13 # limitations under the License.
     14 #
     16 from .nccl import nccl
---> 17 from .ucx import UCX
     19 from .comms_utils import inject_comms_on_handle
     20 from .comms_utils import inject_comms_on_handle_coll_only

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/raft/dask/common/ucx.py:16, in <module>
      1 # Copyright (c) 2020-2022, NVIDIA CORPORATION.
      2 #
      3 # Licensed under the Apache License, Version 2.0 (the "License");
   (...)
     13 # limitations under the License.
     14 #
---> 16 import ucp
     19 async def _connection_func(ep):
     20     UCX.get().add_server_endpoint(ep)

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/ucp/__init__.py:20, in <module>
     17     os.environ["UCX_MEMTYPE_CACHE"] = "n"
     19 from ._version import get_versions as _get_versions  # noqa
---> 20 from .core import *  # noqa
     21 from .core import get_ucx_version  # noqa
     22 from .utils import get_ucxpy_logger  # noqa

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/ucp/core.py:16, in <module>
     13 from functools import partial
     14 from os import close as close_fd
---> 16 from . import comm
     17 from ._libs import ucx_api
     18 from ._libs.arr import Array

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/ucp/comm.py:8, in <module>
      5 import asyncio
      6 from typing import Union
----> 8 from ._libs import arr, ucx_api
     11 def _cb_func(request, exception, event_loop, future):
     12     if event_loop.is_closed() or future.done():

ImportError: libnuma.so.1: cannot open shared object file: No such file or directory

@rlratzel @BradReesWork @betochimas

@taureandyernv
Copy link

taureandyernv commented Jun 22, 2022

I think i see the issue: one FIXME didn't make it into 22.06 somehow. When I removed ucx_py, I got this an error, but this was part of the stacktrace (see lines 23 and 24) when you call from cugraph.structure import graph_primtypes_wrapper, which still imports import cugraph.dask.comms.comms as Comms :

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/dask/common/input_utils.py:22, in <module>
     19 from dask_cudf.core import DataFrame as dcDataFrame
     20 from dask_cudf.core import Series as daskSeries
---> 22 import cugraph.dask.comms.comms as Comms
     23 # FIXME: this raft import breaks the library if ucx-py is
     24 # not available. They are necessary only when doing MG work.
     25 from cugraph.dask.common.read_utils import MissingUCXPy

Here is the entire error stack trace:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Input In [5], in <cell line: 1>()
----> 1 import cugraph

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/__init__.py:14, in <module>
      1 # Copyright (c) 2019-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.community import (
     15     ecg,
     16     ktruss_subgraph,
     17     k_truss,
     18     louvain,
     19     leiden,
     20     spectralBalancedCutClustering,
     21     spectralModularityMaximizationClustering,
     22     analyzeClustering_modularity,
     23     analyzeClustering_edge_cut,
     24     analyzeClustering_ratio_cut,
     25     subgraph,
     26     triangles,
     27     ego_graph,
     28     batched_ego_graphs,
     29 )
     31 from cugraph.structure import (
     32     Graph,
     33     DiGraph,
   (...)
     56     is_bipartite,
     57     is_multipartite)
     59 from cugraph.centrality import (
     60     betweenness_centrality,
     61     edge_betweenness_centrality,
   (...)
     64     eigenvector_centrality,
     65 )

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/community/__init__.py:14, in <module>
      1 # Copyright (c) 2019-2021, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.community.louvain import louvain
     15 from cugraph.community.leiden import leiden
     16 from cugraph.community.ecg import ecg

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/community/louvain.py:14, in <module>
      1 # Copyright (c) 2019-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.community import louvain_wrapper
     15 from cugraph.utilities import (ensure_cugraph_obj_for_nx,
     16                                df_score_to_dictionary,
     17                                )
     20 def louvain(G, max_iter=100, resolution=1.):

File cugraph/community/louvain_wrapper.pyx:21, in init cugraph.community.louvain_wrapper()

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/structure/__init__.py:14, in <module>
      1 # Copyright (c) 2019-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.structure.graph_classes import (Graph,
     15                                              DiGraph,
     16                                              MultiGraph,
     17                                              MultiDiGraph,
     18                                              BiPartiteGraph,
     19                                              BiPartiteDiGraph)
     20 from cugraph.structure.graph_classes import (is_weighted,
     21                                              is_directed,
     22                                              is_multigraph,
     23                                              is_bipartite,
     24                                              is_multipartite)
     25 from cugraph.structure.number_map import NumberMap

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/structure/graph_classes.py:15, in <module>
      1 # Copyright (c) 2021-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
     14 import numpy as np
---> 15 from .graph_implementation import (simpleGraphImpl,
     16                                    simpleDistributedGraphImpl,
     17                                    npartiteGraphImpl)
     18 import cudf
     19 import dask_cudf

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/structure/graph_implementation/__init__.py:14, in <module>
      1 # Copyright (c) 2021, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from .simpleGraph import simpleGraphImpl
     15 from .simpleDistributedGraph import simpleDistributedGraphImpl
     16 from .npartiteGraph import npartiteGraphImpl

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/structure/graph_implementation/simpleGraph.py:14, in <module>
      1 # Copyright (c) 2021-2022, NVIDIA CORPORATION.
      2 # Licensed under the Apache License, Version 2.0 (the "License");
      3 # you may not use this file except in compliance with the License.
   (...)
     11 # See the License for the specific language governing permissions and
     12 # limitations under the License.
---> 14 from cugraph.structure import graph_primtypes_wrapper
     15 from cugraph.structure.graph_primtypes_wrapper import Direction
     16 from cugraph.structure.symmetrize import symmetrize

File cugraph/structure/graph_primtypes_wrapper.pyx:29, in init cugraph.structure.graph_primtypes_wrapper()

File ~/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/dask/common/input_utils.py:22, in <module>
     19 from dask_cudf.core import DataFrame as dcDataFrame
     20 from dask_cudf.core import Series as daskSeries
---> 22 import cugraph.dask.comms.comms as Comms
     23 # FIXME: this raft import breaks the library if ucx-py is
     24 # not available. They are necessary only when doing MG work.
     25 from cugraph.dask.common.read_utils import MissingUCXPy

ImportError: cannot import name 'dask' from partially initialized module 'cugraph' (most likely due to a circular import) (/home/studio-lab-user/.conda/envs/rapids-22.06/lib/python3.8/site-packages/cugraph/__init__.py)

@betochimas
Copy link
Contributor

The modules dependent on ucx-py are from raft.dask.common.utils or raft.dask.common.comms which are all imported in 3 files from the cugraph.dask module. So in an environment that's missing ucx-py, the relevant helpers/classes are replaced by a MissingUCXPy object. The issue seems to be a circular import error, which I encountered sometimes when working on the bug but stopped when implementing MissingUCXPy. The FIXMEs aren't really FIXMEs in the typical sense, they rather serve to provide an explanation as to why MissingUCXPy and the try block was needed. I tried reproducing this error in both cugraph 22.06 and 22.08 but wasn't able to reproduce the error.

raydouglass pushed a commit that referenced this issue Jul 12, 2022
…rror for Amazon SMS libnuma.so.1 bug (#2385)

Generalize ModuleNotFoundError exception handling to ImportError for Amazon SMS libnuma.so.1 bug
#2113

This allows cugraph to be imported in a SageMaker environment without having to remove `ucx-py`

Authors:
   - Dylan Chima-Sanchez (https://github.com/betochimas)
   - Rick Ratzel (https://github.com/rlratzel)

Approvers:
   - Brad Rees (https://github.com/BradReesWork)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants