You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A user reported a segmentation fault when initializing the comms while using one of our latest nightlies. This bug is not currently reproducible by any of our nightly tests
Minimum reproducible example
Not reproducible yet
Relevant log output
stcomp>():222] - 2024-08-25 11:15:35,660 - distributed.core - INFO - Starting established connection to tcp://10.174.164.228:43037
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:22,812 - distributed.worker - INFO - Run out-of-band function '_get_nvml_device_index'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:22,812 - distributed.worker - INFO - Run out-of-band function '_get_nvml_device_index'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:22,835 - distributed.worker - INFO - Run out-of-band function '_func_ucp_listener_port'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:22,835 - distributed.worker - INFO - Run out-of-band function '_func_ucp_listener_port'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:22,881 - distributed.worker - INFO - Run out-of-band function '_func_init_all'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:22,881 - distributed.worker - INFO - Run out-of-band function '_func_init_all'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:30,551 - distributed.worker - INFO - Run out-of-band function '_subcomm_init'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:30,585 - distributed.worker - INFO - Run out-of-band function '_subcomm_init'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - [1724584582.860222] [cgjben2-a6wj25tiqsi5u-w-10:8866 :0] parser.c:2036 UCX WARN unused environment variable: UCX_MEMTYPE_CACHE (maybe: UCX_MEMTYPE_CACHE?)
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - [1724584582.860222] [cgjben2-a6wj25tiqsi5u-w-10:8866 :0] parser.c:2036 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning)
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - [cgjben2-a6wj25tiqsi5u-w-10:8866 :0:8866] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - ==== backtrace (tid: 8866) ====
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 0 /mnt/1/python_env/lib/python3.10/site-packages/raft_dask/common/../../../.././libucs.so.0(ucs_handle_error+0x2fd) [0x7f1228e5a06d]
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 1 /mnt/1/python_env/lib/python3.10/site-packages/raft_dask/common/../../../.././libucs.so.0(+0x2a264) [0x7f1228e5a264]
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2 /mnt/1/python_env/lib/python3.10/site-packages/raft_dask/common/../../../.././libucs.so.0(+0x2a42a) [0x7f1228e5a42a]
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 3 /lib/x86_64-linux-gnu/libpthread.so.0(+0x12980) [0x7f129dad7980]
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - =================================
Environment details
No response
Other/Misc.
No response
Code of Conduct
I agree to follow cuGraph's Code of Conduct
I have searched the open bugs and have found no duplicates for this bug report
The text was updated successfully, but these errors were encountered:
Version
24.10
Which installation method(s) does this occur on?
No response
Describe the bug.
A user reported a segmentation fault when initializing the comms while using one of our latest nightlies. This bug is not currently reproducible by any of our nightly tests
Minimum reproducible example
Not reproducible yet
Relevant log output
stcomp>():222] - 2024-08-25 11:15:35,660 - distributed.core - INFO - Starting established connection to tcp://10.174.164.228:43037
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:22,812 - distributed.worker - INFO - Run out-of-band function '_get_nvml_device_index'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:22,812 - distributed.worker - INFO - Run out-of-band function '_get_nvml_device_index'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:22,835 - distributed.worker - INFO - Run out-of-band function '_func_ucp_listener_port'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:22,835 - distributed.worker - INFO - Run out-of-band function '_func_ucp_listener_port'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:22,881 - distributed.worker - INFO - Run out-of-band function '_func_init_all'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:22,881 - distributed.worker - INFO - Run out-of-band function '_func_init_all'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:30,551 - distributed.worker - INFO - Run out-of-band function '_subcomm_init'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2024-08-25 11:16:30,585 - distributed.worker - INFO - Run out-of-band function '_subcomm_init'
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - [1724584582.860222] [cgjben2-a6wj25tiqsi5u-w-10:8866 :0] parser.c:2036 UCX WARN unused environment variable: UCX_MEMTYPE_CACHE (maybe: UCX_MEMTYPE_CACHE?)
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - [1724584582.860222] [cgjben2-a6wj25tiqsi5u-w-10:8866 :0] parser.c:2036 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning)
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - [cgjben2-a6wj25tiqsi5u-w-10:8866 :0:8866] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - ==== backtrace (tid: 8866) ====
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 0 /mnt/1/python_env/lib/python3.10/site-packages/raft_dask/common/../../../.././libucs.so.0(ucs_handle_error+0x2fd) [0x7f1228e5a06d]
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 1 /mnt/1/python_env/lib/python3.10/site-packages/raft_dask/common/../../../.././libucs.so.0(+0x2a264) [0x7f1228e5a264]
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 2 /mnt/1/python_env/lib/python3.10/site-packages/raft_dask/common/../../../.././libucs.so.0(+0x2a42a) [0x7f1228e5a42a]
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - 3 /lib/x86_64-linux-gnu/libpthread.so.0(+0x12980) [0x7f129dad7980]
24/08/25 11:16:45 WARN python: [dask_cluster.py:():222] - =================================
Environment details
No response
Other/Misc.
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: