Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

11.x builds failing on system without nvcc #297

Open
charlesbluca opened this issue Oct 7, 2024 · 14 comments
Open

11.x builds failing on system without nvcc #297

charlesbluca opened this issue Oct 7, 2024 · 14 comments

Comments

@charlesbluca
Copy link
Member

When attempting to build UCXX with the CUDA 11.8 conda environment on a system without nvcc pre-installed (i.e. all CTK components being installed through conda), I get the following error at build configuration:

CMake Error at /home/charlesb/miniforge3/envs/ucxx-cuda-118/lib/cmake/rmm/rmm-targets.cmake:61 (set_target_properties):
  The link interface of target "rmm::rmm" contains:

    CUDA::cudart

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /home/charlesb/miniforge3/envs/ucxx-cuda-118/lib/cmake/rmm/rmm-config.cmake:75 (include)
  build/cmake/CPM_0.40.0.cmake:249 (find_package)
  build/cmake/CPM_0.40.0.cmake:303 (cpm_find_package)
  build/_deps/rapids-cmake-src/rapids-cmake/cpm/find.cmake:189 (CPMFindPackage)
  build/_deps/rapids-cmake-src/rapids-cmake/cpm/rmm.cmake:75 (rapids_cpm_find)
  cmake/thirdparty/get_rmm.cmake:20 (rapids_cpm_rmm)
  cmake/thirdparty/get_rmm.cmake:24 (find_and_configure_rmm)
  CMakeLists.txt:112 (include)

This was somewhat confusing, as the conda install itself raised a warning message that implied I should have libcudart in the conda environment:

To enable CUDA support, UCX requires the CUDA Runtime library (libcudart).
The library can be installed with the appropriate command below:

* For CUDA 11, run:    conda install cudatoolkit cuda-version=11
* For CUDA 12, run:    conda install cuda-cudart cuda-version=12
→ conda list cuda
# packages in environment at /home/charlesb/miniforge3/envs/ucxx-cuda-118:
#
# Name                    Version                   Build  Channel
cuda-version              11.8                 h70ddcb2_3    conda-forge
cudatoolkit               11.8.0              h4ba93d1_13    conda-forge
→ find $CONDA_PREFIX -name "libcudart.so*"
/home/charlesb/miniforge3/envs/ucxx-cuda-118/lib/libcudart.so
/home/charlesb/miniforge3/envs/ucxx-cuda-118/lib/libcudart.so.11.0
/home/charlesb/miniforge3/envs/ucxx-cuda-118/lib/libcudart.so.11.8.89

Saw that these failures were coming up in the configuration of RMM, so tried building that with its accompanying 11.8 conda environment and got a somewhat clearer error that it was unable to find an installation of CTK on my system (nvcc bin was missing):

/home/charlesb/miniforge3/envs/rmm-cuda-118/bin/nvcc: line 9: /bin/nvcc: No such file or directory
...
CMake Error at /home/charlesb/miniforge3/envs/rmm-cuda-118/share/cmake-3.30/Modules/FindPackageHandleStandardArgs.cmake:233 (message):
  Could NOT find CUDAToolkit (missing: CUDAToolkit_INCLUDE_DIRECTORIES)
Call Stack (most recent call first):
  /home/charlesb/miniforge3/envs/rmm-cuda-118/share/cmake-3.30/Modules/FindPackageHandleStandardArgs.cmake:603 (_FPHSA_FAILURE_MESSAGE)
  /home/charlesb/miniforge3/envs/rmm-cuda-118/share/cmake-3.30/Modules/FindCUDAToolkit.cmake:1048 (find_package_handle_standard_args)
  build/_deps/rapids-cmake-src/rapids-cmake/find/package.cmake:125 (find_package)
  CMakeLists.txt:62 (rapids_find_package)

Was unable to reproduce this with the 12.5 environment, which does pull a conda installation of nvcc.

About to do a system installation of CTK on the system to see if this unblocks

@pentschev
Copy link
Member

As far as I remember, you need to install nvcc_linux-64=11.8, could you check if that works?

@charlesbluca
Copy link
Member Author

Installing that, it seems like the resulting nvcc bin is just wrapping what I assume is a system installation of nvcc?

→ conda list nvcc
# packages in environment at /home/charlesb/miniforge3/envs/ucxx-cuda-118:
#
# Name                    Version                   Build  Channel
nvcc_linux-64             11.8                h9852d18_24    conda-forge

→ nvcc
/home/charlesb/miniforge3/envs/ucxx-cuda-118/bin/nvcc: line 9: /bin/nvcc: No such file or directory

→ cat /home/charlesb/miniforge3/envs/ucxx-cuda-118/bin/nvcc
#!/bin/bash

for arg in "${@}" ; do
  case ${arg} in -ccbin)
    # If -ccbin argument is already provided, don't add an additional one.
    exec "${CUDA_HOME}/bin/nvcc" "${@}"
  esac
done
exec "${CUDA_HOME}/bin/nvcc" -ccbin "${CXX}" "${@}"

@pentschev
Copy link
Member

IIRC, with CUDA 11.x CUDA_HOME is redefined during conda activate. Can you check if deactivating and reactivating your environment changes the behavior?

@pentschev
Copy link
Member

By "redefined" I mean it should be redefined to $CONDA_PREFIX.

@charlesbluca
Copy link
Member Author

charlesbluca commented Oct 7, 2024

Ah thanks for that tip - this highlights what I assume is an underlying issue here, in that we aren't able to locate the CUDA_HOME during environment activation:

→ conda activate ucxx-cuda-118
Cannot determine CUDA_HOME: cuda-gdb not in PATH

This warning specifically starts popping up with the installation of nvcc_linux-64 in the environment

@pentschev
Copy link
Member

I had this discussion with @robertmaynard in the past, his answer was:

conda nvcc scrip uses cuda-gdb to determine the cuda install location if CUDA_HOME hasn't been explicitly set beforehand so if the machine doesn't have cuda-gdb the conda activation scripts will fail to setup CUDA_HOME, which you will need to do manually

So yeah, I think you need a system install of CTK for CUDA 11.x to be able to compile.

@charlesbluca
Copy link
Member Author

charlesbluca commented Oct 7, 2024

Thanks @pentschev, installed CTK 12.5 on my system (seemingly the oldest version available for ubuntu24.04 right now), and that unblocked builds.

Moving forward, can or should we explicitly encode a CTK dependency similar to what RMM is doing in its CMakeLists.txt?

https://github.com/rapidsai/rmm/blob/c494395e58288cac16321ce90e9b15f3508ae89a/CMakeLists.txt#L62-L65

Or is this too brittle of a solution, with just general documentation of system installing CTK for 11.x builds making more sense?

@charlesbluca
Copy link
Member Author

Also worth noting that it seems like there's more required than just proper setting of CUDA_HOME here, as even manually setting it to the CONDA_PREFIX above that I can see contains libcudart seems to raise the same failures

@robertmaynard
Copy link
Contributor

Also worth noting that it seems like there's more required than just proper setting of CUDA_HOME here, as even manually setting it to the CONDA_PREFIX above that I can see contains libcudart seems to raise the same failures

Can you try setting the env variable CUDA_PATH that is what is used by CMake ( not CUDA_HOME ).

@charlesbluca
Copy link
Member Author

Thanks for the tip - looks like that's still failing. For reference the command I'm working with:

$ CUDA_PATH=/home/charlesb/miniforge3/envs/ucxx-cuda-118 ./build.sh
-- The C compiler identification is GNU 13.3.0
-- The CXX compiler identification is GNU 13.3.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/charlesb/miniforge3/envs/ucxx-cuda-118/bin/x86_64-conda-linux-gnu-cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/charlesb/miniforge3/envs/ucxx-cuda-118/bin/x86_64-conda-linux-gnu-c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE
-- CPM: Using local package [email protected]
-- Configuring done (2.1s)
CMake Error at /home/charlesb/miniforge3/envs/ucxx-cuda-118/lib/cmake/rmm/rmm-targets.cmake:61 (set_target_properties):
  The link interface of target "rmm::rmm" contains:

    CUDA::cudart

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /home/charlesb/miniforge3/envs/ucxx-cuda-118/lib/cmake/rmm/rmm-config.cmake:75 (include)
  build/cmake/CPM_0.40.0.cmake:249 (find_package)
  build/cmake/CPM_0.40.0.cmake:303 (cpm_find_package)
  build/_deps/rapids-cmake-src/rapids-cmake/cpm/find.cmake:189 (CPMFindPackage)
  build/_deps/rapids-cmake-src/rapids-cmake/cpm/rmm.cmake:75 (rapids_cpm_find)
  cmake/thirdparty/get_rmm.cmake:20 (rapids_cpm_rmm)
  cmake/thirdparty/get_rmm.cmake:24 (find_and_configure_rmm)
  CMakeLists.txt:112 (include)


-- Generating done (0.0s)
CMake Generate step failed.  Build files cannot be regenerated correctly.

Output of conda list:

# packages in environment at /home/charlesb/miniforge3/envs/ucxx-cuda-118:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
attrs                     24.2.0             pyh71513ae_0    conda-forge
autoconf                  2.71            pl5321h2b4cb7a_1    conda-forge
automake                  1.17            pl5321ha770c72_0    conda-forge
aws-c-auth                0.7.31               h57bd9a3_0    conda-forge
aws-c-cal                 0.7.4                hfd43aa1_1    conda-forge
aws-c-common              0.9.28               hb9d3cd8_0    conda-forge
aws-c-compression         0.2.19               h756ea98_1    conda-forge
aws-c-event-stream        0.4.3                h29ce20c_2    conda-forge
aws-c-http                0.8.10               h5e77a74_0    conda-forge
aws-c-io                  0.14.18             h4e6ae90_11    conda-forge
aws-c-mqtt                0.10.6               h02abb05_0    conda-forge
aws-c-s3                  0.6.6                h834ce55_0    conda-forge
aws-c-sdkutils            0.1.19               h756ea98_3    conda-forge
aws-checksums             0.1.20               h756ea98_0    conda-forge
aws-crt-cpp               0.28.3               h469002c_5    conda-forge
aws-sdk-cpp               1.11.407             h9f1560d_0    conda-forge
azure-core-cpp            1.13.0               h935415a_0    conda-forge
azure-identity-cpp        1.9.0                hd126650_0    conda-forge
azure-storage-blobs-cpp   12.13.0              h1d30c4a_0    conda-forge
azure-storage-common-cpp  12.8.0               ha3822c6_0    conda-forge
azure-storage-files-datalake-cpp 12.12.0              h0f25b8a_0    conda-forge
binutils                  2.43                 h4852527_1    conda-forge
binutils_impl_linux-64    2.43                 h4bf12b8_1    conda-forge
binutils_linux-64         2.43                 h4852527_1    conda-forge
bokeh                     3.5.2              pyhd8ed1ab_0    conda-forge
brotli-python             1.1.0           py312h2ec8cdc_2    conda-forge
bzip2                     1.0.8                h4bc722e_7    conda-forge
c-ares                    1.33.1               heb4867d_0    conda-forge
c-compiler                1.8.0                h2b85faf_0    conda-forge
ca-certificates           2024.8.30            hbcca054_0    conda-forge
cachetools                5.5.0              pyhd8ed1ab_0    conda-forge
cffi                      1.17.1          py312h06ac9bb_0    conda-forge
cfgv                      3.3.1              pyhd8ed1ab_0    conda-forge
click                     8.1.7           unix_pyh707e725_0    conda-forge
cloudpickle               3.0.0              pyhd8ed1ab_0    conda-forge
cmake                     3.30.4               hf9cb763_0    conda-forge
colorama                  0.4.6              pyhd8ed1ab_0    conda-forge
contourpy                 1.3.0           py312h68727a3_2    conda-forge
cubinlinker               0.3.0           py312hbe86355_1    rapidsai
cuda-python               11.8.3          py312h32b3722_2    conda-forge
cuda-version              11.8                 h70ddcb2_3    conda-forge
cudatoolkit               11.8.0              h4ba93d1_13    conda-forge
cudf                      24.12.00a150    cuda11_py312_241007_gfcff2b6ef7_150    rapidsai-nightly
cupy                      13.3.0          py312h8e83189_0    conda-forge
cupy-core                 13.3.0          py312h53955ab_0    conda-forge
cxx-compiler              1.8.0                h1a2810e_0    conda-forge
cython                    3.0.11          py312h8fd2918_3    conda-forge
cytoolz                   1.0.0           py312h66e93f0_0    conda-forge
dask                      2024.9.0           pyhd8ed1ab_0    conda-forge
dask-core                 2024.9.0           pyhd8ed1ab_0    conda-forge
dask-cuda                 24.12.00a2      py312_241007_gfe16796_2    rapidsai-nightly
dask-cudf                 24.12.00a150    cuda11_py312_241007_gfcff2b6ef7_150    rapidsai-nightly
dask-expr                 1.1.14             pyhd8ed1ab_0    conda-forge
distlib                   0.3.8              pyhd8ed1ab_0    conda-forge
distributed               2024.9.0           pyhd8ed1ab_0    conda-forge
dlpack                    0.8                  h59595ed_3    conda-forge
doxygen                   1.9.1                hb166930_1    conda-forge
exceptiongroup            1.2.2              pyhd8ed1ab_0    conda-forge
fastrlock                 0.8.2           py312h30efb56_2    conda-forge
filelock                  3.16.1             pyhd8ed1ab_0    conda-forge
fmt                       11.0.2               h434a139_0    conda-forge
freetype                  2.12.1               h267a509_2    conda-forge
fsspec                    2024.9.0           pyhff2d567_0    conda-forge
gcc                       13.3.0               h9576a4e_1    conda-forge
gcc_impl_linux-64         13.3.0               hfea6d02_1    conda-forge
gcc_linux-64              13.3.0               hc28eda2_4    conda-forge
gflags                    2.2.2             h5888daf_1005    conda-forge
glog                      0.7.1                hbabe93e_0    conda-forge
gxx                       13.3.0               h9576a4e_1    conda-forge
gxx_impl_linux-64         13.3.0               hdbfa832_1    conda-forge
gxx_linux-64              13.3.0               h6834431_4    conda-forge
h2                        4.1.0              pyhd8ed1ab_0    conda-forge
hpack                     4.0.0              pyh9f0ad1d_0    conda-forge
hyperframe                6.0.1              pyhd8ed1ab_0    conda-forge
icu                       75.1                 he02047a_0    conda-forge
identify                  2.6.1              pyhd8ed1ab_0    conda-forge
importlib-metadata        8.5.0              pyha770c72_0    conda-forge
importlib-resources       6.4.5              pyhd8ed1ab_0    conda-forge
importlib_metadata        8.5.0                hd8ed1ab_0    conda-forge
importlib_resources       6.4.5              pyhd8ed1ab_0    conda-forge
iniconfig                 2.0.0              pyhd8ed1ab_0    conda-forge
jinja2                    3.1.4              pyhd8ed1ab_0    conda-forge
jsonschema                4.23.0             pyhd8ed1ab_0    conda-forge
jsonschema-specifications 2023.12.1          pyhd8ed1ab_0    conda-forge
kernel-headers_linux-64   3.10.0              he073ed8_17    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
krb5                      1.21.3               h659f571_0    conda-forge
lcms2                     2.16                 hb7c19ff_0    conda-forge
ld_impl_linux-64          2.43                 h712a8e2_1    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libabseil                 20240722.0      cxx17_h5888daf_1    conda-forge
libarrow                  17.0.0          h364f349_19_cpu    conda-forge
libarrow-acero            17.0.0          h5888daf_19_cpu    conda-forge
libarrow-dataset          17.0.0          h5888daf_19_cpu    conda-forge
libarrow-substrait        17.0.0          he882d9a_19_cpu    conda-forge
libblas                   3.9.0           24_linux64_openblas    conda-forge
libbrotlicommon           1.1.0                hb9d3cd8_2    conda-forge
libbrotlidec              1.1.0                hb9d3cd8_2    conda-forge
libbrotlienc              1.1.0                hb9d3cd8_2    conda-forge
libcblas                  3.9.0           24_linux64_openblas    conda-forge
libcrc32c                 1.1.2                h9c3ff4c_0    conda-forge
libcudf                   24.12.00a150    cuda11_241007_gfcff2b6ef7_150    rapidsai-nightly
libcufile                 1.4.0.31                      0    nvidia
libcufile-dev             1.4.0.31                      0    nvidia
libcurl                   8.10.1               hbbe4b11_0    conda-forge
libdeflate                1.22                 hb9d3cd8_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 hd590300_2    conda-forge
libevent                  2.1.12               hf998b51_1    conda-forge
libexpat                  2.6.3                h5888daf_0    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc                    14.1.0               h77fa898_1    conda-forge
libgcc-devel_linux-64     13.3.0             h84ea5a7_101    conda-forge
libgcc-ng                 14.1.0               h69a702a_1    conda-forge
libgfortran               14.1.0               h69a702a_1    conda-forge
libgfortran-ng            14.1.0               h69a702a_1    conda-forge
libgfortran5              14.1.0               hc5f4f2c_1    conda-forge
libgomp                   14.1.0               h77fa898_1    conda-forge
libgoogle-cloud           2.29.0               h438788a_1    conda-forge
libgoogle-cloud-storage   2.29.0               h0121fbd_1    conda-forge
libgrpc                   1.65.5               hf5c653b_0    conda-forge
libiconv                  1.17                 hd590300_2    conda-forge
libjpeg-turbo             3.0.0                hd590300_1    conda-forge
libkvikio                 24.12.00a       cuda11_241007_ge64c363_20    rapidsai-nightly
liblapack                 3.9.0           24_linux64_openblas    conda-forge
libllvm14                 14.0.6               hcd5def8_4    conda-forge
libnghttp2                1.58.0               h47da74e_1    conda-forge
libnl                     3.10.0               h4bc722e_0    conda-forge
libnsl                    2.0.1                hd590300_0    conda-forge
libopenblas               0.3.27          pthreads_hac2b453_1    conda-forge
libparquet                17.0.0          h6bd9018_19_cpu    conda-forge
libpng                    1.6.44               hadc24fc_0    conda-forge
libprotobuf               5.27.5               h5b01275_2    conda-forge
libre2-11                 2023.09.01           hbbce691_3    conda-forge
librmm                    24.12.00a9      cuda11_241007_gc494395e_9    rapidsai-nightly
libsanitizer              13.3.0               heb74ff8_1    conda-forge
libsqlite                 3.46.1               hadc24fc_0    conda-forge
libssh2                   1.11.0               h0841786_0    conda-forge
libstdcxx                 14.1.0               hc0a3c3a_1    conda-forge
libstdcxx-devel_linux-64  13.3.0             h84ea5a7_101    conda-forge
libstdcxx-ng              14.1.0               h4852527_1    conda-forge
libthrift                 0.21.0               h0e7cc3e_0    conda-forge
libtiff                   4.7.0                he137b08_1    conda-forge
libtool                   2.4.7                he02047a_1    conda-forge
libutf8proc               2.8.0                h166bdaf_0    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libuv                     1.49.0               hb9d3cd8_0    conda-forge
libwebp-base              1.4.0                hd590300_0    conda-forge
libxcb                    1.17.0               h8a09558_0    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libxml2                   2.12.7               he7c6b58_4    conda-forge
libzlib                   1.3.1                hb9d3cd8_2    conda-forge
llvmlite                  0.43.0          py312h374181b_1    conda-forge
locket                    1.0.0              pyhd8ed1ab_0    conda-forge
lz4                       4.3.3           py312hb3f7f12_1    conda-forge
lz4-c                     1.9.4                hcb278e6_0    conda-forge
m4                        1.4.18            h516909a_1001    conda-forge
markdown-it-py            3.0.0              pyhd8ed1ab_0    conda-forge
markupsafe                2.1.5           py312h66e93f0_1    conda-forge
mdurl                     0.1.2              pyhd8ed1ab_0    conda-forge
msgpack-python            1.1.0           py312h68727a3_0    conda-forge
ncurses                   6.5                  he02047a_1    conda-forge
ninja                     1.12.1               h297d8ca_0    conda-forge
nodeenv                   1.9.1              pyhd8ed1ab_0    conda-forge
numba                     0.60.0          py312h83e6fd3_0    conda-forge
numba-cuda                0.0.15             pyh267e887_1    conda-forge
numpy                     2.0.2           py312h58c1407_0    conda-forge
nvcomp                    4.0.1                hee583db_0    conda-forge
nvtx                      0.2.10          py312h66e93f0_2    conda-forge
openjpeg                  2.5.2                h488ebb8_0    conda-forge
openssl                   3.3.2                hb9d3cd8_0    conda-forge
orc                       2.0.2                h690cf93_1    conda-forge
packaging                 24.1               pyhd8ed1ab_0    conda-forge
pandas                    2.2.3           py312hf9745cd_1    conda-forge
partd                     1.4.2              pyhd8ed1ab_0    conda-forge
pathspec                  0.12.1             pyhd8ed1ab_0    conda-forge
perl                      5.32.1          7_hd590300_perl5    conda-forge
pillow                    10.4.0          py312h56024de_1    conda-forge
pip                       24.2               pyh8b19718_1    conda-forge
pkg-config                0.29.2            h4bc722e_1009    conda-forge
pkgutil-resolve-name      1.3.10             pyhd8ed1ab_1    conda-forge
platformdirs              4.3.6              pyhd8ed1ab_0    conda-forge
pluggy                    1.5.0              pyhd8ed1ab_0    conda-forge
pre-commit                4.0.0              pyha770c72_0    conda-forge
psutil                    6.0.0           py312h66e93f0_1    conda-forge
pthread-stubs             0.4               hb9d3cd8_1002    conda-forge
ptxcompiler               0.8.1           py312h32b3722_4    conda-forge
pyarrow                   17.0.0          py312h9cebb41_1    conda-forge
pyarrow-core              17.0.0          py312h9cafe31_1_cpu    conda-forge
pyarrow-hotfix            0.6                pyhd8ed1ab_0    conda-forge
pycparser                 2.22               pyhd8ed1ab_0    conda-forge
pygments                  2.18.0             pyhd8ed1ab_0    conda-forge
pylibcudf                 24.12.00a150    cuda11_py312_241007_gfcff2b6ef7_150    rapidsai-nightly
pynvml                    11.4.1             pyhd8ed1ab_0    conda-forge
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
pytest                    7.4.4              pyhd8ed1ab_0    conda-forge
pytest-asyncio            0.23.8             pyhd8ed1ab_0    conda-forge
pytest-rerunfailures      14.0               pyhd8ed1ab_0    conda-forge
python                    3.12.7          hc5c86c4_0_cpython    conda-forge
python-dateutil           2.9.0              pyhd8ed1ab_0    conda-forge
python-tzdata             2024.2             pyhd8ed1ab_0    conda-forge
python_abi                3.12                    5_cp312    conda-forge
pytz                      2024.1             pyhd8ed1ab_0    conda-forge
pyyaml                    6.0.2           py312h66e93f0_1    conda-forge
rapids-build-backend      0.3.2                      py_0    rapidsai
rapids-dask-dependency    24.12.00a6                 py_0    rapidsai-nightly
rapids-dependency-file-generator 1.15.0                     py_0    rapidsai
rdma-core                 54.0                 h5888daf_0    conda-forge
re2                       2023.09.01           h77b4e00_3    conda-forge
readline                  8.2                  h8228510_1    conda-forge
referencing               0.35.1             pyhd8ed1ab_0    conda-forge
rhash                     1.4.4                hd590300_0    conda-forge
rich                      13.9.2             pyhd8ed1ab_0    conda-forge
rmm                       24.12.00a9      cuda11_py312_241007_gc494395e_9    rapidsai-nightly
rpds-py                   0.20.0          py312h12e396e_1    conda-forge
s2n                       1.5.4                h1380c3d_0    conda-forge
scikit-build-core         0.10.7             pyh4afc917_0    conda-forge
setuptools                75.1.0             pyhd8ed1ab_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
snappy                    1.2.1                ha2e4443_0    conda-forge
sortedcontainers          2.4.0              pyhd8ed1ab_0    conda-forge
spdlog                    1.14.1               hed91bc2_1    conda-forge
sysroot_linux-64          2.17                h4a8ded7_17    conda-forge
tblib                     3.0.0              pyhd8ed1ab_0    conda-forge
tk                        8.6.13          noxft_h4845f30_101    conda-forge
tomli                     2.0.2              pyhd8ed1ab_0    conda-forge
tomlkit                   0.13.2             pyha770c72_0    conda-forge
toolz                     1.0.0              pyhd8ed1ab_0    conda-forge
tornado                   6.4.1           py312h66e93f0_1    conda-forge
typing-extensions         4.12.2               hd8ed1ab_0    conda-forge
typing_extensions         4.12.2             pyha770c72_0    conda-forge
tzdata                    2024b                hc8b5060_0    conda-forge
ucx                       1.17.0               h0104b51_3    conda-forge
ukkonen                   1.0.1           py312h68727a3_5    conda-forge
urllib3                   2.2.3              pyhd8ed1ab_0    conda-forge
virtualenv                20.26.6            pyhd8ed1ab_0    conda-forge
wheel                     0.44.0             pyhd8ed1ab_0    conda-forge
xorg-libxau               1.0.11               hb9d3cd8_1    conda-forge
xorg-libxdmcp             1.1.5                hb9d3cd8_0    conda-forge
xyzservices               2024.9.0           pyhd8ed1ab_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
zict                      3.0.0              pyhd8ed1ab_0    conda-forge
zipp                      3.20.2             pyhd8ed1ab_0    conda-forge
zstandard                 0.23.0          py312hef9b889_1    conda-forge
zstd                      1.5.6                ha6fb4c9_0    conda-forge

@robertmaynard
Copy link
Contributor

I would need a full trace log from CMake to see what is exactly going wrong.

IIRC the command line would be:

CUDA_PATH=/home/charlesb/miniforge3/envs/ucxx-cuda-118 ./build.sh --cmake-args=\"--trace\" > log

@charlesbluca
Copy link
Member Author

Here's a log with CMake traces enabled:

failure.log

@robertmaynard
Copy link
Contributor

Some clarification. The cuda-gdb detection logic is what conda uses to manage finding a local install of CUDA 11.X

CMake uses different logic for finding nvcc and from the extracting the rest of the CUDA Toolkit libraries and headers. @charlesbluca In the trace you provided the FindCUDAToolkit is failing since it can't find nvcc or the sentinel versions files inside the CUDA Toolkit.

I think the primary issue is that CUDA_PATH needs to point not to your conda env, but the local install of the cuda toolkit. E.g /usr/local/cuda-11.8/

@pentschev
Copy link
Member

@charlesbluca do you think there's still anything we should do in UCXX for better UX?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants