Build fails for `-DUSE_CUDA=1` #5785

jmakov · 2023-03-15T16:46:02Z

Description

#5089 is marked as resolved but this is still the case trying to build in RAPIDS Docker container:

#0 153.8 /usr/include/c++/11/bits/std_function.h:435:145: note:         '_ArgTypes'
#0 153.8 /usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with '...':
#0 153.8   530 |         operator=(_Functor&& __f)
#0 153.8       |                                                                                                                                                  ^ 
#0 153.8 /usr/include/c++/11/bits/std_function.h:530:146: note:         '_ArgTypes'
#0 154.6 make[2]: *** [CMakeFiles/lightgbm_objs.dir/build.make:734: CMakeFiles/lightgbm_objs.dir/src/treelearner/cuda/cuda_best_split_finder.cu.o] Error 1
#0 154.6 make[1]: *** [CMakeFiles/Makefile2:257: CMakeFiles/lightgbm_objs.dir/all] Error 2
#0 154.6 make: *** [Makefile:136: all] Error 2

Reproducible example

Environment info

LightGBM version or commit hash:

Command(s) you used to install LightGBM

mkdir /tmp/lib && cd /tmp/lib  \
    && git clone --recursive https://github.com/microsoft/LightGBM \
    && mkdir /tmp/lib/LightGBM/build && cd /tmp/lib/LightGBM/build \
    && cmake -DUSE_CUDA=1 .. && make -j \
    && pip uninstall -y lightgbm \
    && cd ../python-package/ && python setup.py install --precompile

Build in docker FROM rapidsai/rapidsai-core:23.02-cuda11.8-runtime-ubuntu22.04-py3.10
GCC 11.3

Additional Comments

The text was updated successfully, but these errors were encountered:

shiyu1994 · 2023-03-16T05:36:58Z

@jmakov Is it possible to see more error message? For example, why the compilation of cuda_best_split_finder.cu fail?

jmakov · 2023-03-16T13:04:02Z

@shiyu1994 there seems to be only 1 type of error:

/tmp/lib/LightGBM/include/LightGBM/utils/../../../external_libs/fmt/include/fmt/format-inl.h(85): here                                             
                                                                                                                                                   
/usr/include/c++/11/bits/std_function.h:435:145: error: parameter packs not expanded with '...':                                                   
  435 |         function(_Functor&& __f)                                                                                                           
      |                                                                                                                                            
     ^                                                                                                                                             
/usr/include/c++/11/bits/std_function.h:435:145: note:         '_ArgTypes'                                                                         
/usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with '...':                                                   
  530 |         operator=(_Functor&& __f)                                                                                                          
      |                                                                                                                                            
      ^                                                                                                                                            
/usr/include/c++/11/bits/std_function.h:530:146: note:         '_ArgTypes'

whole log:
build_fail.txt

jmakov · 2023-03-31T14:47:24Z

This is kinda a blocker for me. Would be great to have some more insight into what can be done about it.

domtisdell · 2023-07-19T22:24:16Z

I've been having similar problems I think when trying to install v4.0. Builds were failing until I switched gcc (and g++ for good measure) to version 10 for compiling.

Found solution from this reference: NVIDIA/nccl#650

jameslamb · 2024-04-24T04:11:23Z

Sorry for the long delay in response. I believe recent changes in LightGBM have fixed this.

I was able to build latest LightGBM (1443548) in the latest stable rapidsai/base image.

(rapidsai/rapidsai-core images were removed as part of rapidsai/docker#539)

docker run \
    --rm \
    --user root \
    -it rapidsai/base:24.04-cuda12.0-py3.10 \
    bash

mkdir /tmp/lib
cd /tmp/lib 

# install build tools (rapidsai/core doesn't ship these)
apt-get update
apt-get install -y \
    build-essential \
    cmake \
    git

# build LightGBM
git clone --recursive https://github.com/microsoft/LightGBM

cd ./LightGBM
cmake -B build -S . -DUSE_CUDA=1
cmake --build build --target _lightgbm -j2
sh build-python.sh install --precompile

That built successfully for me.

full logs (click me)

Configure step:

-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- The CUDA compiler identification is NVIDIA 12.0.76
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /opt/conda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Found CUDA: /opt/conda/targets/sbsa-linux (found suitable version "12.0", minimum required is "11.0")
-- CMAKE_CUDA_FLAGS:  -Xcompiler=-fopenmp -Xcompiler=-fPIC -Xcompiler=-Wall -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_87,code=sm_87 -gencode arch=compute_89,code=sm_89 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_90,code=compute_90 -O3 -lineinfo
-- ALLFEATS_DEFINES: -DPOWER_FEATURE_WORKGROUPS=12;-DUSE_CONSTANT_BUF=0;-DENABLE_ALL_FEATURES
-- FULLDATA_DEFINES: -DPOWER_FEATURE_WORKGROUPS=12;-DUSE_CONSTANT_BUF=0;-DENABLE_ALL_FEATURES;-DIGNORE_INDICES
-- Performing Test MM_PREFETCH
-- Performing Test MM_PREFETCH - Failed
-- Performing Test MM_MALLOC
-- Performing Test MM_MALLOC - Failed
-- Configuring done
-- Generating done
-- Build files have been written to: /tmp/lib/LightGBM/build

Build step:

[  1%] Building CUDA object CMakeFiles/histo_16_64_256_sp.dir/src/treelearner/kernels/histogram_16_64_256.cu.o
[  2%] Building CUDA object CMakeFiles/histo_16_64_256-fulldata_sp.dir/src/treelearner/kernels/histogram_16_64_256.cu.o
[  2%] Built target histo_16_64_256-fulldata_sp
[  2%] Built target histo_16_64_256_sp
[  4%] Building CXX object CMakeFiles/lightgbm_capi_objs.dir/src/c_api.cpp.o
[  5%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/boosting/boosting.cpp.o
[  6%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/boosting/gbdt.cpp.o
[  8%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/boosting/gbdt_model_text.cpp.o
[  8%] Built target lightgbm_capi_objs
[  9%] Building CUDA object CMakeFiles/histo_16_64_256_sp_const.dir/src/treelearner/kernels/histogram_16_64_256.cu.o
[ 10%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/boosting/gbdt_prediction.cpp.o
[ 12%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/boosting/prediction_early_stop.cpp.o
[ 13%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/boosting/sample_strategy.cpp.o
[ 13%] Built target histo_16_64_256_sp_const
[ 14%] Building CUDA object CMakeFiles/histo_16_64_256-fulldata_sp_const.dir/src/treelearner/kernels/histogram_16_64_256.cu.o
[ 16%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/io/bin.cpp.o
[ 16%] Built target histo_16_64_256-fulldata_sp_const
[ 17%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/io/config.cpp.o
[ 18%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/io/config_auto.cpp.o
[ 20%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/io/dataset.cpp.o
[ 21%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/io/dataset_loader.cpp.o
[ 22%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/io/file_io.cpp.o
[ 24%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/io/json11.cpp.o
[ 25%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/io/metadata.cpp.o
[ 27%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/io/parser.cpp.o
[ 28%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/io/train_share_states.cpp.o
[ 29%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/io/tree.cpp.o
[ 31%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/metric/dcg_calculator.cpp.o
[ 32%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/metric/metric.cpp.o
[ 33%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/network/linker_topo.cpp.o
[ 35%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/network/linkers_mpi.cpp.o
[ 36%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/network/linkers_socket.cpp.o
[ 37%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/network/network.cpp.o
[ 39%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/objective/objective_function.cpp.o
[ 40%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/treelearner/data_parallel_tree_learner.cpp.o
[ 41%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/treelearner/feature_histogram.cpp.o
[ 43%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/treelearner/feature_parallel_tree_learner.cpp.o
[ 44%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/treelearner/gpu_tree_learner.cpp.o
[ 45%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/treelearner/gradient_discretizer.cpp.o
[ 47%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/treelearner/linear_tree_learner.cpp.o
In file included from /tmp/lib/LightGBM/external_libs/eigen/Eigen/Core:214,
                 from /tmp/lib/LightGBM/external_libs/eigen/Eigen/Dense:1,
                 from /tmp/lib/LightGBM/src/treelearner/linear_tree_learner.cpp:7:
/tmp/lib/LightGBM/external_libs/eigen/Eigen/src/Core/arch/NEON/PacketMath.h: In function 'Packet Eigen::internal::pload(const typename Eigen::internal::unpacket_traits<T>::type*) [with Packet = Eigen::internal::eigen_packet_wrapper<int, 2>; typename Eigen::internal::unpacket_traits<T>::type = signed char]':
/tmp/lib/LightGBM/external_libs/eigen/Eigen/src/Core/arch/NEON/PacketMath.h:1671:9: warning: 'void* memcpy(void*, const void*, size_t)' copying an object of non-trivial type 'Eigen::internal::Packet4c' {aka 'struct Eigen::internal::eigen_packet_wrapper<int, 2>'} from an array of 'const int8_t' {aka 'const signed char'} [-Wclass-memaccess]
 1671 |   memcpy(&res, from, sizeof(Packet4c));
      |   ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /tmp/lib/LightGBM/external_libs/eigen/Eigen/Core:172,
                 from /tmp/lib/LightGBM/external_libs/eigen/Eigen/Dense:1,
                 from /tmp/lib/LightGBM/src/treelearner/linear_tree_learner.cpp:7:
/tmp/lib/LightGBM/external_libs/eigen/Eigen/src/Core/GenericPacketMath.h:159:8: note: 'Eigen::internal::Packet4c' {aka 'struct Eigen::internal::eigen_packet_wrapper<int, 2>'} declared here
  159 | struct eigen_packet_wrapper
      |        ^~~~~~~~~~~~~~~~~~~~
In file included from /tmp/lib/LightGBM/external_libs/eigen/Eigen/Core:214,
                 from /tmp/lib/LightGBM/external_libs/eigen/Eigen/Dense:1,
                 from /tmp/lib/LightGBM/src/treelearner/linear_tree_learner.cpp:7:
/tmp/lib/LightGBM/external_libs/eigen/Eigen/src/Core/arch/NEON/PacketMath.h: In function 'Packet Eigen::internal::ploadu(const typename Eigen::internal::unpacket_traits<T>::type*) [with Packet = Eigen::internal::eigen_packet_wrapper<int, 2>; typename Eigen::internal::unpacket_traits<T>::type = signed char]':
/tmp/lib/LightGBM/external_libs/eigen/Eigen/src/Core/arch/NEON/PacketMath.h:1716:9: warning: 'void* memcpy(void*, const void*, size_t)' copying an object of non-trivial type 'Eigen::internal::Packet4c' {aka 'struct Eigen::internal::eigen_packet_wrapper<int, 2>'} from an array of 'const int8_t' {aka 'const signed char'} [-Wclass-memaccess]
 1716 |   memcpy(&res, from, sizeof(Packet4c));
      |   ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /tmp/lib/LightGBM/external_libs/eigen/Eigen/Core:172,
                 from /tmp/lib/LightGBM/external_libs/eigen/Eigen/Dense:1,
                 from /tmp/lib/LightGBM/src/treelearner/linear_tree_learner.cpp:7:
/tmp/lib/LightGBM/external_libs/eigen/Eigen/src/Core/GenericPacketMath.h:159:8: note: 'Eigen::internal::Packet4c' {aka 'struct Eigen::internal::eigen_packet_wrapper<int, 2>'} declared here
  159 | struct eigen_packet_wrapper
      |        ^~~~~~~~~~~~~~~~~~~~
[ 48%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/treelearner/serial_tree_learner.cpp.o
[ 50%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/treelearner/tree_learner.cpp.o
[ 51%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/treelearner/voting_parallel_tree_learner.cpp.o
[ 52%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/utils/openmp_wrapper.cpp.o
[ 54%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cpp.o
[ 55%] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o
[ 56%] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/cuda/cuda_algorithms.cu.o
[ 58%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/cuda/cuda_utils.cpp.o
[ 59%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/io/cuda/cuda_column_data.cpp.o
[ 60%] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/io/cuda/cuda_column_data.cu.o
[ 62%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/io/cuda/cuda_metadata.cpp.o
[ 63%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/io/cuda/cuda_row_data.cpp.o
[ 64%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/io/cuda/cuda_tree.cpp.o
[ 66%] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/io/cuda/cuda_tree.cu.o
[ 67%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/metric/cuda/cuda_binary_metric.cpp.o
[ 68%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/metric/cuda/cuda_pointwise_metric.cpp.o
[ 70%] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/metric/cuda/cuda_pointwise_metric.cu.o
[ 71%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/metric/cuda/cuda_regression_metric.cpp.o
[ 72%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/objective/cuda/cuda_binary_objective.cpp.o
[ 74%] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/objective/cuda/cuda_binary_objective.cu.o
[ 75%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/objective/cuda/cuda_multiclass_objective.cpp.o
[ 77%] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/objective/cuda/cuda_multiclass_objective.cu.o
[ 78%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/objective/cuda/cuda_rank_objective.cpp.o
[ 79%] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/objective/cuda/cuda_rank_objective.cu.o
[ 81%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/objective/cuda/cuda_regression_objective.cpp.o
[ 82%] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/objective/cuda/cuda_regression_objective.cu.o
[ 83%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/treelearner/cuda/cuda_best_split_finder.cpp.o
[ 85%] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/treelearner/cuda/cuda_best_split_finder.cu.o
[ 86%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/treelearner/cuda/cuda_data_partition.cpp.o
[ 87%] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/treelearner/cuda/cuda_data_partition.cu.o
[ 89%] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/treelearner/cuda/cuda_gradient_discretizer.cu.o
[ 90%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/treelearner/cuda/cuda_histogram_constructor.cpp.o
[ 91%] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/treelearner/cuda/cuda_histogram_constructor.cu.o
[ 93%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/treelearner/cuda/cuda_leaf_splits.cpp.o
[ 94%] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/treelearner/cuda/cuda_leaf_splits.cu.o
[ 95%] Building CXX object CMakeFiles/lightgbm_objs.dir/src/treelearner/cuda/cuda_single_gpu_tree_learner.cpp.o
[ 97%] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/treelearner/cuda/cuda_single_gpu_tree_learner.cu.o
[ 97%] Built target lightgbm_objs
[ 98%] Linking CUDA device code CMakeFiles/_lightgbm.dir/cmake_device_link.o
[100%] Linking CXX shared library ../lib_lightgbm.so
[100%] Built target _lightgb

Python build + install logs.

building lightgbm
Collecting build>=0.10.0
  Downloading build-1.2.1-py3-none-any.whl.metadata (4.3 kB)
Requirement already satisfied: packaging>=19.1 in /opt/conda/lib/python3.10/site-packages (from build>=0.10.0) (24.0)
Collecting pyproject_hooks (from build>=0.10.0)
  Downloading pyproject_hooks-1.0.0-py3-none-any.whl.metadata (1.3 kB)
Collecting tomli>=1.1.0 (from build>=0.10.0)
  Downloading tomli-2.0.1-py3-none-any.whl.metadata (8.9 kB)
Downloading build-1.2.1-py3-none-any.whl (21 kB)
Downloading tomli-2.0.1-py3-none-any.whl (12 kB)
Downloading pyproject_hooks-1.0.0-py3-none-any.whl (9.3 kB)
Installing collected packages: tomli, pyproject_hooks, build
Successfully installed build-1.2.1 pyproject_hooks-1.0.0 tomli-2.0.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
found pre-compiled lib_lightgbm.so
--- building sdist ---
* Creating isolated environment: venv+pip...
* Installing packages in isolated environment:
  - setuptools
* Getting build dependencies for sdist...
running egg_info
creating lightgbm.egg-info
writing lightgbm.egg-info/PKG-INFO
writing dependency_links to lightgbm.egg-info/dependency_links.txt
writing requirements to lightgbm.egg-info/requires.txt
writing top-level names to lightgbm.egg-info/top_level.txt
writing manifest file 'lightgbm.egg-info/SOURCES.txt'
reading manifest file 'lightgbm.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching '*.dll' under directory 'lightgbm'
warning: no files found matching '*.dylib' under directory 'lightgbm'
adding license file 'LICENSE'
writing manifest file 'lightgbm.egg-info/SOURCES.txt'
* Building sdist...
running sdist
running egg_info
writing lightgbm.egg-info/PKG-INFO
writing dependency_links to lightgbm.egg-info/dependency_links.txt
writing requirements to lightgbm.egg-info/requires.txt
writing top-level names to lightgbm.egg-info/top_level.txt
reading manifest file 'lightgbm.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching '*.dll' under directory 'lightgbm'
warning: no files found matching '*.dylib' under directory 'lightgbm'
adding license file 'LICENSE'
writing manifest file 'lightgbm.egg-info/SOURCES.txt'
running check
creating lightgbm-4.3.0.99
creating lightgbm-4.3.0.99/lightgbm
creating lightgbm-4.3.0.99/lightgbm.egg-info
creating lightgbm-4.3.0.99/lightgbm/lib
copying files to lightgbm-4.3.0.99...
copying LICENSE -> lightgbm-4.3.0.99
copying MANIFEST.in -> lightgbm-4.3.0.99
copying README.rst -> lightgbm-4.3.0.99
copying pyproject.toml -> lightgbm-4.3.0.99
copying setup.cfg -> lightgbm-4.3.0.99
copying lightgbm/__init__.py -> lightgbm-4.3.0.99/lightgbm
copying lightgbm/basic.py -> lightgbm-4.3.0.99/lightgbm
copying lightgbm/callback.py -> lightgbm-4.3.0.99/lightgbm
copying lightgbm/compat.py -> lightgbm-4.3.0.99/lightgbm
copying lightgbm/dask.py -> lightgbm-4.3.0.99/lightgbm
copying lightgbm/engine.py -> lightgbm-4.3.0.99/lightgbm
copying lightgbm/libpath.py -> lightgbm-4.3.0.99/lightgbm
copying lightgbm/plotting.py -> lightgbm-4.3.0.99/lightgbm
copying lightgbm/py.typed -> lightgbm-4.3.0.99/lightgbm
copying lightgbm/sklearn.py -> lightgbm-4.3.0.99/lightgbm
copying lightgbm.egg-info/PKG-INFO -> lightgbm-4.3.0.99/lightgbm.egg-info
copying lightgbm.egg-info/SOURCES.txt -> lightgbm-4.3.0.99/lightgbm.egg-info
copying lightgbm.egg-info/dependency_links.txt -> lightgbm-4.3.0.99/lightgbm.egg-info
copying lightgbm.egg-info/requires.txt -> lightgbm-4.3.0.99/lightgbm.egg-info
copying lightgbm.egg-info/top_level.txt -> lightgbm-4.3.0.99/lightgbm.egg-info
copying lightgbm/lib/lib_lightgbm.so -> lightgbm-4.3.0.99/lightgbm/lib
copying lightgbm.egg-info/SOURCES.txt -> lightgbm-4.3.0.99/lightgbm.egg-info
Writing lightgbm-4.3.0.99/setup.cfg
Creating tar archive
removing 'lightgbm-4.3.0.99' (and everything under it)
Successfully built lightgbm-4.3.0.99.tar.gz
--- installing lightgbm ---
WARNING: Skipping lightgbm as it is not installed.
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Looking in links: .
Processing ./lightgbm-4.3.0.99.tar.gz
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy in /opt/conda/lib/python3.10/site-packages (from lightgbm) (1.26.4)
Requirement already satisfied: scipy in /opt/conda/lib/python3.10/site-packages (from lightgbm) (1.13.0)
Building wheels for collected packages: lightgbm
  Building wheel for lightgbm (pyproject.toml) ... done
  Created wheel for lightgbm: filename=lightgbm-4.3.0.99-py3-none-any.whl size=62203670 sha256=ea5fe085de440887522cfa4a3b9f9ee1b076bc93be325cd1a3f068471d73bdf8
  Stored in directory: /tmp/pip-ephem-wheel-cache-_be0h8ev/wheels/97/06/d4/842e2ab3fea42d639f11ba3250fbe19b540afb7108b58b2cfc
Successfully built lightgbm
Installing collected packages: lightgbm
Successfully installed lightgbm-4.3.0.99
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
cleaning up

github-actions · 2024-05-25T04:03:16Z

This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one. Thank you for taking the time to improve LightGBM!

fingoldo · 2024-07-09T09:53:26Z

Sorry for the long delay in response. I believe recent changes in LightGBM have fixed this.

I was able to build latest LightGBM (1443548) in the latest stable rapidsai/base image.

(rapidsai/rapidsai-core images were removed as part of rapidsai/docker#539)
docker run \
    --rm \
    --user root \
    -it rapidsai/base:24.04-cuda12.0-py3.10 \
    bash

mkdir /tmp/lib
cd /tmp/lib 

# install build tools (rapidsai/core doesn't ship these)
apt-get update
apt-get install -y \
    build-essential \
    cmake \
    git

# build LightGBM
git clone --recursive https://github.com/microsoft/LightGBM

cd ./LightGBM
cmake -B build -S . -DUSE_CUDA=1
cmake --build build --target _lightgbm -j2
sh build-python.sh install --precompile
That built successfully for me.

full logs (click me)

Wondering if it's possible to enforce architecture somehow. Trying to reproduce your commands on NVIDIA RTX 6000 Ada (SM 8.9) & CUDA Version: 12.4, Ubuntu 20.04.6 LTS leads to

#$ "/usr/bin"/c++ -D__CUDA_ARCH__=300 -E -x c++
-DCUDA_DOUBLE_MATH_FUNCTIONS -D__CUDACC__ -D__NVCC__
-D__CUDACC_VER_MAJOR__=10 -D__CUDACC_VER_MINOR__=1
-D__CUDACC_VER_BUILD__=243 -include "cuda_runtime.h" -m64
"CMakeCUDACompilerId.cu" > "tmp/CMakeCUDACompilerId.cpp1.ii"

#$ cicc --c++14 --gnu_version=80400 --allow_managed -arch compute_30 -m64
-ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 --include_file_name
"CMakeCUDACompilerId.fatbin.c" -tused -nvvmir-library
"/usr/lib/nvidia-cuda-toolkit/libdevice/libdevice.10.bc"
--gen_module_id_file --module_id_file_name
"tmp/CMakeCUDACompilerId.module_id" --orig_src_file_name
"CMakeCUDACompilerId.cu" --gen_c_file_name
"tmp/CMakeCUDACompilerId.cudafe1.c" --stub_file_name
"tmp/CMakeCUDACompilerId.cudafe1.stub.c" --gen_device_file_name
"tmp/CMakeCUDACompilerId.cudafe1.gpu" "tmp/CMakeCUDACompilerId.cpp1.ii" -o
"tmp/CMakeCUDACompilerId.ptx"

#$ ptxas -arch=sm_30 -m64 "tmp/CMakeCUDACompilerId.ptx" -o
"tmp/CMakeCUDACompilerId.sm_30.cubin"

ptxas fatal : Value 'sm_30' is not defined for option 'gpu-name'

fingoldo · 2024-07-09T10:25:25Z

Nevermind. I had to remove nvidia-cuda-toolkit (which I installed 'cause it allowed open CL version of lightgbm to work, only to find out it's buggy on big datasets and overall an abandoned branch).

Currently stuck at

found pre-compiled lib_lightgbm.so
--- building sdist ---
build-python.sh: 347: python: not found

Why is it so hard to get lightgbm working with GPU? Catboost & Xgboost teams somehow managed to solve it with single "pip install" command ;-)

jameslamb · 2024-07-09T13:17:22Z

build-python.sh: 347: python: not found

You have to have Python installed and a python executable available on PATH to build LightGBM's Python package.

I strongly suspect that you aren't using the exact example I provided in #5785 (comment), but you haven't described your setup here so it's not possible to help much more.

Why is it so hard to get lightgbm working with GPU? Catboost & Xgboost teams somehow managed to solve it with single "pip install" command

We're doing the best we can with a much smaller amount of maintainer availability. Those projects both have multiple maintainers being paid to work on them full-time... LightGBM does not.

You're welcome to come contribute here any time.

fingoldo · 2024-07-09T13:26:30Z

build-python.sh: 347: python: not found

You have to have Python installed and a python executable available on PATH to build LightGBM's Python package.

I strongly suspect that you aren't using the exact example I provided in #5785 (comment), but you haven't described your setup here so it's not possible to help much more.

Why is it so hard to get lightgbm working with GPU? Catboost & Xgboost teams somehow managed to solve it with single "pip install" command

We're doing the best we can with a much smaller amount of maintainer availability. Those projects both have multiple maintainers being paid to work on them full-time... LightGBM does not.

You're welcome to come contribute here any time.

Yeah, I know. Thanks a lot for your hard work, guys. I hope getting an easier access to GPU training is on the roadmap. Not experienced myself in that, otherwise would contribute for sure.

jameslamb added the bug label Jun 26, 2023

jameslamb mentioned this issue Jan 8, 2024

[cmake] [c++] require CMake 3.18+ #6260

Merged

jameslamb added the awaiting response label Apr 24, 2024

github-actions bot closed this as completed May 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build fails for `-DUSE_CUDA=1` #5785

Build fails for `-DUSE_CUDA=1` #5785

jmakov commented Mar 15, 2023 •

edited

Loading

shiyu1994 commented Mar 16, 2023

jmakov commented Mar 16, 2023 •

edited

Loading

jmakov commented Mar 31, 2023

domtisdell commented Jul 19, 2023 •

edited

Loading

jameslamb commented Apr 24, 2024 •

edited

Loading

github-actions bot commented May 25, 2024

fingoldo commented Jul 9, 2024

fingoldo commented Jul 9, 2024 •

edited

Loading

jameslamb commented Jul 9, 2024

fingoldo commented Jul 9, 2024

Build fails for -DUSE_CUDA=1 #5785

Build fails for -DUSE_CUDA=1 #5785

Comments

jmakov commented Mar 15, 2023 • edited Loading

Description

Reproducible example

Environment info

Additional Comments

shiyu1994 commented Mar 16, 2023

jmakov commented Mar 16, 2023 • edited Loading

jmakov commented Mar 31, 2023

domtisdell commented Jul 19, 2023 • edited Loading

jameslamb commented Apr 24, 2024 • edited Loading

github-actions bot commented May 25, 2024

fingoldo commented Jul 9, 2024

fingoldo commented Jul 9, 2024 • edited Loading

jameslamb commented Jul 9, 2024

fingoldo commented Jul 9, 2024

Build fails for `-DUSE_CUDA=1` #5785

Build fails for `-DUSE_CUDA=1` #5785

jmakov commented Mar 15, 2023 •

edited

Loading

jmakov commented Mar 16, 2023 •

edited

Loading

domtisdell commented Jul 19, 2023 •

edited

Loading

jameslamb commented Apr 24, 2024 •

edited

Loading

fingoldo commented Jul 9, 2024 •

edited

Loading