-
Notifications
You must be signed in to change notification settings - Fork 920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QST] Error building cudf #8617
Comments
Hi, thank you for reporting! Could you try disabling S3 support as described here? |
Thanks @shwina , it looks like now after a few minutes the compilation process is stuck. `home/eyal/ThirdParties/cudf/cpp/src/io/json/reader_impl.cu:499:54: required from here /usr/include/c++/9/bits/stl_tree.h:2199:8: error: no matching function for call to ‘std::pair<std::_Rb_tree_node_base*, 2199 | return _Res(0, _M_rightmost());
/usr/include/c++/9/bits/stl_pair.h:434:1: note: candidate: ‘template<class ... _Args1, long unsigned int ..._Indexes1, class ... 434 | template<typename... _Args1, std::size_t... _Indexes1, /usr/include/c++/9/bits/stl_tree.h:2208:8: error: no matching function for call to ‘std::pair<std::_Rb_tree_node_base*,
` |
@shwina I'm also getting the following error message when I try to use conda: conda install -c rapidsai -c nvidia -c numba -c conda-forge cudf=21.06 python=3.7 cudatoolkit=11.0 InvalidSpecError: Invalid spec: =1.2.2.5 |
We recommend creating a fresh environment, as installing into an existing environment can sometimes cause tricky dependency resolution. https://rapids.ai/start#get-rapids For the community to best assist you with compiling (if that is your goal), please include all of your environment details as noted in the build guide. |
Thanks @beckernick |
We've also been seeing errors building cudf with CUDA 11.3. We're building cudf and all of its dependencies from source (not using conda at all to avoid increased container sizes), using the DLFW containers. This build process worked fine on cuda 11.2 with the DLFW 21.03 containers - but we're trying to update to the 21.06 DLFW version which is on CUDA11.3 and seeing issues. There are at least three issues that I've hit so far:
This seems like its working in 21.08 though, and I can make 21.06 compile by changing those two lines to match 21.08:
edit: fix in 21.08 is #8525
Note: we built RMM 21.06 with this patch on top first rapidsai/rmm#809. Edit: filed a PR for the first error here #8635 |
@benfred Thanks for the answer. So what is the suggested way to compile the code for 11.3 now? Any idea why the gather.cu takes hours? Is there some sort of multi template pattern hidden somewhere ? |
@robertmaynard tried to build with 11.3 and saw many of the same issues as @benfred described (long compile times on some files). I believe there were some compiler bugs with 11.3 that will be solved with 11.4. |
Correct, the above regression in 11.3 ( as outlined by @benfred ) have been resolved with 11.4. |
Compiler Issue 2 about the 'std::move' with temporaries as referenced by @benfred isn't fixed in CUDA 11.4, but works with the latest fixes in the pull request mentioned |
I think it would be best capping the CUDA version support in the READMEs of the stable builds until either this fix gets moved to the stable builds or CUDA Toolkit fixes the std::move with temporaries (in this case, mention it doesn't support 11.3 and 11.4 at least) |
This issue has been labeled |
Closing this issue as cudf now requires CUDA Toolkit 11.5+ |
This PR is a breaking change that disables Arrow S3 support by default. Enabling this feature by default has caused build issues for many downstream consumers, all of whom (to my knowledge) manually disable support for this feature. Most commonly, that build error appears as `fatal error: aws/core/Aws.h: No such file or directory`. In my understanding, several downstream consumers of cudf no longer rely on Arrow S3 support from this library and instead get S3 access via fsspec. I am not aware of any users of libcudf who rely on this being enabled by default (or enabled at all). See related issues and discussions: #8617, #11333, #8867, #10644 (comment), NVIDIA/spark-rapids#2827. Build errors caused by this default behavior have also been reported internally. cc: @rjzamora @beckernick @jdye64 @randerzander @robertmaynard @jlowe @quasiben if you have comments following our previous discussion. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Nghia Truong (https://github.com/ttnghia) - GALI PREM SAGAR (https://github.com/galipremsagar) - Vyas Ramasubramani (https://github.com/vyasr) - AJ Schmidt (https://github.com/ajschmidt8) URL: #11470
I'm running ./build.sh libcudf and get the following error:
`
[ 33%] Building CXX object _deps/arrow-build/src/arrow/CMakeFiles/arrow_objlib.dir/filesystem/s3fs.cc.o
/home/eyal/ThirdParties/cudf/cpp/build/_deps/arrow-src/cpp/src/arrow/filesystem/s3fs.cc:38:10: fatal error: aws/core/Aws.h: No such file or directory
38 | #include <aws/core/Aws.h>
| ^~~~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [_deps/arrow-build/src/arrow/CMakeFiles/arrow_objlib.dir/build.make:1885: _deps/arrow-
build/src/arrow/CMakeFiles/arrow_objlib.dir/filesystem/s3fs.cc.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:1199: _deps/arrow-build/src/arrow/CMakeFiles/arrow_objlib.dir/all] Error 2
make: *** [Makefile:156: all] Error 2
`
cmake --version
cmake version 3.20.5
Seems like the cmake process is fine
`
-- Found OpenSSL Crypto Library: /usr/lib/x86_64-linux-gnu/libcrypto.so
-- Building with OpenSSL (Version: 1.1.1f) support
-- Found hdfs.h at: /home/eyal/ThirdParties/cudf/cpp/build/_deps/arrow-src/cpp/thirdparty/hadoop/include/hdfs.h
-- Found AWS SDK headers:
-- Found AWS SDK libraries:
-- All bundled static libraries:
`
Any idea?
The text was updated successfully, but these errors were encountered: