Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Upgrade arrow & pyarrow to 6.0.1 #9686

Merged
merged 46 commits into from
Feb 9, 2022

Conversation

galipremsagar
Copy link
Contributor

@galipremsagar galipremsagar commented Nov 15, 2021

Resolves: #9645
This PR upgrades arrow & pyarrow to 6.0.1 from 5.0.0.

@galipremsagar galipremsagar added improvement Improvement / enhancement to an existing function breaking Breaking change labels Nov 15, 2021
@galipremsagar galipremsagar requested review from a team as code owners November 15, 2021 15:59
@galipremsagar galipremsagar changed the base branch from branch-21.12 to branch-22.02 November 15, 2021 15:59
@github-actions github-actions bot added CMake CMake build issue conda libcudf Affects libcudf (C++/CUDA) code. labels Nov 15, 2021
@galipremsagar galipremsagar added the 5 - DO NOT MERGE Hold off on merging; see PR for details label Nov 15, 2021
@galipremsagar galipremsagar self-assigned this Nov 15, 2021
@galipremsagar
Copy link
Contributor Author

rerun tests

@beckernick
Copy link
Member

Is there any risk to us using the constraint >= arrow/pyarrow 5 rather than requiring 6? This would likely make it much easier to interact with tools that are slower to upgrade.

@kkraus14
Copy link
Collaborator

Is there any risk to us using the constraint >= arrow/pyarrow 5 rather than requiring 6? This would likely make it much easier to interact with tools that are slower to upgrade.

We can't do that. We link against the libarrow shared library in libcudf, and libarrow 5.0.0 for example sets its SONAME as libarrow.so.500. This adds a NEEDED entry into libcudf of libarrow.so.500. If libarrow 6.0.0 is installed, the linker will fail to find libarrow.so.500 and libcudf will give a missing symbols crash.

@galipremsagar
Copy link
Contributor Author

rerun tests

1 similar comment
@galipremsagar
Copy link
Contributor Author

rerun tests

@nirandaperera
Copy link

nirandaperera commented Dec 6, 2021

Arrow has new 6.0.1 bug fix release. Could you please use that instead of 6.0.0? I believe it would be a trivial change.

Copy link
Contributor

@robertmaynard robertmaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMake changes LGTM

@kkraus14
Copy link
Collaborator

kkraus14 commented Feb 8, 2022

Python / CMake / Conda changes all LGTM, thanks @galipremsagar!

Copy link
Contributor

@rgsl888prabhu rgsl888prabhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just need to update year in copyright

@galipremsagar galipremsagar added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Feb 8, 2022
@github-actions github-actions bot removed the gpuCI label Feb 8, 2022
@galipremsagar
Copy link
Contributor Author

Thanks for the reviews, everyone! @robertmaynard @rgsl888prabhu @bdice

Thanks @kkraus14 for the help with runtime gxx & cxx libraries.
Thanks @Ethyling for publishing arm packages and @ajschmidt8 for helping with conda issues in CI.
Thanks @jakirkham for the help in resolving pyorc packaging issues.

@galipremsagar
Copy link
Contributor Author

rerun tests

2 similar comments
@galipremsagar
Copy link
Contributor Author

rerun tests

@jjacobelli
Copy link
Contributor

rerun tests

@davidwendt
Copy link
Contributor

@galipremsagar Not sure you are aware of the build error for QUANTILES_TEST

13:24:38 FAILED: gtests/QUANTILES_TEST 
13:24:38 : && /usr/local/gcc9/bin/g++ -O3 -DNDEBUG  tests/CMakeFiles/QUANTILES_TEST.dir/quantiles/percentile_approx_test.cu.o tests/CMakeFiles/QUANTILES_TEST.dir/quantiles/quantile_test.cpp.o tests/CMakeFiles/QUANTILES_TEST.dir/quantiles/quantiles_test.cpp.o -o gtests/QUANTILES_TEST -L/usr/local/cuda/targets/x86_64-linux/lib/stubs   -L/usr/local/cuda/targets/x86_64-linux/lib -Wl,-rpath,$PREFIX/lib:/opt/conda/envs/rapids/lib:$SRC_DIR/cpp/build:  libcudftestutil.a  /opt/conda/envs/rapids/lib/libgmock_main.so  /opt/conda/envs/rapids/lib/libgtest_main.so  libcudf.so  $PREFIX/lib/libarrow.so.600.1.0  $PREFIX/lib/libarrow_cuda.so.600.1.0  -ldl  $PREFIX/lib/libcudart.so  /usr/lib64/libcuda.so  /opt/conda/envs/rapids/lib/libgmock.so  /opt/conda/envs/rapids/lib/libgtest.so  -pthread  -lcudadevrt  -lcudart_static  -lrt  -lpthread  -ldl && :
13:24:38 tests/CMakeFiles/QUANTILES_TEST.dir/quantiles/percentile_approx_test.cu.o: In function `arrow_percentile_approx(cudf::column_view const&, int, std::vector<double, std::allocator<double> > const&)':
13:24:38 tmpxft_00002e7a_00000000-6_percentile_approx_test.compute_86.cudafe1.cpp:(.text+0x1284): undefined reference to `arrow::internal::TDigest::Quantile(double)'
13:24:38 tmpxft_00002e7a_00000000-6_percentile_approx_test.compute_86.cudafe1.cpp:(.text+0x1488): undefined reference to `arrow::internal::TDigest::MergeInput()'
13:24:38 collect2: error: ld returned 1 exit status

Looks like you will need to update this test code before the build will work?
I apologize if this is already known.

@galipremsagar
Copy link
Contributor Author

@galipremsagar Not sure you are aware of the build error for QUANTILES_TEST

13:24:38 FAILED: gtests/QUANTILES_TEST 
13:24:38 : && /usr/local/gcc9/bin/g++ -O3 -DNDEBUG  tests/CMakeFiles/QUANTILES_TEST.dir/quantiles/percentile_approx_test.cu.o tests/CMakeFiles/QUANTILES_TEST.dir/quantiles/quantile_test.cpp.o tests/CMakeFiles/QUANTILES_TEST.dir/quantiles/quantiles_test.cpp.o -o gtests/QUANTILES_TEST -L/usr/local/cuda/targets/x86_64-linux/lib/stubs   -L/usr/local/cuda/targets/x86_64-linux/lib -Wl,-rpath,$PREFIX/lib:/opt/conda/envs/rapids/lib:$SRC_DIR/cpp/build:  libcudftestutil.a  /opt/conda/envs/rapids/lib/libgmock_main.so  /opt/conda/envs/rapids/lib/libgtest_main.so  libcudf.so  $PREFIX/lib/libarrow.so.600.1.0  $PREFIX/lib/libarrow_cuda.so.600.1.0  -ldl  $PREFIX/lib/libcudart.so  /usr/lib64/libcuda.so  /opt/conda/envs/rapids/lib/libgmock.so  /opt/conda/envs/rapids/lib/libgtest.so  -pthread  -lcudadevrt  -lcudart_static  -lrt  -lpthread  -ldl && :
13:24:38 tests/CMakeFiles/QUANTILES_TEST.dir/quantiles/percentile_approx_test.cu.o: In function `arrow_percentile_approx(cudf::column_view const&, int, std::vector<double, std::allocator<double> > const&)':
13:24:38 tmpxft_00002e7a_00000000-6_percentile_approx_test.compute_86.cudafe1.cpp:(.text+0x1284): undefined reference to `arrow::internal::TDigest::Quantile(double)'
13:24:38 tmpxft_00002e7a_00000000-6_percentile_approx_test.compute_86.cudafe1.cpp:(.text+0x1488): undefined reference to `arrow::internal::TDigest::MergeInput()'
13:24:38 collect2: error: ld returned 1 exit status

Looks like you will need to update this test code before the build will work? I apologize if this is already known.

Have seen this undefined symbol issue, We don't see this error locally. It appears to be happening in CI if we try to upgrade arrow to 6.0 from 5.0 in ci, if we force remove 5.0 arrow and install 6.0 we don't get this problem. So @Ethyling upgraded our build images too with arrow 6.0 and reran the tests.

@galipremsagar
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 19dc46f into rapidsai:branch-22.04 Feb 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge breaking Breaking change CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Upgrade cudf to use arrow-6.0.0