-
Notifications
You must be signed in to change notification settings - Fork 922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Upgrade to Arrow 14 #14370
Comments
This update is important because if an environment already contains a package that depends on (or solves with) pyarrow or libarrow version 14, it's not possible for the conda/mamba solver to downgrade from libarrow 14 to libarrow 13 due to the differences in package structure. We're starting to see this issue pop up across RAPIDS. Pinning this to Arrow 13 everywhere this problem shows up is not a good strategy because this should be solved within cudf's pinnings. |
#14371 updated cudf to use Arrow 14. The next step I would take before closing this is to identify which subpackages we rely on and only install those subpackages where needed, rather than Based on this list: cudf/cpp/cmake/thirdparty/get_arrow.cmake Lines 181 to 182 in 723f0e4
I think we want |
Just a note, |
This is something we want to change, and which may well still happen in the 14 series. The first iteration only split the libraries, but getting the installation size of pyarrow down (where the whole enchilada isn't needed) has been something arrow wanted to do for a while, and a bunch of the necessary work already landed in 14. |
Glad to hear that! Thanks Axel 🙏 |
This PR splits the libarrow build dependencies, rather than using `libarrow-all`. This implements the proposal in #14370 (comment) and closes #14370. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Ray Douglass (https://github.com/raydouglass) URL: #14506
This issue tracks the migration of Arrow to major version 14.
The conda-forge packaging of
libarrow
has been split up (conda-forge/arrow-cpp-feedstock#1201), resulting in this list of packages split fromlibarrow
:libarrow-all
is a metapackage which depends on all the following packageslibarrow
libarrow-acero
libarrow-dataset
libarrow-flight
libarrow-flight-sql
libarrow-gandiva
libarrow-substrait
libparquet
The
pyarrow
package hasrun
dependencies on all of the above packages (excludinglibarrow-all
).As a first attempt at migration, I think we can use
libarrow-all
in place oflibarrow
in our conda builds, and if that works, we can try just using the components that we need. My best guess is that this includes onlylibarrow
andlibarrow-dataset
, and maybelibparquet
.cc: @galipremsagar
The text was updated successfully, but these errors were encountered: