-
-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it necessary to install cudatoolkit with pyarrow 11 on Linux? #962
Comments
ucx indeed got added for arrow 11 (on the feedstock, haven't looked at backporting this yet), though I agree that cudatoolkit is a bit heavier than expected. Are you using the CPU or CUDA-builds for arrow? I guess we could restrict it to the CUDA builds. That said, there's a larger theme here that arrow keeps growing non-trivial dependencies. I guess we could introduce a separate output for a "minimal" arrow ( CC @conda-forge/arrow-cpp |
I just ran into the same issue. This now gets installed for a normal CPU build (I was using a simple
Is it on the arrow-cpp side that you would not depend on UCX for CPU builds, or it is that It seems that ucx already has some logic around this: https://github.com/conda-forge/ucx-split-feedstock/blob/dd05d5af3a6c4902093ecff980564b6605633124/recipe/meta.yaml#L59-L60 (maybe I am interpreting it wrongly, but it seems it should already not depend on cudatoolkit if
That's indeed something we have to look at long term, but probably good for a separate issue? (it seems to me that regardless of that, cudatoolkit should never be a dependency for the CPU version?) |
Digging a little bit further, it seems that it are older version of ucx that depends on cudatoolkit, the more recent versions indeed avoid it:
But so doing a pyarrow install is for some reason not getting the latest version by default:
But also asking for the latest version still gives a ucx build with cudatoolkit:
Explicitly asking for the ucx build that doesn't depend on cudatoolkit:
So it seems there is something wrong with the last ucx build? (the cudatoolkit dependency got added again?) conda-forge/ucx-split-feedstock#114 is the PR that bumped the build number, but I don't directly see how that changed this dependency. |
I opened conda-forge/ucx-split-feedstock#115 for the underlying issue with ucx (also installing ucx on its own has the same issue). Until that is solved, temporarily removing the |
Is the dependency issue here with
|
It's very possible that ongoing migrations (e.g. libabseil) play a role in solvability issues. Re2 should be finished though, as in: everything should have been rebuilt for the newest version |
Also newer versions of ucx have the issue, so the above might be the reason we currently get ucx 1.12. But also when forcing it to be 1.13, cudatoolkit still gets installed (see one of the outputs in #962 (comment)) |
I'm now at this point:
So I'm current suspecting libgoogle-cloud ? but I trust @h-vetinari 's assessment way more than mine! |
This could be if any other package in the environment pins google-cloud-cpp, so it doesn't get recognised (and updated) by the migrator. Will have a look later. That's not it, arrow hasn't been rebuilt for the newest abseil yet either. |
FWIW, this issue is now gone for me with updated conda-forge builds no longer requiring ucx+cudatoolkit. |
Closing this issue now; let us know if there are problems like this again please (planning to reinstate ucx-support as soon as that feedstock has separated out the cudatoolkit dependency more cleanly) |
Comment:
When installing pyarrow 11 on Linux cudatoolkit is installed. It is a pretty big dependency:
This is not downloaded with pyarrow=10.
When installing the environment I can see this is because of ucx:
The text was updated successfully, but these errors were encountered: