Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit runtime dependency to libarrow>=16.0.0,<16.1.0a0 #15782

Merged
merged 4 commits into from
May 20, 2024

Conversation

pentschev
Copy link
Member

Fix libarrow runtime dependency which is currently broken due to the release of libarrow=16.1.0:

$ python -c "import cudf"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/cudf/__init__.py", line 9, in <module>
    _setup_numba()
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/cudf/utils/_numba.py", line 124, in _setup_numba
    _get_cc_60_ptx_file()
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/cudf/utils/_numba.py", line 16, in _get_cc_60_ptx_file
    from cudf._lib import strings_udf
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/cudf/_lib/__init__.py", line 4, in <module>
    from . import (
ImportError: libarrow.so.1600: cannot open shared object file: No such file or directory

@pentschev pentschev requested a review from a team as a code owner May 18, 2024 09:25
@pentschev pentschev requested a review from jameslamb May 18, 2024 09:25
@pentschev pentschev added bug Something isn't working 3 - Ready for Review Ready for review by team non-breaking Non-breaking change labels May 18, 2024
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run-exports were recently fixed on conda-forge and should save us here. This was done less than 12 hours ago so a rebuild of nightlies across RAPIDS should fix it.

Fix for 16.1: conda-forge/arrow-cpp-feedstock#1409

Backport of this fix to 16.0: conda-forge/conda-forge-repodata-patches-feedstock@6155ba0

@pentschev
Copy link
Member Author

That doesn’t seem to do the trick, build is picking up the latest release:

libarrow                  16.1.0           hbcc2d42_2_cpu    conda-forge
libarrow-acero            16.1.0           hac33072_2_cpu    conda-forge
libarrow-dataset          16.1.0           hac33072_2_cpu    conda-forge
libarrow-substrait        16.1.0           h7e0c224_2_cpu    conda-forge

still fails:

ImportError: libarrow.so.1600: cannot open shared object file: No such file or directory

Latest build is 2 from 10h ago, whereas 1 was 13h ago, as seen in https://anaconda.org/conda-forge/libarrow/files .

@github-actions github-actions bot added the conda label May 20, 2024
@jameslamb jameslamb self-assigned this May 20, 2024
@jameslamb
Copy link
Member

Summarizing some conversations from other places... in short, the root cause for the error reported in this PR's description is the mix of the following:

*libcudf conda package builds against libarrow supplied via conda-forge...it gets that libarrow's run_exports (conda docs)

The ideal fix for this is to update the run exports for conda-forge's libarrow=16.0 packages. That's documented at conda-forge/arrow-cpp-feedstock#1418.

To hopefully unblock cudf and other projects using it in the interim, I just pushed a commit (ff82c0e) that does the following:

  • ignore libarrow's run_exports when building libcudf
  • manually specify the tighter runtime pins we want in libcudf's run dependencies

@jameslamb
Copy link
Member

It looks to me like this fixes the libarrow issue... I see libarrow=16.0 being pulled in in both conda builds (conda-cpp-build job) and tests (conda-cpp-tests job).

But the Python tests appear to still be failing with the other, Dask-related issue @rjzamora is working on in #15788.

x......[gw2] node down: Not properly terminated
F
replacing crashed worker gw2
Fatal Python error: Segmentation fault

(wheel-tests job)

So @vyasr @bdice @pentschev @rjzamora I think we'll either need to admin merge this or decide on one PR (#15788 or this one) to fix both issues at the same time.

@jameslamb jameslamb requested a review from bdice May 20, 2024 18:18
Comment on lines +111 to +116
# TODO: start taking libarrow's run exports again wwhen they're correct for 16.0
# ref: https://github.com/conda-forge/arrow-cpp-feedstock/issues/1418
- libarrow>=16.0.0,<16.1.0a0
- libarrow-acero>=16.0.0,<16.1.0a0
- libarrow-dataset>=16.0.0,<16.1.0a0
- libparquet>=16.0.0,<16.1.0a0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the libarrow pin sufficient? The rest are pinned exactly to it (for example)

Suggested change
# TODO: start taking libarrow's run exports again wwhen they're correct for 16.0
# ref: https://github.com/conda-forge/arrow-cpp-feedstock/issues/1418
- libarrow>=16.0.0,<16.1.0a0
- libarrow-acero>=16.0.0,<16.1.0a0
- libarrow-dataset>=16.0.0,<16.1.0a0
- libparquet>=16.0.0,<16.1.0a0
# TODO: start taking libarrow's run exports again wwhen they're correct for 16.0
# ref: https://github.com/conda-forge/arrow-cpp-feedstock/issues/1418
- libarrow>=16.0.0,<16.1.0a0

Copy link
Member

@jameslamb jameslamb May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're probably right that it would be enough. I went for all of them just mimicking what I already saw in dependencies.yaml here for building the test environment.

Alright if we still merge this, and then I push a follow-up PR trying to pare this back down to just libarrow? That way we wouldn't have to wait for another CI cycle to get the fix onto branch-24.06.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's worth changing at this point. Once the upstream arrow packages are updated we'll just be dropping this altogether anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok works for me, I won't touch this unless @jakirkham disagrees and thinks we should pursue it.

@raydouglass raydouglass merged commit 16e8625 into rapidsai:branch-24.06 May 20, 2024
67 of 69 checks passed
@pentschev pentschev deleted the limit-pyarrow-16.0 branch November 19, 2024 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants