Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Remove cudf Cython bindings (cudf._lib) in favor of pylibcudf #17317

Open
3 of 4 tasks
mroeschke opened this issue Nov 14, 2024 · 0 comments · May be fixed by #17760
Open
3 of 4 tasks

[FEA] Remove cudf Cython bindings (cudf._lib) in favor of pylibcudf #17317

mroeschke opened this issue Nov 14, 2024 · 0 comments · May be fixed by #17760
Assignees
Labels
feature request New feature or request Python Affects Python cuDF API.

Comments

@mroeschke
Copy link
Contributor

mroeschke commented Nov 14, 2024

Is your feature request related to a problem? Please describe.
With pylibcudf largely complete in containing the necessary APIs for cudf Python (xref #15162), the cudf Cython layer (cudf._lib) should be largely unneeded now as pylibcudf provides a public interface to access libcudf.

Describe the solution you'd like
Ideally to fully remove the cudf._lib directory and stop developing Cython binding for this directory.

Each Cython file in cudf._lib should be able to be converted to a Python in cudf.core._internals (or cudf.core._plc?)

Additional context

Open Questions/Considerations

  1. cudf Cython spill locks whenever calling pylibcudf algorithms that operate on columns (e.g. nans_to_nulls), while there's currently no way to replicate spill locking with using pylibcudf alone.

  2. cudf Cython also defines Cython classes that back cudf Python columns and scalars. Ideally cudf Python should just be able to define these objects in Python, but it's not clear yet (to me) if there's some aspect that needs to be defined in Cython.

Projects using cudf Cython layer

@mroeschke mroeschke added the feature request New feature or request label Nov 14, 2024
@vyasr vyasr added the Python Affects Python cuDF API. label Nov 14, 2024
@mroeschke mroeschke self-assigned this Nov 14, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 15, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 18, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 18, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 18, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 18, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 18, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 22, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 9, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 9, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 12, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 13, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 13, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 13, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 13, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 16, 2024
Contributes to #17317

More can be removed once my other cudf._lib PRs are in

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #17586
rapids-bot bot pushed a commit that referenced this issue Dec 16, 2024
Contributes to #17317

Also I found that `PackedColumns` was not being use anywhere. It appears it was added back in #8153 for dask_cudf but I cannot see it being used there anymore

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #17548
rapids-bot bot pushed a commit that referenced this issue Dec 17, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 17, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 17, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 19, 2024
rapids-bot bot pushed a commit that referenced this issue Dec 20, 2024
Contributes to #17317

Dependent on #17582

Did a search across RAPIDS and Morpheus and didn't find usage of these methods.

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #17625
rapids-bot bot pushed a commit that referenced this issue Dec 20, 2024
Contributes to #17317

Primary change is to use `pylibcudf.TypeId` instead of an ad-hoc one defined in `cudf._lib.types`. Additionally uses pylibcudf more consistently and inlines/removes some seldom uses/dead code

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #17619
rapids-bot bot pushed a commit that referenced this issue Jan 9, 2025
Contributes to #17317

1. Moves some Python routines/objects to `cudf/utils/dtypes.py`
2. Moves specific column only routines directly to `cudf/_libs/column.pyx`

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #17665
rapids-bot bot pushed a commit to nv-morpheus/Morpheus that referenced this issue Jan 16, 2025
Related to rapidsai/cudf#17317, `cudf._lib.column.Column` is being phased out in favor of using pylibcudf.

I found some locations where `cudf._lib.column.Column` was used as type annotations in Cython; however, the accessed or returned objects were calling Python defined attributes/methods or used as objects in other Python objects respectively. Therefore, I don't suspect the annotations are entirely performance critical here

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/nv-morpheus/Morpheus/blob/main/docs/source/developer_guide/contributing.md).
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - David Gardner (https://github.com/dagardner-nv)

URL: #2109
rapids-bot bot pushed a commit to rapidsai/cuspatial that referenced this issue Jan 23, 2025
…mn.Column (#1514)

Related to rapidsai/cudf#17317, it appears the Cython bindings are not dependent on any methods specifically on `cudf._lib.column.Column`. It appears most routines operate on `column_view` which can be provided by a `pylibcudf.Column`.

This PR essentially makes the routines defined in Cython accept a `pylibcudf.Column` instead and return a `cudf.core.column.Column` object instead (which should help cudf transition away from its `cudf._lib.column.Column`)

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - Michael Wang (https://github.com/isVoid)

URL: #1514
@mroeschke mroeschke linked a pull request Jan 29, 2025 that will close this issue
3 tasks
@GPUtester GPUtester moved this from Todo to In Progress in cuDF Python Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Python Affects Python cuDF API.
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

2 participants