Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to_pandas to return pandas.ArrowDtype #15182

Merged
merged 7 commits into from
Mar 4, 2024

Conversation

mroeschke
Copy link
Contributor

Description

Adds a arrow_type: bool parameter to to_pandas to allow the conversion to return pandas.ArrowDtype in pandas.

(Opens up the dream of cudf to pandas round tripping to happen via arrow formatted data)

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@mroeschke mroeschke added Python Affects Python cuDF API. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 28, 2024
@mroeschke mroeschke requested a review from a team as a code owner February 28, 2024 21:52
@mroeschke mroeschke requested review from vyasr and isVoid February 28, 2024 21:52
def test_series_to_pandas_arrow_type_nullable_raises(scalar):
pa_array = pa.array([scalar, None])
ser = cudf.Series(pa_array)
with pytest.raises(ValueError):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For readability, perhaps specify the reason argument reason="cannot both be set"?

Copy link
Contributor

@galipremsagar galipremsagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! A minor comment..

python/cudf/cudf/core/series.py Show resolved Hide resolved
@mroeschke
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit f12b8e1 into rapidsai:branch-24.04 Mar 4, 2024
73 checks passed
@mroeschke mroeschke deleted the enh/to_pandas/arrow branch March 4, 2024 23:18
rapids-bot bot pushed a commit that referenced this pull request Mar 5, 2024
I think there will be a mypy error on main soon as #15182 and #15155 were merge in close succession (my fault for not rebasing first)

Also address a review I forgot in https://github.com/rapidsai/cudf/pull/15182/files#r1507154770

cc @galipremsagar

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #15228
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants