Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ListColumn.to_pandas() to retain list type #15155

Merged
merged 5 commits into from
Mar 4, 2024

Conversation

galipremsagar
Copy link
Contributor

Description

Fixes: #14568

This PR fixes ListColumn.to_pandas() by calling ArrowArray.to_pylist() method to retain list type in pandas series.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@galipremsagar galipremsagar added bug Something isn't working 3 - Ready for Review Ready for review by team Python Affects Python cuDF API. non-breaking Non-breaking change labels Feb 27, 2024
@galipremsagar galipremsagar self-assigned this Feb 27, 2024
@galipremsagar galipremsagar requested review from a team as code owners February 27, 2024 17:33
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, but with one potential follow-up request.

Comment on lines +298 to +299
# Can't rely on Column.to_pandas implementation for lists.
# Need to perform `to_pylist` to preserve list types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this is some bug in how list dtypes are handled in arrow? Do we need to report something upstream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already an open issue: apache/arrow#34574

Co-authored-by: Matthew Roeschke <[email protected]>
@galipremsagar galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Feb 28, 2024
@galipremsagar galipremsagar requested a review from rjzamora March 4, 2024 18:11
@galipremsagar
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit c3cad1d into rapidsai:branch-24.04 Mar 4, 2024
73 checks passed
rapids-bot bot pushed a commit that referenced this pull request Mar 5, 2024
I think there will be a mypy error on main soon as #15182 and #15155 were merge in close succession (my fault for not rebasing first)

Also address a review I forgot in https://github.com/rapidsai/cudf/pull/15182/files#r1507154770

cc @galipremsagar

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #15228
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Conversion of list column to pandas produces array(..., dtype=object) entries
4 participants