Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Deprecate na_sentinel in factorize #12817

Merged
merged 18 commits into from
Mar 7, 2023

Conversation

galipremsagar
Copy link
Contributor

Description

This PR:

  • Deprecates na_sentinel in factorize.
  • Introduces use_na_sentinel as an alternative.
  • Introduces sort in factorize.
  • Fixing up docs.

The above changes are required as part of enabling pandas 2.0 support.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@galipremsagar galipremsagar added 3 - Ready for Review Ready for review by team Python Affects Python cuDF API. 4 - Needs cuDF (Python) Reviewer labels Feb 21, 2023
@galipremsagar galipremsagar requested a review from a team as a code owner February 21, 2023 20:13
@galipremsagar galipremsagar self-assigned this Feb 21, 2023
@galipremsagar galipremsagar added improvement Improvement / enhancement to an existing function breaking Breaking change labels Feb 21, 2023
Comment on lines 1346 to 1355
dtype = min_scalar_type(
max(
len(cats),
-1
if isinstance(na_sentinel, cudf.Scalar)
and na_sentinel.value is cudf.NA
else na_sentinel,
),
8,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's cleaner for internal methods like this to always accept a cudf.Scalar when a scalar is expected.

(in the same way that we moved towards internal methods always accepting a ColumnBase, rather than any column-like object).

@vyasr do you have any thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a better idea too, switched to accept only cudf.Scalar's here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would favor that approach.

@galipremsagar galipremsagar requested a review from shwina February 22, 2023 23:44
Copy link
Contributor

@shwina shwina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small change potentially requested, but approving otherwise.

@galipremsagar galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs cuDF (Python) Reviewer labels Mar 6, 2023
@galipremsagar
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 6d1f8e3 into rapidsai:branch-23.04 Mar 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge breaking Breaking change improvement Improvement / enhancement to an existing function Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants